From andymac at bullseye.apana.org.au  Wed Sep  1 00:11:57 2004
From: andymac at bullseye.apana.org.au (Andrew MacIntyre)
Date: Wed Sep  1 03:14:52 2004
Subject: [Python-Dev]  Re: [Python-checkins] python/dist/src/Misc NEWS,
	1.1125, 1.1126
In-Reply-To: <20040831140352.GC15320@rogue.amk.ca>
References: <E1C292N-00084D-Q5@sc8-pr-cvs1.sourceforge.net>
	<20040831140352.GC15320@rogue.amk.ca>
Message-ID: <20040901080957.D88920@bullseye.apana.org.au>

On Tue, 31 Aug 2004, A.M. Kuchling wrote:

> On Tue, Aug 31, 2004 at 06:51:03AM -0700, akuchling@users.sourceforge.net wrote:
> > Add news item.
> > +- The mpz, rotor, and xreadlines modules, all deprecated in earlier
> > +  versions of Python, have now been removed.
> > +
>
> Well, *that* was messier than I was expecting...  Done now.
>
> I haven't touched the Makefiles for the PC and OS2 ports to remove
> these modules; if the maintainers want me to do that, please let me
> know.

I'll take care of the OS2 fixes (though I probably won't be able to do so
for a week or so).

Regards,
Andrew

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac@pcug.org.au             (alt) |        Belconnen  ACT  2616
Web:    http://www.andymac.org/               |        Australia
From gvanrossum at gmail.com  Wed Sep  1 06:59:25 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  1 06:59:30 2004
Subject: [Python-Dev] Rejecting the J2 decorators proposal
Message-ID: <ca471dc20408312159440c689b@mail.gmail.com>

Robert and Python-dev,

I've read the J2 proposal up and down several times, pondered all the
issues, and slept on it for a night, and I still don't like it enough
to accept it. The only reason to accept it would be to pacify the
supporters of the proposal, and that just isn't a good enough reason
in language design.

However, it got pretty darn close! I'm impressed with how the
community managed to pull together and face the enormous challenge of
picking a single alternative (from more than two dozen on the Wiki!)
and arguing consistently. I expect to see more proposals like this in
the future, and I'm sure that some of them will be good enough to make
it into the language.

I've also (again) learned a lesson: dramatic changes must be discussed
with the community at large. In a large enough group there are no
uncontroversial proposals, so this will take time, but it's worth it
-- one of the main issues with the @decorator syntax was not technical
but socio-political, in the sense that it hadn't been properly
discussed outside a *very* small circle. I take the full blame for
that, and I don't want to hide behind my current lack of time which,
realistically, won't change until either my ESI stock options earn me
an early retirement, or the PSF strikes it rich and can pay me full
time :-).

So let me explain why I'm not choosing J2, and what's next.

There are two major issues and one minor that made me decide against J2.

Major issue one: the syntactic form of an indented block strongly
suggests that its contents should be a sequence of statements, but in
fact it is not -- only expressions are allowed, and there is an
implicit "collecting" of these expressions going on until they can be
applied to the subsequent function definition. To me, this is a more
serious problem than the namespace questions brought up in the
proposal (unfortunately that particular section of the proposal is its
most confused part; but even if the text had been crystal clear, the
problem remains). The best counter-argument to this I've heard is
"you'll get used to it", which is also what I'm saying of @decorators;
and many people have already testified that they indeed got used to it
and even liked it.

Major issue two: the keyword starting the line that heads a block
draws a lot of attention to it. This is true for "if", "while", "for",
"try", "def" and "class". But the "using" keyword (or any other
keyword in its place) doesn't deserve that attention; the emphasis
should be on the decorator or decorators inside the suite, since those
are the important modifiers to the function definition that follows.
When a function definition carries one or more decorators, the most
important information is not the fact that it has decorators, but the
specific decorators used. A classmethod or staticmethod decorator adds
a completely different flavor than a decorator that provides an
external linkage hint for ObjC, or one that adds synchronization, or
one that declares deprecation. I expect that at least 80% of the use
of decorators will have a single decorator per function, and it's a
pain for that decorator to be hiding behind a content-free keyword.
(This is *not* a number-of-keystrokes argument. You know I don't care
much about that.)

Minor issue: "using" is a poor choice of keyword. It resembles C#'s
"using" and perhaps Perl's "use", both of which have completely
different meanings. But there don't seem to be any better alternatives
(the best I could come up with was "transmogrify" :-).

So, what's next? In Python 2.4a3 (to be released this Thursday),
everything remains as currently in CVS. For 2.4b1, I will consider a
change of @ to some other single character, even though I think that @
has the advantage of being the same character used by a similar
feature in Java. It's been argued that it's not quite the same, since
@ in Java is used for attributes that don't change semantics. But
Python's dynamic nature makes that its syntactic elements never mean
quite the same thing as similar constructs in other languages, and
there is definitely significant overlap. Regarding the impact on 3rd
party tools: IPython's author doesn't think there's going to be much
impact; Leo's author has said that Leo will survive (although it will
cause him and his users some transitional pain). I actually expect
that picking a character that's already used elsewhere in Python's
syntax might be harder for external tools to adapt to, since parsing
will have to be more subtle in that case. But I'm frankly undecided,
so there's some wiggle room here. I don't want to consider further
syntactic alternatives at this point: the buck has to stop at some
point, everyone has had their say, and the show must go on.

In the coming years I hope that as a community we'll gain enough
experience with decorators to decide whether we need to adopt a
different syntax for Python 3000 or not. One of the difficulties with
choosing a decorator syntax has definitely been that nobody can
predict how they are going to be used predominantly. Different
alternatives look better depending on whether there are many or few
decorators per function, whether they have long argument lists or not,
and perhaps also whether their use is for transformation or for
annotation. Despite the novelty of using the @ character, I personally
feel that prefix decorators are a huge improvement over the "f =
staticmethod(f)" style of decorating.

A warning: some people have shown examples of extreme uses of
decorators. I've seen decorators proposed for argument and return type
annotations, and even one that used a decorator to create an object
that did a regular expression substitution. Those uses are cute, but I
recommend being conservative when deciding between using a decorator
or some other approach, especially in code that will see a large
audience (like 3rd party library packages). Using decorators for type
annotations in particular looks tedious, and this particular
application is so important that I expect Python 3000 will have
optional type declarations integrated into the argument list.

Thanks to everyone who read until the end of this message!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From kbk at shore.net  Wed Sep  1 07:56:36 2004
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Sep  1 07:56:41 2004
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200409010556.i815uaF7026858@h006008a7bda6.ne.client2.attbi.com>

Patch / Bug Summary
___________________

Patches :  247 open (-12) /  2596 closed (+23) /  2843 total (+11)
Bugs    :  758 open (+13) /  4415 closed (+10) /  5173 total (+23)
RFE     :  148 open ( -2) /   131 closed ( +1) /   279 total ( -1)

New / Reopened Patches
______________________

compiler.transformer: correct lineno attribute when possible  (2004-08-25)
       http://python.org/sf/1015989  opened by  Thenault Sylvain

configure.in change to allow compilation on AIX 5  (2004-08-25)
CLOSED http://python.org/sf/1016224  opened by  Trent Mick

bsddb's DB.keys() method ignores transaction argument  (2004-08-27)
       http://python.org/sf/1017405  opened by  Jp Calderone

Fix for bug 1017546  (2004-08-27)
       http://python.org/sf/1017550  opened by  Michael

ifdeffery patch  (2004-08-28)
CLOSED http://python.org/sf/1018291  opened by  Ilya Sandler

fix for several sre escaping bugs (fixes #776311)  (2004-08-28)
       http://python.org/sf/1018386  opened by  Mike Coleman

fix bug 807871 : tkMessageBox.askyesno wrong result  (2004-08-29)
       http://python.org/sf/1018509  opened by  Jiba

Multi-line strings and unittest  (2004-08-30)
       http://python.org/sf/1019220  opened by  Felix Wiemann

Bad test for HAVE_UINTPTR_T in PC/pyconfig.h  (2004-08-31)
       http://python.org/sf/1020042  opened by  Scott David Daniels

Py_CLEAR to implicitly cast its argument to PyObject *  (2004-09-01)
       http://python.org/sf/1020185  opened by  Dima Dorfman

Use Py_CLEAR where necessary to avoid crashes  (2004-09-01)
       http://python.org/sf/1020188  opened by  Dima Dorfman

Patches Closed
______________

expose lowlevel setlocale  (2003-07-24)
       http://python.org/sf/776854  closed by  mhammond

Docs claim that coerce can return None  (2004-08-24)
       http://python.org/sf/1015021  closed by  loewis

bug in tarfile.ExFileObject.readline  (2004-08-24)
       http://python.org/sf/1014992  closed by  loewis

decode message attachments in email.Message  (2003-08-28)
       http://python.org/sf/796908  closed by  loewis

More urllib2 examples  (2003-08-31)
       http://python.org/sf/798244  closed by  loewis

interpreter final destination location  (2003-05-13)
       http://python.org/sf/736857  closed by  loewis

docs for interpreter final destination location   (2003-05-13)
       http://python.org/sf/736859  closed by  loewis

Use a better BuildRoot tag  (2004-06-10)
       http://python.org/sf/970019  closed by  loewis

Generate a working spec even with wrong version of software   (2004-06-10)
       http://python.org/sf/970015  closed by  loewis

platform-specific entropy  (2004-04-14)
       http://python.org/sf/934711  closed by  loewis

configure.in change to allow compilation on AIX 5  (2004-08-25)
       http://python.org/sf/1016224  closed by  tmick

Expose current parse location to XMLParser  (2004-08-24)
       http://python.org/sf/1014930  closed by  davecole

Improve markup and punctuation in libsocket.tex  (2004-08-24)
       http://python.org/sf/1015012  closed by  davecole

help on re-exported names (bug 925628)  (2004-04-13)
       http://python.org/sf/934356  closed by  jlgijsbers

socketmodule on OpenBSD/sparc64 (64bit machine)  (2004-08-01)
       http://python.org/sf/1001610  closed by  loewis

ifdeffery patch  (2004-08-28)
       http://python.org/sf/1018291  closed by  loewis

difflib side by side diff support, diff.py s/b/s HTML option  (2004-03-12)
       http://python.org/sf/914575  closed by  loewis

Fix for compilation with runtime_library_dirs  (2004-06-15)
       http://python.org/sf/973204  closed by  loewis

AUTH_TYPE and REMOTE_USER for CGIHTTPServer.py:run_cgi()  (2003-04-25)
       http://python.org/sf/727483  closed by  loewis

Backport of recent sre fixes.  (2003-04-19)
       http://python.org/sf/723940  closed by  loewis

Fixes for bug 940578 (glob.glob on broken symlinks)  (2004-04-24)
       http://python.org/sf/941486  closed by  jlgijsbers

fix for bugs 976878, 926369, 875404 (pdb bkpt handling)  (2004-08-05)
       http://python.org/sf/1003640  closed by  jlgijsbers

Multi-line imports implementation  (2004-08-11)
       http://python.org/sf/1007189  closed by  anthonybaxter

New / Reopened Bugs
___________________

__setitem__ for __dict__ ignored  (2004-08-25)
CLOSED http://python.org/sf/1015792  opened by  Viktor A Danilov

Don't define _SGAPI on IRIX  (2003-04-27)
       http://python.org/sf/728330  reopened by  loewis

os.system segmentation fault   (2004-08-25)
       http://python.org/sf/1015937  opened by  Tomasz Kowaltowski

"reversed" gives its name as "reverse" in docstring  (2004-08-25)
CLOSED http://python.org/sf/1016181  opened by  Hamish Lawson

urllib2 bug in proxy auth  (2004-08-26)
       http://python.org/sf/1016563  opened by  Christoph Mussenbrock

distutils support for swig is under par  (2004-08-26)
       http://python.org/sf/1016626  opened by  Sjoerd Mullender

urllib.urlretrieve silently truncates downloads  (2004-08-26)
       http://python.org/sf/1016880  opened by  David Abrahams

email.Message does not allow iteration  (2004-08-26)
       http://python.org/sf/1017329  opened by  Paul McGuire

including Python.h redefines _POSIX_C_SOURCE  (2004-08-27)
       http://python.org/sf/1017450  opened by  Jon K?re Hellan

including Python.h redefines _POSIX_C_SOURCE  (2004-08-27)
CLOSED http://python.org/sf/1017455  opened by  Jon K?re Hellan

test_inspect.py fails to clean up upon failure  (2004-08-27)
       http://python.org/sf/1017546  opened by  Michael

filemode() in tarfile.py makes wrong file mode strings  (2004-08-27)
       http://python.org/sf/1017553  opened by  Peter Loje Hansen

Case sensitivity bug in ConfigParser  (2004-08-27)
       http://python.org/sf/1017864  opened by  Dani

IDLE DOES NOT START ON WinXP Pro  (2004-08-27)
       http://python.org/sf/1017978  opened by  Snake

__new__ not defined?  (2004-08-28)
       http://python.org/sf/1018315  opened by  Skip Montanaro

Solaris: reentrancy issues  (2004-08-29)
       http://python.org/sf/1018492  opened by  Simon Harrison

inspect.getmodule symlink-related failur  (2002-06-18)
       http://python.org/sf/570300  reopened by  amitar

re.sub: two-digit group-reference hangs  (2004-08-29)
       http://python.org/sf/1018815  opened by  Michael Dyck

__metaclass__ in locals is ignored  (2004-08-30)
       http://python.org/sf/1019048  opened by  Jeff Epler

"rich comparison'' methods hide stack overflow  (2004-08-30)
       http://python.org/sf/1019129  opened by  boyanb

distutils ignores configure's --includedir  (2004-08-31)
       http://python.org/sf/1019715  opened by  Joseph Winston

wrong socket error returned  (2004-08-31)
       http://python.org/sf/1019808  opened by  Federico Schwindt

hotshot start / stop stats bug  (2004-08-31)
       http://python.org/sf/1019882  opened by  Barry A. Warsaw

httplib.HTTPConnection sends extra blank line  (2004-08-31)
       http://python.org/sf/1019956  opened by  Antonio Rodriguez

Bugs Closed
___________

__setitem__ for __dict__ ignored  (2004-08-25)
       http://python.org/sf/1015792  closed by  nnorwitz

Building with --disable-toolbox-glue fails  (2004-07-15)
       http://python.org/sf/991962  closed by  bcannon

"reversed" gives its name as "reverse" in docstring  (2004-08-25)
       http://python.org/sf/1016181  closed by  rhettinger

including Python.h redefines _POSIX_C_SOURCE  (2004-08-27)
       http://python.org/sf/1017455  closed by  nnorwitz

glob.glob inconsistent about broken symlinks  (2004-04-23)
       http://python.org/sf/940578  closed by  jlgijsbers

PDB: unreliable breakpoints on functions  (2004-06-21)
       http://python.org/sf/976878  closed by  jlgijsbers

global stmt causes breakpoints to be ignored  (2004-01-12)
       http://python.org/sf/875404  closed by  jlgijsbers

pdb sometimes sets breakpoints in the wrong location  (2004-03-31)
       http://python.org/sf/926369  closed by  jlgijsbers

help does not help with imported objects  (2004-03-29)
       http://python.org/sf/925628  closed by  jlgijsbers

Misc/NEWS no valid reStructuredText  (2004-08-24)
       http://python.org/sf/1014770  closed by  jlgijsbers

Misc/NEWS.help  (2004-08-24)
       http://python.org/sf/1014775  closed by  jlgijsbers

"make pdf" failure w/ 2.4 docs  (2004-07-30)
       http://python.org/sf/1000841  closed by  jlgijsbers

RFE Closed
__________

array.array objects should support sequences  (2004-07-17)
       http://python.org/sf/992967  closed by  rhettinger

From anthony at interlink.com.au  Wed Sep  1 07:11:31 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Sep  1 08:16:01 2004
Subject: [Python-Dev] Freezing for alpha3 - trunk FROZEN from 2004-09-01
	13:00 UTC
Message-ID: <41355A03.6070405@interlink.com.au>

I plan to start alpha3 in about 24 hours time. From about 12 hours from
now, the trunk should be considered frozen - that's starting at about
1700 UTC on 2004-09-01.

If your name isn't Anthony, Fred or Martin, please do NOT check in while
we're doing the release. Really. Really, really, really. I'm cc'ing
python-checkins as well this time, as a few folks missed this last time.
The trunk will stay frozen until about 6 hours or so after the release
is done - this makes it easier for me to do an emergency brown-paper-bag
release in the case of a cockup <wink>

Thanks,
Anthony
From pf_moore at yahoo.co.uk  Wed Sep  1 08:44:58 2004
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Wed Sep  1 08:44:53 2004
Subject: [Python-Dev] Re: Right curry considered harmful
References: <20040831164226.96D151E4008@bag.python.org>
	<4134DE7D.9020204@blueyonder.co.uk>
Message-ID: <uoekqlc7p.fsf@yahoo.co.uk>

Peter Harris <scav@blueyonder.co.uk> writes:

> I think we'll see if partial() is a useful enough feature to be
> worth optimising once it actually makes it into a build and gets
> used.

We've now had a couple of comments regarding efficiency (Raymond
Hettinger made this point as well). As a C implementation exists, and
I can also imagine that this is the sort of thing that could get used
in performance-sensitive areas, why not use the C implementation?

Paul.
-- 
Ooh, how Gothic. Barring the milk.

From pedronis at bluewin.ch  Wed Sep  1 14:00:41 2004
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Wed Sep  1 13:58:25 2004
Subject: [Python-Dev] @ character choice and Jython (was: Rejecting the
	J2 decorators proposal)
In-Reply-To: <ca471dc20408312159440c689b@mail.gmail.com>
References: <ca471dc20408312159440c689b@mail.gmail.com>
Message-ID: <4135B9E9.9020906@bluewin.ch>

Guido van Rossum wrote:

> So, what's next? In Python 2.4a3 (to be released this Thursday),
> everything remains as currently in CVS. For 2.4b1, I will consider a
> change of @ to some other single character, even though I think that @
> has the advantage of being the same character used by a similar
> feature in Java. It's been argued that it's not quite the same, since
> @ in Java is used for attributes that don't change semantics. But
> Python's dynamic nature makes that its syntactic elements never mean
> quite the same thing as similar constructs in other languages, and
> there is definitely significant overlap.

One issue with the '@' character choice is that in the context of
Jython things can get rather confusing and I mean beyond the fact
that the need of an "annotation" to get a static method will seem
rather bizarre to Java people. It somewhat put the burden on Jython to 
try to do the obvious thing:

Consider this java annotation definition (concretely these get compiled 
to interfaces):

public @interface Author {
     String value() default "";
}

Now this potential Jython code:

import Author

classs A: # not inheriting from a Java class

   # this also the exact legal java syntax for this
   @Author("batman")
   def method(self):
      pass

in the past and at the moment interfaces are not callable, but here
we would like to produce a nice error or warning, we are not inheriting
from a java class so there is no way to attach the annotation.

But in this case:

import java
import Author

classs A(java.lang.Runnable):

   @Author("batman")
   def run(self): # this one
      pass

here we potentially could attach the annotation to the exposed method.

My point is basically that '@' will likely generate more user questions 
(which are time consuming) and expectations than a different character 
choice in Jython context.

Samuele


From skip at pobox.com  Wed Sep  1 17:41:59 2004
From: skip at pobox.com (Skip Montanaro)
Date: Wed Sep  1 17:42:14 2004
Subject: [Python-Dev] 
	Re: [Python-checkins] python/nondist/peps pep-0318.txt, 1.30, 1.31
In-Reply-To: <E1C2Wd0-0008BX-2I@sc8-pr-cvs1.sourceforge.net>
References: <E1C2Wd0-0008BX-2I@sc8-pr-cvs1.sourceforge.net>
Message-ID: <16693.60871.999004.146879@montanaro.dyndns.org>


    anthony> (I'm not sure if the "Community Concensus" section should be
    anthony> trimmed down radically now - it's a lot of words for a rejected
    anthony> form, and the case for the form is still available on the web
    anthony> and in the mailing list archives... opinions, anyone?)

I'd just refer to the wiki and Robert Brewer's J2 proposal.

Skip
From fdrake at acm.org  Wed Sep  1 17:45:26 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Sep  1 17:45:34 2004
Subject: [Python-Dev] 
	Re: [Python-checkins] python/nondist/peps pep-0318.txt, 1.30, 1.31
In-Reply-To: <E1C2Wd0-0008BX-2I@sc8-pr-cvs1.sourceforge.net>
References: <E1C2Wd0-0008BX-2I@sc8-pr-cvs1.sourceforge.net>
Message-ID: <200409011145.26772.fdrake@acm.org>

On Wednesday 01 September 2004 11:02 am, anthonybaxter@users.sourceforge.net 
wrote:
 > (I'm not sure if the "Community Concensus" section should be trimmed
 > down radically now - it's a lot of words for a rejected form, and the
 > case for the form is still available on the web and in the mailing
 > list archives... opinions, anyone?)

I'm for leaving the text in; wikis are fragile, and this is a valuable bit of 
Python history.  A reader can ignore it if that's not interesting to them.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From gvanrossum at gmail.com  Wed Sep  1 17:52:58 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  1 17:53:05 2004
Subject: [Python-Dev] 
	Re: [Python-checkins] python/nondist/peps pep-0318.txt, 1.30, 1.31
In-Reply-To: <200409011145.26772.fdrake@acm.org>
References: <E1C2Wd0-0008BX-2I@sc8-pr-cvs1.sourceforge.net>
	<200409011145.26772.fdrake@acm.org>
Message-ID: <ca471dc2040901085265643020@mail.gmail.com>

>  > (I'm not sure if the "Community Concensus" section should be trimmed
>  > down radically now - it's a lot of words for a rejected form, and the
>  > case for the form is still available on the web and in the mailing
>  > list archives... opinions, anyone?)
> 
> I'm for leaving the text in; wikis are fragile, and this is a valuable bit of
> Python history.  A reader can ignore it if that's not interesting to them.

+1

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Wed Sep  1 17:55:52 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  1 17:55:56 2004
Subject: [Python-Dev] @ character choice and Jython (was: Rejecting
	the J2 decorators proposal)
In-Reply-To: <4135B9E9.9020906@bluewin.ch>
References: <ca471dc20408312159440c689b@mail.gmail.com>
	<4135B9E9.9020906@bluewin.ch>
Message-ID: <ca471dc204090108556b33ca22@mail.gmail.com>

> One issue with the '@' character choice is that in the context of
> Jython things can get rather confusing and I mean beyond the fact
> that the need of an "annotation" to get a static method will seem
> rather bizarre to Java people. It somewhat put the burden on Jython
> to try to do the obvious thing:

[snipped example showing that Jython can do the right thing, at least
for Java-derived classes, with Java annotation interfaces]

> My point is basically that '@' will likely generate more user
> questions (which are time consuming) and expectations than a
> different character choice in Jython context.

Have you gotten cynical?  This should be counted as an argument *for*
the @ character.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From anthony at interlink.com.au  Wed Sep  1 17:37:48 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Sep  1 18:33:36 2004
Subject: [Python-Dev] (my) revisions to PEP318 finally done.
In-Reply-To: <4133382B.7070308@interlink.com.au>
References: <41332A02.1040902@interlink.com.au>
	<4133382B.7070308@interlink.com.au>
Message-ID: <4135ECCC.4080704@interlink.com.au>

I've now updated the PEP to the current state of play,
which is pretty much done. If there's no significant
feedback, I'll post this to c.l.py tomorrow.
-------------- next part --------------
PEP: 318
Title: Decorators for Functions and Methods
Version: $Revision: 1.31 $
Last-Modified: $Date: 2004/09/01 15:02:22 $
Author: Kevin D. Smith, Jim Jewett, Skip Montanaro, Anthony Baxter
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 05-Jun-2003
Python-Version: 2.4
Post-History: 09-Jun-2003, 10-Jun-2003, 27-Feb-2004, 23-Mar-2004, 30-Aug-2004,
2-Sep-2004


WarningWarningWarning
=====================

This document is meant to describe the decorator syntax and the
process that resulted in the decisions that were made.  It does not 
attempt to cover the huge number of potential alternative syntaxes, 
nor is it an attempt to exhaustively list all the positives and 
negatives of each form.


Abstract
========

The current method for transforming functions and methods (for instance,
declaring them as a class or static method) is awkward and can lead to
code that is difficult to understand.  Ideally, these transformations
should be made at the same point in the code where the declaration
itself is made.  This PEP introduces new syntax for transformations of a
function or method declaration.


Motivation
==========

The current method of applying a transformation to a function or method
places the actual translation after the function body.  For large
functions this separates a key component of the function's behavior from
the definition of the rest of the function's external interface.  For
example::

    def foo(self):
        perform method operation
    foo = classmethod(foo)

This becomes less readable with longer methods.  It also seems less
than pythonic to name the function three times for what is conceptually
a single declaration.  A solution to this problem is to move the
transformation of the method closer to the method's own declaration.
While the new syntax is not yet final, the intent is to replace::

    def foo(cls):
        pass
    foo = synchronized(lock)(foo)
    foo = classmethod(foo)

with an alternative that places the decoration in the function's
declaration::

    @classmethod
    @synchronized(lock)
    def foo(cls):
        pass

Modifying classes in this fashion is also possible, though the benefits
are not as immediately apparent.  Almost certainly, anything which could
be done with class decorators could be done using metaclasses, but
using metaclasses is sufficiently obscure that there is some attraction
to having an easier way to make simple modifications to classes.  For
Python 2.4, only function/method decorators are being added.


Why Is This So Hard?
--------------------

Two decorators (``classmethod()`` and ``staticmethod()``) have been
available in Python since version 2.2.  It's been assumed since
approximately that time that some syntactic support for them would
eventually be added to the language.  Given this assumption, one might
wonder why it's been so difficult to arrive at a consensus.  Discussions
have raged off-and-on at times in both comp.lang.python and the
python-dev mailing list about how best to implement function decorators.
There is no one clear reason why this should be so, but a few problems
seem to be most problematic.

* Disagreement about where the "declaration of intent" belongs.
  Almost everyone agrees that decorating/transforming a function at the
  end of its definition is suboptimal.  Beyond that there seems to be no
  clear consensus where to place this information.

* Syntactic constraints.  Python is a syntactically simple language
  with fairly strong constraints on what can and can't be done without
  "messing things up" (both visually and with regards to the language
  parser).  There's no obvious way to structure this information so
  that people new to the concept will think, "Oh yeah, I know what
  you're doing."  The best that seems possible is to keep new users from
  creating a wildly incorrect mental model of what the syntax means.

* Overall unfamiliarity with the concept.  For people who have a
  passing acquaintance with algebra (or even basic arithmetic) or have
  used at least one other programming language, much of Python is
  intuitive.  Very few people will have had any experience with the
  decorator concept before encountering it in Python.  There's just no
  strong preexisting meme that captures the concept.

* Syntax discussions in general appear to cause more contention than
  almost anything else. Readers are pointed to the ternary operator
  discussions that were associated with PEP 308 for another example of
  this.


Background
==========

There is general agreement that syntactic support is desirable to
the current state of affairs.  Guido mentioned `syntactic support
for decorators`_ in his DevDay keynote presentation at the `10th
Python Conference`_, though `he later said`_ it was only one of
several extensions he proposed there "semi-jokingly".  `Michael Hudson
raised the topic`_ on ``python-dev`` shortly after the conference,
attributing the initial bracketed syntax to an earlier proposal on
``comp.lang.python`` by `Gareth McCaughan`_.

.. _syntactic support for decorators:
   http://www.python.org/doc/essays/ppt/python10/py10keynote.pdf
.. _10th python conference:
   http://www.python.org/workshops/2002-02/
.. _michael hudson raised the topic:
   http://mail.python.org/pipermail/python-dev/2002-February/020005.html
.. _he later said:
   http://mail.python.org/pipermail/python-dev/2002-February/020017.html
.. _gareth mccaughan:
   http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=slrna40k88.2h9o.Gareth.McCaughan%40g.local

Class decorations seem like an obvious next step because class
definition and function definition are syntactically similar,
however Guido remains unconvinced, and class decorators will almost
certainly not be in Python 2.4.

The discussion continued on and off on python-dev from February
2002 through July 2004.  Hundreds and hundreds of posts were made,
with people proposing many possible syntax variations.  Guido took
a list of proposals to `EuroPython 2004`_, where a discussion took
place.  Subsequent to this, he decided that we'd have the `Java-style`_ 
@decorator syntax, and this appeared for the first time in 2.4a2.  
Barry Warsaw named this the 'pie-decorator' syntax, in honor of the 
Pie-thon Parrot shootout which was occured around the same time as 
the decorator syntax, and because the @ looks a little like a pie.  
Guido `outlined his case`_ on Python-dev, including `this piece`_ 
on some of the (many) rejected forms.

.. _EuroPython 2004:
    http://www.python.org/doc/essays/ppt/euro2004/euro2004.pdf
.. _outlined his case:
    http://mail.python.org/pipermail/python-dev/2004-August/author.html
.. _this piece:
    http://mail.python.org/pipermail/python-dev/2004-August/046672.html
..  _Java-style:
    http://java.sun.com/j2se/1.5.0/docs/guide/language/annotations.html


On the name 'Decorator'
=======================

There's been a number of complaints about the choice of the name
'decorator' for this feature.  The major one is that the name is not
consistent with its use in the `GoF book`_.  The name 'decorator'
probably owes more to its use in the compiler area -- a syntax tree is
walked and annotated.  It's quite possible that a better name may turn
up.

.. _GoF book:
    http://patterndigest.com/patterns/Decorator.html


Design Goals
============

The new syntax should

* work for arbitrary wrappers, including user-defined callables and
  the existing builtins ``classmethod()`` and ``staticmethod()``.  This
  requirement also means that a decorator syntax must support passing
  arguments to the wrapper constructor

* work with multiple wrappers per definition

* make it obvious what is happening; at the very least it should be
  obvious that new users can safely ignore it when writing their own
  code

* be a syntax "that ... [is] easy to remember once explained"

* not make future extensions more difficult

* be easy to type; programs that use it are expected to use it very
  frequently

* not make it more difficult to scan through code quickly.  It should
  still be easy to search for all definitions, a particular definition,
  or the arguments that a function accepts

* not needlessly complicate secondary support tools such as
  language-sensitive editors and other "`toy parser tools out
  there`_"

* allow future compilers to optimize for decorators.  With the hope of
  a JIT compiler for Python coming into existence at some point this
  tends to require the syntax for decorators to come before the function
  definition

* move from the end of the function, where it's currently hidden, to
  the front where it is more `in your face`_

Andrew Kuchling has links to a bunch of the discussions about
motivations and use cases `in his blog`_.  Particularly notable is `Jim
Huginin's list of use cases`_.

.. _toy parser tools out there:
   http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=mailman.1010809396.32158.python-list%40python.org
.. _in your face:
    http://mail.python.org/pipermail/python-dev/2004-August/047112.html
.. _in his blog:
    http://www.amk.ca/diary/archives/cat_python.html#003255
.. _Jim Huginin's list of use cases:
    http://mail.python.org/pipermail/python-dev/2004-April/044132.html


Current Syntax
==============

The current syntax for function decorators as implemented in Python
2.4a2 is::

    @dec2
    @dec1
    def func(arg1, arg2, ...):
        pass

This is equivalent to::

    def func(arg1, arg2, ...):
        pass
    func = dec2(dec1(func))

without the intermediate assignment to the variable ``func``.  The
decorators are near the function declaration.  The @ sign makes it clear
that something new is going on here.

The decorator statement is limited in what it can accept -- arbitrary
expressions will not work.  Guido preferred this because of a `gut
feeling`_.

.. _gut feeling:
    http://mail.python.org/pipermail/python-dev/2004-August/046711.html


Syntax Alternatives
===================

There have been `a large number`_ of different syntaxes proposed --
rather than attempting to work through these individual syntaxes, it's
worthwhile to break the syntax discussion down into a number of areas.
Attempting to discuss `each possible syntax`_ individually would be an
act of madness, and produce a completely unwieldy PEP.

.. _a large number:
    http://www.python.org/moin/PythonDecorators
.. _each possible syntax:
    http://ucsu.colorado.edu/~bethard/py/decorators-output.py


Decorator Location
------------------

The first syntax point is the location of the decorators.  For the
following examples, we use the @syntax used in 2.4a2.

Decorators before the def statement are the first alternative, and the
syntax used in 2.4a2::

    @classmethod
    def foo(arg1,arg2):
        pass

    @accepts(int,int)
    @returns(float)
    def bar(low,high):
        pass

There have been a number of objections raised to this location -- the
primary one is that it's the first real Python case where a line of code
has a result on a following line.  The syntax available for in 2.4a3 
requires one decorator per line (in a2, multiple decorators could be
specified on the same line).

People also complained that the syntax got unworldly quickly when
multiple decorators were used.  The point was made, though, that the
chances of a large number of decorators being used on a single function
were small and thus this was not a large worry.

Some of the advantages of this form are that the decorators live outside
the method body -- they are obviously executed at the time the function
is defined.

Another advantage is that being prefix to the function definition fit
the idea of knowing about a change to the semantics of the code before
the code itself, thus knowing how to interpret the code's semantics
properly without having to go back and change your initial perceptions
if the syntax did not come before the function definition.

Guido decided `he preferred`_ having the decorators on the line before
the 'def', because it was felt that a long argument list would mean that
the decorators would be 'hidden'

.. _he preferred:
    http://mail.python.org/pipermail/python-dev/2004-March/043756.html

The second form is the decorators between the def and the function name,
or the function name and the argument list::

    def @classmethod foo(arg1,arg2):
        pass

    def @accepts(int,int),@returns(float) bar(low,high):
        pass

    def foo @classmethod (arg1,arg2):
        pass

    def bar @accepts(int,int),@returns(float) (low,high):
        pass

There are a couple of objections to this form.  The first is that it
breaks easily 'greppability' of the source -- you can no longer search
for 'def foo(' and find the definition of the function.  The second,
more serious, objection is that in the case of multiple decorators, the
syntax would be extremely unwieldy.

The next form, which has had a number of strong proponents, is to have
the decorators between the argument list and the trailing ``:`` in the
'def' line::

    def foo(arg1,arg2) @classmethod:
        pass

    def bar(low,high) @accepts(int,int),@returns(float):
        pass

Guido `summarized the arguments`_ against this form (many of which also
apply to the previous form) as:

- it hides crucial information (e.g. that it is a static method)
  after the signature, where it is easily missed

- it's easy to miss the transition between a long argument list and a
  long decorator list

- it's cumbersome to cut and paste a decorator list for reuse, because
  it starts and ends in the middle of a line

.. _summarized the arguments:
    http://mail.python.org/pipermail/python-dev/2004-August/047112.html

The next form is that the decorator syntax go inside the method body at
the start, in the same place that docstrings currently live:

    def foo(arg1,arg2):
        @classmethod
        pass

    def bar(low,high):
        @accepts(int,int)
        @returns(float)
        pass

The primary objection to this form is that it requires "peeking inside"
the method body to determine the decorators.  In addition, even though
the code is inside the method body, it is not executed when the method
is run.  Guido felt that docstrings were not a good counter-example, and
that it was quite possible that a 'docstring' decorator could help move
the docstring to outside the function body.

The final form is a new block that encloses the method's code.  For this
example, we'll use a 'decorate' keyword, as it makes no sense with the
@syntax. ::

    decorate:
        classmethod
        def foo(arg1,arg2):
            pass

    decorate:
        accepts(int,int)
        returns(float)
        def bar(low,high):
            pass

This form would result in inconsistent indentation for decorated and
undecorated methods.  In addition, a decorated method's body would start
three indent levels in.


Syntax forms
------------

* ``@decorator``::

    @classmethod
    def foo(arg1,arg2):
        pass

    @accepts(int,int)
    @returns(float)
    def bar(low,high):
        pass

  The major objections against this syntax are that the @ symbol is
  not currently used in Python (and is used in both IPython and Leo),
  and that the @ symbol is not meaningful. Another objection is that
  this "wastes" a currently unused character (from a limited set) on
  something that is not perceived as a major use.

* ``|decorator``::

    |classmethod
    def foo(arg1,arg2):
        pass

    |accepts(int,int)
    |returns(float)
    def bar(low,high):
        pass

  This is a variant on the @decorator syntax -- it has the advantage
  that it does not break IPython and Leo.  Its major disadvantage
  compared to the @syntax is that the | symbol looks like both a capital
  I and a lowercase l.

* list syntax::

    [classmethod]
    def foo(arg1,arg2):
        pass

    [accepts(int,int), returns(float)] 
    def bar(low,high): 
        pass 

  The major objection to the list syntax is that it's currently
  meaningful (when used in the form before the method).  It's also
  lacking any indication that the expression is a decorator.

* list syntax using other brackets (``<...>``, ``[[...]]``, ...)::

    <classmethod>
    def foo(arg1,arg2):
        pass

    <accepts(int,int), returns(float)>
    def bar(low,high): 
        pass 

  None of these alternatives gained much traction. The alternatives
  which involve square brackets only serve to make it obvious that the
  decorator construct is not a list. They do nothing to make parsing any
  easier. The '<...>' alternative presents parsing problems because '<'
  and '>' already parse as un-paired. They present a further parsing
  ambiguity because a right angle bracket might be a greater than symbol
  instead of a closer for the decorators.

* ``decorate()``

  The ``decorate()`` proposal was that no new syntax be implemented
  -- instead a magic function that used introspection to manipulate
  the following function.  Both Jp Calderone and Philip Eby produced
  implementations of functions that did this.  Guido was pretty firmly
  against this -- with no new syntax, the magicness of a function like
  this is extremely high:

    Using functions with "action-at-a-distance" through sys.settraceback
    may be okay for an obscure feature that can't be had any other
    way yet doesn't merit changes to the language, but that's not
    the situation for decorators.  The widely held view here is that
    decorators need to be added as a syntactic feature to avoid the
    problems with the postfix notation used in 2.2 and 2.3.  Decorators
    are slated to be an important new language feature and their
    design needs to be forward-looking, not constrained by what can be
    implemented in 2.3.

* _`new keyword (and block)`

  This idea was the consensus alternate from comp.lang.python (more
  on this in `Community Consensus`_ below.)  Robert Brewer wrote up a
  detailed `J2 proposal`_ document outlining the arguments in favor of
  this form.  The initial issues with this form are:

  - It requires a new keyword, and therefore a ``from __future__
    import decorators`` statement.

  - The choice of keyword is contentious.  However ``using`` emerged
    as the consensus choice, and is used in the proposal and
    implementation.

  - The keyword/block form produces something that looks like a normal
    code block, but isn't.  Attempts to use statements in this block
    will cause a syntax error, which may confuse users.

  A few days later, Guido `rejected the proposal`_ on two main grounds,
  firstly:

    ... the syntactic form of an indented block strongly
    suggests that its contents should be a sequence of statements, but
    in fact it is not -- only expressions are allowed, and there is an
    implicit "collecting" of these expressions going on until they can
    be applied to the subsequent function definition. ...

  and secondly:

    ... the keyword starting the line that heads a block
    draws a lot of attention to it. This is true for "if", "while",
    "for", "try", "def" and "class". But the "using" keyword (or any
    other keyword in its place) doesn't deserve that attention; the
    emphasis should be on the decorator or decorators inside the suite,
    since those are the important modifiers to the function definition
    that follows. ...

  Readers are invited to read `the full response`_.

  .. _J2 proposal:
     http://www.aminus.org/rbre/python/pydec.html

  .. _rejected the proposal:
     http://mail.python.org/pipermail/python-dev/2004-September/048518.html

  .. _the full response:
     http://mail.python.org/pipermail/python-dev/2004-September/048518.html

* Other forms 

  There are plenty of other variants and proposals on `the wiki page`_.

.. _the wiki page:
    http://www.python.org/moin/PythonDecorators


Why @?
------

There is some history in Java using @ initially as a marker in `Javadoc
comments`_ and later in Java 1.5 for `annotations`_, which are similar
to Python decorators.  The fact that @ was previously unused as a token
in Python also means it's clear there is no possibility of such code
being parsed by an earlier version of Python, leading to possibly subtle
semantic bugs.  It also means that ambiguity of what is a decorator
and what isn't is removed. of That said, @ is still a fairly arbitrary
choice.  Some have suggested using | instead.

For syntax options which use a list-like syntax (no matter where it
appears) to specify the decorators a few alternatives were proposed:
``[|...|]``, ``*[...]*``, and ``<...>``.

.. _Javadoc comments:
    http://java.sun.com/j2se/javadoc/writingdoccomments/
.. _annotations:
    http://java.sun.com/j2se/1.5.0/docs/guide/language/annotations.html


Current Implementation, History
===============================

Guido asked for a volunteer to implement his preferred syntax, and Mark
Russell stepped up and posted a `patch`_ to SF.  This new syntax was 
available in 2.4a2. ::

    @dec2
    @dec1
    def func(arg1, arg2, ...):
        pass

This is equivalent to::

    def func(arg1, arg2, ...):
        pass
    func = dec2(dec1(func))

though without the intermediate creation of a variable named ``func``.

The version implemented in 2.4a2 allowed multiple ``@decorator`` clauses
on a single line. In 2.4a3, this was tightened up to only allowing one
decorator per line.

A `previous patch`_ from Michael Hudson which implements the
list-after-def syntax is also still kicking around.

.. _patch: http://www.python.org/sf/979728
.. _previous patch: http://starship.python.net/crew/mwh/hacks/meth-syntax-sugar-3.diff

After 2.4a2 was released, in response to community reaction, Guido
stated that he'd re-examine a community proposal, if the community
could come up with a community consensus, a decent proposal, and an
implementation.  After an amazing number of posts, collecting a vast
number of alternatives in the `Python wiki`_, a community consensus
emerged (below).  Guido `subsequently rejected`_ this alternate form,
but added:

    In Python 2.4a3 (to be released this Thursday), everything remains
    as currently in CVS.  For 2.4b1, I will consider a change of @ to
    some other single character, even though I think that @ has the
    advantage of being the same character used by a similar feature
    in Java.  It's been argued that it's not quite the same, since @
    in Java is used for attributes that don't change semantics.  But
    Python's dynamic nature makes that its syntactic elements never mean
    quite the same thing as similar constructs in other languages, and
    there is definitely significant overlap.  Regarding the impact on
    3rd party tools: IPython's author doesn't think there's going to be
    much impact; Leo's author has said that Leo will survive (although
    it will cause him and his users some transitional pain).  I actually
    expect that picking a character that's already used elsewhere in
    Python's syntax might be harder for external tools to adapt to,
    since parsing will have to be more subtle in that case.  But I'm
    frankly undecided, so there's some wiggle room here.  I don't want
    to consider further syntactic alternatives at this point: the buck
    has to stop at some point, everyone has had their say, and the show
    must go on.

.. _Python wiki:
    http://www.python.org/moin/PythonDecorators
.. _subsequently rejected:
     http://mail.python.org/pipermail/python-dev/2004-September/048518.html


Community Consensus
-------------------

[editor's note: should this section be removed now?]

The consensus that emerged on comp.lang.python was the proposed J2
syntax (the "J2" was how it was referenced on the PythonDecorators wiki
page): the new keyword ``using`` prefixing a block of decorators before
the ``def`` statement.  For example::

    using:
        classmethod
        synchronized(lock)
    def func(cls):
        pass

The main arguments for this syntax fall under the "readability counts"
doctrine.  In brief, they are:

* A suite is better than multiple @lines.  The ``using`` keyword and
  block transforms the single-block ``def`` statement into a
  multiple-block compound construct, akin to try/finally and others.

* A keyword is better than punctuation for a new token.  A keyword
  matches the existing use of tokens.  No new token category is
  necessary.  A keyword distinguishes Python decorators from Java
  annotations and .Net attributes, which are significantly different
  beasts.

Robert Brewer wrote a `detailed proposal`_ for this form, and Michael
Sparks produced `a patch`_.

.. _detailed proposal:
    http://www.aminus.org/rbre/python/pydec.html
.. _a patch: 
    http://www.python.org/sf/1013835

As noted previously, Guido rejected this form, outlining his problems
with it in `a message`_ to python-dev and comp.lang.python.

.. _a message:
     http://mail.python.org/pipermail/python-dev/2004-September/048518.html


Examples
========

Much of the discussion on ``comp.lang.python`` and the ``python-dev``
mailing list focuses on the use of decorators as a cleaner way to use
the ``staticmethod()`` and ``classmethod()`` builtins.  This capability
is much more powerful than that.  This section presents some examples of
use.

1. Define a function to be executed at exit.  Note that the function
   isn't actually "wrapped" in the usual sense. ::

       def onexit(f):
           import atexit
           atexit.register(f)
           return f

       @onexit
       def func():
           ...

   Note that this example is probably not suitable for real usage, but
   is for example purposes only.

2. Define a class with a singleton instance.  Note that once the class
   disappears enterprising programmers would have to be more creative to
   create more instances.  (From Shane Hathaway on ``python-dev``.) ::

       def singleton(cls):
           instances = {}
           def getinstance():
               if cls not in instances:
                   instances[cls] = cls()
               return instances[cls]
           return getinstance

       @singleton
       class MyClass:
           ...

3. Add attributes to a function.  (Based on an example posted by
   Anders Munch on ``python-dev``.) ::

       def attrs(**kwds):
           def decorate(f):
               for k in kwds:
                   setattr(f, k, kwds[k])
               return f
           return decorate

       @attrs(versionadded="2.2",
              author="Guido van Rossum")
       def mymethod(f):
           ...

4. Enforce function argument and return types.  Note that this 
   copies the func_name attribute from the old to the new function.
   func_name was made writable in Python 2.4a3::

       def accepts(*types):
           def check_accepts(f):
               assert len(types) == f.func_code.co_argcount
               def new_f(*args, **kwds):
                   for (a, t) in zip(args, types):
                       assert isinstance(a, t), \
                              "arg %r does not match %s" % (a,t)
                   return f(*args, **kwds)
               new_f.func_name = f.func_name
               return new_f
           return check_accepts

       def returns(rtype):
           def check_returns(f):
               def new_f(*args, **kwds):
                   result = f(*args, **kwds)
                   assert isinstance(result, rtype), \
                          "return value %r does not match %s" % (result,rtype)
                   return result
               new_f.func_name = f.func_name
               return new_f
           return check_returns

       @accepts(int, (int,float))
       @returns((int,float))
       def func(arg1, arg2):
           return arg1 * arg2

5. Declare that a class implements a particular (set of) interface(s).
   This is from a posting by Bob Ippolito on ``python-dev`` based on
   experience with `PyProtocols`_. ::

       def provides(*interfaces):
            """
            An actual, working, implementation of provides for
            the current implementation of PyProtocols.  Not
            particularly important for the PEP text.
            """
            def provides(typ):
                declareImplementation(typ, instancesProvide=interfaces)
                return typ
            return provides

       class IBar(Interface):
            """Declare something about IBar here"""

       @provides(IBar)
       class Foo(object):
               """Implement something here..."""

   .. _PyProtocols: http://peak.telecommunity.com/PyProtocols.html

Of course, all these examples are possible today, though without
syntactic support.


Open Issues
===========

1. It's not yet certain that class decorators will be incorporated
   into the language at a future point.  Guido expressed skepticism about
   the concept, but various people have made some `strong arguments`_
   (search for ``PEP 318 -- posting draft``) on their behalf in
   ``python-dev``.  It's exceedingly unlikely that class decorators 
   will be in Python 2.4.

   .. _strong arguments:
      http://mail.python.org/pipermail/python-dev/2004-March/thread.html

2. The choice of the ``@`` character will be re-examined before 
   Python 2.4b1.

Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:
From FBatista at uniFON.com.ar  Wed Sep  1 19:11:41 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Wed Sep  1 19:16:07 2004
Subject: [Python-Dev] Decimal for Py2.3 (was: about presicion)
Message-ID: <A128D751272CD411BC9200508BC2194D053C78C8@escpl.tcp.com.ar>

[Alex Martelli]

#- I think that's an excellent policy -- Python 2.3 will no doubt remain
#- widely used for a long time to come.  I think it would be nice if
#- Decimal was packaged up with its own docs for easy download and install
#- into an existing 2.3 installation, then... make life as easy as possible
#- for 2.3 users who need to do some decimal arithmetic!

Don't know what's commong usage, so I ask.

Should I prepare a "decimal package" with the module and docs and whatever,
in form of a tgz, rmp, .exe, etc, to let people "install" decimal in their
Py2.3?

Or it's better to somewhere tell the user that if he/she wants to use
Decimal in Py2.3 to follow this simple n steps (and the detail of the steps,
of course ;)?

Considering that it's only a file, and the docs could be accessed through
Py2.4 documentation, I'll go for the latter.

.	Facundo
From vishalvkapoor at gmail.com  Wed Sep  1 20:51:05 2004
From: vishalvkapoor at gmail.com (Vishal Kapoor)
Date: Wed Sep  1 20:51:09 2004
Subject: [Python-Dev] Installing python-dev
Message-ID: <189a54cc0409011151216ab01@mail.gmail.com>

Hi, 
I am trying to install Zope and it requires python2.3-dev. 
I downloaded Python and installed it, how do i install python-dev ??

Thank you

Vishal Kapoor
From allison at sumeru.stanford.EDU  Wed Sep  1 20:59:20 2004
From: allison at sumeru.stanford.EDU (Dennis Allison)
Date: Wed Sep  1 20:59:27 2004
Subject: [Python-Dev] Installing python-dev
In-Reply-To: <189a54cc0409011151216ab01@mail.gmail.com>
Message-ID: <Pine.LNX.4.10.10409011156370.8342-100000@sumeru.stanford.EDU>

If you are on a rpm based sysem (RH, etc) you need to download the
development RPM as well as the binary as that's where all the includes are
that are needed for C extensions.  But, the better installation is to
download the source and compile locally using the usual sequence

	./congfigure
	make
	su
	make install

(but read the documents first).  In that case the necesary files will be
automatically installed.

On Wed, 1 Sep 2004, Vishal Kapoor wrote:

> Hi, 
> I am trying to install Zope and it requires python2.3-dev. 
> I downloaded Python and installed it, how do i install python-dev ??
> 
> Thank you
> 
> Vishal Kapoor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu
> 

From aahz at pythoncraft.com  Wed Sep  1 21:49:57 2004
From: aahz at pythoncraft.com (Aahz)
Date: Wed Sep  1 21:50:02 2004
Subject: [Python-Dev] Installing python-dev
In-Reply-To: <189a54cc0409011151216ab01@mail.gmail.com>
References: <189a54cc0409011151216ab01@mail.gmail.com>
Message-ID: <20040901194957.GB25565@panix.com>

On Wed, Sep 01, 2004, Vishal Kapoor wrote:
>
> I am trying to install Zope and it requires python2.3-dev. 
> I downloaded Python and installed it, how do i install python-dev ??

Sorry, this post is off-topic for python-dev.  Please use
comp.lang.python or a Zope list for help.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"To me vi is Zen.  To use vi is to practice zen.  Every command is a
koan.  Profound to the user, unintelligible to the uninitiated.  You
discover truth everytime you use it."  --reddy@lion.austin.ibm.com
From martin at v.loewis.de  Wed Sep  1 22:32:57 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  1 22:33:03 2004
Subject: [Python-Dev] Installing python-dev
In-Reply-To: <20040901194957.GB25565@panix.com>
References: <189a54cc0409011151216ab01@mail.gmail.com>
	<20040901194957.GB25565@panix.com>
Message-ID: <413631F9.4070507@v.loewis.de>

Aahz wrote:
> On Wed, Sep 01, 2004, Vishal Kapoor wrote:
> 
>>I am trying to install Zope and it requires python2.3-dev. 
>>I downloaded Python and installed it, how do i install python-dev ??
> 
> 
> Sorry, this post is off-topic for python-dev.  

Although it is far from obvious that a list called python-dev is
not about python-dev :-)

Martin

From gvanrossum at gmail.com  Thu Sep  2 02:42:56 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 02:42:59 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
Message-ID: <ca471dc204090117421a813d4e@mail.gmail.com>

On Wed, 01 Sep 2004 15:31:26 -0700, mhammond@users.sourceforge.net
<mhammond@users.sourceforge.net> wrote:
> Update of /cvsroot/python/python/dist/src/Modules
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv22421/Modules
> 
> Modified Files:
>       Tag: release23-maint
>         threadmodule.c
> Log Message:
> Backport [ 1010677 ] thread Module Breaks PyGILState_Ensure()
> to the 2.3 maint branch.

As long as we're backporting C APIs to 2.3, can I request that the new
datetime API be backported to 2.3? Anthony Tuininga (the cx_Oracle
author) would be interested in using this and might be willing to help
out with the work. (And yes, I'm encouraging this because I could use
this myself.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From tim.peters at gmail.com  Thu Sep  2 03:43:34 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  2 03:43:38 2004
Subject: [Python-Dev] Coernic Desktop Search versus shutil.rmtree
Message-ID: <1f7befae040901184316a8ebf6@mail.gmail.com>

It took a while to make this connection!  Last night I downloaded this
new free (beer) app for Windows:

    http://www.copernic.com/en/products/desktop-search/index.html

Just a note to say that it's fantastic.  It builds index files for all
the documents on your drive, including PDFs, HTMLs, and Outlook email
stores.  Then you do can seriously fast (sub-second) Boolean searches.
 It's indexed about 10GB of data on my drive with about a million
keywords, with 0 errors and 0 glitches, and the search quality is very
good.

Anyway, today I saw a really weird failure in the Zope X3 test suite,
shutil.rmtree() complaining that it couldn't remove a directory. 
Studying the test didn't turn up any plausible cause for this. 
Tonight I was running the Python CVS test suite, and it failed once in
the same mysterious way.  Then it failed again that way, but in
another test.  I eventually reduced it to this:

"""
import os
import shutil

LOCALEDIR = os.path.join('xx', 'LC_MESSAGES')
MOFILE  = os.path.join(LOCALEDIR, 'gettext.mo')
UMOFILE = os.path.join(LOCALEDIR, 'ugettext.mo')
MMOFILE = os.path.join(LOCALEDIR, 'metadata.mo')


class Drive:
    def setUp(self):
        if os.path.isdir(LOCALEDIR):
            shutil.rmtree(os.path.split(LOCALEDIR)[0])
        os.makedirs(LOCALEDIR)
        fp = open(MOFILE, 'wb');  fp.write('a'); fp.close()
        fp = open(UMOFILE, 'wb'); fp.write('b'); fp.close()
        fp = open(MMOFILE, 'wb'); fp.write('c'); fp.close()
        shutil.rmtree(os.path.split(LOCALEDIR)[0])

d = Drive()
while True:
    d.setUp()
    print '.',
"""

That failed every time, after printing from 0 to 100 dots, while
trying to rmdir xx/LC_MESSAGES.

The cause:  Windows has low-level hooks for apps that want to monitor
changes to the filesystem.  For example, virus scanners use those
heavily.  Coernic also uses them, to reindex changed files in the
background.  So it can keep a file open beyond the time Python thinks
it deleted it, and then trying to rmdir its parent directory fails
(because the directory isn't really empty yet).

Stopping the Desktop Search process makes these problems go away.  It
also appears to cure a range of incomprehensible complaints from large
CVS updates that starting showing up last night <wink>.  Ah, there's
an option to keep the search app running but to turn off the
filesystem hooking -- that cures it too.

Anyway, this is worth sharing because this has got to be the next PC
Killer App genre:  finding info on a 120GB disk has become impossible,
and I switched most of my email to a gmail account because I can't
even find "important" email from last week using Outlook anymore.  If
you don't run an app like Coernic yet, you will soon <0.5 wink>.
From anthony at interlink.com.au  Thu Sep  2 07:19:10 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Sep  2 07:19:35 2004
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Modules	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc204090117421a813d4e@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
Message-ID: <4136AD4E.30503@interlink.com.au>

Guido van Rossum wrote:
> As long as we're backporting C APIs to 2.3, can I request that the new
> datetime API be backported to 2.3? Anthony Tuininga (the cx_Oracle
> author) would be interested in using this and might be willing to help
> out with the work. (And yes, I'm encouraging this because I could use
> this myself.)

Erm - this particular fix was a bug fix. I'm deeply uncomfortable about
adding the C version of datetime to 2.3 at this very late stage of 2.3's
life cycle.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From aleaxit at yahoo.com  Thu Sep  2 09:32:51 2004
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Sep  2 09:32:50 2004
Subject: [Python-Dev] Re: Decimal for Py2.3 (was: about presicion)
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C78C8@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C78C8@escpl.tcp.com.ar>
Message-ID: <4BB8FEFE-FCB2-11D8-A6A1-000A95EFAE9E@yahoo.com>


On 2004 Sep 01, at 19:11, Batista, Facundo wrote:

> [Alex Martelli]
>
> #- I think that's an excellent policy -- Python 2.3 will no doubt 
> remain
> #- widely used for a long time to come.  I think it would be nice if
> #- Decimal was packaged up with its own docs for easy download and 
> install
> #- into an existing 2.3 installation, then... make life as easy as 
> possible
> #- for 2.3 users who need to do some decimal arithmetic!
>
> Don't know what's commong usage, so I ask.
>
> Should I prepare a "decimal package" with the module and docs and 
> whatever,
> in form of a tgz, rmp, .exe, etc, to let people "install" decimal in 
> their
> Py2.3?
>
> Or it's better to somewhere tell the user that if he/she wants to use
> Decimal in Py2.3 to follow this simple n steps (and the detail of the 
> steps,
> of course ;)?
>
> Considering that it's only a file, and the docs could be accessed 
> through
> Py2.4 documentation, I'll go for the latter.

Based on little direct experience, I think that the former course would 
more than double the usage of Decimal among people still sticking with 
Python 2.3 -- having to get files out of a different release, including 
a piece of the docs, feels way scarier and more hassle to most people 
than just downloading the appropriate package and double clicking (or 
unpacking and running python setup.py install, of course).  I can't 
blame them, most particularly when we're talking about people whose 
familiarity with the stuff that installers do to their computers is 
hazy and imprecise -- and I see no reason why such people shouldn't be 
eager Decimal users as well as people with more system administration 
nous.


Alex

From revol at free.fr  Thu Sep  2 12:24:07 2004
From: revol at free.fr (=?windows-1252?q?Fran=E7ois?= Revol)
Date: Thu Sep  2 12:31:02 2004
Subject: [Python-Dev] problem with pymalloc on the BeOS port.
In-Reply-To: <1f7befae04082420104f9158df@mail.gmail.com>
Message-ID: <1540805831-BeMail@taz>

> [Fran?ois Revol]
> > Now, I don't see why malloc itself would give such a result, it's
> > pyMalloc which places those marks, so the thing malloc does 
> > wouldn't
> > place them 4 bytes of each other for no reason, or repport 0 bytes
> > where 4 are allocated.
> 
> I think you're fooling yourself if you believe 4 *were* allocated. 
> The memory dump shows nothing but gibberish, with 4 blocks of 
> fbfbfbfb
> not a one of which makes sense in context (the numbers before and
> after them make no sense as "# of bytes allocated" or as "serial
> number" values, so these forbidden-byte blocks don't make sense as
> either end of an active pymalloc block).
> 
> You should at least try to get a C traceback at this point, on the
> chance that the routine passing the pointer is a clue.  We don't even
> know here yet whether the complaint came from a free() or realloc()
> call.

I finally found out what was making python throw up when using 
pymalloc, (and possibly why I'm getting MemoryErrors without it).
It's caused by the BeOS exec() which copies the path to argv[0] 
without telling anyone.
I noticed it was overriding argv[0] in the execed process, but didn't 
think it was doing that before actually doing the syscall.
So this results in a double-free if exec fails.

posix_fork()
posix_fork() 0
posix_fork() 637
posix_execv1
posix_execv2: path @ 0x80010fb8 ='./gcc'
posix_execv3
posix_execv4
posix_execv5: argvlist @ 0x8014f8e0
posix_execv5: argv[0] @ 0x80010fa0 = gcc
posix_execv5: argv[1] @ 0x80010e20 = -O0
posix_execv5: argv[2] @ 0x801504c0 = -g
posix_execv5: argv[3] @ 0x800193c0 = -fno-strict-aliasing
posix_execv5: argv[4] @ 0x80150400 = -I.
posix_execv5: argv[5] @ 0x80160f08 = -I/boot/home/Python-2.3.4/./
Include
posix_execv5: argv[6] @ 0x8015c028 = -I/boot/home/config/include
posix_execv5: argv[7] @ 0x80160f40 = -I/boot/home/Python-2.3.4/Include
posix_execv5: argv[8] @ 0x8015c0e8 = -I/boot/home/Python-2.3.4
posix_execv5: argv[9] @ 0x80150460 = -c
posix_execv5: argv[10] @ 0x8015e3a8 = /boot/home/Python-2.3.4/Modules/
structmodule.c
posix_execv5: argv[11] @ 0x801503b8 = -o
posix_execv5: argv[12] @ 0x8015e068 = build/temp.beos-5.1-BePC-2.3/
structmodule.o
posix_execv6
execv: No such file or directory
posix_execv7: path @ 0x80010fb8 ='./gcc'
posix_execv7: argvlist @ 0x8014f8e0
posix_execv7: argv[0] @ 0x80010fb8 = ./gcc <<<<<< that's the problem !
posix_execv7: argv[1] @ 0x80010e20 = -O0
posix_execv7: argv[2] @ 0x801504c0 = -g
posix_execv7: argv[3] @ 0x800193c0 = -fno-strict-aliasing
posix_execv7: argv[4] @ 0x80150400 = -I.
posix_execv7: argv[5] @ 0x80160f08 = -I/boot/home/Python-2.3.4/./
Include
posix_execv7: argv[6] @ 0x8015c028 = -I/boot/home/config/include
posix_execv7: argv[7] @ 0x80160f40 = -I/boot/home/Python-2.3.4/Include
posix_execv7: argv[8] @ 0x8015c0e8 = -I/boot/home/Python-2.3.4
posix_execv7: argv[9] @ 0x80150460 = -c
posix_execv7: argv[10] @ 0x8015e3a8 = /boot/home/Python-2.3.4/Modules/
structmodule.c
posix_execv7: argv[11] @ 0x801503b8 = -o
posix_execv7: argv[12] @ 0x8015e068 = build/temp.beos-5.1-BePC-2.3/
structmodule.o
Debug memory block at address p=0x80010fb8:
    0 bytes originally requested
    The 4 pad bytes at p-4 are FORBIDDENBYTE, as expected.
    The 4 pad bytes at tail=0x80010fb8 are not all FORBIDDENBYTE 
(0xfb):
        at tail+0: 0xdb *** OUCH
        at tail+1: 0xdb *** OUCH
        at tail+2: 0xdb *** OUCH
        at tail+3: 0xdb *** OUCH
    The block was made by call #3688627195 to debug malloc/realloc.
Fatal Python error: bad trailing pad byte
error: Bad thread ID

From aahz at pythoncraft.com  Thu Sep  2 15:20:02 2004
From: aahz at pythoncraft.com (Aahz)
Date: Thu Sep  2 15:20:16 2004
Subject: [Python-Dev] Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <1f7befae040901184316a8ebf6@mail.gmail.com>
References: <1f7befae040901184316a8ebf6@mail.gmail.com>
Message-ID: <20040902132002.GA13089@panix.com>

On Wed, Sep 01, 2004, Tim Peters wrote:
>
> The cause:  Windows has low-level hooks for apps that want to monitor
> changes to the filesystem.  For example, virus scanners use those
> heavily.  Coernic also uses them, to reindex changed files in the
> background.  So it can keep a file open beyond the time Python thinks
> it deleted it, and then trying to rmdir its parent directory fails
> (because the directory isn't really empty yet).

What happens when you use Windows Exploder to delete the folder?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"To me vi is Zen.  To use vi is to practice zen.  Every command is a
koan.  Profound to the user, unintelligible to the uninitiated.  You
discover truth everytime you use it."  --reddy@lion.austin.ibm.com
From tim.peters at gmail.com  Thu Sep  2 16:37:03 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  2 16:37:05 2004
Subject: [Python-Dev] Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <20040902132002.GA13089@panix.com>
References: <1f7befae040901184316a8ebf6@mail.gmail.com>
	<20040902132002.GA13089@panix.com>
Message-ID: <1f7befae04090207371f5b2142@mail.gmail.com>

[Tim]
>> The cause:  Windows has low-level hooks for apps that want to
>> monitor changes to the filesystem.  For example, virus scanners
>> use those heavily.  Coernic also uses them, to reindex changed
>> files in the background.  So it can keep a file open beyond the time
>> Python thinks it deleted it, and then trying to rmdir its parent
>> directory fails (because the directory isn't really empty yet).

[Aahz]
> What happens when you use Windows Exploder to delete the folder?

I didn't try Explorer specifically.  Since I was in a DOS box anyway,
I used rmdir/s to clean it out.  I'm sure using Explorer would have
worked too.

This is a timing problem.  By the time I can click on the folder to
delete it in Explorer, or by the time I can type "rmdir/s xx",
Copernic is long done reindexing the files, so there's no problem
nuking the directory then.  shutil.rmtree issues the rmdir at machine
speed.
From tim.peters at gmail.com  Thu Sep  2 16:59:58 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  2 17:00:00 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <4136AD4E.30503@interlink.com.au>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
Message-ID: <1f7befae0409020759266a45cb@mail.gmail.com>

[Anthony Baxter]
> Erm - this particular fix was a bug fix. I'm deeply uncomfortable about
> adding the C version of datetime to 2.3 at this very late stage of 2.3's
> life cycle.

It's quite arguably a bugfix, since datetime.h in 2.3.4 exposes things
that can't possibly be used outside of datetimemodule.c (the datetime
type objects are referenced in the header, but not exported in a
usable way).  Anthony Tuininga's patch to *finish* (not really add)
the datetime C API is a low-risk change regardless:  it doesn't change
any existing functionality, it just finishes the job of exposing it to
C coders, and adds some new macros for convenience.

Now if some platform header file has macros with names like

    PyDateTime_FromTimestamp
or
    PyDelta_FromDSU

then adding these macros to datetime.h could cause new problems.  But
platform header files don't have macros with names like those (if they
did, we would have bumped into it while developing 2.4).
From gvanrossum at gmail.com  Thu Sep  2 17:42:55 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 17:42:58 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <4136AD4E.30503@interlink.com.au>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
Message-ID: <ca471dc204090208425915a41@mail.gmail.com>

> > As long as we're backporting C APIs to 2.3, can I request that the new
> > datetime API be backported to 2.3? Anthony Tuininga (the cx_Oracle
> > author) would be interested in using this and might be willing to help
> > out with the work. (And yes, I'm encouraging this because I could use
> > this myself.)
> 
> Erm - this particular fix was a bug fix. I'm deeply uncomfortable about
> adding the C version of datetime to 2.3 at this very late stage of 2.3's
> life cycle.

Fair enough. Let's drop the idea.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Thu Sep  2 17:45:34 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 17:45:40 2004
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Modules threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <1f7befae0409020759266a45cb@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
Message-ID: <ca471dc204090208452448a73c@mail.gmail.com>

> It's quite arguably a bugfix, since datetime.h in 2.3.4 exposes things
> that can't possibly be used outside of datetimemodule.c (the datetime
> type objects are referenced in the header, but not exported in a
> usable way).  Anthony Tuininga's patch to *finish* (not really add)
> the datetime C API is a low-risk change regardless:  it doesn't change
> any existing functionality, it just finishes the job of exposing it to
> C coders, and adds some new macros for convenience.
> 
> Now if some platform header file has macros with names like
> 
>     PyDateTime_FromTimestamp
> or
>     PyDelta_FromDSU
> 
> then adding these macros to datetime.h could cause new problems.  But
> platform header files don't have macros with names like those (if they
> did, we would have bumped into it while developing 2.4).

Hm, Anthony, what do you think now? (Disregard my previous mail, I was
confused by multiple logical threads mixed into the same
conversation.)

--Guido
From anthony at interlink.com.au  Thu Sep  2 18:26:12 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Sep  2 18:26:50 2004
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Modules	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <1f7befae0409020759266a45cb@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
Message-ID: <413749A4.5020902@interlink.com.au>

Tim Peters wrote:
> [Anthony Baxter]
> 
>>Erm - this particular fix was a bug fix. I'm deeply uncomfortable about
>>adding the C version of datetime to 2.3 at this very late stage of 2.3's
>>life cycle.
> 
> 
> It's quite arguably a bugfix, since datetime.h in 2.3.4 exposes things
> that can't possibly be used outside of datetimemodule.c 

Ah - I misunderstood, and thought that 2.3 had no version of datetime.c
at all, and Guido was proposing that we add it. So, to get this
straight, what _are_ we talking about, exactly? Is there an SF
bug/patch with the trunk change?


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From fredrik at pythonware.com  Thu Sep  2 18:34:23 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Sep  2 18:32:37 2004
Subject: [Python-Dev] Re: Coernic Desktop Search versus shutil.rmtree
References: <1f7befae040901184316a8ebf6@mail.gmail.com><20040902132002.GA13089@panix.com>
	<1f7befae04090207371f5b2142@mail.gmail.com>
Message-ID: <ch7hv0$6on$1@sea.gmane.org>

Tim Peters wrote:

> This is a timing problem.  By the time I can click on the folder to
> delete it in Explorer, or by the time I can type "rmdir/s xx",
> Copernic is long done reindexing the files, so there's no problem
> nuking the directory then.  shutil.rmtree issues the rmdir at machine
> speed.

so a possible robustification would be to add

def _rmdir(path):
    try:
        os.rmdir(path):
    except IOError, v:
        if sys.platform == "win32" and (directory not empty):
            time.sleep(0.1)
            os.rmdir(path)
        else:
            raise

and use _rmdir instead of os.rmdir in _build_cmdtuple...

</F> 


From gvanrossum at gmail.com  Thu Sep  2 18:47:45 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 18:47:50 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <413749A4.5020902@interlink.com.au>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
	<413749A4.5020902@interlink.com.au>
Message-ID: <ca471dc204090209476b20ae26@mail.gmail.com>

Anthony (the other one), can you explain it?

On Fri, 03 Sep 2004 02:26:12 +1000, Anthony Baxter
<anthony@interlink.com.au> wrote:
> Tim Peters wrote:
> > [Anthony Baxter]
> >
> >>Erm - this particular fix was a bug fix. I'm deeply uncomfortable about
> >>adding the C version of datetime to 2.3 at this very late stage of 2.3's
> >>life cycle.
> >
> >
> > It's quite arguably a bugfix, since datetime.h in 2.3.4 exposes things
> > that can't possibly be used outside of datetimemodule.c
> 
> Ah - I misunderstood, and thought that 2.3 had no version of datetime.c
> at all, and Guido was proposing that we add it. So, to get this
> straight, what _are_ we talking about, exactly? Is there an SF
> bug/patch with the trunk change?
> 
> 
> --
> Anthony Baxter     <anthony@interlink.com.au>
> It's never too late to have a happy childhood.
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From jcarlson at uci.edu  Thu Sep  2 18:59:19 2004
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu Sep  2 19:05:59 2004
Subject: [Python-Dev] Re: Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <ch7hv0$6on$1@sea.gmane.org>
References: <1f7befae04090207371f5b2142@mail.gmail.com>
	<ch7hv0$6on$1@sea.gmane.org>
Message-ID: <20040902095224.C10B.JCARLSON@uci.edu>


> so a possible robustification would be to add
> 
> def _rmdir(path):
>     try:
>         os.rmdir(path):
>     except IOError, v:
>         if sys.platform == "win32" and (directory not empty):
>             time.sleep(0.1)
>             os.rmdir(path)
>         else:
>             raise
> 
> and use _rmdir instead of os.rmdir in _build_cmdtuple...


Only for this test.

In the general case, there could be other reasons why that deletion
failed.  One that I run into relatively often is...

Shell 1:
curpath: <drive>:\arbitrary\path\name

Shell 2:
curpath: <drive>:\arbitrary\path
command: python -c 'import os;os.remove("name")'

In this case, the OSError is the correct thing, and shouldn't be hidden
with a 'sleep'.


 - Josiah

From martin at v.loewis.de  Thu Sep  2 19:36:43 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep  2 19:36:45 2004
Subject: [Python-Dev] Re: [Python-checkins]	python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc204090208452448a73c@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
Message-ID: <41375A2B.2020401@v.loewis.de>

Guido van Rossum wrote:
>>Now if some platform header file has macros with names like
>>
>>    PyDateTime_FromTimestamp
>>or
>>    PyDelta_FromDSU
>>
>>then adding these macros to datetime.h could cause new problems.  But
>>platform header files don't have macros with names like those (if they
>>did, we would have bumped into it while developing 2.4).
> 
> 
> Hm, Anthony, what do you think now?

I'm not Anthony (neither, actually), but I do think this is a new
feature, not a bug fix - assuming we are talking about the changes
between datetime.h in 2.3 and 2.4.

This introduces datetime.datetime_CAPI, which is a C object
allowing cross-module datetime calls at the C level.

This change is very unlikely to break existing code, as existing
code just won't use that new API. This is good for a backport.

At the same time, this also clearly shows it is a new feature:
only new code can use it.

Channelling Anthony (Baxter), this cannot be accepted for 2.3.
It would allow for code that works on 2.3.5, but fails on 2.3.4.
What's worse, the extension module can be built on 2.3.5, and
the binary module will fail when run on 2.3.4, as importing the
CAPI object would fail.

People who rely on that feature should get a compile time
error on 2.3.x, instead of compilation succeeding for some x.
People who need to support 2.3 as well should use the Python
API to the datetime module, not the C API.

Regards,
Martin


From anthony at interlink.com.au  Thu Sep  2 19:45:43 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Sep  2 19:46:17 2004
Subject: [Python-Dev] Re: [Python-checkins]	python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <41375A2B.2020401@v.loewis.de>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
Message-ID: <41375C47.2020608@interlink.com.au>

Martin v. L?wis wrote:
> Channelling Anthony (Baxter), this cannot be accepted for 2.3.
> It would allow for code that works on 2.3.5, but fails on 2.3.4.
> What's worse, the extension module can be built on 2.3.5, and
> the binary module will fail when run on 2.3.4, as importing the
> CAPI object would fail.

Ugh. Thanks for the clarification. I really don't think that this is
something we want to add to 2.3.5.

From gvanrossum at gmail.com  Thu Sep  2 20:05:11 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 20:05:15 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <41375A2B.2020401@v.loewis.de>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
Message-ID: <ca471dc20409021105afbfbc@mail.gmail.com>

> People who rely on that feature should get a compile time
> error on 2.3.x, instead of compilation succeeding for some x.
> People who need to support 2.3 as well should use the Python
> API to the datetime module, not the C API.

Given that it's a CObject, code could easily be written (and I'm sure
cx_Oracle will do this) that attempts to import the CObject and uses a
fallback if that fails. I expect that cx_Oracle will be just about the
only customer of this API.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Thu Sep  2 20:07:22 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 20:07:25 2004
Subject: [Python-Dev] Re: Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <20040902095224.C10B.JCARLSON@uci.edu>
References: <1f7befae04090207371f5b2142@mail.gmail.com>
	<ch7hv0$6on$1@sea.gmane.org> <20040902095224.C10B.JCARLSON@uci.edu>
Message-ID: <ca471dc2040902110776ed9a2@mail.gmail.com>

On Thu, 02 Sep 2004 09:59:19 -0700, Josiah Carlson <jcarlson@uci.edu> wrote:
> 
> > so a possible robustification would be to add
> >
> > def _rmdir(path):
> >     try:
> >         os.rmdir(path):
> >     except IOError, v:
> >         if sys.platform == "win32" and (directory not empty):
> >             time.sleep(0.1)
> >             os.rmdir(path)
> >         else:
> >             raise
> >
> > and use _rmdir instead of os.rmdir in _build_cmdtuple...
> 
> 
> Only for this test.
> 
> In the general case, there could be other reasons why that deletion
> failed.  One that I run into relatively often is...
> 
> Shell 1:
> curpath: <drive>:\arbitrary\path\name
> 
> Shell 2:
> curpath: <drive>:\arbitrary\path
> command: python -c 'import os;os.remove("name")'
> 
> In this case, the OSError is the correct thing, and shouldn't be hidden
> with a 'sleep'.
> 
> 
>  - Josiah

I surely hope Fredrik was being facetious.


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From fredrik at pythonware.com  Thu Sep  2 20:17:45 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Sep  2 20:15:58 2004
Subject: [Python-Dev] Re: Re: Coernic Desktop Search versus shutil.rmtree
References: <1f7befae04090207371f5b2142@mail.gmail.com><ch7hv0$6on$1@sea.gmane.org>
	<20040902095224.C10B.JCARLSON@uci.edu>
	<ca471dc2040902110776ed9a2@mail.gmail.com>
Message-ID: <ch7o0q$nqb$1@sea.gmane.org>

Guido van Rossum wrote

> I surely hope Fredrik was being facetious.

not necessarily. The rmtree function already takes a couple of flags; I wouldn't
mind seeing a "try harder" option for platforms like windows.

(but a better solution would probably be a way to "override" the functions
used to remove files and directories.  I've had to copy and tweak the rmtree
code quite a few times, usually to deal with cases where the tree might con-
tain read-only files...)

</F> 


From jhylton at gmail.com  Thu Sep  2 20:26:55 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Thu Sep  2 20:27:05 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <41375A2B.2020401@v.loewis.de>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
Message-ID: <e8bf7a5304090211262e2b3975@mail.gmail.com>

On Thu, 02 Sep 2004 19:36:43 +0200, "Martin v. L?wis"
<martin@v.loewis.de> wrote:
> I'm not Anthony (neither, actually), but I do think this is a new
> feature, not a bug fix - assuming we are talking about the changes
> between datetime.h in 2.3 and 2.4.
> 
> This introduces datetime.datetime_CAPI, which is a C object
> allowing cross-module datetime calls at the C level.
> 
> This change is very unlikely to break existing code, as existing
> code just won't use that new API. This is good for a backport.
> 
> At the same time, this also clearly shows it is a new feature:
> only new code can use it.

Of late, I've found the True / False introduction in later 2.2
releases to be a pain.  I'm writing code on a machine that has 2.2.2,
but I occasionally run into machines with earlier versions of 2.2 and
then my code fails.  It would be easier if it didn't work on any 2.2
release, then I wouldn't be lulled into thinking it will work.

Jeremy
From martin at v.loewis.de  Thu Sep  2 20:28:40 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep  2 20:28:41 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc20409021105afbfbc@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<ca471dc20409021105afbfbc@mail.gmail.com>
Message-ID: <41376658.6020502@v.loewis.de>

Guido van Rossum wrote:
> Given that it's a CObject, code could easily be written (and I'm sure
> cx_Oracle will do this) that attempts to import the CObject and uses a
> fallback if that fails. I expect that cx_Oracle will be just about the
> only customer of this API.

If there is a fallback already, why do you want the backport? Just
use the fallback.

Regards,
Martin
From gvanrossum at gmail.com  Thu Sep  2 21:03:01 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 21:03:04 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <41376658.6020502@v.loewis.de>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<ca471dc20409021105afbfbc@mail.gmail.com>
	<41376658.6020502@v.loewis.de>
Message-ID: <ca471dc204090212035455e8a2@mail.gmail.com>

> If there is a fallback already, why do you want the backport? Just
> use the fallback.

Because the fallback is slower?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From anthony at interlink.com.au  Thu Sep  2 21:37:47 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Sep  2 21:38:22 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc204090212035455e8a2@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<ca471dc20409021105afbfbc@mail.gmail.com>
	<41376658.6020502@v.loewis.de>
	<ca471dc204090212035455e8a2@mail.gmail.com>
Message-ID: <4137768B.9000800@interlink.com.au>

Guido van Rossum wrote:
>>If there is a fallback already, why do you want the backport? Just
>>use the fallback.
> 
> 
> Because the fallback is slower?

This, to me, is a poor reason to break the backwards/forwards
compatibility of binary modules. Yes, modules _could_ be written
to do the right thing, and cx_Oracle might. But then someone
else comes along and uses it, and notices that it works on 2.3.5,
so makes a 2.3 binary package. And people on older 2.3's get a
broken package.

I'm really really unconvinced that this is a good idea.

Anthony
From aahz at pythoncraft.com  Thu Sep  2 22:02:06 2004
From: aahz at pythoncraft.com (Aahz)
Date: Thu Sep  2 22:02:09 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <e8bf7a5304090211262e2b3975@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<e8bf7a5304090211262e2b3975@mail.gmail.com>
Message-ID: <20040902200205.GA23600@panix.com>

On Thu, Sep 02, 2004, Jeremy Hylton wrote:
>
> Of late, I've found the True / False introduction in later 2.2
> releases to be a pain.  I'm writing code on a machine that has 2.2.2,
> but I occasionally run into machines with earlier versions of 2.2 and
> then my code fails.  It would be easier if it didn't work on any 2.2
> release, then I wouldn't be lulled into thinking it will work.

+1
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"To me vi is Zen.  To use vi is to practice zen.  Every command is a
koan.  Profound to the user, unintelligible to the uninitiated.  You
discover truth everytime you use it."  --reddy@lion.austin.ibm.com
From skip at pobox.com  Thu Sep  2 22:06:21 2004
From: skip at pobox.com (Skip Montanaro)
Date: Thu Sep  2 22:07:15 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <4137768B.9000800@interlink.com.au>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<ca471dc20409021105afbfbc@mail.gmail.com>
	<41376658.6020502@v.loewis.de>
	<ca471dc204090212035455e8a2@mail.gmail.com>
	<4137768B.9000800@interlink.com.au>
Message-ID: <16695.32061.372611.9265@montanaro.dyndns.org>


>>>>> "Anthony" == Anthony Baxter <anthony@interlink.com.au> writes:

    >> Because the fallback is slower?

    Anthony> This, to me, is a poor reason to break the backwards/forwards
    Anthony> compatibility of binary modules. 

+100

At my new job we maintain "stable" and "unstable" versions (*) of our
current project.  I frequently hold up Python's policy of "only bug fixes
are allowed in stable versions" as a shining example of how things should be
done.  We've violated that policy on occasion.  When that happens it
generally comes back to bite us, and we only have three users down the hall,
not users scattered all over the planet.

Skip

(*) I use those terms *very* loosely.
From gvanrossum at gmail.com  Thu Sep  2 22:23:54 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  2 22:24:00 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <16695.32061.372611.9265@montanaro.dyndns.org>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<ca471dc20409021105afbfbc@mail.gmail.com>
	<41376658.6020502@v.loewis.de>
	<ca471dc204090212035455e8a2@mail.gmail.com>
	<4137768B.9000800@interlink.com.au>
	<16695.32061.372611.9265@montanaro.dyndns.org>
Message-ID: <ca471dc2040902132375eefdb7@mail.gmail.com>

OK, I withdraw my request. Never mind. :-)

--Guido
From martin at v.loewis.de  Thu Sep  2 22:25:18 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep  2 22:25:18 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc204090212035455e8a2@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<ca471dc20409021105afbfbc@mail.gmail.com>
	<41376658.6020502@v.loewis.de>
	<ca471dc204090212035455e8a2@mail.gmail.com>
Message-ID: <413781AE.8000204@v.loewis.de>

Guido van Rossum wrote:
>>If there is a fallback already, why do you want the backport? Just
>>use the fallback.
> 
> 
> Because the fallback is slower?

I see. However, people with existing installation will have to suffer
from the slow-down, anyway; people will need to upgrade in order to
see the speed improvement. If they need the speed advantage (which
is exactly how much?), they should consider upgrading to 2.4.

That an extension module runs slower in 2.3 than it does in 2.3 is
not a bug in 2.3 - a lot of things run slower in 2.3, yet we don't
backport all performance changes to 2.3, especially if code has to
be adopted to make use of it.

Regards,
Martin
From barry at python.org  Thu Sep  2 22:27:14 2004
From: barry at python.org (Barry Warsaw)
Date: Thu Sep  2 22:27:22 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <e8bf7a5304090211262e2b3975@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>
	<ca471dc204090117421a813d4e@mail.gmail.com>
	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
	<e8bf7a5304090211262e2b3975@mail.gmail.com>
Message-ID: <1094156834.8722.21.camel@geddy.wooz.org>

On Thu, 2004-09-02 at 14:26, Jeremy Hylton wrote:

> Of late, I've found the True / False introduction in later 2.2
> releases to be a pain.  I'm writing code on a machine that has 2.2.2,
> but I occasionally run into machines with earlier versions of 2.2 and
> then my code fails.  It would be easier if it didn't work on any 2.2
> release, then I wouldn't be lulled into thinking it will work.

Just to add: while this can be worked around in code, it's extremely
tedious both to add those workarounds, and to remove them when they're
no longer necessary.  I think it's generally not a good idea to do it.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040902/a09a1882/attachment.pgp
From Scott.Daniels at Acm.Org  Thu Sep  2 22:51:15 2004
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Thu Sep  2 22:50:14 2004
Subject: [Python-Dev] Re: Alternative placeholder delimiters for PEP 292
In-Reply-To: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
References: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
Message-ID: <ch811q$i7k$1@sea.gmane.org>

Andrew Durdin wrote:
> A Yet Simpler Proposal, modifying that of PEP 292 ...
>     ... placeholders are delimited by braces {}.
Do you know about the techinique I use?  It works w/o a new library:

Surround-style delimiting, using a single (specifiable) character.

     def subst(template, _sep='$', **kwds):
         if '' not in kwds:
             kwds[''] = _sep    # Allow doubled _sep for _sep.
         parts = template.split(_sep)
         parts[1::2] = [kwds[element] for element in parts[1::2]]
         return template[0:0].join(parts)

For 2.4, use a generator expression, not a list comprehension:

     def subst(template, _sep='$', **kwds):
         if '' not in kwds:
             kwds[''] = _sep    # Allow doubled _sep for _sep.
         parts = template.split(_sep)
         parts[1::2] = (kwds[element] for element in parts[1::2])
         return template[0:0].join(parts)

Then you can use:

     subst('What I $mean$ is $$5.00', mean='really mean')
or  subst(u'What I $mean$ is $$5.00', mean=u'really mean')
or  subst('What I $mean$ is $$5.00', mean='really mean', *locals())
or ...

-- Scott David Daniels
Scott.Daniels@Acm.Org

From Scott.Daniels at Acm.Org  Thu Sep  2 23:49:02 2004
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Thu Sep  2 23:47:54 2004
Subject: [Python-Dev] Re: PEP 309 updated slightly
In-Reply-To: <36f88922040831075133a98188@mail.gmail.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F72@UKDCX001.uk.int.atosorigin.com>
	<36f88922040831075133a98188@mail.gmail.com>
Message-ID: <ch84e5$l91$1@sea.gmane.org>

Alex Naanou wrote:

>        if isinstance(func, LCurry) or isinstance(func, RCurry):
^^ is better written as:
          if isinstance(func, (LCurry, RCurry)):

-- 
-- Scott David Daniels
Scott.Daniels@Acm.Org

From Scott.Daniels at Acm.Org  Fri Sep  3 01:50:54 2004
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Fri Sep  3 01:49:48 2004
Subject: [Python-Dev] Re: FW: [Python-checkins] python/nondist/peps
 pep-0318.txt, 1.25, 1.26
In-Reply-To: <wtznu1x7.fsf@python.net>
References: <004b01c48a29$2b6d6ba0$04f9cc97@oemcomputer>
	<wtznu1x7.fsf@python.net>
Message-ID: <ch8bin$52m$1@sea.gmane.org>

Thomas Heller wrote:

> "Raymond Hettinger" <python@rcn.com> writes:
>>.... everyone knew they [metaclasses] were powerful when they were
>> put in, but no one knew how they would be used or whether they were
>>necessary.  In fact, two versions later, we still don't know those
>>answers.
> 
> Sorry I have to say this, but I don't think you know what you're talking
> about in this paragraph.

I would suggest we don't know the practical range of application yet.
It is clear that some black magicians are happy, but metaclasses are
not yet as well understood as list comprehensions in the sense of,
"this is when you use them; this is an abuse."  Metaclasses are more
structural and less linguistic; such things take longer to absorb as
design structure elements.  This is all by way of saying, "nope,
he has a point."

-- Scott David Daniels
Scott.Daniels@Acm.Org

From alex.nanou at gmail.com  Fri Sep  3 01:59:01 2004
From: alex.nanou at gmail.com (Alex Naanou)
Date: Fri Sep  3 01:59:06 2004
Subject: [Python-Dev] Re: PEP 309 updated slightly
In-Reply-To: <ch84e5$l91$1@sea.gmane.org>
References: <16E1010E4581B049ABC51D4975CEDB8803060F72@UKDCX001.uk.int.atosorigin.com>
	<36f88922040831075133a98188@mail.gmail.com>
	<ch84e5$l91$1@sea.gmane.org>
Message-ID: <36f88922040902165989cd456@mail.gmail.com>

On Thu, 02 Sep 2004 14:49:02 -0700, Scott David Daniels
<scott.daniels@acm.org> wrote:
> Alex Naanou wrote:
> 
> >        if isinstance(func, LCurry) or isinstance(func, RCurry):
> ^^ is better written as:
>          if isinstance(func, (LCurry, RCurry)):

I know! :)
 ...and it also is faster!
but that particular code was a) written quite a while back. b) at my
course at the MSU the students seem to have ALLOT less trouble
understanding the existing version of the code.... (I did try both...)

yes, this is a bit of a trade-off.... need to think about it a bit more!

though this an off-topic here, but it would indeed be interesting to
know if anyone has a different opinion....


Thanks! ^_^

---
Alex.
From aahz at pythoncraft.com  Fri Sep  3 02:11:01 2004
From: aahz at pythoncraft.com (Aahz)
Date: Fri Sep  3 02:11:04 2004
Subject: [Python-Dev] Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <1f7befae04090207371f5b2142@mail.gmail.com>
References: <1f7befae040901184316a8ebf6@mail.gmail.com>
	<20040902132002.GA13089@panix.com>
	<1f7befae04090207371f5b2142@mail.gmail.com>
Message-ID: <20040903001101.GA16770@panix.com>

On Thu, Sep 02, 2004, Tim Peters wrote:
>
> [Tim]
>>> The cause:  Windows has low-level hooks for apps that want to
>>> monitor changes to the filesystem.  For example, virus scanners
>>> use those heavily.  Coernic also uses them, to reindex changed
>>> files in the background.  So it can keep a file open beyond the time
>>> Python thinks it deleted it, and then trying to rmdir its parent
>>> directory fails (because the directory isn't really empty yet).
> 
> [Aahz]
>> What happens when you use Windows Exploder to delete the folder?
> 
> I didn't try Explorer specifically.  Since I was in a DOS box anyway,
> I used rmdir/s to clean it out.  I'm sure using Explorer would have
> worked too.
> 
> This is a timing problem.  By the time I can click on the folder to
> delete it in Explorer, or by the time I can type "rmdir/s xx",
> Copernic is long done reindexing the files, so there's no problem
> nuking the directory then.  shutil.rmtree issues the rmdir at machine
> speed.

Question is, what happens when you use Explorer while Coernic is busy
inside a folder?  If it barfs, then I think it's reasonable for rmtree()
to barf.  Or are you saying that it's not possible to make that test?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes
From anthony at computronix.com  Thu Sep  2 17:59:10 2004
From: anthony at computronix.com (Anthony Tuininga)
Date: Fri Sep  3 02:58:47 2004
Subject: [SPAM-heur] Re: Re: Re: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Modules threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc204090208452448a73c@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>
	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
Message-ID: <4137434E.4080505@computronix.com>

Well, I find the argument convincing enough and it is quite safe. I am 
willing to make the necessary patches and it would be quite convenient 
to be able to use the C API in Python 2.3 as well. So I'm in favor but 
I'll bow to the greater wisdom of the Python development community since 
I really am not significantly involved. :-)

Guido van Rossum wrote:
>>It's quite arguably a bugfix, since datetime.h in 2.3.4 exposes things
>>that can't possibly be used outside of datetimemodule.c (the datetime
>>type objects are referenced in the header, but not exported in a
>>usable way).  Anthony Tuininga's patch to *finish* (not really add)
>>the datetime C API is a low-risk change regardless:  it doesn't change
>>any existing functionality, it just finishes the job of exposing it to
>>C coders, and adds some new macros for convenience.
>>
>>Now if some platform header file has macros with names like
>>
>>    PyDateTime_FromTimestamp
>>or
>>    PyDelta_FromDSU
>>
>>then adding these macros to datetime.h could cause new problems.  But
>>platform header files don't have macros with names like those (if they
>>did, we would have bumped into it while developing 2.4).
> 
> 
> Hm, Anthony, what do you think now? (Disregard my previous mail, I was
> confused by multiple logical threads mixed into the same
> conversation.)
> 
> --Guido

-- 
Anthony Tuininga
anthony@computronix.com

Computronix
Distinctive Software. Real People.
Suite 200, 10216 - 124 Street NW
Edmonton, AB, Canada  T5N 4A3
Phone:	(780) 454-3700
Fax:	(780) 454-3838
http://www.computronix.com

From anthony at computronix.com  Thu Sep  2 19:15:12 2004
From: anthony at computronix.com (Anthony Tuininga)
Date: Fri Sep  3 02:58:48 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <ca471dc204090209476b20ae26@mail.gmail.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<413749A4.5020902@interlink.com.au>
	<ca471dc204090209476b20ae26@mail.gmail.com>
Message-ID: <41375520.8060503@computronix.com>

Yes, there are patches. There happens to be two entries because I missed 
some documentation changes the first time around. The links are:

https://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=876130
https://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=986010

The files changed are
dist/src/Modules/datetimemodule.c
dist/src/Include/datetime.h
dist/src/Doc/api/concrete.tex

To summarize, an attribute named "datetime_CAPI" is added to the 
datetime module. A macro PyDateTime_IMPORT is used to access this 
attribute and then additional macros are available for manipulating 
datetime instances. If you want to look at an actual implementation you 
can take a look at cx_Oracle 4.1 beta 1 available at 
http://starship.python.net/crew/atuining

If you have further questions, let me know and I'll try to answer them.

Guido van Rossum wrote:
> Anthony (the other one), can you explain it?
> 
> On Fri, 03 Sep 2004 02:26:12 +1000, Anthony Baxter
> <anthony@interlink.com.au> wrote:
> 
>>Tim Peters wrote:
>>
>>>[Anthony Baxter]
>>>
>>>
>>>>Erm - this particular fix was a bug fix. I'm deeply uncomfortable about
>>>>adding the C version of datetime to 2.3 at this very late stage of 2.3's
>>>>life cycle.
>>>
>>>
>>>It's quite arguably a bugfix, since datetime.h in 2.3.4 exposes things
>>>that can't possibly be used outside of datetimemodule.c
>>
>>Ah - I misunderstood, and thought that 2.3 had no version of datetime.c
>>at all, and Guido was proposing that we add it. So, to get this
>>straight, what _are_ we talking about, exactly? Is there an SF
>>bug/patch with the trunk change?
>>
>>
>>--
>>Anthony Baxter     <anthony@interlink.com.au>
>>It's never too late to have a happy childhood.
>>
>>
>>_______________________________________________
>>Python-Dev mailing list
>>Python-Dev@python.org
>>http://mail.python.org/mailman/listinfo/python-dev
>>Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
> 
> 
> 

-- 
Anthony Tuininga
anthony@computronix.com

Computronix
Distinctive Software. Real People.
Suite 200, 10216 - 124 Street NW
Edmonton, AB, Canada  T5N 4A3
Phone:	(780) 454-3700
Fax:	(780) 454-3838
http://www.computronix.com

From anthony at computronix.com  Thu Sep  2 19:42:58 2004
From: anthony at computronix.com (Anthony Tuininga)
Date: Fri Sep  3 02:58:49 2004
Subject: [SPAM-heur] Re: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Modules threadmodule.c,2.56, 2.56.8.1
In-Reply-To: <41375A2B.2020401@v.loewis.de>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
Message-ID: <41375BA2.5070700@computronix.com>

I won't presume to dictate policy on this matter but since this is C API 
you do have to go through some effort in order to use it. I already have 
the following in cx_Oracle

#if (PY_VERSION_HEX >= 0x02040000)
....do stuff....
#endif

I am assuming that I could (if this patch is accepted) simply change it to

#if (PY_VERSION_HEX >= 0x02030500)
....do stuff....
#endif

Whether or not this makes it acceptable or not I leave that to the 
release manager to decide....

Martin v. L?wis wrote:
> Guido van Rossum wrote:
> 
>>> Now if some platform header file has macros with names like
>>>
>>>    PyDateTime_FromTimestamp
>>> or
>>>    PyDelta_FromDSU
>>>
>>> then adding these macros to datetime.h could cause new problems.  But
>>> platform header files don't have macros with names like those (if they
>>> did, we would have bumped into it while developing 2.4).
>>
>>
>>
>> Hm, Anthony, what do you think now?
> 
> 
> I'm not Anthony (neither, actually), but I do think this is a new
> feature, not a bug fix - assuming we are talking about the changes
> between datetime.h in 2.3 and 2.4.
> 
> This introduces datetime.datetime_CAPI, which is a C object
> allowing cross-module datetime calls at the C level.
> 
> This change is very unlikely to break existing code, as existing
> code just won't use that new API. This is good for a backport.
> 
> At the same time, this also clearly shows it is a new feature:
> only new code can use it.
> 
> Channelling Anthony (Baxter), this cannot be accepted for 2.3.
> It would allow for code that works on 2.3.5, but fails on 2.3.4.
> What's worse, the extension module can be built on 2.3.5, and
> the binary module will fail when run on 2.3.4, as importing the
> CAPI object would fail.
> 
> People who rely on that feature should get a compile time
> error on 2.3.x, instead of compilation succeeding for some x.
> People who need to support 2.3 as well should use the Python
> API to the datetime module, not the C API.
> 
> Regards,
> Martin
> 

-- 
Anthony Tuininga
anthony@computronix.com

Computronix
Distinctive Software. Real People.
Suite 200, 10216 - 124 Street NW
Edmonton, AB, Canada  T5N 4A3
Phone:	(780) 454-3700
Fax:	(780) 454-3838
http://www.computronix.com

From tim.peters at gmail.com  Fri Sep  3 03:30:36 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Sep  3 03:30:51 2004
Subject: [Python-Dev] Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <20040903001101.GA16770@panix.com>
References: <1f7befae040901184316a8ebf6@mail.gmail.com>
	<20040902132002.GA13089@panix.com>
	<1f7befae04090207371f5b2142@mail.gmail.com>
	<20040903001101.GA16770@panix.com>
Message-ID: <1f7befae0409021830219eb2fe@mail.gmail.com>

[Aahz]
> Question is, what happens when you use Explorer while Coernic is busy
> inside a folder?  If it barfs, then I think it's reasonable for rmtree()
> to barf.  Or are you saying that it's not possible to make that test?

I didn't claim it was unreasonable for shutil.rmtree to barf, and I
have no interest in making that test.  As mentioned before, Copernic's
use of the filesystem hooks drives CVS crazy too.  It's a new app, and
using the filesystem hooks transparently is a subtle undertaking. 
They'll fix it eventually.
From tim.peters at gmail.com  Fri Sep  3 05:52:52 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Sep  3 05:52:54 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	threadmodule.c, 2.56, 2.56.8.1
In-Reply-To: <41375A2B.2020401@v.loewis.de>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>
	<ca471dc204090208452448a73c@mail.gmail.com>
	<41375A2B.2020401@v.loewis.de>
Message-ID: <1f7befae04090220522da07284@mail.gmail.com>

[Martin v. L?wis]
> ...
> Channelling Anthony (Baxter), this cannot be accepted for 2.3.
> It would allow for code that works on 2.3.5, but fails on 2.3.4.
> What's worse, the extension module can be built on 2.3.5, and
> the binary module will fail when run on 2.3.4, as importing the
> CAPI object would fail.

That is a strong argument, and you're right that "the rules" don't
allow it.  OTOH, unlike Jeremy's True/False example, this is an
obscure piece of C with only one known user in the world (Anthony
wrote the datetime C API patch, and Anthony wrote the Oracle wrapper
which is the datetime C API's only known user).  So an opposing
"practicality beats purity" argument *could* apply too.  I'm not going
to make it myself, because I personally have no use for the C datetime
API <wink>.
From anthony at interlink.com.au  Fri Sep  3 07:46:36 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Sep  3 07:47:05 2004
Subject: [SPAM-heur] Re: [Python-Dev] Re:
	[Python-checkins]	python/dist/src/Modules
	threadmodule.c,2.56, 2.56.8.1
In-Reply-To: <41375BA2.5070700@computronix.com>
References: <E1C2ddW-0005qV-As@sc8-pr-cvs1.sourceforge.net>	<ca471dc204090117421a813d4e@mail.gmail.com>	<4136AD4E.30503@interlink.com.au>	<1f7befae0409020759266a45cb@mail.gmail.com>	<ca471dc204090208452448a73c@mail.gmail.com>	<41375A2B.2020401@v.loewis.de>
	<41375BA2.5070700@computronix.com>
Message-ID: <4138053C.2070501@interlink.com.au>

Anthony Tuininga wrote:
[snip]
> Whether or not this makes it acceptable or not I leave that to the 
> release manager to decide....

I can understand why it would be convenient for this to be in 2.3.5,
but I really don't want to see this in the release23-maint branch. The
advantages (it allows cx_oracle to be faster) are nowhere near strong
enough to outweigh the disadvantages (breaking binary compatibility
between bugfix releases).

 From the feedback I've received since I started the current run of
bugfix releases, one of the strongest messages I've received is that
people _really_ _really_ like the no-new-features rule, because it
makes it much easier to justify rolling out a bugfix release. I'm
not saying that this rule must never be broken, only that it would
need an extremely good reason to do so.

This case is even worse, as it is both a new feature _and_ a binary
imcompatibility.

If you wanted to, you could produce a package with a patched datetime
module, and instructions for allowing users to install it into their
existing installation. This is entirely up to them, then.

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From anthony at python.org  Fri Sep  3 10:36:16 2004
From: anthony at python.org (Anthony Baxter)
Date: Fri Sep  3 10:36:31 2004
Subject: [Python-Dev] RELEASED Python 2.4, alpha 3
Message-ID: <41382D00.70906@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On behalf of the Python development team and the Python community, I'm
happy to announce the third alpha of Python 2.4.

Python 2.4a3 is an alpha release.  We'd greatly appreciate it if you
could download it, kick the tires and let us know of any problems you
find, but it is not suitable for production usage.

~    http://www.python.org/2.4

In this release we have PEP-292 string templates, a new syntax for
multi-line imports, and a large number of other bug fixes and
improvements.  See either the highlights, the What's New in
Python 2.4, or the detailed NEWS file -- all available from the
Python 2.4 webpage.

This will hopefully be the last alpha in the Python 2.4 cycle -
a first beta will follow in a few weeks. Once the first beta is
out, we're in feature-freeze mode - so if you've got new things
you want in, make sure you hurry!

Please log any problems you have with this release in the SourceForge
bug tracker (noting that you're using 2.4a3):

~    http://sourceforge.net/bugs/?group_id=5470

Enjoy the new release,
Anthony

Anthony Baxter
anthony@python.org
Python Release Manager
(on behalf of the entire python-dev team)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBOCz9Dt3F8mpFyBYRAmgqAJ42drhwIe3QLSx6WyUxOUPewUtX4QCgt5Wv
mP4MfJRsXy6t0IcS6fY8Mmc=
=efD5
-----END PGP SIGNATURE-----
From anthony at interlink.com.au  Fri Sep  3 11:30:16 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Sep  3 11:30:31 2004
Subject: please keep trunk frozen for a bit (was Re: [Python-Dev] RELEASED
	Python 2.4, alpha 3)
In-Reply-To: <41382D00.70906@python.org>
References: <41382D00.70906@python.org>
Message-ID: <413839A8.9040700@interlink.com.au>

If people could keep the trunk frozen for about 5 or 6
hours (in case of a need for a brown-paper-bag release)
I'd appreciate it.

Say, until 2004-09-03 14:00 UTC or so...

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From jim at zope.com  Fri Sep  3 13:02:10 2004
From: jim at zope.com (Jim Fulton)
Date: Fri Sep  3 13:02:15 2004
Subject: [Python-Dev] Want to make dictproxy objects creatable from Python
Message-ID: <41384F32.10801@zope.com>


New-style classes use dict proxies to protect their dictionaries
from direct manipulation. I would like to be able to use these in
other situations, but the dictproxy class doesn't let me create new
instances.  I propose to give the class __new__ and __init__ methods
so that it is callable from Python.

Any objections?

May I do this for 2.4b1?

Jim


-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From adurdin at gmail.com  Fri Sep  3 15:55:25 2004
From: adurdin at gmail.com (Andrew Durdin)
Date: Fri Sep  3 15:55:34 2004
Subject: [Python-Dev] Re: Alternative placeholder delimiters for PEP 292
In-Reply-To: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
References: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
Message-ID: <59e9fd3a0409030655395c213a@mail.gmail.com>

On Mon, 30 Aug 2004 16:11:34 +1000, Andrew Durdin <adurdin@gmail.com> wrote:
> A Yet Simpler Proposal, modifying that of PEP 292
> 
>     I propose that the Template module not use $ to set off
>     placeholders; instead, placeholders are delimited by braces {}.

Barry, would you care to comment on my proposal, particularly my
points in the rationale for it?

I've just taken the 2.4a3 Template class and modified it to fit this
proposal. The result is below. I've also got a modified unit test and
tex file to account for the changes at
http://andy.durdin.net/test_pep292_braces.py and
http://andy.durdin.net/bracetmpl.text -- I'd make a complete patch,
but I'm not sure what tools to use (I'm running Win2k): can someone
point me in the right direction?

####################################################################
import re as _re

class Template(unicode):
    """A string class for supporting {}-substitutions."""
    __slots__ = []

    # Search for {{, }}, {identifier}, and any bare {'s or }'s
    pattern = _re.compile(r"""
      (?P<escapedlt>\{{2})|              # Escape sequence of two { braces
      (?P<escapedrt>\}{2})|              # Escape sequence of two } braces
      {(?P<braced>[_a-z][_a-z0-9]*)}|    # $ and a brace delimited identifier
      (?P<bogus>\{|\})                   # Other ill-formed { or } expressions
    """, _re.IGNORECASE | _re.VERBOSE)

    def __mod__(self, mapping):
        def convert(mo):
            if mo.group('escapedlt') is not None:
                return '{'
            if mo.group('escapedrt') is not None:
                return '}'
            if mo.group('bogus') is not None:
                raise ValueError('Invalid placeholder at index %d' %
                                 mo.start('bogus'))
            val = mapping[mo.group('braced')]
            return unicode(val)
        return self.pattern.sub(convert, self)


class SafeTemplate(Template):
    """A string class for supporting {}-substitutions.

    This class is 'safe' in the sense that you will never get KeyErrors if
    there are placeholders missing from the interpolation dictionary.  In that
    case, you will get the original placeholder in the value string.
    """
    __slots__ = []

    def __mod__(self, mapping):
        def convert(mo):
            if mo.group('escapedlt') is not None:
                return '{'
            if mo.group('escapedrt') is not None:
                return '}'
            if mo.group('bogus') is not None:
                raise ValueError('Invalid placeholder at index %d' %
                                 mo.start('bogus'))
            braced = mo.group('braced')
            try:
                return unicode(mapping[braced])
            except KeyError:
                return '{' + braced + '}'
        return self.pattern.sub(convert, self)

del _re
From barry at python.org  Fri Sep  3 17:30:24 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep  3 17:30:29 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:
	Simple String Substitutions
In-Reply-To: <uk6vicaxo.fsf@yahoo.co.uk>
References: <20040827233958.GA5560@panix.com>
	<000501c48d34$19ce9000$e841fea9@oemcomputer>
	<uk6vicaxo.fsf@yahoo.co.uk>
Message-ID: <1094225424.8811.71.camel@geddy.wooz.org>

Skipped content of type multipart/mixed-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040903/1b7fd280/attachment.pgp
From barry at python.org  Fri Sep  3 17:38:52 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep  3 17:38:55 2004
Subject: [Python-Dev] Alternative Implementation for PEP 292:
	SimpleString Substitutions
In-Reply-To: <003301c48e55$04942ac0$e841fea9@oemcomputer>
References: <003301c48e55$04942ac0$e841fea9@oemcomputer>
Message-ID: <1094225932.8788.85.camel@geddy.wooz.org>

On Mon, 2004-08-30 at 01:48, Raymond Hettinger wrote:

> By not inheriting from unicode, the bug can be fixed while retaining a
> class implementation (see sandbox\curry292.py for an example).
> 
> But, be clear, it *is* a bug.
> 
> If all the inputs are strings, Unicode should not magically appear.  See
> all the other string methods as an example.  

But the Template classes aren't string methods, so I don't think the
analogy is quite right.  Because the template string itself is by
definition a Unicode, it actually makes more sense that everything its
mod operator returns is also a Unicode.  So I still don't think it's a
bug.

> Someday, all will be
> Unicode, until then, some apps choose to remain Unicode free.  Also,
> there is a build option to not even compile Unicode support -- it would
> be a bummer to have the $ templates fail as a result.

Maybe.  Like the doctor says, well, don't do that!  (i.e. use Templates
and disable unicode).

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040903/b8ac2991/attachment.pgp
From Paul.Moore at atosorigin.com  Fri Sep  3 17:49:59 2004
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Fri Sep  3 17:50:04 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:Simple
	String Substitutions
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>

From: Barry Warsaw
> Attached is a demo using the 2.4a3 implementation of string.Template. 
> Note that the only change in the Template subclass is the pattern, and
> there, it's just that the 'named' and 'braced' groups got a '.' in the
> second character class.

Ah, I follow. The lookup logic is in the mapping class rather than in
the template.

Would it be useful to factor out the "identifier syntax" bit of the
pattern? The "escaped" and "bogus" groups are less likely to need
changing than what constitutes an identifier.

Hmm, you'd have to get fancy then, as the "obvious" approach is a
class attribute

    id = "[_a-z][_a-z0-9]*"

but then computing pattern while keeping it as a class attribute is
harder than I can work out right now.

Forget it - let's keep it simple until someone shows a real need.

Thanks for the sample,
Paul.


__________________________________________________________________________
This e-mail and the documents attached are confidential and intended 
solely for the addressee; it may also be privileged. If you receive this 
e-mail in error, please notify the sender immediately and destroy it.
As its integrity cannot be secured on the Internet, the Atos Origin group 
liability cannot be triggered for the message content. Although the 
sender endeavours to maintain a computer virus-free network, the sender 
does not warrant that this transmission is virus-free and will not be 
liable for any damages resulting from any virus transmitted.
__________________________________________________________________________
From bob at redivi.com  Fri Sep  3 18:58:46 2004
From: bob at redivi.com (Bob Ippolito)
Date: Fri Sep  3 18:58:52 2004
Subject: [Python-Dev] Coernic Desktop Search versus shutil.rmtree
In-Reply-To: <1f7befae0409021830219eb2fe@mail.gmail.com>
References: <1f7befae040901184316a8ebf6@mail.gmail.com>
	<20040902132002.GA13089@panix.com>
	<1f7befae04090207371f5b2142@mail.gmail.com>
	<20040903001101.GA16770@panix.com>
	<1f7befae0409021830219eb2fe@mail.gmail.com>
Message-ID: <84EC7C1B-FDCA-11D8-95A7-000A95686CD8@redivi.com>


On Sep 2, 2004, at 9:30 PM, Tim Peters wrote:

> [Aahz]
>> Question is, what happens when you use Explorer while Coernic is busy
>> inside a folder?  If it barfs, then I think it's reasonable for 
>> rmtree()
>> to barf.  Or are you saying that it's not possible to make that test?
>
> I didn't claim it was unreasonable for shutil.rmtree to barf, and I
> have no interest in making that test.  As mentioned before, Copernic's
> use of the filesystem hooks drives CVS crazy too.  It's a new app, and
> using the filesystem hooks transparently is a subtle undertaking.
> They'll fix it eventually.

It could very well be a bug in Windows, too.  I think I ran across one 
last night.

Sometimes win32's os.stat(...) raises an exception for folders that 
*do* exist.  The only case I ran into was an iPod though.  The iPod is 
a 3rd generation 15gb (read: not a new click wheel), FAT32 formatted 
(fresh restore from the win32 iPod Updater) managed by iTunes with 
"Enable disk use" on, and is plugged into a laptop with a "low speed" 
USB port (I think 1.1, whatever was before 2.0).  It is mounted as 
'F:\\' and it has several folders on the root (IIRC ['Notes', 
'Calendars', 'Contacts', 'iPod_Control']).  Nothing special about any 
of the folders, except iPod_Control which is attrib +h.

os.stat fails on EVERY folder except 'F:\\iPod_Control'.  
os.listdir('F:\\') shows them all.  win32api.GetFileAttributes works on 
all of these folders and returns just FILE_DIRECTORY (or whatever it's 
called, I don't have a win32 machine where I'm at right now; the 
constant is 16).  iPod_Control probably returns a slightly different 
set of flags, I don't remember trying it.

os.stat('F:\\Notes\\Instructions') succeeds (Instructions is a file).

I believe this may be a bug in _wstat64i (name may be slightly wrong.. 
from memory) or something?  I tried it on Windows XP (not SP2, but 
should be otherwise patched up) with the python.org distributions of 
Python 2.3.0 and Python 2.3.4 and got the same results both times.

-bob
From raymond.hettinger at verizon.net  Fri Sep  3 20:51:52 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri Sep  3 20:52:37 2004
Subject: [Python-Dev] Alternative Implementation for PEP 292:SimpleString
	Substitutions
In-Reply-To: <1094225932.8788.85.camel@geddy.wooz.org>
Message-ID: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>

> > By not inheriting from unicode, the bug can be fixed while retaining
a
> > class implementation (see sandbox\curry292.py for an example).
> >
> > But, be clear, it *is* a bug.
> >
> > If all the inputs are strings, Unicode should not magically appear.
See
> > all the other string methods as an example.
> 
> But the Template classes aren't string methods, so I don't think the
> analogy is quite right.  Because the template string itself is by
> definition a Unicode, it actually makes more sense that everything its
> mod operator returns is also a Unicode.  So I still don't think it's a
> bug.

Templates are not Unicode by definition.  That is an arbitrary
implementation quirk and a design flaw.

The '%(key)s' forms do not behave this way.   They return str unless one
of the inputs are unicode.

People should be able to use Python and not have to deal with Unicode
unless that is an intentional part of their design.

Unless there is some compelling advantage to going beyond the PEP and
changing all the rules, it is a bug.


Raymond

From skip at pobox.com  Fri Sep  3 21:21:18 2004
From: skip at pobox.com (Skip Montanaro)
Date: Fri Sep  3 21:21:31 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
Message-ID: <16696.50222.258426.648284@montanaro.dyndns.org>

Anthony gave me a mild virtual rap on the knuckles for checking in a doc
change yesterday while the trunk was frozen for the release.  Later on he
asked if we could keep it frozen for a bit more.  I just saw a bunch of
checkin messages float by, but haven't seen an all-clear from Anthony.
Isn't the trunk still frozen?

Skip
From bac at OCF.Berkeley.EDU  Fri Sep  3 21:30:48 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Sep  3 21:31:58 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
In-Reply-To: <16696.50222.258426.648284@montanaro.dyndns.org>
References: <16696.50222.258426.648284@montanaro.dyndns.org>
Message-ID: <4138C668.3080009@ocf.berkeley.edu>

Skip Montanaro wrote:

> Anthony gave me a mild virtual rap on the knuckles for checking in a doc
> change yesterday while the trunk was frozen for the release.  Later on he
> asked if we could keep it frozen for a bit more.  I just saw a bunch of
> checkin messages float by, but haven't seen an all-clear from Anthony.
> Isn't the trunk still frozen?
> 

Although Anthony has not given the explicit go-ahead for checkins he 
said it would  be okay after 14:00 UTC today (which is 9:00 CST) in 
another email to the list so it should be okay.

Personally I am still waiting for Anthony to say it is okay to check in 
again.

-Brett
From mal at egenix.com  Fri Sep  3 22:37:54 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri Sep  3 22:37:59 2004
Subject: [Python-Dev] Alternative Implementation for PEP 292:SimpleString
	Substitutions
In-Reply-To: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
Message-ID: <4138D622.6050807@egenix.com>

Raymond Hettinger wrote:
>>>By not inheriting from unicode, the bug can be fixed while retaining a
>>>class implementation (see sandbox\curry292.py for an example).
>>>
>>>But, be clear, it *is* a bug.
>>>
>>>If all the inputs are strings, Unicode should not magically appear. See
>>>all the other string methods as an example.
>>
>>But the Template classes aren't string methods, so I don't think the
>>analogy is quite right.  Because the template string itself is by
>>definition a Unicode, it actually makes more sense that everything its
>>mod operator returns is also a Unicode.  So I still don't think it's a
>>bug.
> 
> 
> Templates are not Unicode by definition.  That is an arbitrary
> implementation quirk and a design flaw.
> 
> The '%(key)s' forms do not behave this way.   They return str unless one
> of the inputs are unicode.
> 
> People should be able to use Python and not have to deal with Unicode
> unless that is an intentional part of their design.
> 
> Unless there is some compelling advantage to going beyond the PEP and
> changing all the rules, it is a bug.

I think Barry needs some backup here.

First, please be aware that normal use of Templates is for formatting
*text* data. Second, it is good design and good practice to store text
data in Unicode objects, because that's what they were designed for,
while string objects have always been an abstract container for storing
bytes with varying meanings and interpretations. The latter is
a design flaw that needs to get fixed, not the choice of Unicode
as Template base class.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 03 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From bac at OCF.Berkeley.EDU  Fri Sep  3 22:43:35 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Sep  3 22:43:50 2004
Subject: [Python-Dev] Making custom patterns for string.Template easier
 (was: Alternative Implementation for PEP 292)
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
Message-ID: <4138D777.3090009@ocf.berkeley.edu>

Moore, Paul wrote:

[SNIP]
> Hmm, you'd have to get fancy then, as the "obvious" approach is a
> class attribute
> 
>     id = "[_a-z][_a-z0-9]*"
> 
> but then computing pattern while keeping it as a class attribute is
> harder than I can work out right now.
> 
> Forget it - let's keep it simple until someone shows a real need.
> 

OK, but it isn't *that* bad.  I already have it so that the parts of the 
pattern can at least be separate and it leaves the class alone::

 >>> test = string.Template('This has a ${dotted.thing} in it')
[26527 refs]
 >>> test.braced = "\${(?P<braced>[_a-z][_a-z0-9]*(\.[_a-z0-9]+)?)}"
[26532 refs]
 >>> test % {'dotted.thing': "dotted name"}
u'This has a dotted name in it'
 >>> string.Template('This has a ${dotted.thing} in it') % 
{'dotted.thing': "dotted named"}
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/Users/drifty/Code/CVS/python/dist/src/Lib/string.py", line 
123, in __mod__
     return self.pattern.sub(convert, self)
   File "/Users/drifty/Code/CVS/python/dist/src/Lib/string.py", line 
119, in convert
     raise ValueError('Invalid placeholder at index %d' %
ValueError: Invalid placeholder at index 11


Making it so that one doesn't have to specify the extra stuff (such as 
braces, $, group name, etc.) would not be hard but could take away from 
the power of it all.  But it does not in any way mess with the class and 
the class' regex is still compiled at class creation time so slowdown 
from anything only happens if someone changes something (did rip out the 
empty __slots__ value, though).

But once again I don't know how useful it would be.  The only thing 
coming off the top of my head is Raymond's Cheetah example of making the 
rules looser for bogus $s.  With this you just need to substitute 
self.bogus.  Nice thing about that is if we change the rules for the 
other pattern groups later on the past code will get that benefit insted 
of being locked into the pattern they probably copied at the time of 
writing and pasted in with their minor tweak.  Also won't lead to errors 
down the road if we add another group to the pattern.

Anyway, I did this partially as an exercise so not a huge deal to me if 
it doesn't make it in, so +0 from me for adding the functionality.

-Brett
From anthony at interlink.com.au  Fri Sep  3 22:51:13 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Sep  3 22:52:09 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
In-Reply-To: <4138C668.3080009@ocf.berkeley.edu>
References: <16696.50222.258426.648284@montanaro.dyndns.org>
	<4138C668.3080009@ocf.berkeley.edu>
Message-ID: <4138D941.4050608@interlink.com.au>

Brett C. wrote:
> Although Anthony has not given the explicit go-ahead for checkins he 
> said it would  be okay after 14:00 UTC today (which is 9:00 CST) in 
> another email to the list so it should be okay.
> 
> Personally I am still waiting for Anthony to say it is okay to check in 
> again.

I'm very concerned by this bugreport:

http://www.python.org/sf/1022010

My windows box has a dodgy RAID card, so I can't check it myself, but
the -n -e is definately in Tools/msi/msi.py. I have no ideas whether
they're meant to be - the msi stuff is a complete unknown to me.

Martin?

Anthony


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From tim.peters at gmail.com  Fri Sep  3 23:01:08 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Sep  3 23:01:10 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
In-Reply-To: <4138D941.4050608@interlink.com.au>
References: <16696.50222.258426.648284@montanaro.dyndns.org>
	<4138C668.3080009@ocf.berkeley.edu>
	<4138D941.4050608@interlink.com.au>
Message-ID: <1f7befae040903140179f8919@mail.gmail.com>

[Anthony Baxter]
> I'm very concerned by this bugreport:
> 
> http://www.python.org/sf/1022010
> 
> My windows box has a dodgy RAID card, so I can't check it myself,
> but the -n -e is definately in Tools/msi/msi.py. I have no ideas
> whether they're meant to be - the msi stuff is a complete unknown
> to me.
> 
> Martin?

I think the bug report got it right:  due to copy-'n-paste error, the
associations for .py (etc) files were mistakely given IDLE-specific
arguments.  So they don't work.  Everyone on c.l.py who noticed this
(the bug report isn't unique) quickly figured out how to fix it
themself (by editing the file assocations created on their box to get
rid of the inappropriate arguments).

Presumably Martin can fix that by building a new MSI installer after
repairing the MSI setup.  If he does that before doing a cvs up, it's
"just" a matter of cutting a new Windows installer.
From amk at amk.ca  Sat Sep  4 00:13:39 2004
From: amk at amk.ca (A.M. Kuchling)
Date: Sat Sep  4 00:13:44 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
In-Reply-To: <1f7befae040903140179f8919@mail.gmail.com>
References: <16696.50222.258426.648284@montanaro.dyndns.org>
	<4138C668.3080009@ocf.berkeley.edu>
	<4138D941.4050608@interlink.com.au>
	<1f7befae040903140179f8919@mail.gmail.com>
Message-ID: <20040903221339.GA5870@rogue.amk.ca>

On Fri, Sep 03, 2004 at 05:01:08PM -0400, Tim Peters wrote:
> Presumably Martin can fix that by building a new MSI installer after
> repairing the MSI setup.  If he does that before doing a cvs up, it's
> "just" a matter of cutting a new Windows installer.

Should Raymond's last random.py bugfix also be included, if there's
going to be a quick 2.4a4?

--amk

From raymond.hettinger at verizon.net  Sat Sep  4 00:36:28 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat Sep  4 00:37:15 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
In-Reply-To: <20040903221339.GA5870@rogue.amk.ca>
Message-ID: <001201c49206$74505fa0$e841fea9@oemcomputer>

> > Presumably Martin can fix that by building a new MSI installer after
> > repairing the MSI setup.  If he does that before doing a cvs up,
it's
> > "just" a matter of cutting a new Windows installer.
> 
> Should Raymond's last random.py bugfix also be included, if there's
> going to be a quick 2.4a4?

That's up to Anthony.
It is a critical fix.


Raymond

From raymond.hettinger at verizon.net  Sat Sep  4 01:15:17 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat Sep  4 01:16:06 2004
Subject: [Python-Dev] Alternative Implementation for PEP 292:SimpleString
	Substitutions
In-Reply-To: <4138D622.6050807@egenix.com>
Message-ID: <001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>

[MAL]
> Second, it is good design and good practice to store text
> data in Unicode objects, because that's what they were designed for,
> while string objects have always been an abstract container for
storing
> bytes with varying meanings and interpretations. 

IMO, it is subversive to start taking new string functions/methods and
coercing their results to Unicode.  Someday we may be there, Py3.0
perhaps, but str is not yet deprecated.  Until then, a user should
reasonably expect SISO str in, str out.  This is doubly true when the
rest of python makes active efforts to avoid SIUO (see % formatting
and''.join() for example).

Someday Guido may get wild and turn all text uses of str into unicode.
Most likely, it will need a PEP so that all the issues get thought
through and everything gets changed at once.  Slipping this into the
third alpha as if it were part of PEP292 is not a good idea.

The PEP was about simplification.  Tossing in unnecessary unicode
coercions is not in line with that goal.

Does anyone else think this is a crummy idea?
Is everyone ready for unicode coercions to start sprouting everywhere?


Raymond

From aahz at pythoncraft.com  Sat Sep  4 01:49:45 2004
From: aahz at pythoncraft.com (Aahz)
Date: Sat Sep  4 01:49:47 2004
Subject: [Python-Dev] Alternative Implementation for PEP 292:SimpleString
	Substitutions
In-Reply-To: <001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
References: <4138D622.6050807@egenix.com>
	<001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
Message-ID: <20040903234945.GA5856@panix.com>

On Fri, Sep 03, 2004, Raymond Hettinger wrote:
>
> The PEP was about simplification.  Tossing in unnecessary unicode
> coercions is not in line with that goal.
> 
> Does anyone else think this is a crummy idea?
> Is everyone ready for unicode coercions to start sprouting everywhere?

+0 (agreeing with Raymond)

Correct me if I'm wrong, but there are a couple of issues here:

* First of all, I believe that unicode strings are interoperable (down
to hashing) with 8-bit strings, as long as there are no non-7-bit ASCII
characters.  Where things get icky is with encoded 8-bit strings making
use of e.g. Latin-1.  So the question is whether we need full
interoperability.

* Unicode strings take four bytes per character (not counting decomposed
characters).  Is it fair at this point in Python's evolution to force
this kind of change in performance metric, essentially silently?

The PEP and docs do make the issue of Unicode fairly clear up-front, so
anyone choosing to use template strings knows what zie is getting into.
But what about someone grabbing a module that uses template strings
internally?....

OTOH, I'm not up for making a big issue out of this.  If Raymond really
is the only person who feels strongly about it, it probably isn't going
to be a big deal in practice.  In addition, I think it's the kind of
change that could be easily fixed in the next release.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes
From martin at v.loewis.de  Sat Sep  4 03:46:24 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep  4 03:46:21 2004
Subject: [Python-Dev] Isn't the trunk still frozen?
In-Reply-To: <4138D941.4050608@interlink.com.au>
References: <16696.50222.258426.648284@montanaro.dyndns.org>
	<4138C668.3080009@ocf.berkeley.edu>
	<4138D941.4050608@interlink.com.au>
Message-ID: <41391E70.6000204@v.loewis.de>

> My windows box has a dodgy RAID card, so I can't check it myself, but
> the -n -e is definately in Tools/msi/msi.py. I have no ideas whether
> they're meant to be - the msi stuff is a complete unknown to me.

I've now corrected the MSI, and put a new version on

http://www.dcl.hpi.uni-potsdam.de/home/loewis/python-2.4a3.2.msi
8e84ce2308613955b54673bbfb47697f python-2.4a3.2.msi

This has the very same binaries as the previous package - just
the packaging itself has changed (also bringing back "Edit with
IDLE" in the context menu).

Regards,
Martin
From fredrik at pythonware.com  Sat Sep  4 08:23:26 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep  4 08:21:44 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP
	292:SimpleStringSubstitutions
References: <4138D622.6050807@egenix.com>
	<001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
Message-ID: <chbmte$380$1@sea.gmane.org>

Raymond Hettinger wrote:
>
> The PEP was about simplification.  Tossing in unnecessary unicode
> coercions is not in line with that goal.
>
> Does anyone else think this is a crummy idea?

Yes.  Whatever MAL and Barry thinks, Python's current model is 8+8=8,
U+U=U, and 8+U=U for ascii U.  That's an advantage, not a bug.

> Is everyone ready for unicode coercions to start sprouting everywhere?

No.

And when that time comes, storing everything as 32-bit characters is not the
right answer either.

</F> 


From mal at egenix.com  Sat Sep  4 13:51:22 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat Sep  4 13:51:40 2004
Subject: [Python-Dev] Alternative Implementation for PEP 292:SimpleString
	Substitutions
In-Reply-To: <001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
References: <001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
Message-ID: <4139AC3A.8050501@egenix.com>

Raymond Hettinger wrote:
> [MAL]
> 
>>Second, it is good design and good practice to store text
>>data in Unicode objects, because that's what they were designed for,
>>while string objects have always been an abstract container for
 >>storing bytes with varying meanings and interpretations.

Hmm, I wonder why you cut away the first part: "First, please be
aware that normal use of Templates is for formatting *text* data."

This is the most important argument for making Template
a Unicode-subclass. Coercion to Unicode then is a logical
consequence and fully in line with what Python has been
doing since version 1.6, ie. U=U+U and U=U+8 (to use /Fs
notation).

> IMO, it is subversive to start taking new string functions/methods and
> coercing their results to Unicode. 

I don't understand... there's nothing subversive here. If strings
meet Unicode the result gets coerced to Unicode. Nothing
surprising here.

Why are you guys putting so much effort into fighting
Unicode ? I often get the impression that you are considering
Unicode a nightmare rather than a blessing.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 04 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mal at egenix.com  Sat Sep  4 14:05:32 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat Sep  4 14:05:39 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP	292:SimpleStringSubstitutions
In-Reply-To: <chbmte$380$1@sea.gmane.org>
References: <4138D622.6050807@egenix.com>	<001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
	<chbmte$380$1@sea.gmane.org>
Message-ID: <4139AF8C.9040907@egenix.com>

Fredrik Lundh wrote:
> Raymond Hettinger wrote:
> 
>>The PEP was about simplification.  Tossing in unnecessary unicode
>>coercions is not in line with that goal.
>>
>>Does anyone else think this is a crummy idea?
> 
> 
> Yes.  Whatever MAL and Barry thinks, Python's current model is 8+8=8,
> U+U=U, and 8+U=U for ascii U.  That's an advantage, not a bug.

Indeed, but I don't see how that's different from what the PEP
is saying.

>>Is everyone ready for unicode coercions to start sprouting everywhere?
> 
> No.
> 
> And when that time comes, storing everything as 32-bit characters is not the
> right answer either.

I'll leave that for the libc designers to decide :-)

If you look at performance, there's not much difference between
8-bit strings and Unicode, so the only argument against using
Unicode for storing text data is memory usage.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 04 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From martin at v.loewis.de  Sat Sep  4 15:12:10 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep  4 15:12:06 2004
Subject: [Python-Dev] random.py still broken wrt. urandom
Message-ID: <4139BF2A.6000907@v.loewis.de>

I consider the random module still broken in its current form (1.66).
It tries to invoke random.urandom(1) in order to find out whether
urandom works. Instead, it should defer that determination until
urandom is actually used; i.e. instead of

             if _urandom is None:
                 import time
                 a = long(time.time() * 256) # use fractional seconds
             else:
                 a = long(_hexlify(_urandom(16)), 16)

it should read

             try:
                 a = long(_hexlify(os.urandom(16)), 16)
             except NotImplementedError:
                 import time
                 a = long(time.time() * 256) # use fractional seconds

IMO the patch to random.py should not have been applied without a
review.

Regards,
Martin
From fredrik at pythonware.com  Sat Sep  4 15:20:55 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep  4 15:19:08 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation
	forPEP	292:SimpleStringSubstitutions
References: <4138D622.6050807@egenix.com>	<001f01c4920b$e0ac4ba0$e841fea9@oemcomputer><chbmte$380$1@sea.gmane.org>
	<4139AF8C.9040907@egenix.com>
Message-ID: <chcfc7$deb$1@sea.gmane.org>

M.-A. Lemburg wrote:

>> Yes.  Whatever MAL and Barry thinks, Python's current model is 8+8=8,
>> U+U=U, and 8+U=U for ascii U.  That's an advantage, not a bug.
>
> Indeed, but I don't see how that's different from what the PEP
> is saying.

the current implementation is

     T(8) % 8 = U.

which violates the 8+8=8 rule.

>> And when that time comes, storing everything as 32-bit characters is not the
>> right answer either.
>
> I'll leave that for the libc designers to decide :-)
>
> If you look at performance, there's not much difference between
> 8-bit strings and Unicode, so the only argument against using
> Unicode for storing text data is memory usage.

I used to make that argument, but these days, I no longer think that you can
talk about performance without taking memory usage into account.

</F> 


From tim.peters at gmail.com  Sat Sep  4 16:40:55 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Sep  4 16:40:58 2004
Subject: [Python-Dev] random.py still broken wrt. urandom
In-Reply-To: <4139BF2A.6000907@v.loewis.de>
References: <4139BF2A.6000907@v.loewis.de>
Message-ID: <1f7befae040904074040c1ba4e@mail.gmail.com>

[Martin v. L?wis]
> I consider the random module still broken in its current form (1.66).
> It tries to invoke random.urandom(1) in order to find out whether
> urandom works. Instead, it should defer that determination until
> urandom is actually used;

Why?

> i.e. instead of
> 
>             if _urandom is None:
>                 import time
>                 a = long(time.time() * 256) # use fractional seconds
>             else:
>                 a = long(_hexlify(_urandom(16)), 16)
> 
> it should read
> 
>             try:
>                 a = long(_hexlify(os.urandom(16)), 16)
>             except NotImplementedError:
>                 import time
>                 a = long(time.time() * 256) # use fractional seconds

Why?  I like it better the way it is, in part because this kind of
determination is made at least 4 times in random.py, and the "_urandom
is None" spelling is quite clear.  The

    from binascii import hexlify as _hexlify

import certainly doesn't belong in the try/except block setting that up, though.

> IMO the patch to random.py should not have been applied without a
> review.

I think that falls under the "expert rule":  Raymond has done more
work on random.py than everyone else combined over the last year or
two, and he had no reason to suspect this change would be
controversial.  To the contrary, I specifically suggested (on
python-dev) that using urandom in seed() methods, when available,
would be a significant improvement over time.time()-based seeding.

Now that you've made your objection, I confess I still have no idea
why you're objecting (see "why?" <wink>).  I did review the patch
(after the fact) for numeric correctness (which did lead to changing
the code, due to a subtle numeric flaw in the original
HardwareRandom.random).
From martin at v.loewis.de  Sat Sep  4 17:55:15 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep  4 17:55:12 2004
Subject: [Python-Dev] random.py still broken wrt. urandom
In-Reply-To: <1f7befae040904074040c1ba4e@mail.gmail.com>
References: <4139BF2A.6000907@v.loewis.de>
	<1f7befae040904074040c1ba4e@mail.gmail.com>
Message-ID: <4139E563.8010506@v.loewis.de>

Tim Peters wrote:
>>I consider the random module still broken in its current form (1.66).
>>It tries to invoke random.urandom(1) in order to find out whether
>>urandom works. Instead, it should defer that determination until
>>urandom is actually used;
> 
> 
> Why?

Invoking urandom() causes /dev/urandom to be opened on Unix (if
available). I really would prefer if merely importing the random
module would not open files (and keep them open for the entire
Python run). Operations like this contribute to startup time.

Now, since the random module also creates a Random object, importing
random would still open /dev/urandom even if the logic dealing with
its absence was somewhat deferred - unless seeding the RNG would
also be deferred until it is first used.

Still, consuming randomness just to determine whether it is
available seems wrong. Importing a module should not affect
system state unless absolutely necessary.

> I think that falls under the "expert rule":  Raymond has done more
> work on random.py than everyone else combined over the last year or
> two, and he had no reason to suspect this change would be
> controversial.  To the contrary, I specifically suggested (on
> python-dev) that using urandom in seed() methods, when available,
> would be a significant improvement over time.time()-based seeding.

I have no problem with that aspect of the change. However, I wish
somebody had noticed that unavailability of urandom is expressed
through a NotImplementedError, not through absence of the function
itself. It is bad luck that this state of the code was released
as 2.4a3.

Regards,
Martin
From barry at python.org  Sat Sep  4 18:25:38 2004
From: barry at python.org (Barry Warsaw)
Date: Sat Sep  4 18:25:46 2004
Subject: [Python-Dev] Alternative Implementation for PEP
	292:SimpleString Substitutions
In-Reply-To: <4138D622.6050807@egenix.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>
Message-ID: <1094315138.8696.36.camel@geddy.wooz.org>

On Fri, 2004-09-03 at 16:37, M.-A. Lemburg wrote:

> I think Barry needs some backup here.

Thanks MAL!

I'll point out that Template was very deliberately subclassed from
unicode, so Template instances /are/ unicode objects.  From the
standpoint of type conversion, using /F's notation, T(8) == U, thus
because U % 8 == U, T(8) % 8 == U.

Other than .encode() are there any other methods of unicode objects that
return 8bit strings?  I don't think so, so it seems completely natural
that T % 8 returns U.

Raymond is against the class-based implementation of PEP 292, but if you
accept the class implementation of 292 (which I still believe is the
right choice), then the fact that the mod operator always returns a
unicode makes perfect sense.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040904/7a62acc7/attachment.pgp
From barry at python.org  Sat Sep  4 18:30:45 2004
From: barry at python.org (Barry Warsaw)
Date: Sat Sep  4 18:30:50 2004
Subject: [Python-Dev] Alternative Implementation for PEP
	292:SimpleString Substitutions
In-Reply-To: <4139AC3A.8050501@egenix.com>
References: <001f01c4920b$e0ac4ba0$e841fea9@oemcomputer>
	<4139AC3A.8050501@egenix.com>
Message-ID: <1094315445.8693.40.camel@geddy.wooz.org>

On Sat, 2004-09-04 at 07:51, M.-A. Lemburg wrote:

> Why are you guys putting so much effort into fighting
> Unicode ? I often get the impression that you are considering
> Unicode a nightmare rather than a blessing.

Indeed.  For example, the only way to maintain your sanity in an i18n'd
application is to convert all text[1] to unicode as early as possible,
deal with only unicode internally, and encode to 8bit strings as late as
possible, if ever.

-Barry

[1] "text" defined as "strings intended for human consumption".

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040904/de02da8f/attachment.pgp
From barry at python.org  Sat Sep  4 18:32:39 2004
From: barry at python.org (Barry Warsaw)
Date: Sat Sep  4 18:32:42 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:Simple
	String Substitutions
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
Message-ID: <1094315559.8696.42.camel@geddy.wooz.org>

On Fri, 2004-09-03 at 11:49, Moore, Paul wrote:

> Would it be useful to factor out the "identifier syntax" bit of the
> pattern? The "escaped" and "bogus" groups are less likely to need
> changing than what constitutes an identifier.

And if they did, you'd want to change them both at the same time.  Do
you have any ideas for an efficient, easily documented implementation?

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040904/b3c5fb1b/attachment-0001.pgp
From barry at python.org  Sat Sep  4 18:33:36 2004
From: barry at python.org (Barry Warsaw)
Date: Sat Sep  4 18:33:39 2004
Subject: [Python-Dev] Alternative placeholder delimiters for PEP 292
In-Reply-To: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
References: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
Message-ID: <1094315616.8721.44.camel@geddy.wooz.org>

On Mon, 2004-08-30 at 02:11, Andrew Durdin wrote:

>     I propose that the Template module not use $ to set off
>     placeholders; instead, placeholders are delimited by braces {}.
>     The following rules for {}-placeholders apply:

The PEP 292 rules were specifically chosen for their similarity to
placeholder syntaxes in many other languages.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040904/c1b34f0e/attachment.pgp
From barry at python.org  Sat Sep  4 18:54:18 2004
From: barry at python.org (Barry Warsaw)
Date: Sat Sep  4 18:54:22 2004
Subject: [Python-Dev] Re: Making custom patterns for string.Template easier
	(was: Alternative Implementation for PEP 292)
In-Reply-To: <4138D777.3090009@ocf.berkeley.edu>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
	<4138D777.3090009@ocf.berkeley.edu>
Message-ID: <1094316858.8696.58.camel@geddy.wooz.org>

On Fri, 2004-09-03 at 16:43, Brett C. wrote:

> Anyway, I did this partially as an exercise so not a huge deal to me if 
> it doesn't make it in, so +0 from me for adding the functionality.

So, where's the code?! :)

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040904/0446cf0a/attachment.pgp
From raymond.hettinger at verizon.net  Sat Sep  4 22:03:25 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat Sep  4 22:04:12 2004
Subject: [Python-Dev] Alternative Implementation for PEP292:SimpleString
	Substitutions
In-Reply-To: <1094315138.8696.36.camel@geddy.wooz.org>
Message-ID: <006a01c492ba$3d90f920$e841fea9@oemcomputer>

> Other than .encode() are there any other methods of unicode objects
that
> return 8bit strings?

That misses the point.  Templates do not have to be unicode objects.
Template can be their own class rather than a subclass of unicode.  The
application does not demand that unicode be mentioned at all.

There seems to be a strong "just live with it" argument but no
advantages are offered other than it matching your personal approach to
text handling.  Why force it when you don't have to.  At least three of
your users (me, Aahz, and Fred) do not want unicode output when we have
str inputs. 


> Raymond is against the class-based implementation of PEP 292,

That choice is independent of the decision of whether to always coerce
to unicode.

Also, it more accurate to say that I think __mod__ operator is not
ideal.  If you want to stay with classes, Guido's __call__ syntax is
also fine.  It avoids the issues with %, makes it possible to have
keyword arguments, and lets you take advantage of polymorphism.

The % operator has several issues:
* it is mnemonic for %(name)s substitution not $ formatting.
* it is hard to find in the docs
* it is does not accept tuple/scalar arguments like % formatting
* its precedence is more appropriate for int.__mod__


Raymond

From adurdin at gmail.com  Sun Sep  5 00:23:49 2004
From: adurdin at gmail.com (Andrew Durdin)
Date: Sun Sep  5 00:23:52 2004
Subject: [Python-Dev] Alternative placeholder delimiters for PEP 292
In-Reply-To: <59e9fd3a0409041520114d0604@mail.gmail.com>
References: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
	<1094315616.8721.44.camel@geddy.wooz.org>
	<59e9fd3a0409041520114d0604@mail.gmail.com>
Message-ID: <59e9fd3a04090415236f9bc0ac@mail.gmail.com>

On Sat, 04 Sep 2004 12:33:36 -0400, Barry Warsaw <barry@python.org> wrote:
> On Mon, 2004-08-30 at 02:11, Andrew Durdin wrote:
>
> >     I propose that the Template module not use $ to set off
> >     placeholders; instead, placeholders are delimited by braces {}.
> >     The following rules for {}-placeholders apply:
>
> The PEP 292 rules were specifically chosen for their similarity to
> placeholder syntaxes in many other languages.

Sure. But just because many other languages do it that way doesn't
mean that it's the best way for Python. There are significant
advantages to using paired delimiters instead of a single prefix
delimiter.

The "Rationale" section of PEP 292 says only that the desire was for
something simpler than the built-in % substitution; if the similarity
to many other languages is also an important part of the rationale,
then the PEP should be modified to take that into account, should it
not?
From bac at OCF.Berkeley.EDU  Sun Sep  5 01:22:26 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Sep  5 01:22:30 2004
Subject: [Python-Dev] Re: Making custom patterns for string.Template easier
	(was: Alternative Implementation for PEP 292)
In-Reply-To: <1094316858.8696.58.camel@geddy.wooz.org>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>	<4138D777.3090009@ocf.berkeley.edu>
	<1094316858.8696.58.camel@geddy.wooz.org>
Message-ID: <413A4E32.2010104@ocf.berkeley.edu>

Barry Warsaw wrote:
> On Fri, 2004-09-03 at 16:43, Brett C. wrote:
> 
> 
>>Anyway, I did this partially as an exercise so not a huge deal to me if 
>>it doesn't make it in, so +0 from me for adding the functionality.
> 
> 
> So, where's the code?! :)
> 

Not on SF since the bloody thing is down!  I will stick it up on a 
patch, assign to you, and report the tracker # here as soon as it is 
back up.

-Brett
From tim.peters at gmail.com  Sun Sep  5 06:32:33 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Sep  5 06:32:37 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:Simple
	String Substitutions
In-Reply-To: <1094315559.8696.42.camel@geddy.wooz.org>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
	<1094315559.8696.42.camel@geddy.wooz.org>
Message-ID: <1f7befae040904213277ffa84@mail.gmail.com>

[Paul Moore]
>> Would it be useful to factor out the "identifier syntax" bit of the
>> pattern? The "escaped" and "bogus" groups are less likely to need
>> changing than what constitutes an identifier.
 
[Barry Warsaw]
> And if they did, you'd want to change them both at the same time.  Do
> you have any ideas for an efficient, easily documented implementation?

You'll rarely hear me say this <wink>, but fiddling classes at class
creation time is exactly what metaclasses are for.  For example,
suppose you said a Template subclass could define a class variable
`idpat`, containing a regexp matching that subclass's idea of "an
identifier".

Then we could define a metaclass once-and-for-all, like so:

class _TemplateFiddler(type):

    pattern = r"""
        (?P<escaped>\${2})|     # Escape sequence of two $ signs
        \$(?P<named>%s)|        # $ and a Python identifier
        \${(?P<braced>%s)}|     # $ and a brace delimited identifier
        (?P<bogus>\$)           # Other ill-formed $ expressions
    """

    def __init__(cls, name, bases, dct):
        super(_TemplateFiddler, cls).__init__(name, bases, dct)
        idpat = cls.idpat
        cls.pattern = _re.compile(_TemplateFiddler.pattern % (idpat, idpat),
                                  _re.IGNORECASE | _re.VERBOSE)

That substitutes the idpat regexp into the base pattern in two spots,
compiles it, and attaches the result as the `pattern` attribute of the
class being defined.

The definition of Template changes like so:

class Template(unicode): # same
    """A string class for supporting $-substitutions.""" # same

    __metaclass__ = _TemplateFiddler  # this is new
    __slots__ = [] # same

    idpat = r'[_a-z][_a-z0-9]*'  # this repaces the current `pattern`

    # The rest is the same.

While the implementation relies on understanding metaclasses, users
don't have to know about that.  The docs are easy ("define a class
vrbl `idpat`"), and it's as efficient as if subclasses had compiled
the full regexp themselves.  Indeed, you can do any amount of
computation once in the metaclass __init__, and cache the results in
attributes of the class.
From tim.peters at gmail.com  Sun Sep  5 07:02:35 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Sep  5 07:02:38 2004
Subject: [Python-Dev] Dangerous exceptions (was Re: Another test_compiler
	mystery)
In-Reply-To: <20040816112916.GA19969@vicky.ecs.soton.ac.uk>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
Message-ID: <1f7befae04090422024afaee58@mail.gmail.com>

[Armin Rigo]
> ... Here is a patch attempting to do what I described:
> http://www.python.org/sf/1009929
>
> It's an extension of the asynchronous exception mecanism used to signal
> between threads.  PyErr_Clear() can send some exceptions to its own thread
> using this mecanism.  (So it is thread-safe.)

I'm sorry that I haven't had time to look at this.  But since I didn't
and don't, let's try to complicate it <wink>.

Some exceptions should never be suppressed unless named explicitly,
and a real bitch is that some user-defined exceptions can fit in that
category too.  The ones that give me (and my employer) the most grief
are the tree of exceptions deriving from ZODB's ConflictError. 
ConflictError is a serious thing:  it essentially means the current
transaction cannot succeed, and the app should give up (and maybe
retry the current transaction from its start).  Suppressing
ConflictError by accident-- even inside a hasattr() call! --can
grossly reduce efficiency, and has a long history too of provoking
subtle, catastrophic, database corruption bugs.

I would like to see Python's exception hierarchy grow more
sophisticated in this respect.  MemoryError, SystemExit, and
KeyboardInterrupt are things that should not be caught by "except
Exception:", neither by a bare "except:", nor by hasattr() or C-level
dict lookup.  ZODB's ConflictError is another of that ilk.  I'd like
to see "except Exception:" become synonymous with bare "except:", and
move the "dangerous exceptions" to subclass off a new branch of the
exception hierarchy.  It could be that something like your patch is
the only practical way to make this work in the C implementation, so
I'm keen on it.
From raymond.hettinger at verizon.net  Sun Sep  5 07:07:35 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun Sep  5 07:08:24 2004
Subject: [Python-Dev] decorator support
Message-ID: <001f01c49306$41f86060$e841fea9@oemcomputer>

In my experiments with decorators, it is common to wrap the original
function with a new function.

After creating the new function, there are efforts to make it look like
the old:

  newf.__doc__  = oldf.__doc__        # copy the docstring
  newf.__dict__.update(oldf.__dict__) # copy attributes
  newf.__name__ = oldf.__name__       # keep the name (new in Py2.4)

All is well and good except the argspec.  Running help() on the new
function gives:

     funcname(*args, **kwds)
        The original docstring

Running help() on the original function gives:

     funcname(arg1, arg2)
        The original docstring

So, it would be nice if there were some support for carrying forward the
argspec to inform help(), calltips(), and inspect().

FWIW, I do know that with sufficient gyrations a decorator could do this
on its own, but it is way too difficult for general use.


Raymond

From anthony at interlink.com.au  Sun Sep  5 09:51:07 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sun Sep  5 09:51:55 2004
Subject: [Python-Dev] random.py fixage
Message-ID: <413AC56B.5060308@interlink.com.au>

I've made a patch (from CVS) for the random.py breakage and linked it
from the top of the 2.4 page. I'll be sending an email out shortly
with this and the new windows installer availability.

I thought about cutting a 2.4a4, but decided against it. For one thing,
it'd mean Martin would need to cut more Windows installers, if we don't
want to end up with mismatched windows installers and tarballs. For
another, it's an _alpha_ release, and so it's not as vital as if it'd
been something like a 2.3.4 bug, or a 2.4 final.

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From fredrik at pythonware.com  Sun Sep  5 10:26:28 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep  5 10:24:45 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP292:SimpleString
	Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org>
Message-ID: <cheig3$ki8$1@sea.gmane.org>

Barry wrote:

> I'll point out that Template was very deliberately subclassed from
> unicode, so Template instances /are/ unicode objects.  From the
> standpoint of type conversion, using /F's notation, T(8) == U, thus
> because U % 8 == U, T(8) % 8 == U.

from a user perspective, there's no reason to make templates a sub-
class of unicode, so the rest of your argument is irrelevant.

instead of looking at use patterns, you're stuck defending the existing
code.  that's not a good way to design usable code.

</F> 


From anthony at python.org  Sun Sep  5 09:58:15 2004
From: anthony at python.org (Anthony Baxter)
Date: Sun Sep  5 10:34:09 2004
Subject: [Python-Dev] UPDATE: New 2.4a3 Windows installer and random.py
	patch available
Message-ID: <413AC717.5010707@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Due to a goof in the packaging scripts, the Windows installer
that was released on Friday for 2.4a3 broke file associations
for .py files. There's a fixed installer (python-2.4a3.2.msi)
available from the Python 2.4 web page.

There's also a patch for the breakage for random.py on systems
that don't have support for the new os.urandom() call. This is
also available from the Python 2.4 web page.

~    http://www.python.org/2.4/

We apologise to those affected by these bugs, and I'd like to
thank the folks who downloaded the release and let us know about
the problems so promptly.

Anthony Baxter
anthony@python.org
Python Release Manager
(on behalf of the entire python-dev team)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBOscVDt3F8mpFyBYRAk2xAJ9KcxQf5JOTmD6dpOiBShe/8jnWSwCgnL/S
hN44eBC0mqygEhltFK2W9Ig=
=W445
-----END PGP SIGNATURE-----
From fredrik at pythonware.com  Sun Sep  5 10:36:51 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep  5 10:35:08 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleStringSubstitutions
References: <1094315138.8696.36.camel@geddy.wooz.org>
	<006a01c492ba$3d90f920$e841fea9@oemcomputer>
Message-ID: <chej3i$lj2$1@sea.gmane.org>

Raymond Hettinger wrote:

> There seems to be a strong "just live with it" argument but no
> advantages are offered other than it matching your personal approach to
> text handling.  Why force it when you don't have to.  At least three of
> your users (me, Aahz, and Fred) do not want unicode output when we have
> str inputs.

one of which wrote the original unicode implementation, and the mixed-type
regular expression engine used to implement templates, and a very popular
XML library that successfully uses mixed-type text to handle text faster and
using less memory than all other Python XML libraries.

I've shown over and over again that Unicode-aware text handling in Python
doesn't have to be slow and bloated; I'd prefer if we kept it that way.

</F> 


From martin at v.loewis.de  Sun Sep  5 11:16:45 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep  5 11:16:41 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <001f01c49306$41f86060$e841fea9@oemcomputer>
References: <001f01c49306$41f86060$e841fea9@oemcomputer>
Message-ID: <413AD97D.9060301@v.loewis.de>

Raymond Hettinger wrote:
> So, it would be nice if there were some support for carrying forward the
> argspec to inform help(), calltips(), and inspect().

What were you thinking of? I could imagine a predefined class, such as

class copyfuncattrs:
     def __init__(self, f):
         self.f = f

     def __call__(self, func):
         res = self.f(func)
         res.__name__ = func.__name__
         res.__doc__ = func.__doc__
         res.__dict__.update(func.__dict__)
         return res

This could be used to define a decorator

@copyfuncattrs
def trace(f):
     def do_trace(*args):
         print "invoking", f.__name__, args
         return f(*args)
     return do_trace

which in turn could be used to decorate a function

@trace
def hello():
     "Print a nice greeting"
     print "Hello, world"

which in turn could be called and inspected

hello()
print hello.__doc__

Then, the question is where copyfuncattrs should live, and I would
object that to be yet another builtin.

Regards,
Martin
From raymond.hettinger at verizon.net  Sun Sep  5 11:24:13 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun Sep  5 11:25:08 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <413AD97D.9060301@v.loewis.de>
Message-ID: <000d01c4932a$1c912400$e841fea9@oemcomputer>

{Martin]
> Then, the question is where copyfuncattrs should live

I imagine that a number of useful recipes like this will emerge over the
next few months and need to be collected in a module.  For now, a wiki
might be a good idea.


Raymond

From anthony at interlink.com.au  Sun Sep  5 11:50:09 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sun Sep  5 11:50:31 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <413AD97D.9060301@v.loewis.de>
References: <001f01c49306$41f86060$e841fea9@oemcomputer>
	<413AD97D.9060301@v.loewis.de>
Message-ID: <413AE151.9090309@interlink.com.au>

Martin v. L?wis wrote:
> Then, the question is where copyfuncattrs should live, and I would
> object that to be yet another builtin.

I think it's very likely that in 2.5 we'll have some sort of
'decorators' module that captures these sorts of things. I
don't think it's likely we'll know enough about the various ins
and outs of decorators to want to put something in 2.4.

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From jacobs at theopalgroup.com  Sun Sep  5 15:13:41 2004
From: jacobs at theopalgroup.com (Kevin Jacobs)
Date: Sun Sep  5 15:13:48 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <001f01c49306$41f86060$e841fea9@oemcomputer>
References: <001f01c49306$41f86060$e841fea9@oemcomputer>
Message-ID: <413B1105.4070305@theopalgroup.com>

Raymond Hettinger wrote:

>In my experiments with decorators, it is common to wrap the original
>function with a new function.
>[...]
>So, it would be nice if there were some support for carrying forward the
>argspec to inform help(), calltips(), and inspect().
>
>FWIW, I do know that with sufficient gyrations a decorator could do this
>on its own, but it is way too difficult for general use.
>  
>
The way I look at it, this is a situation analogous to symbolic links
at the filesystem level.  My first intuition is to add a function attribute,
say func.__proxyfor__ that indicates that a function is a proxy for
another.  That way, introspection and reflection tools that understand
that attribute can extract information from both the proxy and the
proxied functions without too much behind the scenes black magic.

Also note that this is not just an issue with decorators -- I run
into it frequently when writing metaclasses.  I tend not to copy
the proxied functions dictionary, attributes, etc.  Rather, I
re-introduce the proxied function into the class namespace under
a different name.  e.g., a Synchronized metaclass, which
implements thread synchronization, replaces each method
with a proxy that performs the locking gymnastics.  However,
the original methods are accessible via '_<name>_unlocked'.
This allows documentation and introspection tools to find
and output information on the true function signatures.  All
that is missing is the '__proxyfor__' attribute to link them
and tools that interpret it.

I realize that decorators are limited to dealing with one name
binding at a time without resorting to _getframe (unlike
metaclasses, which can rummage through the entire class
with impunity).  Thus, I'd like to keep brainstorming on
how to address this issue.  Once we have something good,
I'm even up for contributing some of the necessary code.


-Kevin

From edcjones at erols.com  Sun Sep  5 15:57:36 2004
From: edcjones at erols.com (Edward C. Jones)
Date: Sun Sep  5 16:03:57 2004
Subject: [Python-Dev] Re: Alternative placeholder delimiters for PEP 292
In-Reply-To: <20040905082502.9CEB41E4007@bag.python.org>
References: <20040905082502.9CEB41E4007@bag.python.org>
Message-ID: <413B1B50.9010803@erols.com>

On Sat, 04 Sep 2004, Barry Warsaw wrote:

>On Mon, 2004-08-30 at 02:11, Andrew Durdin wrote:
>
>>   I propose that the Template module not use $ to set off
>>    placeholders; instead, placeholders are delimited by braces {}.
>>    The following rules for {}-placeholders apply:
>>    
>>
>
>The PEP 292 rules were specifically chosen for their similarity to
>placeholder syntaxes in many other languages.
>  
>
Just because other languages use "$" does not mean that Python should. 
IMHO, lines of code with "$"s in them do not read with a smooth flow. 
They start to look like regex or Perl. I suggest "<@...@>" which I think 
reads more smoothly. If "$" is too locked in to change, please allow 
users to change the default.


From barry at python.org  Sun Sep  5 17:25:28 2004
From: barry at python.org (Barry Warsaw)
Date: Sun Sep  5 17:25:37 2004
Subject: [Python-Dev] Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <006a01c492ba$3d90f920$e841fea9@oemcomputer>
References: <006a01c492ba$3d90f920$e841fea9@oemcomputer>
Message-ID: <1094397928.8145.35.camel@geddy.wooz.org>

On Sat, 2004-09-04 at 16:03, Raymond Hettinger wrote:
> > Other than .encode() are there any other methods of unicode objects
> that
> > return 8bit strings?
> 
> That misses the point.  Templates do not have to be unicode objects.

But it's damn convenient for them to be though.  Please read the
Internationalization section of the PEP.  In addition to being able to
use them directly as gettext catalog keys, I think there will be /a lot/
of scenarios where you won't want to care whether you have a Template or
a unicode -- you will just want to treat everything as a unicode string
without having to do tedious type checking.

> There seems to be a strong "just live with it" argument but no
> advantages are offered other than it matching your personal approach to
> text handling.  

> Why force it when you don't have to.  At least three of
> your users (me, Aahz, and Fred) do not want unicode output when we have
> str inputs. 

<deep_breath>

PEP 292 was a direct outgrowth of my experience in trying to
internationalize an application and make it (much) easier for my
translators to contribute.  Many of them are not Python gurus and the
existing % syntax is clearly a common tripping point.

I'm convinced that the current design of PEP 292 is right for the use
cases I originally designed it for.  To be generous, if the three of you
disagree, then it's because you have other requirements.  That's fine;
maybe they're just incompatible with mine.  Maybe I did a poor job of
explaining how my uses cases lead to the design of PEP 292.

If all that's true, then PEP 292 can't be made general enough and should
be rejected, and the code should be ripped out of the standard library.
Let applications use whatever is appropriate for their own uses cases.  
Because PEP 292 is a library addition, Python itself won't suffer in the
least.  The implementations you proposed won't be of any use to me. 
Fortunately, the archives will be replete with all the alternatives for
future software archaeologists.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040905/94f595a1/attachment.pgp
From barry at python.org  Sun Sep  5 17:26:49 2004
From: barry at python.org (Barry Warsaw)
Date: Sun Sep  5 17:26:52 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <cheig3$ki8$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com> <1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org>
Message-ID: <1094398009.8144.37.camel@geddy.wooz.org>

On Sun, 2004-09-05 at 04:26, Fredrik Lundh wrote:

> from a user perspective, there's no reason to make templates a sub-
> class of unicode, so the rest of your argument is irrelevant.

Not true.  I had a very specific reason for making Templates subclasses
of unicode.  Read the Internationalization section of the PEP.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040905/6e10965b/attachment.pgp
From barry at python.org  Sun Sep  5 17:30:56 2004
From: barry at python.org (Barry Warsaw)
Date: Sun Sep  5 17:31:00 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:Simple
	String Substitutions
In-Reply-To: <1f7befae040904213277ffa84@mail.gmail.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
	<1094315559.8696.42.camel@geddy.wooz.org>
	<1f7befae040904213277ffa84@mail.gmail.com>
Message-ID: <1094398256.8140.41.camel@geddy.wooz.org>

On Sun, 2004-09-05 at 00:32, Tim Peters wrote:

> You'll rarely hear me say this <wink>, but fiddling classes at class
> creation time is exactly what metaclasses are for.  For example,
> suppose you said a Template subclass could define a class variable
> `idpat`, containing a regexp matching that subclass's idea of "an
> identifier".

Very cool idea, thanks Tim!  I like that much more than forcing users to
specify the entire pattern.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040905/b6b0bbb3/attachment.pgp
From fredrik at pythonware.com  Sun Sep  5 17:58:14 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep  5 17:56:36 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>
	<1094398009.8144.37.camel@geddy.wooz.org>
Message-ID: <chfcv6$6f5$1@sea.gmane.org>

Barry wrote:

> Not true.  I had a very specific reason for making Templates subclasses
> of unicode.  Read the Internationalization section of the PEP.

this section?

    The implementation supports internationalization magic by keeping
    the original string value intact.  In fact, all the work of the
    special substitution rules are implemented by overriding the
    __mod__() operator.  However the string value of a Template (or
    SafeTemplate) is the string that was passed to its constructor.

    This approach allows a gettext-based internationalized program to
    use the Template instance as a lookup into the catalog; in fact
    gettext doesn't care that the catalog key is a Template.  Because
    the value of the Template is the original $-string, translators
    also never need to use %-strings.  The right thing will happen at
    run-time.

I don't follow: if you're passing a template to gettext, do you really
get a template back?

if that's really the case, is being able to write "_(Template(x))" really
that much of an advantage over writing "Template(_(x))" ?

if that's really the case, you can still get the same effect from a template
factory function, or a trivial modification of gettext.

</F> 


From pf_moore at yahoo.co.uk  Sun Sep  5 18:13:57 2004
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Sun Sep  5 18:14:02 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:Simple
	String Substitutions
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
	<1094315559.8696.42.camel@geddy.wooz.org>
	<1f7befae040904213277ffa84@mail.gmail.com>
Message-ID: <upt50wv5m.fsf@yahoo.co.uk>

Tim Peters <tim.peters@gmail.com> writes:

> [Paul Moore]
>>> Would it be useful to factor out the "identifier syntax" bit of the
>>> pattern? The "escaped" and "bogus" groups are less likely to need
>>> changing than what constitutes an identifier.
>  
> [Barry Warsaw]
>> And if they did, you'd want to change them both at the same time.  Do
>> you have any ideas for an efficient, easily documented implementation?
>
> You'll rarely hear me say this <wink>, but fiddling classes at class
> creation time is exactly what metaclasses are for.  For example,
> suppose you said a Template subclass could define a class variable
> `idpat`, containing a regexp matching that subclass's idea of "an
> identifier".
>
> Then we could define a metaclass once-and-for-all, like so:

[...]

That's exactly the type of implementation I was thinking about, but
didn't have the knowledge to implement myself! Thanks for doing my
thinking for me, Tim :-)

Paul.
-- 
The major difference between a thing that might go wrong and a thing
that cannot possibly go wrong is that when a thing that cannot
possibly go wrong goes wrong it usually turns out to be impossible to
get at or repair. -- Douglas Adams

From mal at egenix.com  Sun Sep  5 18:48:55 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun Sep  5 18:49:04 2004
Subject: [Python-Dev] Re: Re: Alternative
	Implementation	forPEP	292:SimpleStringSubstitutions
In-Reply-To: <chcfc7$deb$1@sea.gmane.org>
References: <4138D622.6050807@egenix.com>	<001f01c4920b$e0ac4ba0$e841fea9@oemcomputer><chbmte$380$1@sea.gmane.org>	<4139AF8C.9040907@egenix.com>
	<chcfc7$deb$1@sea.gmane.org>
Message-ID: <413B4377.6060307@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
> 
>>>Yes.  Whatever MAL and Barry thinks, Python's current model is 8+8=8,
>>>U+U=U, and 8+U=U for ascii U.  That's an advantage, not a bug.
>>
>>Indeed, but I don't see how that's different from what the PEP
>>is saying.
> 
> 
> the current implementation is
> 
>      T(8) % 8 = U.
> 
> which violates the 8+8=8 rule.

T is a sub-class of Unicode, so you have:

	U % 8 = U

which is just fine.

>>>And when that time comes, storing everything as 32-bit characters is not the
>>>right answer either.
>>
>>I'll leave that for the libc designers to decide :-)
>>
>>If you look at performance, there's not much difference between
>>8-bit strings and Unicode, so the only argument against using
>>Unicode for storing text data is memory usage.
> 
> I used to make that argument, but these days, I no longer think that you can
> talk about performance without taking memory usage into account.

You always have to take both into account. I was just saying
that 8-bit strings don't buy you much in terms of performance
over Unicode these days, so the only argument against using
Unicode would be doubled memory usage. Of course, this is
a rather mild argument given the problems you face when trying
to localize applications - which I see as the main use case
for templates.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 05 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From edloper at gradient.cis.upenn.edu  Sun Sep  5 20:17:31 2004
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Sun Sep  5 20:17:21 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <20040905100006.8B0BB1E403A@bag.python.org>
References: <20040905100006.8B0BB1E403A@bag.python.org>
Message-ID: <D9F6AE6A-FF67-11D8-BC8C-000393C78C88@gradient.cis.upenn.edu>

Raymond wrote:
> So, it would be nice if there were some support for carrying forward 
> the
> argspec to inform help(), calltips(), and inspect().

+1.  I've already gotten one complaint about decorators confusing 
epydoc (since the user wasn't copying the function's __name__), and I 
expect to get many more.  One thing that's always bothered me about 
classmethod and staticmethod is the fact that you can't inspect them.  
Compare that to instancemethod, which has the im_func property.

One way to carry forward the argspec would be to recommend that 
decorators add a pointer back to the undecorated function (similar to 
instancemethod's im_func):

   newf.__doc__  = oldf.__doc__        # copy the docstring
   newf.__dict__.update(oldf.__dict__) # copy attributes
   newf.__name__ = oldf.__name__       # keep the name (new in Py2.4)
   newf.__undecorated__ = oldf         # [XX NEW] ptr to undecorated obj

Then tools like pydoc/epydoc could be written to check this property 
for the real argspec.  (Note that there's precedent for showing the 
*undecorated* signature in tools like pydoc/epydoc, since that's what 
they both do with instancemethods, classmethods, and staticmethods).

We would want to pick a standard name.  __undecorated__ seems 
reasonable to me, but I'd be ok with other names.

Martin v. Lowis wrote:
> What were you thinking of? I could imagine a predefined class, such as
> class copyfuncattrs: [...]

This function could take care of adding the __undecorated__ attribute.  
But I'm not sure copyfuncattrs is the right name; note that it works 
just as well on class decorators (we did decide that we're allowing 
those, right?).  Perhaps just copyattrs?

Martin v. Lowis wrote:
> Then, the question is where copyfuncattrs should live, and I would
> object that to be yet another builtin.

Having trouble parsing this sentence -- do you mean that you object to 
making it a builtin, or did you mean "I would expect that to be..."?

Anthony Baxter wrote:
> I think it's very likely that in 2.5 we'll have some sort of
> 'decorators' module that captures these sorts of things. I
> don't think it's likely we'll know enough about the various ins
> and outs of decorators to want to put something in 2.4.

I disagree.  We may not know much about the ins and outs of decorators, 
but I feel very confident that I'll want to be able to inspect them, 
whatever they are.  I would very much like for one of the following to 
happen *before* we release 2.4:

   - Add a prominent note in the docs on decorators that decorators 
should
     generally copy the original object's __doc__, __name__, and
     attributes, unless there's a good reason not to.  (Also, create an
     __undecorated__ property, but I'll wait to see what others see about
     that idea first.)

   - Add copyfuncattrs to the standard library (or builtins), and add a
     prominent note to the docs that you should use it unless you have a
     good reason not to.

(I can write up an appropriate patch for the docs if no one objects; 
for copyfuncattrs, we'd need to decide what to name it and where it 
should live, first.)

-Edward

From bac at OCF.Berkeley.EDU  Sun Sep  5 21:16:50 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Sep  5 21:16:58 2004
Subject: [Python-Dev] Re: Making custom patterns for string.Template easier
In-Reply-To: <413A4E32.2010104@ocf.berkeley.edu>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>	<4138D777.3090009@ocf.berkeley.edu>
	<1094316858.8696.58.camel@geddy.wooz.org>
	<413A4E32.2010104@ocf.berkeley.edu>
Message-ID: <413B6622.7030308@ocf.berkeley.edu>

Brett C. wrote:
> Barry Warsaw wrote:
> 
>> On Fri, 2004-09-03 at 16:43, Brett C. wrote:
>>
>>
>>> Anyway, I did this partially as an exercise so not a huge deal to me 
>>> if it doesn't make it in, so +0 from me for adding the functionality.
>>
>>
>>
>> So, where's the code?! :)
>>
> 
> Not on SF since the bloody thing is down!  I will stick it up on a 
> patch, assign to you, and report the tracker # here as soon as it is 
> back up.
> 

OK, http://www.python.org/sf/1022698 has the patch.  But  with Tim's 
metaclass solution I don't know how much people will care about this 
one.  =)

I guess the question becomes whether people will prefer to have to 
define a new class to override the regex or do it per instance (I 
suspect the former, although my code does the latter).

If Tim's solution is used I would like to suggest that we also allow for 
overloading the bogus group since that was the last thing in contention 
about the implementation (sans the whole unicode subclass deal).

-Brett
From bac at OCF.Berkeley.EDU  Sun Sep  5 22:02:27 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Sep  5 22:02:38 2004
Subject: [Python-Dev] random.py fixage
In-Reply-To: <413AC56B.5060308@interlink.com.au>
References: <413AC56B.5060308@interlink.com.au>
Message-ID: <413B70D3.6010107@ocf.berkeley.edu>

Anthony Baxter wrote:
[SNIP]
> I thought about cutting a 2.4a4, but decided against it. For one thing,
> it'd mean Martin would need to cut more Windows installers, if we don't
> want to end up with mismatched windows installers and tarballs. For
> another, it's an _alpha_ release, and so it's not as vital as if it'd
> been something like a 2.3.4 bug, or a 2.4 final.
> 

Does this mean that CVS is open to major checkins again?  I assume so, 
but Misc/NEWS has not been given a new section for 2.4b1 so I thought I 
would double-check before doing a major checkin.

-Brett
From martin at v.loewis.de  Sun Sep  5 23:40:27 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep  5 23:40:23 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <D9F6AE6A-FF67-11D8-BC8C-000393C78C88@gradient.cis.upenn.edu>
References: <20040905100006.8B0BB1E403A@bag.python.org>
	<D9F6AE6A-FF67-11D8-BC8C-000393C78C88@gradient.cis.upenn.edu>
Message-ID: <413B87CB.6080701@v.loewis.de>

Edward Loper wrote:
>> Then, the question is where copyfuncattrs should live, and I would
>> object that to be yet another builtin.
> 
> 
> Having trouble parsing this sentence -- do you mean that you object to 
> making it a builtin, or did you mean "I would expect that to be..."?

The former. No more builtins.

Regards,
Martin
From martin at v.loewis.de  Sun Sep  5 23:41:48 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep  5 23:41:43 2004
Subject: [Python-Dev] random.py fixage
In-Reply-To: <413B70D3.6010107@ocf.berkeley.edu>
References: <413AC56B.5060308@interlink.com.au>
	<413B70D3.6010107@ocf.berkeley.edu>
Message-ID: <413B881C.20806@v.loewis.de>

Brett C. wrote:
> Does this mean that CVS is open to major checkins again?  I assume so, 
> but Misc/NEWS has not been given a new section for 2.4b1 so I thought I 
> would double-check before doing a major checkin.

I think the tradition is that this is created by whoever makes the first
NEWS-worthy change.

Regards,
Martin
From jcarlson at uci.edu  Mon Sep  6 00:42:56 2004
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon Sep  6 00:50:08 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <D9F6AE6A-FF67-11D8-BC8C-000393C78C88@gradient.cis.upenn.edu>
References: <20040905100006.8B0BB1E403A@bag.python.org>
	<D9F6AE6A-FF67-11D8-BC8C-000393C78C88@gradient.cis.upenn.edu>
Message-ID: <20040905151944.90BC.JCARLSON@uci.edu>


>    newf.__doc__  = oldf.__doc__        # copy the docstring
>    newf.__dict__.update(oldf.__dict__) # copy attributes
>    newf.__name__ = oldf.__name__       # keep the name (new in Py2.4)
>    newf.__undecorated__ = oldf         # [XX NEW] ptr to undecorated obj

If one were to consider function decorators as a type of 'subclass' of a
particular function, that is, post-decoration a function inherits from
the pre-decoration version of the function, we could, with a single
attribute (like __undecorated__ or __proxyfor__ as suggested) and proper
introspection tools, do iterative attribute lookups similar to the way
that it is already done with classes (without diamond-inheritance). That
is, one wouldn't need to copy __doc__, __name__, and __dict__ from a
decorated function, one would get access to them automatically.

In current Python, what I am saying would be equivalent to...


def foo(arg):
    pass

t = foo
foo = arbitrary_decorator(foo)
foo.__proxyfor__ = t
del t

It would be some 'behind-the-scenes-magic', but I think it may be the
right amount and kind of magic.

 - Josiah

From adurdin at gmail.com  Mon Sep  6 01:24:05 2004
From: adurdin at gmail.com (Andrew Durdin)
Date: Mon Sep  6 01:24:16 2004
Subject: [Python-Dev] Alternative placeholder delimiters for PEP 292
In-Reply-To: <59e9fd3a040905152243efcb2@mail.gmail.com>
References: <59e9fd3a040829231141cd3fe4@mail.gmail.com>
	<1094315616.8721.44.camel@geddy.wooz.org>
	<59e9fd3a0409041520114d0604@mail.gmail.com>
	<1094398089.8144.39.camel@geddy.wooz.org>
	<59e9fd3a040905152243efcb2@mail.gmail.com>
Message-ID: <59e9fd3a0409051624705fe938@mail.gmail.com>

(oops -- I accidentally sent this reply to python-list instead of python-dev)

On Sun, 05 Sep 2004 11:28:09 -0400, Barry Warsaw <barry@python.org> wrote:
>
> It's explained in the section of the PEP titled "Why $ and Braces?".

I quote that section entirely: """The BDFL said it best: The $ means
"substitution" in so many languages besides Perl that I wonder where
you've been. [...] We're copying this from the shell."""

That states that the $-substitution is common in other languages, and
was copied from the shell. It does not provide a case for the merits
of $ substitution (as opposed to using other delimiters), nor a
rationale why copying from another language is a good idea in this
instance.
From eppstein at ics.uci.edu  Mon Sep  6 03:04:19 2004
From: eppstein at ics.uci.edu (David Eppstein)
Date: Mon Sep  6 03:04:25 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compiler mystery)
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
Message-ID: <eppstein-8D6D98.18041805092004@sea.gmane.org>

In article <1f7befae04090422024afaee58@mail.gmail.com>,
 Tim Peters <tim.peters@gmail.com> wrote:

> Some exceptions should never be suppressed unless named explicitly,
> and a real bitch is that some user-defined exceptions can fit in that
> category too.  The ones that give me (and my employer) the most grief
> are the tree of exceptions deriving from ZODB's ConflictError. 
> ConflictError is a serious thing:  it essentially means the current
> transaction cannot succeed, and the app should give up (and maybe
> retry the current transaction from its start).  Suppressing
> ConflictError by accident-- even inside a hasattr() call! --can
> grossly reduce efficiency, and has a long history too of provoking
> subtle, catastrophic, database corruption bugs.
> 
> I would like to see Python's exception hierarchy grow more
> sophisticated in this respect.  MemoryError, SystemExit, and
> KeyboardInterrupt are things that should not be caught by "except
> Exception:", neither by a bare "except:", nor by hasattr() or C-level
> dict lookup.  ZODB's ConflictError is another of that ilk.  I'd like
> to see "except Exception:" become synonymous with bare "except:", and
> move the "dangerous exceptions" to subclass off a new branch of the
> exception hierarchy.  It could be that something like your patch is
> the only practical way to make this work in the C implementation, so
> I'm keen on it.

It's not really the same subject, but the exception that gives me the 
most grief is StopIteration.  I have to keep remembering to never call 
.next() without catching it; if I forget, I get bugs where some loop 
several levels back in the call tree mysteriously exits.

-- 
David Eppstein
Computer Science Dept., Univ. of California, Irvine
http://www.ics.uci.edu/~eppstein/

From jhylton at gmail.com  Mon Sep  6 03:42:44 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Mon Sep  6 03:42:50 2004
Subject: [Python-Dev] Dangerous exceptions (was Re: Another test_compiler
	mystery)
In-Reply-To: <1f7befae04090422024afaee58@mail.gmail.com>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
Message-ID: <e8bf7a5304090518425dc3ebec@mail.gmail.com>

On Sun, 5 Sep 2004 01:02:35 -0400, Tim Peters <tim.peters@gmail.com> wrote:
> I would like to see Python's exception hierarchy grow more
> sophisticated in this respect.  MemoryError, SystemExit, and
> KeyboardInterrupt are things that should not be caught by "except
> Exception:", neither by a bare "except:", nor by hasattr() or C-level
> dict lookup.  ZODB's ConflictError is another of that ilk.  I'd like
> to see "except Exception:" become synonymous with bare "except:", and
> move the "dangerous exceptions" to subclass off a new branch of the
> exception hierarchy.  It could be that something like your patch is
> the only practical way to make this work in the C implementation, so
> I'm keen on it.

The current exception hierarchy isn't too far from what you suggest. 
We just got the names wrong.  That is, there is a base class,
StandardException, that captures most exceptions other than
MemoryError, SystemError, and KeyboardInterrupt.  If we renamed that
Exception, then we'd be 90% of the way there.  You could also change
your code, right now, to say "except StandardError:" and avoid the
problem entirely.  Make sure ConflictError does not inherit from
StandardError, of course.  And make sure you're happy that ImportError
is not a StandardError either.

I'm not sure what I think of the change to "except:"  It's often the
case that someone who has written "except:" really means "except
Something:", but I expect that very often Something != StandardError
and issubclass(Something, StandardError).  In that case, the change
doesn't really help them.  The code is still wrong.

Jeremy
From xavier.combelle at free.fr  Sat Sep 18 02:57:36 2004
From: xavier.combelle at free.fr (Xavier Combelle)
Date: Mon Sep  6 08:46:53 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <413AD97D.9060301@v.loewis.de>
References: <001f01c49306$41f86060$e841fea9@oemcomputer>
	<413AD97D.9060301@v.loewis.de>
Message-ID: <414B8800.1080204@free.fr>


>
> This could be used to define a decorator
>
> @copyfuncattrs
> def trace(f):
>     def do_trace(*args):
>         print "invoking", f.__name__, args
>         return f(*args)
>     return do_trace

I am quite new in Python, and I do my first suggestions. (One time
I should begin)

If there is a general agreement about what to do to wrap a decorator,
why no use the following syntax

@decorator
def trace(f):
    def do_trace(*args):
        print "invoking", f.__name__, args
        return f(*args)
    return do_trace

I would prefer it because it explain what the method do
instead of how it implement it.
Generally speaking, I believe the decarators should
be very expressive, just because even if it is not
a part of  the language, it's a kind of new syntax.
For the same reason, Python should incorporate
just a reduce set of decorators.

From fredrik at pythonware.com  Mon Sep  6 09:44:12 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Sep  6 09:42:22 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compilermystery)
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer><1f7befae040812091754035bcb@mail.gmail.com><20040812185521.GA2277@vicky.ecs.soton.ac.uk><1f7befae04081212414007274f@mail.gmail.com><20040812204431.GA31884@vicky.ecs.soton.ac.uk><1f7befae0408151950361f0cb4@mail.gmail.com><20040816112916.GA19969@vicky.ecs.soton.ac.uk><1f7befae04090422024afaee58@mail.gmail.com>
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>
Message-ID: <chh4cr$qre$1@sea.gmane.org>

Jeremy Hylton wrote:

> The current exception hierarchy isn't too far from what you suggest.
> We just got the names wrong.  That is, there is a base class,
> StandardException, that captures most exceptions other than
> MemoryError, SystemError, and KeyboardInterrupt.  If we renamed that
> Exception, then we'd be 90% of the way there.  You could also change
> your code, right now, to say "except StandardError:" and avoid the
> problem entirely.

when was that changed?

Python 2.4a3 (#1, Sep  3 2004, 11:32:03)
>>> StandardException
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'StandardException' is not defined
>>> StandardError
<class exceptions.StandardError at 0x401a329c>
>>> issubclass(MemoryError, StandardError)
True
>>> issubclass(KeyboardInterrupt, StandardError)
True
>>> issubclass(SystemExit, StandardError)
False
>>> issubclass(StopIteration, StandardError)
False
>>> issubclass(ImportError, StandardError)
True

</F> 


From aahz at pythoncraft.com  Mon Sep  6 19:42:12 2004
From: aahz at pythoncraft.com (Aahz)
Date: Mon Sep  6 19:42:14 2004
Subject: [Python-Dev] Re: Re: Alternative
	Implementation	forPEP	292:SimpleStringSubstitutions
In-Reply-To: <413B4377.6060307@egenix.com>
References: <4138D622.6050807@egenix.com> <4139AF8C.9040907@egenix.com>
	<chcfc7$deb$1@sea.gmane.org> <413B4377.6060307@egenix.com>
Message-ID: <20040906174212.GA7423@panix.com>

On Sun, Sep 05, 2004, M.-A. Lemburg wrote:
>
> You always have to take both into account. I was just saying that
> 8-bit strings don't buy you much in terms of performance over Unicode
> these days, so the only argument against using Unicode would be
> doubled memory usage.

Only if one sticks with the 2-byte Unicode implementation; you can
compile Python with 4-byte Unicode (and I seem to recall that at least
one standard distribution does exactly that).

> Of course, this is a rather mild argument given the problems you face
> when trying to localize applications - which I see as the main use
> case for templates.

If I18N is intended to be the primary/only use case of templates, then
the PEP needs to be updated.  It would also explain some of the
disagreement about the implementation.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes
From barry at python.org  Mon Sep  6 19:57:54 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep  6 19:57:57 2004
Subject: [Python-Dev] Re: Making custom patterns for string.Template easier
In-Reply-To: <413B6622.7030308@ocf.berkeley.edu>
References: <16E1010E4581B049ABC51D4975CEDB8803060F7E@UKDCX001.uk.int.atosorigin.com>
	<4138D777.3090009@ocf.berkeley.edu>
	<1094316858.8696.58.camel@geddy.wooz.org>
	<413A4E32.2010104@ocf.berkeley.edu> <413B6622.7030308@ocf.berkeley.edu>
Message-ID: <1094493474.8144.109.camel@geddy.wooz.org>

On Sun, 2004-09-05 at 15:16, Brett C. wrote:

> If Tim's solution is used I would like to suggest that we also allow for 
> overloading the bogus group since that was the last thing in contention 
> about the implementation (sans the whole unicode subclass deal).

+1.  I'm going to pursue Tim's approach, but thanks for the patch!

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040906/925d67ff/attachment.pgp
From martin at v.loewis.de  Mon Sep  6 22:00:59 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Sep  6 22:00:54 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <414B8800.1080204@free.fr>
References: <001f01c49306$41f86060$e841fea9@oemcomputer>
	<413AD97D.9060301@v.loewis.de> <414B8800.1080204@free.fr>
Message-ID: <413CC1FB.7060903@v.loewis.de>

> If there is a general agreement about what to do to wrap a decorator,
> why no use the following syntax
> 
> @decorator
> def trace(f):

I've thought of that, and it is tempting. However, it does not give
you any clue what the decorator actually *does*, that's why I don't
like it. People would declare any decorator using @decorator, without
thinking whether they actually need to make that declaration. By
design, any function (or, any callable for that matter) can serve
as a decorator, so having a declaration for it might actually add
confusion.

Regards,
Martin
From niemeyer at conectiva.com  Mon Sep  6 22:24:21 2004
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Mon Sep  6 22:35:33 2004
Subject: [Python-Dev] Re: FW: [Python-checkins] python/dist/src/Lib
	sre_parse.py, 1.62, 1.63
In-Reply-To: <000b01c492d9$f9e66140$e841fea9@oemcomputer>
References: <000b01c492d9$f9e66140$e841fea9@oemcomputer>
Message-ID: <20040906202421.GA4909@burma.localdomain>

> FYI, with today's checkins, test_re.py fails.

Not here. Do you have any extra information about it?

Is it failing for anyone else?

-- 
Gustavo Niemeyer
http://niemeyer.net
From raymond.hettinger at verizon.net  Mon Sep  6 22:44:13 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon Sep  6 22:45:04 2004
Subject: [Python-Dev] FW: FW: [Python-checkins] python/dist/src/Lib
	sre_parse.py, 1.62, 1.63
Message-ID: <002e01c49452$452b20e0$e841fea9@oemcomputer>

> > FYI, with today's checkins, test_re.py fails.
> 
> Not here. Do you have any extra information about it?
> 
> Is it failing for anyone else?

Another one of your checkins arrived later and fixed it.  All is well.


Raymond

From xavier.combelle at free.fr  Sun Sep 19 02:30:40 2004
From: xavier.combelle at free.fr (Xavier Combelle)
Date: Tue Sep  7 02:25:43 2004
Subject: [Python-Dev] decorator support
In-Reply-To: <413CC1FB.7060903@v.loewis.de>
References: <001f01c49306$41f86060$e841fea9@oemcomputer>
	<413AD97D.9060301@v.loewis.de> <414B8800.1080204@free.fr>
	<413CC1FB.7060903@v.loewis.de>
Message-ID: <414CD330.2000500@free.fr>


>  However, it does not give
> you any clue what the decorator actually *does*, that's why I don't
> like it. People would declare any decorator using @decorator, without
> thinking whether they actually need to make that declaration. 


In my opinion, I don't care what is behind the scene. If @decorator 
syntax transform
the function into a decorator, all is good.

> By
> design, any function (or, any callable for that matter) can serve
> as a decorator, so having a declaration for it might actually add
> confusion.
>
Not any function can act as an usefull decorator, in my opinion. It 
should do something
useful around the concept of callabale object. That's why it seems for 
me more like
a syntax sugar than an algorihm construct.

>

From Vladimir.Marangozov at t-online.de  Sat Sep  4 04:33:02 2004
From: Vladimir.Marangozov at t-online.de (Vladimir Marangozov)
Date: Tue Sep  7 03:29:22 2004
Subject: [Python-Dev] Re: Call for defense of @decorators
Message-ID: <000001c49227$85069f00$6c02a8c0@ESII9100>


Hi,

Having read the PEP, most of the py-dev discussion and the EuroPython
slides just now (sorry, maybe I'm late by a couple of months :-), the
call for defense is a tough call for me.

I am -0.  There is no new functionality and actually my impression is
that there is potential for code readability to get hurt.

The argumentation for the change in the PEP seems to be weak.  It is
explicitly stated that successive @decos are equivalent to chained
function calls without the temp vars at definition time.  This is indeed
a key point.  No new functionality is actually added (or a good amount
of typed characters saved for that matter).

The staticmethod() readability problem is a code readability problem,
and as such it can simply be addressed the "classic" way via comments.
(BTW, has anyone suggested changing the @deco syntax to #deco? :-)
It works for me <0.5 wink>.

I do not necessarily perceive the foo = staticmethod(foo) transformation
statements as evil too.  They are explicit, normal def-time function
calls, and we certainly don't need a new way for the job.  We might
introduce one indeed (I saw the syntax becoming a fact of life already),
but we don't really need it to get the job done.  IOW, everything the
@decos can bring will remain doable the obvious way via function calls.
And that's good enough.

As for the arguments about future code tools knowing better about the
annotated functions / classes via the @decos at definition time, well,
it's common practice really (cf. #pragma and all sorts of special
comments for compile-time processing) which does not involve new syntax.
I perceive the extravaganza in doing one or the other as comparable.

That said, looking at what's being done here, and all things being
equal,
there is certainly some merit in the idea of declaring the def-time
caller list of a function before the function itself, but I personally
fail to foresee its goodness for Python on most counts.

These chained @decos look like a call stack crash dump to me anyway :-)

I will happily embrace genexps and multi-line imports though.  Thanks!

Cheers,
Vladimir


From gvanrossum at gmail.com  Tue Sep  7 03:46:59 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Sep  7 03:47:03 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compiler mystery)
In-Reply-To: <eppstein-8D6D98.18041805092004@sea.gmane.org>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<eppstein-8D6D98.18041805092004@sea.gmane.org>
Message-ID: <ca471dc2040906184648d95e55@mail.gmail.com>

> It's not really the same subject, but the exception that gives me the
> most grief is StopIteration.  I have to keep remembering to never call
> .next() without catching it; if I forget, I get bugs where some loop
> several levels back in the call tree mysteriously exits.

Are you sure? This sounds like superstition to me, since that's not
how loops work. Raising StopIteration in the middle of a loop does not
break out of the loop -- only raising StopIteration from a next()
breaks a loop.

Or are you talking about nested next() calls? That's the only case
where the behavior you are citing occurs.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From jhylton at gmail.com  Tue Sep  7 14:09:49 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Sep  7 14:09:51 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compilermystery)
In-Reply-To: <chh4cr$qre$1@sea.gmane.org>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>
	<chh4cr$qre$1@sea.gmane.org>
Message-ID: <e8bf7a5304090705092ee4daa7@mail.gmail.com>

On Mon, 6 Sep 2004 09:44:12 +0200, Fredrik Lundh <fredrik@pythonware.com> wrote:
> Jeremy Hylton wrote:
> 
> > The current exception hierarchy isn't too far from what you suggest.
> > We just got the names wrong.  That is, there is a base class,
> > StandardException, that captures most exceptions other than
> > MemoryError, SystemError, and KeyboardInterrupt.  If we renamed that
> > Exception, then we'd be 90% of the way there.  You could also change
> > your code, right now, to say "except StandardError:" and avoid the
> > problem entirely.
> 
> when was that changed?

I misread the very long output of pydoc exceptions :-).  I don't
understand the hierarchy, either, or I would have noticed that the
results don't make sense.  Why isn't StopIteration a StandardError?

I'll second Tim's suggestion that some errors -- like SystemError,
MemoryError, and KeyboardInterrupt belong in a different category.  I
think it would be easier in principle to put them at a different place
in the class hierarchy than to make them some special kind of
uncatchable exception.

Jeremy
From jim at zope.com  Tue Sep  7 15:43:53 2004
From: jim at zope.com (Jim Fulton)
Date: Tue Sep  7 15:43:57 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re:
	Another	test_compilermystery)
In-Reply-To: <e8bf7a5304090705092ee4daa7@mail.gmail.com>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>	<1f7befae040812091754035bcb@mail.gmail.com>	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>	<1f7befae04081212414007274f@mail.gmail.com>	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>	<1f7befae0408151950361f0cb4@mail.gmail.com>	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>	<1f7befae04090422024afaee58@mail.gmail.com>	<e8bf7a5304090518425dc3ebec@mail.gmail.com>	<chh4cr$qre$1@sea.gmane.org>
	<e8bf7a5304090705092ee4daa7@mail.gmail.com>
Message-ID: <413DBB19.40602@zope.com>

Jeremy Hylton wrote:

...

>  I
> think it would be easier in principle to put them at a different place
> in the class hierarchy than to make them some special kind of
> uncatchable exception.

Note that we don't want uncatchable exceptions. Rather, we want
exceptions that aren't caught by bare excepts or very broad
excepts.  In many cases, we want certain knowledgeable code to be able
to catch these exceptions.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From jhylton at gmail.com  Tue Sep  7 16:07:25 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Sep  7 16:07:28 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compilermystery)
In-Reply-To: <413DBB19.40602@zope.com>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>
	<chh4cr$qre$1@sea.gmane.org>
	<e8bf7a5304090705092ee4daa7@mail.gmail.com> <413DBB19.40602@zope.com>
Message-ID: <e8bf7a53040907070718fad12f@mail.gmail.com>

On Tue, 07 Sep 2004 09:43:53 -0400, Jim Fulton <jim@zope.com> wrote:
> Jeremy Hylton wrote:
> 
> ...
> 
> >  I
> > think it would be easier in principle to put them at a different place
> > in the class hierarchy than to make them some special kind of
> > uncatchable exception.
> 
> Note that we don't want uncatchable exceptions. Rather, we want
> exceptions that aren't caught by bare excepts or very broad
> excepts.  In many cases, we want certain knowledgeable code to be able
> to catch these exceptions.

I agree with half the cause.  There ought to be a decent organization
of the exception class hierarchy so that exceptions like
KeyboardInterrupt are in a special category.  Then an "except
NormalError:" <wink> would catch only the normal errors and not the
special ones.  I don't think bare exception should change it's
meaning; you just shouldn't use it unless it's *really* what you mean.

I think backwards compatibility is a really hard issue for any of
these changes.  It's probably hard to re-arrange the class hierarchy,
but I don't know what practical solution there is to these problems
that doesn't involve breaking some code.  It's even harder to change
bare except, but I don't think that's necessary.

Jeremy
From barry at python.org  Tue Sep  7 16:13:51 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep  7 16:13:58 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re:
	Another	test_compilermystery)
In-Reply-To: <413DBB19.40602@zope.com>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>	<chh4cr$qre$1@sea.gmane.org>
	<e8bf7a5304090705092ee4daa7@mail.gmail.com> <413DBB19.40602@zope.com>
Message-ID: <1094566431.8341.25.camel@geddy.wooz.org>

On Tue, 2004-09-07 at 09:43, Jim Fulton wrote:

> Note that we don't want uncatchable exceptions. Rather, we want
> exceptions that aren't caught by bare excepts or very broad
> excepts.  In many cases, we want certain knowledgeable code to be able
> to catch these exceptions.

I don't agree about having exceptions that pass bare excepts.  A typical
/valid/ use of bare excepts are in frameworks such as transaction
processing, where you need to do some extra work when an exception
occurs, then re-raise the original exception, e.g.:

try:
    do_something()
except:
    database.rollback()
    raise
else:
    database.commit()

Even exceptions like SystemError, MemoryError, or KeyboardInterrupt want
to adhere to this simple idiom.  Bare except should continue to catch
all exceptions.  Code that wanted to do otherwise should /not/ use a
bare except, and +1 on some form of exception hierarchy restructuring
that would make that clearer.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040907/3512c996/attachment.pgp
From jhylton at gmail.com  Tue Sep  7 16:26:34 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Sep  7 16:26:43 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test
	test_compiler.py, 1.5, 1.6 test_decimal.py, 1.13, 1.14
In-Reply-To: <E1C3gqZ-00038j-Sf@sc8-pr-cvs1.sourceforge.net>
References: <E1C3gqZ-00038j-Sf@sc8-pr-cvs1.sourceforge.net>
Message-ID: <e8bf7a5304090707264180bba@mail.gmail.com>

On Sat, 04 Sep 2004 13:09:15 -0700, rhettinger@users.sourceforge.net
<rhettinger@users.sourceforge.net> wrote:
> Update of /cvsroot/python/python/dist/src/Lib/test
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv11509
> 
> Modified Files:
>         test_compiler.py test_decimal.py
> Log Message:
> Change the strategy for coping with time intensive tests from
> "all or none" to "all or some".
> 
> This provides much greater test coverage without eating much time.
> It also makes it more likely that routine regression testing will
> unearth bugs.

If the time is really a problem, I'd rather select a certain set of
files to test all the times.  I'm interested in making sure the
compiler is run against the widest possible range of language
constructs, which plain old random testing isn't likely to do.

I'd prefer, even, an option to enable random sampling for these tests.
 When I'm testing the compiler package, I really want to run the
entire test, not a small part of it.

Jeremy
From tim.peters at gmail.com  Tue Sep  7 16:43:25 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Sep  7 16:43:39 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test
	test_compiler.py, 1.5, 1.6 test_decimal.py, 1.13, 1.14
In-Reply-To: <e8bf7a5304090707264180bba@mail.gmail.com>
References: <E1C3gqZ-00038j-Sf@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5304090707264180bba@mail.gmail.com>
Message-ID: <1f7befae04090707437ce4aff1@mail.gmail.com>

>> <rhettinger@users.sourceforge.net> wrote:
>> Modified Files:
>>         test_compiler.py test_decimal.py
>> Log Message:
>> Change the strategy for coping with time intensive tests from
>> "all or none" to "all or some".
>>
>> This provides much greater test coverage without eating much time.
>> It also makes it more likely that routine regression testing will
>> unearth bugs.

[Jeremy Hylton]
> If the time is really a problem, I'd rather select a certain set of
> files to test all the times.  I'm interested in making sure the
> compiler is run against the widest possible range of language
> constructs, which plain old random testing isn't likely to do.
>
> I'd prefer, even, an option to enable random sampling for these tests.
> When I'm testing the compiler package, I really want to run the
> entire test, not a small part of it.

If you enable the compiler resource, the entire test is still run. 
Nothing changed there.  What changed is what happens if you don't
enable that resource:  before, test_compiler was skipped entirely;
after, test_compiler tries a small, random subset of files.
From jim at zope.com  Tue Sep  7 16:44:17 2004
From: jim at zope.com (Jim Fulton)
Date: Tue Sep  7 16:44:21 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was
	Re:	Another	test_compilermystery)
In-Reply-To: <1094566431.8341.25.camel@geddy.wooz.org>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>	
	<1f7befae040812091754035bcb@mail.gmail.com>	
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>	
	<1f7befae04081212414007274f@mail.gmail.com>	
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>	
	<1f7befae0408151950361f0cb4@mail.gmail.com>	
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>	
	<1f7befae04090422024afaee58@mail.gmail.com>	
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>	<chh4cr$qre$1@sea.gmane.org>	
	<e8bf7a5304090705092ee4daa7@mail.gmail.com>
	<413DBB19.40602@zope.com> <1094566431.8341.25.camel@geddy.wooz.org>
Message-ID: <413DC941.60600@zope.com>

Barry Warsaw wrote:
> On Tue, 2004-09-07 at 09:43, Jim Fulton wrote:
> 
> 
>>Note that we don't want uncatchable exceptions. Rather, we want
>>exceptions that aren't caught by bare excepts or very broad
>>excepts.  In many cases, we want certain knowledgeable code to be able
>>to catch these exceptions.
> 
> 
> I don't agree about having exceptions that pass bare excepts.  A typical
> /valid/ use of bare excepts are in frameworks such as transaction
> processing, where you need to do some extra work when an exception
> occurs, then re-raise the original exception, e.g.:
> 
> try:
>     do_something()
> except:
>     database.rollback()
>     raise
> else:
>     database.commit()
> 
> Even exceptions like SystemError, MemoryError, or KeyboardInterrupt want
> to adhere to this simple idiom.  Bare except should continue to catch
> all exceptions.  Code that wanted to do otherwise should /not/ use a
> bare except, and +1 on some form of exception hierarchy restructuring
> that would make that clearer.

Fair enough.

I also agree with jeremy's points re backward compatability.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From jacobs at theopalgroup.com  Tue Sep  7 17:11:57 2004
From: jacobs at theopalgroup.com (Kevin Jacobs)
Date: Tue Sep  7 17:12:01 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was
	Re:	Another	test_compilermystery)
In-Reply-To: <1094566431.8341.25.camel@geddy.wooz.org>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>	<1f7befae040812091754035bcb@mail.gmail.com>	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>	<1f7befae04081212414007274f@mail.gmail.com>	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>	<1f7befae0408151950361f0cb4@mail.gmail.com>	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>	<1f7befae04090422024afaee58@mail.gmail.com>	<e8bf7a5304090518425dc3ebec@mail.gmail.com>	<chh4cr$qre$1@sea.gmane.org>	<e8bf7a5304090705092ee4daa7@mail.gmail.com>
	<413DBB19.40602@zope.com> <1094566431.8341.25.camel@geddy.wooz.org>
Message-ID: <413DCFBD.7010306@theopalgroup.com>

Barry Warsaw wrote:

>I don't agree about having exceptions that pass bare excepts.  A typical
>/valid/ use of bare excepts are in frameworks such as transaction
>processing, where you need to do some extra work when an exception
>occurs, then re-raise the original exception, [...]
>
My policy for bare excepts is that without significant justification
they _must_ either re-raise the original exception or raise another
exception.  There are very few circumstances where I have allowed
my team to write pure bare excepts.  I haven't checked, but a warning
for violations of this rule may be a nice addition to pychecker or pylint.

-Kevin

From jhylton at gmail.com  Tue Sep  7 17:12:40 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Sep  7 17:12:46 2004
Subject: [Python-Dev] assert failure on obmalloc
Message-ID: <e8bf7a5304090708123c619f4@mail.gmail.com>

Failure running the test suite today with -u compiler enabled on Windows XP.

test_logging
Assertion failed: bp != NULL, file
\code\python\dist\src\Objects\obmalloc.c, line 604

The debugger says the error is here:
 	msvcr71d.dll!_assert(const char * expr=0x1e22bcc0, const char *
filename=0x1e22bc94, unsigned int lineno=604)  Line 306	C
 	python24_d.dll!PyObject_Malloc(unsigned int nbytes=100)  Line 604 + 0x1b	C
 	python24_d.dll!_PyObject_DebugMalloc(unsigned int nbytes=84)  Line
1014 + 0x9	C
 	python24_d.dll!PyThreadState_New(_is * interp=0x00951028)  Line 136 + 0x7	C
 	python24_d.dll!PyGILState_Ensure()  Line 430 + 0xc	C
 	python24_d.dll!t_bootstrap(void * boot_raw=0x02801d48)  Line 431 + 0x5	C
 	python24_d.dll!bootstrap(void * call=0x04f0d264)  Line 166 + 0x7	C
 	msvcr71d.dll!_threadstart(void * ptd=0x026a2320)  Line 196 + 0xd	C

I've been seeing this sort of error on-and-off for at least a year
with my Python 2.3 install.  It's the usual reason my spambayes
popproxy dies.  I can't recell seeing it before on Windows or while
running the test suite.

Jeremy
From jhylton at gmail.com  Tue Sep  7 17:14:40 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Tue Sep  7 17:14:42 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test
	test_compiler.py, 1.5, 1.6 test_decimal.py, 1.13, 1.14
In-Reply-To: <1f7befae04090707437ce4aff1@mail.gmail.com>
References: <E1C3gqZ-00038j-Sf@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5304090707264180bba@mail.gmail.com>
	<1f7befae04090707437ce4aff1@mail.gmail.com>
Message-ID: <e8bf7a5304090708141c81b057@mail.gmail.com>

On Tue, 7 Sep 2004 10:43:25 -0400, Tim Peters <tim.peters@gmail.com> wrote:
> If you enable the compiler resource, the entire test is still run.
> Nothing changed there.  What changed is what happens if you don't
> enable that resource:  before, test_compiler was skipped entirely;
> after, test_compiler tries a small, random subset of files.

Thanks for clarifying.  The checkin comment obviously says that, but
somehow I didn't get it when reading the diff for test_compiler.

Jeremy
From mwh at python.net  Tue Sep  7 17:25:28 2004
From: mwh at python.net (Michael Hudson)
Date: Tue Sep  7 17:25:29 2004
Subject: [Python-Dev] assert failure on obmalloc
In-Reply-To: <e8bf7a5304090708123c619f4@mail.gmail.com> (Jeremy Hylton's
	message of "Tue, 7 Sep 2004 11:12:40 -0400")
References: <e8bf7a5304090708123c619f4@mail.gmail.com>
Message-ID: <2msm9u14pj.fsf@starship.python.net>

Jeremy Hylton <jhylton@gmail.com> writes:

> Failure running the test suite today with -u compiler enabled on Windows XP.
>
> test_logging
> Assertion failed: bp != NULL, file
> \code\python\dist\src\Objects\obmalloc.c, line 604
>
> The debugger says the error is here:
>  	msvcr71d.dll!_assert(const char * expr=0x1e22bcc0, const char *
> filename=0x1e22bc94, unsigned int lineno=604)  Line 306	C
>  	python24_d.dll!PyObject_Malloc(unsigned int nbytes=100)  Line 604 + 0x1b	C
>  	python24_d.dll!_PyObject_DebugMalloc(unsigned int nbytes=84)  Line
> 1014 + 0x9	C
>  	python24_d.dll!PyThreadState_New(_is * interp=0x00951028)  Line 136 + 0x7	C
>  	python24_d.dll!PyGILState_Ensure()  Line 430 + 0xc	C
>  	python24_d.dll!t_bootstrap(void * boot_raw=0x02801d48)  Line 431 + 0x5	C
>  	python24_d.dll!bootstrap(void * call=0x04f0d264)  Line 166 + 0x7	C
>  	msvcr71d.dll!_threadstart(void * ptd=0x026a2320)  Line 196 + 0xd	C
>
> I've been seeing this sort of error on-and-off for at least a year
> with my Python 2.3 install.  It's the usual reason my spambayes
> popproxy dies.  I can't recell seeing it before on Windows or while
> running the test suite.

Don't debug builds route all PyMem_ calls through PyMalloc?  Doesn't
pymalloc rely on the GIL being held when it's called?  If both of
these are true, there's an obvious problem here, because the call to
PyMem_NEW in PyThreadState_New certainly isn't called with the GIL
held...

This would only be a problem in a debug build, though.

Cheers,
mwh

-- 
  Never meddle in the affairs of NT. It is slow to boot and quick to
  crash.                                             -- Stephen Harris
               -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html
From eppstein at ics.uci.edu  Tue Sep  7 17:31:41 2004
From: eppstein at ics.uci.edu (David Eppstein)
Date: Tue Sep  7 17:31:58 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compiler mystery)
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<eppstein-8D6D98.18041805092004@sea.gmane.org>
	<ca471dc2040906184648d95e55@mail.gmail.com>
Message-ID: <eppstein-86F633.08314107092004@sea.gmane.org>

In article <ca471dc2040906184648d95e55@mail.gmail.com>,
 Guido van Rossum <gvanrossum@gmail.com> wrote:

> > It's not really the same subject, but the exception that gives me the
> > most grief is StopIteration.  I have to keep remembering to never call
> > .next() without catching it; if I forget, I get bugs where some loop
> > several levels back in the call tree mysteriously exits.
> 
> Are you sure? This sounds like superstition to me, since that's not
> how loops work. Raising StopIteration in the middle of a loop does not
> break out of the loop -- only raising StopIteration from a next()
> breaks a loop.
> 
> Or are you talking about nested next() calls? That's the only case
> where the behavior you are citing occurs.

I don't remember, it could have been nested next()s.

-- 
David Eppstein
Computer Science Dept., Univ. of California, Irvine
http://www.ics.uci.edu/~eppstein/

From tim.peters at gmail.com  Tue Sep  7 17:44:48 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Sep  7 17:44:51 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test
	test_compiler.py, 1.5, 1.6 test_decimal.py, 1.13, 1.14
In-Reply-To: <e8bf7a5304090708141c81b057@mail.gmail.com>
References: <E1C3gqZ-00038j-Sf@sc8-pr-cvs1.sourceforge.net>
	<e8bf7a5304090707264180bba@mail.gmail.com>
	<1f7befae04090707437ce4aff1@mail.gmail.com>
	<e8bf7a5304090708141c81b057@mail.gmail.com>
Message-ID: <1f7befae04090708446feb59c7@mail.gmail.com>

[Tim Peters]
>> If you enable the compiler resource, the entire test is still run.
>> Nothing changed there.  What changed is what happens if you don't
>> enable that resource:  before, test_compiler was skipped entirely;
>> after, test_compiler tries a small, random subset of files.

[Jeremy Hylton]
> Thanks for clarifying.  The checkin comment obviously says that, but
> somehow I didn't get it when reading the diff for test_compiler.

Me neither!  I misunderstand it until I brought up test_compiler in an
editor, and then it was obvious.  Let's blame it on diff <wink>.
From barry at python.org  Tue Sep  7 18:01:08 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep  7 18:01:13 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was
	Re:	Another	test_compilermystery)
In-Reply-To: <413DCFBD.7010306@theopalgroup.com>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<1f7befae040812091754035bcb@mail.gmail.com>
	<20040812185521.GA2277@vicky.ecs.soton.ac.uk>
	<1f7befae04081212414007274f@mail.gmail.com>
	<20040812204431.GA31884@vicky.ecs.soton.ac.uk>
	<1f7befae0408151950361f0cb4@mail.gmail.com>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>	<chh4cr$qre$1@sea.gmane.org>
	<e8bf7a5304090705092ee4daa7@mail.gmail.com> <413DBB19.40602@zope.com>
	<1094566431.8341.25.camel@geddy.wooz.org>
	<413DCFBD.7010306@theopalgroup.com>
Message-ID: <1094572868.8342.43.camel@geddy.wooz.org>

On Tue, 2004-09-07 at 11:11, Kevin Jacobs wrote:

> My policy for bare excepts is that without significant justification
> they _must_ either re-raise the original exception or raise another
> exception.  There are very few circumstances where I have allowed
> my team to write pure bare excepts.  I haven't checked, but a warning
> for violations of this rule may be a nice addition to pychecker or pylint.

The other case I've seen are for command-shell like loops, where you
might print the exception in the bare except, but not re-raise the
exception.  Think about the main interactive interpreter loop.

But yeah I agree, you want strong justification for any use of bare
except.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040907/0ad56503/attachment.pgp
From trentm at ActiveState.com  Tue Sep  7 19:54:57 2004
From: trentm at ActiveState.com (Trent Mick)
Date: Tue Sep  7 20:01:19 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	socketmodule.c, 1.304, 1.305
In-Reply-To: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>;
	from tmick@users.sourceforge.net on Tue, Sep 07, 2004 at
	10:48:29AM -0700
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
Message-ID: <20040907105457.C24597@ActiveState.com>


The log message for that was supposed to be:

    Apply patch from http://python.org/sf/728330 to fix socket module
    compilation on Solaris 2.6, HP-UX 11, AIX 5.1 and (possibly) some
    IRIX versions.

but "cvs" surprised me with its wonderful and clear UI for specifying
the log message. Can the cvs logs be updated after the fact?

Trent


[tmick@users.sourceforge.net wrote]
> Update of /cvsroot/python/python/dist/src/Modules
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10162
> 
> Modified Files:
> 	socketmodule.c 
> Log Message:
> 
> 
> Index: socketmodule.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Modules/socketmodule.c,v
> retrieving revision 1.304
> retrieving revision 1.305
> diff -u -d -r1.304 -r1.305
> --- socketmodule.c	26 Aug 2004 00:51:16 -0000	1.304
> +++ socketmodule.c	7 Sep 2004 17:48:26 -0000	1.305
> @@ -257,7 +257,19 @@
>  # define O_NONBLOCK O_NDELAY
>  #endif
>  
> -#include "addrinfo.h"
> +/* include Python's addrinfo.h unless it causes trouble */
> +#if defined(__sgi) && _COMPILER_VERSION>700 && defined(_SS_ALIGNSIZE)
> +  /* Do not include addinfo.h on some newer IRIX versions.
> +   * _SS_ALIGNSIZE is defined in sys/socket.h by 6.5.21,
> +   * for example, but not by 6.5.10.
> +   */
> +#elif defined(_MSC_VER) && _MSC_VER>1200
> +  /* Do not include addrinfo.h for MSVC7 or greater. 'addrinfo' and
> +   * EAI_* constants are defined in (the already included) ws2tcpip.h.
> +   */
> +#else
> +#  include "addrinfo.h"
> +#endif
>  
>  #ifndef HAVE_INET_PTON
>  int inet_pton(int af, const char *src, void *dst);
> 
> _______________________________________________
> Python-checkins mailing list
> Python-checkins@python.org
> http://mail.python.org/mailman/listinfo/python-checkins

-- 
Trent Mick
TrentM@ActiveState.com
From fdrake at acm.org  Tue Sep  7 20:09:33 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep  7 20:09:47 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	socketmodule.c, 1.304, 1.305
In-Reply-To: <20040907105457.C24597@ActiveState.com>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<20040907105457.C24597@ActiveState.com>
Message-ID: <200409071409.33867.fdrake@acm.org>

On Tuesday 07 September 2004 01:54 pm, Trent Mick wrote:
 > but "cvs" surprised me with its wonderful and clear UI for specifying
 > the log message. Can the cvs logs be updated after the fact?

Yes; run "cvs -H admin" to learn how.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From bac at OCF.Berkeley.EDU  Tue Sep  7 21:26:36 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Sep  7 21:26:55 2004
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Modules	socketmodule.c, 1.304, 1.305
In-Reply-To: <200409071409.33867.fdrake@acm.org>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>	<20040907105457.C24597@ActiveState.com>
	<200409071409.33867.fdrake@acm.org>
Message-ID: <413E0B6C.8020103@ocf.berkeley.edu>

Fred L. Drake, Jr. wrote:

> On Tuesday 07 September 2004 01:54 pm, Trent Mick wrote:
>  > but "cvs" surprised me with its wonderful and clear UI for specifying
>  > the log message. Can the cvs logs be updated after the fact?
> 
> Yes; run "cvs -H admin" to learn how.
> 

You can also read the dev FAQ; it has an entry on just how to do this; 
http://www.python.org/dev/devfaq.html#how-can-i-fix-a-log-message-from-a-previous-checkin

And just so other people know, if you comem across something you have to 
do with CVS that is not clear and think it warrants an entry in the FAQ, 
let me know.  I am trying to make it rather thorough so that no 
developers have to look up the CVS docs again.

-Brett
From noamr at myrealbox.com  Tue Sep  7 21:34:11 2004
From: noamr at myrealbox.com (Noam Raphael)
Date: Tue Sep  7 21:35:28 2004
Subject: [Python-Dev] Missing arguments in RE functions
Message-ID: <413E0D33.7030703@myrealbox.com>

Hello,

I've now finished teaching Python to a group of people, and regular 
expressions was a part of the course. I have encountered a few missing 
features (that is, optional arguments) in RE functions. I've checked, 
and it seems to me that they can be added very easily.

The first missing feature is the "flags" argument in the findall and 
finditer functions. Searching for all occurances of an RE is, of course, 
a legitimate action, and I had to use (?s) in my RE, instead of adding 
re.DOTALL, which, to my opinion, is a lot clearer.
The solution is simple: the functions sub, subn, split, findall and 
finditer all first compile the given RE, with the flags argument set to 
0, and then run the appropriate method. As far as I can see, they could 
all get an additional optional argument, flags=0, and compile the RE 
with it.

The second missing feature is the ability to specify start and end 
indices when doing matches and searches. This feature is available when 
using a compiled RE, but isn't mentioned at all in any of the 
straightforward functions (That's why I didn't even know it was 
possible, until I now checked - I naturally assumed that all the 
functionality is availabe when using the functions).
I think these should be added to the functions match, search, findall 
and finditer. This feature isn't documented for the findall and finditer 
methods, but I checked, and it seems to work fine.
(In case you are interested in the use case: the exercise was to parse 
an XML file. It was done by first matching the beginning of a tag, then 
trying to match attributes, and so on - each match starts from where the 
previous successfull match ended. Since I didn't know of this feature, 
it was done by replacing the original string with a substring after 
every match, which is terribly unefficient.)

If you approve, I can create a patch in a few minutes and send it.

Have a good day,
Noam Raphael

From fdrake at acm.org  Tue Sep  7 21:46:26 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep  7 21:46:54 2004
Subject: [Python-Dev] Re: [Python-checkins]
	=?iso-8859-1?q?python=2Fdist=2Fsrc=2FModules	socketmodule=2Ec?=,
	1.304, 1.305
In-Reply-To: <413E0B6C.8020103@ocf.berkeley.edu>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071409.33867.fdrake@acm.org>
	<413E0B6C.8020103@ocf.berkeley.edu>
Message-ID: <200409071546.26539.fdrake@acm.org>

On Tuesday 07 September 2004 03:26 pm, Brett C. wrote:
 > I am trying to make it rather thorough so that no
 > developers have to look up the CVS docs again.

What's the point of this?  Reading the CVS docs is good, if only because it 
makes one realize how fragile the whole thing is.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From bac at OCF.Berkeley.EDU  Tue Sep  7 21:56:05 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Sep  7 21:56:19 2004
Subject: [Python-Dev] Re: [Python-checkins]
	=?iso-8859-1?q?python=2Fdist=2Fsrc=2FModules
	socketmodule=2Ec?=, 1.304, 1.305
In-Reply-To: <200409071546.26539.fdrake@acm.org>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071409.33867.fdrake@acm.org>
	<413E0B6C.8020103@ocf.berkeley.edu>
	<200409071546.26539.fdrake@acm.org>
Message-ID: <413E1255.9050102@ocf.berkeley.edu>

Fred L. Drake, Jr. wrote:

> On Tuesday 07 September 2004 03:26 pm, Brett C. wrote:
>  > I am trying to make it rather thorough so that no
>  > developers have to look up the CVS docs again.
> 
> What's the point of this?  Reading the CVS docs is good, if only because it 
> makes one realize how fragile the whole thing is.
> 

Well, I know when I was starting out CVS was a hurdle to deal with and 
that what seemed like should be a simple thing was not so simple to 
extrapolate from the docs.  Figured there was no need for anyone else to 
suffer as well.

-Brett
From aahz at pythoncraft.com  Tue Sep  7 22:15:42 2004
From: aahz at pythoncraft.com (Aahz)
Date: Tue Sep  7 22:15:44 2004
Subject: [Python-Dev] Missing arguments in RE functions
In-Reply-To: <413E0D33.7030703@myrealbox.com>
References: <413E0D33.7030703@myrealbox.com>
Message-ID: <20040907201541.GA1083@panix.com>

On Tue, Sep 07, 2004, Noam Raphael wrote:
>
> If you approve, I can create a patch in a few minutes and send it.

Go ahead and create the patch -- it's unlikely that you'll get formal
approval without it.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes
From aahz at pythoncraft.com  Tue Sep  7 22:17:20 2004
From: aahz at pythoncraft.com (Aahz)
Date: Tue Sep  7 22:17:22 2004
Subject: [Python-Dev] Re: =?us-ascii?Q?=5BPytho?=
	=?us-ascii?Q?n-checkins=5D_=3D=3Fiso-8859-1=3Fq=3Fpython=3D2Fdist=3D2Fsrc?=
	=?us-ascii?Q?=3D2FModules?=	socketmodule=2Ec?=, 1.304, 1.305
In-Reply-To: <200409071546.26539.fdrake@acm.org>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071409.33867.fdrake@acm.org>
	<413E0B6C.8020103@ocf.berkeley.edu>
	<200409071546.26539.fdrake@acm.org>
Message-ID: <20040907201720.GB1083@panix.com>

On Tue, Sep 07, 2004, Fred L. Drake, Jr. wrote:
> On Tuesday 07 September 2004 03:26 pm, Brett C. wrote:
>>
>> I am trying to make it rather thorough so that no
>> developers have to look up the CVS docs again.
> 
> What's the point of this?  Reading the CVS docs is good, if only because it 
> makes one realize how fragile the whole thing is.

Heh.  My take on it is that we should minimize the learning curve for
new developers whenever possible -- I don't think we should require
everyone to become CVS experts.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes
From fdrake at acm.org  Tue Sep  7 22:23:05 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep  7 22:23:16 2004
Subject: [Python-Dev] Re: [Python-checkins]
	=?iso-8859-1?q?=3D=3Fiso-8859-1=3Fq=3Fpython=3D2Fdist=3D2Fsrc=3D2FModule?=
	=?iso-8859-1?q?s	socketmodule=3D2Ec=3F=3D?=, 1.304, 1.305
In-Reply-To: <20040907201720.GB1083@panix.com>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071546.26539.fdrake@acm.org>
	<20040907201720.GB1083@panix.com>
Message-ID: <200409071623.05499.fdrake@acm.org>

On Tuesday 07 September 2004 04:17 pm, Aahz wrote:
 > Heh.  My take on it is that we should minimize the learning curve for
 > new developers whenever possible -- I don't think we should require
 > everyone to become CVS experts.

Agreed.  But we do want them to scream out "Why are we still using this piece 
of junk???"

Seriously, I'm not slamming CVS for being evil or bad; it has served well.  
But there are better options now.

(I'm really looking forward to Subversion 1.1; all the advantage of 
Subversion, without the disadvantage of Berkeley DB...!)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From martin at v.loewis.de  Tue Sep  7 22:51:28 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Sep  7 22:51:22 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <200409071623.05499.fdrake@acm.org>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>	<200409071546.26539.fdrake@acm.org>	<20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org>
Message-ID: <413E1F50.90709@v.loewis.de>

Fred L. Drake, Jr. wrote:
> (I'm really looking forward to Subversion 1.1; all the advantage of 
> Subversion, without the disadvantage of Berkeley DB...!)

What *is* the disadvantage of Berkeley DB that the file storage of
svn 1.1 will remove? One of the things that you could do in CVS that
you can't easily do because of the DB approach is to ultimately
remove a file, along with its entire history (by removing the ,v file).
Along with that goes the option of moving part of a repository into
another repository.

Neither is either with svn because of the DB thing. However, I
understand that it won't become simpler with the file storage, either,
as the files being created don't directly correlate to files of the
versions file system. So you still can't delete a single file with all
of its history, nor can you move just a part of the repository.

Of course, you can do both with svndump|svnfilter|svnload, so that
is not a serious obstacle.

One problem that I had with svn+bsddb is that the DB files are
tied to a DB version, so you can't easily upgrade to a newer DB
version (without dump/load cycle). But
a) don't do that, then, and
b) for the last 3 or so bsddb updates (since 4.0), it wasn't
    that bad - the SVN repositories would have continued to work,
    as the bsddb format didn't change in a relevant way.

Regards,
Martin
From exarkun at divmod.com  Tue Sep  7 22:59:17 2004
From: exarkun at divmod.com (Jp Calderone)
Date: Tue Sep  7 22:59:22 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <413E1F50.90709@v.loewis.de>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>	<200409071546.26539.fdrake@acm.org>	<20040907201720.GB1083@panix.com>	<200409071623.05499.fdrake@acm.org>
	<413E1F50.90709@v.loewis.de>
Message-ID: <413E2125.1020206@divmod.com>


(This is somewhat off-topic for python-dev, so I won't post more than =

one message unless people really want me to)

Martin v. L=F6wis wrote:
> Fred L. Drake, Jr. wrote:
> =

>> (I'm really looking forward to Subversion 1.1; all the advantage of =

>> Subversion, without the disadvantage of Berkeley DB...!)
> =

> =

> What *is* the disadvantage of Berkeley DB that the file storage of
> svn 1.1 will remove? One of the things that you could do in CVS that
> you can't easily do because of the DB approach is to ultimately
> remove a file, along with its entire history (by removing the ,v file).
> Along with that goes the option of moving part of a repository into
> another repository.

   Files are, by and large, big blobs of opaque bytes.  They don't =

belong in a database.  The subversion developers made a mistake by =

putting *everything* into bdbs.  They should have put metadata into the =

database and files into the filesystem.  I doubt this is the =

disadvantage perceived by the svn user community at large (they mostly =

complain about umask problems), but it is the real problem with using =

bdb in the way svn uses it.

> [snip]
> =

> One problem that I had with svn+bsddb is that the DB files are
> tied to a DB version, so you can't easily upgrade to a newer DB
> version (without dump/load cycle). But

   This is half-true.  You don't have to dump/load to move between =

incompatible database versions, you just have to run the =

sleepycat-supplied upgrade tool.  Not to say that dump/load doesn't work...

   Jp
From bob at redivi.com  Tue Sep  7 23:02:55 2004
From: bob at redivi.com (Bob Ippolito)
Date: Tue Sep  7 23:03:26 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <413E1F50.90709@v.loewis.de>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>	<200409071546.26539.fdrake@acm.org>	<20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
Message-ID: <4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>


On Sep 7, 2004, at 4:51 PM, Martin v. L?wis wrote:

> Fred L. Drake, Jr. wrote:
>> (I'm really looking forward to Subversion 1.1; all the advantage of 
>> Subversion, without the disadvantage of Berkeley DB...!)
>
> What *is* the disadvantage of Berkeley DB that the file storage of
> svn 1.1 will remove? One of the things that you could do in CVS that
> you can't easily do because of the DB approach is to ultimately
> remove a file, along with its entire history (by removing the ,v file).
> Along with that goes the option of moving part of a repository into
> another repository.

The biggest complaint I've heard, and I believe the reason for the 
optional alternative database implementation in 1.1,  is that the 
Berkeley DB must be on a single local volume.

-bob
From fdrake at acm.org  Tue Sep  7 23:06:59 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep  7 23:07:07 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <413E1F50.90709@v.loewis.de>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
Message-ID: <200409071706.59370.fdrake@acm.org>

On Tuesday 07 September 2004 04:51 pm, Martin v. L?wis wrote:
 > What *is* the disadvantage of Berkeley DB that the file storage of
 > svn 1.1 will remove? One of the things that you could do in CVS that
 > you can't easily do because of the DB approach is to ultimately
 > remove a file, along with its entire history (by removing the ,v file).
 > Along with that goes the option of moving part of a repository into
 > another repository.

I'm not concerned with people deliberately hosing their repositories; they 
shouldn't do that.

The advantage I see is that we won't have to deal with hosed databases having 
to be "recovered" to make the Subversion server useful again.

I certainly agree with Jp's comments about how databases are used, but as long 
as the server is working, that's less of an issue for me.

 > Neither is either with svn because of the DB thing. However, I
 > understand that it won't become simpler with the file storage, either,
 > as the files being created don't directly correlate to files of the
 > versions file system. So you still can't delete a single file with all
 > of its history, nor can you move just a part of the repository.

Again, that's not my desire.  I'm happy to not manipulate the content of the 
repository directly.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From barry at python.org  Tue Sep  7 23:30:00 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep  7 23:30:05 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <413E2125.1020206@divmod.com>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071546.26539.fdrake@acm.org>	<20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
	<413E2125.1020206@divmod.com>
Message-ID: <1094592600.8339.93.camel@geddy.wooz.org>

On Tue, 2004-09-07 at 16:59, Jp Calderone wrote:

>    Files are, by and large, big blobs of opaque bytes.  They don't =
> belong in a database.  The subversion developers made a mistake by =
> putting *everything* into bdbs.  They should have put metadata into the =
> database and files into the filesystem.

Right, and Berkeley splits big blobs up into overflow pages, which
aren't as efficient as if all the data for a key fits in on one page.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040907/756f9d2e/attachment.pgp
From barry at python.org  Tue Sep  7 23:32:53 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep  7 23:32:58 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071546.26539.fdrake@acm.org>	<20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
	<4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>
Message-ID: <1094592773.8346.97.camel@geddy.wooz.org>

On Tue, 2004-09-07 at 17:02, Bob Ippolito wrote:

> The biggest complaint I've heard, and I believe the reason for the 
> optional alternative database implementation in 1.1,  is that the 
> Berkeley DB must be on a single local volume.

Having nothing to do with svn's choice of bdb, my biggest complaint
about subversion is its lack of a mature merging algorithm.  Still, it's
not worse than cvs and there are plenty of other advantages.  We also
had some horrendous stability problems early on, but it looks like the
newer versions are pretty stable.  We haven't lost any data or seen
mysterious re-appearances in months <wink>.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040907/b1c38b9c/attachment.pgp
From pyth at devel.trillke.net  Tue Sep  7 23:50:53 2004
From: pyth at devel.trillke.net (Holger Krekel)
Date: Tue Sep  7 23:51:09 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071546.26539.fdrake@acm.org>
	<20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
	<4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>
Message-ID: <20040907215053.GF5208@solar.trillke>

Bob Ippolito wrote:
> 
> On Sep 7, 2004, at 4:51 PM, Martin v. L?wis wrote:
> 
> >Fred L. Drake, Jr. wrote:
> >>(I'm really looking forward to Subversion 1.1; all the advantage of 
> >>Subversion, without the disadvantage of Berkeley DB...!)
> >
> >What *is* the disadvantage of Berkeley DB that the file storage of
> >svn 1.1 will remove? One of the things that you could do in CVS that
> >you can't easily do because of the DB approach is to ultimately
> >remove a file, along with its entire history (by removing the ,v file).
> >Along with that goes the option of moving part of a repository into
> >another repository.
> 
> The biggest complaint I've heard, and I believe the reason for the 
> optional alternative database implementation in 1.1,  is that the 
> Berkeley DB must be on a single local volume.

the primary reason was more to be able to have a local svn
repository on SMB or NFS network storage. 

Apparently there are a lot of commercial environments who used
cvs this way and the svn developers answered to this pressure
with the new "FSFS" backend. See 

    http://subversion.tigris.org/svn_1.1_releasenotes.html

for more info. 

cheers, 

    holger
From raymond.hettinger at verizon.net  Wed Sep  8 00:01:30 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Sep  8 00:02:20 2004
Subject: [Python-Dev] Missing arguments in RE functions
In-Reply-To: <413E0D33.7030703@myrealbox.com>
Message-ID: <006301c49526$3b46ad40$e841fea9@oemcomputer>

> The first missing feature is the "flags" argument in the findall and
> finditer functions. 
 . . .
> The second missing feature is the ability to specify start and end
> indices when doing matches and searches. 

+1

I've need both of these more than once.

Are you up to crafting the code?


Raymond

From noamr at myrealbox.com  Wed Sep  8 00:23:35 2004
From: noamr at myrealbox.com (Noam Raphael)
Date: Wed Sep  8 00:24:54 2004
Subject: [Python-Dev] Missing arguments in RE functions
In-Reply-To: <006301c49526$3b46ad40$e841fea9@oemcomputer>
References: <006301c49526$3b46ad40$e841fea9@oemcomputer>
Message-ID: <413E34E7.1030409@myrealbox.com>

Raymond Hettinger wrote:

>+1
>
>I've need both of these more than once.
>
>Are you up to crafting the code?
>
>  
>
Thanks!
Are these diffs ok?

Noam
-------------- next part --------------
*** /home/noam/python/old/sre.py	Tue Sep  7 23:36:36 2004
--- sre.py	Tue Sep  7 23:40:53 2004
***************
*** 123,147 ****
  # --------------------------------------------------------------------
  # public interface
  
! def match(pattern, string, flags=0):
      """Try to apply the pattern at the start of the string, returning
      a match object, or None if no match was found."""
!     return _compile(pattern, flags).match(string)
  
! def search(pattern, string, flags=0):
      """Scan through string looking for a match to the pattern, returning
      a match object, or None if no match was found."""
!     return _compile(pattern, flags).search(string)
  
! def sub(pattern, repl, string, count=0):
      """Return the string obtained by replacing the leftmost
      non-overlapping occurrences of the pattern in string by the
      replacement repl.  repl can be either a string or a callable;
      if a callable, it's passed the match object and must return
      a replacement string to be used."""
!     return _compile(pattern, 0).sub(repl, string, count)
  
! def subn(pattern, repl, string, count=0):
      """Return a 2-tuple containing (new_string, number).
      new_string is the string obtained by replacing the leftmost
      non-overlapping occurrences of the pattern in the source
--- 123,147 ----
  # --------------------------------------------------------------------
  # public interface
  
! def match(pattern, string, flags=0, pos=0, endpos=sys.maxint):
      """Try to apply the pattern at the start of the string, returning
      a match object, or None if no match was found."""
!     return _compile(pattern, flags).match(string, pos, endpos)
  
! def search(pattern, string, flags=0, pos=0, endpos=sys.maxint):
      """Scan through string looking for a match to the pattern, returning
      a match object, or None if no match was found."""
!     return _compile(pattern, flags).search(string, pos, endpos)
  
! def sub(pattern, repl, string, count=0, flags=0):
      """Return the string obtained by replacing the leftmost
      non-overlapping occurrences of the pattern in string by the
      replacement repl.  repl can be either a string or a callable;
      if a callable, it's passed the match object and must return
      a replacement string to be used."""
!     return _compile(pattern, flags).sub(repl, string, count)
  
! def subn(pattern, repl, string, count=0, flags=0):
      """Return a 2-tuple containing (new_string, number).
      new_string is the string obtained by replacing the leftmost
      non-overlapping occurrences of the pattern in the source
***************
*** 149,162 ****
      substitutions that were made. repl can be either a string or a
      callable; if a callable, it's passed the match object and must
      return a replacement string to be used."""
!     return _compile(pattern, 0).subn(repl, string, count)
  
! def split(pattern, string, maxsplit=0):
      """Split the source string by the occurrences of the pattern,
      returning a list containing the resulting substrings."""
!     return _compile(pattern, 0).split(string, maxsplit)
  
! def findall(pattern, string):
      """Return a list of all non-overlapping matches in the string.
  
      If one or more groups are present in the pattern, return a
--- 149,162 ----
      substitutions that were made. repl can be either a string or a
      callable; if a callable, it's passed the match object and must
      return a replacement string to be used."""
!     return _compile(pattern, flags).subn(repl, string, count)
  
! def split(pattern, string, maxsplit=0, flags=0):
      """Split the source string by the occurrences of the pattern,
      returning a list containing the resulting substrings."""
!     return _compile(pattern, flags).split(string, maxsplit)
  
! def findall(pattern, string, flags=0, pos=0, endpos=sys.maxint):
      """Return a list of all non-overlapping matches in the string.
  
      If one or more groups are present in the pattern, return a
***************
*** 164,179 ****
      has more than one group.
  
      Empty matches are included in the result."""
!     return _compile(pattern, 0).findall(string)
  
  if sys.hexversion >= 0x02020000:
      __all__.append("finditer")
!     def finditer(pattern, string):
          """Return an iterator over all non-overlapping matches in the
          string.  For each match, the iterator returns a match object.
  
          Empty matches are included in the result."""
!         return _compile(pattern, 0).finditer(string)
  
  def compile(pattern, flags=0):
      "Compile a regular expression pattern, returning a pattern object."
--- 164,179 ----
      has more than one group.
  
      Empty matches are included in the result."""
!     return _compile(pattern, flags).findall(string, pos, endpos)
  
  if sys.hexversion >= 0x02020000:
      __all__.append("finditer")
!     def finditer(pattern, string, flags=0, pos=0, endpos=sys.maxint):
          """Return an iterator over all non-overlapping matches in the
          string.  For each match, the iterator returns a match object.
  
          Empty matches are included in the result."""
!         return _compile(pattern, flags).finditer(string, pos, endpos)
  
  def compile(pattern, flags=0):
      "Compile a regular expression pattern, returning a pattern object."
-------------- next part --------------
*** libre.tex	Wed Sep  8 01:09:55 2004
--- /home/noam/python/old/libre.tex	Wed Sep  8 01:04:53 2004
***************
*** 508,530 ****
  \end{datadesc}
  
  
! \begin{funcdesc}{search}{pattern, string\optional{, 
!                          flags\optional{, pos\optional{, endpos}}}}
    Scan through \var{string} looking for a location where the regular
    expression \var{pattern} produces a match, and return a
    corresponding \class{MatchObject} instance.
    Return \code{None} if no
    position in the string matches the pattern; note that this is
    different from finding a zero-length match at some point in the string.
- 
-   The optional \var{pos} and \var{endpos} parameters have the same
-   meaning as for the \function{match()} function.
-   \versionchanged[Added the optional parameters
-                   \var{pos} and \var{endpos}]{2.4}
  \end{funcdesc}
  
! \begin{funcdesc}{match}{pattern, string\optional{, flags\optional{,
!                         pos\optional{, endpos}}}}
    If zero or more characters at the beginning of \var{string} match
    the regular expression \var{pattern}, return a corresponding
    \class{MatchObject} instance.  Return \code{None} if the string does not
--- 508,523 ----
  \end{datadesc}
  
  
! \begin{funcdesc}{search}{pattern, string\optional{, flags}}
    Scan through \var{string} looking for a location where the regular
    expression \var{pattern} produces a match, and return a
    corresponding \class{MatchObject} instance.
    Return \code{None} if no
    position in the string matches the pattern; note that this is
    different from finding a zero-length match at some point in the string.
  \end{funcdesc}
  
! \begin{funcdesc}{match}{pattern, string\optional{, flags}}
    If zero or more characters at the beginning of \var{string} match
    the regular expression \var{pattern}, return a corresponding
    \class{MatchObject} instance.  Return \code{None} if the string does not
***************
*** 533,561 ****
  
    \note{If you want to locate a match anywhere in
    \var{string}, use \method{search()} instead.}
- 
-   The optional parameter \var{pos} gives an index in the string
-   where the search is to start; it defaults to \code{0}.  This is not
-   completely equivalent to slicing the string; the
-   \code{'\textasciicircum'} pattern
-   character matches at the real beginning of the string and at positions
-   just after a newline, but not necessarily at the index where the search
-   is to start.
- 
-   The optional parameter \var{endpos} limits how far the string will
-   be searched; it will be as if the string is \var{endpos} characters
-   long, so only the characters from \var{pos} to \code{\var{endpos} -
-   1} will be searched for a match.  If \var{endpos} is less than
-   \var{pos}, no match will be found, otherwise,
-   \code{re.match(\var{string}, \var{pos}=0, \var{endpos}=50)} is
-   equivalent to \code{re.match(\var{string}[:50], \var{pos}=0)}.
- 
-   \versionchanged[Added the optional parameters
-                   \var{pos} and \var{endpos}]{2.4}
  \end{funcdesc}
  
! \begin{funcdesc}{split}{pattern, string\optional{,
!                         maxsplit\code{ = 0}\optional{, flags}}}
    Split \var{string} by the occurrences of \var{pattern}.  If
    capturing parentheses are used in \var{pattern}, then the text of all
    groups in the pattern are also returned as part of the resulting list.
--- 526,534 ----
  
    \note{If you want to locate a match anywhere in
    \var{string}, use \method{search()} instead.}
  \end{funcdesc}
  
! \begin{funcdesc}{split}{pattern, string\optional{, maxsplit\code{ = 0}}}
    Split \var{string} by the occurrences of \var{pattern}.  If
    capturing parentheses are used in \var{pattern}, then the text of all
    groups in the pattern are also returned as part of the resulting list.
***************
*** 576,613 ****
  
    This function combines and extends the functionality of
    the old \function{regsub.split()} and \function{regsub.splitx()}.
-   \versionchanged[Added the optional parameter \var{flags}]{2.4}
  \end{funcdesc}
  
! \begin{funcdesc}{findall}{pattern, string\optional{,
!                           flags\optional{, pos\optional{, endpos}}}}
    Return a list of all non-overlapping matches of \var{pattern} in
    \var{string}.  If one or more groups are present in the pattern,
    return a list of groups; this will be a list of tuples if the
    pattern has more than one group.  Empty matches are included in the
    result unless they touch the beginning of another match.
- 
-   The optional parameters \var{pos} and \var{endpos} limit the search to
-   a part of the string, just like they do in the \function{match()}
-   function.
    \versionadded{1.5.2}
-   \versionchanged[Added the optional parameters
-                   \var{flags}, \var{pos} and \var{endpos}]{2.4}
  \end{funcdesc}
  
! \begin{funcdesc}{finditer}{pattern, string\optional{,
!                           flags\optional{, pos\optional{, endpos}}}}
    Return an iterator over all non-overlapping matches for the RE
    \var{pattern} in \var{string}.  For each match, the iterator returns
    a match object.  Empty matches are included in the result unless they
    touch the beginning of another match.
    \versionadded{2.2}
-   \versionchanged[Added the optional parameters
-                   \var{flags}, \var{pos} and \var{endpos}]{2.4}
  \end{funcdesc}
  
! \begin{funcdesc}{sub}{pattern, repl, string\optional{,
!                       count\optional{, flags}}}
    Return the string obtained by replacing the leftmost non-overlapping
    occurrences of \var{pattern} in \var{string} by the replacement
    \var{repl}.  If the pattern isn't found, \var{string} is returned
--- 549,574 ----
  
    This function combines and extends the functionality of
    the old \function{regsub.split()} and \function{regsub.splitx()}.
  \end{funcdesc}
  
! \begin{funcdesc}{findall}{pattern, string}
    Return a list of all non-overlapping matches of \var{pattern} in
    \var{string}.  If one or more groups are present in the pattern,
    return a list of groups; this will be a list of tuples if the
    pattern has more than one group.  Empty matches are included in the
    result unless they touch the beginning of another match.
    \versionadded{1.5.2}
  \end{funcdesc}
  
! \begin{funcdesc}{finditer}{pattern, string}
    Return an iterator over all non-overlapping matches for the RE
    \var{pattern} in \var{string}.  For each match, the iterator returns
    a match object.  Empty matches are included in the result unless they
    touch the beginning of another match.
    \versionadded{2.2}
  \end{funcdesc}
  
! \begin{funcdesc}{sub}{pattern, repl, string\optional{, count}}
    Return the string obtained by replacing the leftmost non-overlapping
    occurrences of \var{pattern} in \var{string} by the replacement
    \var{repl}.  If the pattern isn't found, \var{string} is returned
***************
*** 660,674 ****
    group 2 followed by the literal character \character{0}.  The
    backreference \samp{\e g<0>} substitutes in the entire substring
    matched by the RE.
- 
-   \versionchanged[Added the optional parameter \var{flags}]{2.4}
  \end{funcdesc}
  
! \begin{funcdesc}{subn}{pattern, repl, string\optional{,
!                        count\optional{, flags}}}
    Perform the same operation as \function{sub()}, but return a tuple
    \code{(\var{new_string}, \var{number_of_subs_made})}.
-   \versionchanged[Added the optional parameter \var{flags}]{2.4}
  \end{funcdesc}
  
  \begin{funcdesc}{escape}{string}
--- 621,631 ----
    group 2 followed by the literal character \character{0}.  The
    backreference \samp{\e g<0>} substitutes in the entire substring
    matched by the RE.
  \end{funcdesc}
  
! \begin{funcdesc}{subn}{pattern, repl, string\optional{, count}}
    Perform the same operation as \function{sub()}, but return a tuple
    \code{(\var{new_string}, \var{number_of_subs_made})}.
  \end{funcdesc}
  
  \begin{funcdesc}{escape}{string}
***************
*** 737,749 ****
  Identical to the \function{split()} function, using the compiled pattern.
  \end{methoddesc}
  
! \begin{methoddesc}[RegexObject]{findall}{string\optional{,
!                                          pos\optional{, endpos}}}
  Identical to the \function{findall()} function, using the compiled pattern.
  \end{methoddesc}
  
! \begin{methoddesc}[RegexObject]{finditer}{string\optional{,
!                                           pos\optional{, endpos}}}
  Identical to the \function{finditer()} function, using the compiled pattern.
  \end{methoddesc}
  
--- 694,704 ----
  Identical to the \function{split()} function, using the compiled pattern.
  \end{methoddesc}
  
! \begin{methoddesc}[RegexObject]{findall}{string}
  Identical to the \function{findall()} function, using the compiled pattern.
  \end{methoddesc}
  
! \begin{methoddesc}[RegexObject]{finditer}{string}
  Identical to the \function{finditer()} function, using the compiled pattern.
  \end{methoddesc}
  
From greg at electricrain.com  Wed Sep  8 01:37:18 2004
From: greg at electricrain.com (Gregory P. Smith)
Date: Wed Sep  8 01:37:31 2004
Subject: [Python-Dev] Subversion, Codeville
In-Reply-To: <200409071706.59370.fdrake@acm.org>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
	<200409071706.59370.fdrake@acm.org>
Message-ID: <20040907233718.GC10869@zot.electricrain.com>

On Tue, Sep 07, 2004 at 05:06:59PM -0400, Fred L. Drake, Jr. wrote:
> On Tuesday 07 September 2004 04:51 pm, Martin v. L?wis wrote:
>  > What *is* the disadvantage of Berkeley DB that the file storage of
>  > svn 1.1 will remove? One of the things that you could do in CVS that
>  > you can't easily do because of the DB approach is to ultimately
>  > remove a file, along with its entire history (by removing the ,v file).
>  > Along with that goes the option of moving part of a repository into
>  > another repository.
> 
> I'm not concerned with people deliberately hosing their repositories; they 
> shouldn't do that.
> 
> The advantage I see is that we won't have to deal with hosed databases having 
> to be "recovered" to make the Subversion server useful again.
> 
> I certainly agree with Jp's comments about how databases are used, but as long 
> as the server is working, that's less of an issue for me.

agreed, if someone else makes it work i don't care so much how.  I was
pretty shocked at svn's use of berkeleydb for the reasons others have
already hashed out here.

to fuel a fire:  given that its written in python i'd suggest codeville
as a cvs replacement.  Its in very early development but i'll bet by the
time anyone actually bothers to take the plunge away from tried-and-true
cvs rather than just talk about it, it won't be.  i expect to be shot
down for this suggestion. ;)

>  > Neither is either with svn because of the DB thing. However, I
>  > understand that it won't become simpler with the file storage, either,
>  > as the files being created don't directly correlate to files of the
>  > versions file system. So you still can't delete a single file with all
>  > of its history, nor can you move just a part of the repository.

There should -never- be a reason to remove the entire proof of a files
past existence from a repository (unless you live in 1984).  disk space
is effectively free.

-g

From raymond.hettinger at verizon.net  Wed Sep  8 01:50:44 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Sep  8 01:52:30 2004
Subject: [Python-Dev] Missing arguments in RE functions
In-Reply-To: <006301c49526$3b46ad40$e841fea9@oemcomputer>
Message-ID: <000f01c49535$9ec914c0$e841fea9@oemcomputer>

> > The first missing feature is the "flags" argument in the findall and
> > finditer functions.
>  . . .
> > The second missing feature is the ability to specify start and end
> > indices when doing matches and searches.
> 
> +1
> 
> I've need both of these more than once.
> 
> Are you up to crafting the code?

Noam has posted a patch:
    www.python.org/sf/1024041

After adding the unittests, does anyone see any reason that this should
not be in Py2.4?  


Raymond Hettinger

From cce at clarkevans.com  Wed Sep  8 03:48:45 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 03:48:49 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
Message-ID: <20040908014845.GA52384@prometheusresearch.com>

I've packaged up the idea of a coroutine facility using iterators and an
exception, SuspendIteration.  This would require some rather deep
changes to how generators are implemented, however, it seems backwards
compatible, implementable /w JVM or CLR, and would make most of my
database/web development work far more pleasant.

http://www.python.org/peps/pep-0334.html

Cheers!

Clark

...

PEP: 334
Title: Simple Coroutines via SuspendIteration
Version: $Revision: 1.1 $
Last-Modified: $Date: 2004/09/08 00:11:18 $
Author: Clark C. Evans <info@clarkevans.com>
Status: Draft
Type: Standards Track
Python-Version: 3.0
Content-Type: text/x-rst
Created: 26-Aug-2004
Post-History: 


Abstract
========

Asynchronous application frameworks such as Twisted [1]_ and Peak
[2]_, are based on a cooperative multitasking via event queues or
deferred execution.  While this approach to application development
does not involve threads and thus avoids a whole class of problems
[3]_, it creates a different sort of programming challenge.  When an
I/O operation would block, a user request must suspend so that other
requests can proceed.  The concept of a coroutine [4]_ promises to
help the application developer grapple with this state management
difficulty.

This PEP proposes a limited approach to coroutines based on an
extension to the iterator protocol [5]_.  Currently, an iterator may
raise a StopIteration exception to indicate that it is done producing
values.  This proposal adds another exception to this protocol,
SuspendIteration, which indicates that the given iterator may have
more values to produce, but is unable to do so at this time.


Rationale
=========

There are two current approaches to bringing co-routines to Python.
Christian Tismer's Stackless [6]_ involves a ground-up restructuring
of Python's execution model by hacking the 'C' stack.  While this
approach works, its operation is hard to describe and keep portable. A
related approach is to compile Python code to Parrot [7]_, a
register-based virtual machine, which has coroutines.  Unfortunately,
neither of these solutions is portable with IronPython (CLR) or Jython
(JavaVM).

It is thought that a more limited approach, based on iterators, could
provide a coroutine facility to application programmers and still be
portable across runtimes.

* Iterators keep their state in local variables that are not on the
  "C" stack.  Iterators can be viewed as classes, with state stored in
  member variables that are persistent across calls to its next()
  method.

* While an uncaught exception may terminate a function's execution, an
  uncaught exception need not invalidate an iterator.  The proposed
  exception, SuspendIteration, uses this feature.  In other words,
  just because one call to next() results in an exception does not
  necessarily need to imply that the iterator itself is no longer
  capable of producing values.

There are four places where this new exception impacts:

* The simple generator [8]_ mechanism could be extended to safely
  'catch' this SuspendIteration exception, stuff away its current
  state, and pass the exception on to the caller.

* Various iterator filters [9]_ in the standard library, such as
  itertools.izip should be made aware of this exception so that it can
  transparently propagate SuspendIteration.

* Iterators generated from I/O operations, such as a file or socket
  reader, could be modified to have a non-blocking variety.  This
  option would raise a subclass of SuspendIteration if the requested
  operation would block.

* The asyncore library could be updated to provide a basic 'runner'
  that pulls from an iterator; if the SuspendIteration exception is
  caught, then it moves on to the next iterator in its runlist [10]_.
  External frameworks like Twisted would provide alternative
  implementations, perhaps based on FreeBSD's kqueue or Linux's epoll.

While these may seem dramatic changes, it is a very small amount of
work compared with the utility provided by continuations.


Semantics
=========

This section will explain, at a high level, how the introduction of
this new SuspendIteration exception would behave.


Simple Iterators
----------------

The current functionality of iterators is best seen with a simple
example which produces two values 'one' and 'two'. ::

    class States:

        def __iter__(self):
            self._next = self.state_one
            return self

        def next(self):
            return self._next()

        def state_one(self):
            self._next = self.state_two
            return "one"

        def state_two(self):
            self._next = self.state_stop
            return "two"

        def state_stop(self):
            raise StopIteration

    print list(States())

An equivalent iteration could, of course, be created by the
following generator::

    def States():
        yield 'one'
        yield 'two'

    print list(States())


Introducing SuspendIteration
----------------------------

Suppose that between producing 'one' and 'two', the generator above
could block on a socket read.  In this case, we would want to raise
SuspendIteration to signal that the iterator is not done producing,
but is unable to provide a value at the current moment. ::

    from random import randint
    from time import sleep

    class SuspendIteration(Exception):
          pass

    class NonBlockingResource:

        """Randomly unable to produce the second value"""

        def __iter__(self):
            self._next = self.state_one
            return self

        def next(self):
            return self._next()

        def state_one(self):
            self._next = self.state_suspend
            return "one"

        def state_suspend(self):
            rand = randint(1,10)
            if 2 == rand:
                self._next = self.state_two
                return self.state_two()
            raise SuspendIteration()

        def state_two(self):
            self._next = self.state_stop
            return "two"

        def state_stop(self):
            raise StopIteration

    def sleeplist(iterator, timeout = .1):
        """
        Do other things (e.g. sleep) while resource is
        unable to provide the next value
        """
        it = iter(iterator)
        retval = []
        while True:
            try:
                retval.append(it.next())
            except SuspendIteration:
                sleep(timeout)
                continue
            except StopIteration:
                break
        return retval

    print sleeplist(NonBlockingResource())

In a real-world situation, the NonBlockingResource would be a file
iterator, socket handle, or other I/O based producer.  The sleeplist
would instead be an async reactor, such as those found in asyncore or
Twisted.  The non-blocking resource could, of course, be written as a
generator::

    def NonBlockingResource():
        yield "one"
        while True:
            rand = randint(1,10)
            if 2 == rand:
                break
            raise SuspendIteration()
        yield "two"

It is not necessary to add a keyword, 'suspend', since most real
content generators will not be in application code, they will be in
low-level I/O based operations.  Since most programmers need not be
exposed to the SuspendIteration() mechanism, a keyword is not needed.


Application Iterators
---------------------

The previous example is rather contrived, a more 'real-world' example
would be a web page generator which yields HTML content, and pulls
from a database.  Note that this is an example of neither the
'producer' nor the 'consumer', but rather of a filter. ::

    def ListAlbums(cursor):
        cursor.execute("SELECT title, artist FROM album")
        yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>'
        for (title, artist) in cursor:
            yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
        yield '</table></body></html>'

The problem, of course, is that the database may block for some time
before any rows are returned, and that during execution, rows may be
returned in blocks of 10 or 100 at a time. Ideally, if the database
blocks for the next set of rows, another user connection could be
serviced.  Note the complete absence of SuspendIterator in the above
code.  If done correctly, application developers would be able to
focus on functionality rather than concurrency issues.

The iterator created by the above generator should do the magic
necessary to maintain state, yet pass the exception through to a
lower-level async framework.  Here is an example of what the
corresponding iterator would look like if coded up as a class::

    class ListAlbums:

        def __init__(self, cursor):
            self.cursor = cursor

        def __iter__(self):
            self.cursor.execute("SELECT title, artist FROM album")
            self._iter = iter(self._cursor)
            self._next = self.state_head
            return self

        def next(self):
            return self._next()

        def state_head(self):
            self._next = self.state_cursor
            return "<html><body><table><tr><td>\
                    Title</td><td>Artist</td></tr>"

        def state_tail(self):
            self._next = self.state_stop
            return "</table></body></html>"

        def state_cursor(self):
            try:
                (title,artist) = self._iter.next()
                return '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
            except StopIteration:
                self._next = self.state_tail
                return self.next()
            except SuspendIteration:
                # just pass-through
                raise

        def state_stop(self):
            raise StopIteration


Complicating Factors
--------------------

While the above example is straight-forward, things are a bit more
complicated if the intermediate generator 'condenses' values, that is,
it pulls in two or more values for each value it produces. For
example, ::

    def pair(iterLeft,iterRight):
        rhs = iter(iterRight)
        lhs = iter(iterLeft)
        while True:
           yield (rhs.next(), lhs.next())

In this case, the corresponding iterator behavior has to be a bit more
subtle to handle the case of either the right or left iterator raising
SuspendIteration.  It seems to be a matter of decomposing the
generator to recognize intermediate states where a SuspendIterator
exception from the producing context could happen. ::

    class pair:

        def __init__(self, iterLeft, iterRight):
            self.iterLeft = iterLeft
            self.iterRight = iterRight

        def __iter__(self):
            self.rhs = iter(iterRight)
            self.lhs = iter(iterLeft)
            self._temp_rhs = None
            self._temp_lhs = None
            self._next = self.state_rhs
            return self

        def next(self):
            return self._next()

        def state_rhs(self):
            self._temp_rhs = self.rhs.next()
            self._next = self.state_lhs
            return self.next()

        def state_lhs(self):
            self._temp_lhs = self.lhs.next()
            self._next = self.state_pair
            return self.next()

        def state_pair(self):
            self._next = self.state_rhs
            return (self._temp_rhs, self._temp_lhs)

This proposal assumes that a corresponding iterator written using
this class-based method is possible for existing generators.  The
challenge seems to be the identification of distinct states within
the generator where suspension could occur.


Resource Cleanup
----------------

The current generator mechanism has a strange interaction with
exceptions where a 'yield' statement is not allowed within a
try/finally block.  The SuspendIterator exception provides another
similar issue.  The impacts of this issue are not clear. However it
may be that re-writing the generator into a state machine, as the
previous section did, could resolve this issue allowing for the
situation to be no-worse than, and perhaps even removing the
yield/finally situation.  More investigation is needed in this area.


API and Limitations
-------------------

This proposal only covers 'suspending' a chain of iterators, and does
not cover (of course) suspending general functions, methods, or "C"
extension function.  While there could be no direct support for
creating generators in "C" code, native "C" iterators which comply
with the SuspendIterator semantics are certainly possible.


Low-Level Implementation
========================

The author of the PEP is not yet familiar with the Python execution
model to comment in this area.


References
==========

.. [1] Twisted
   (http://twistedmatrix.com)

.. [2] Peak
   (http://peak.telecommunity.com)

.. [3] C10K
   (http://www.kegel.com/c10k.html)

.. [4] Coroutines
   (http://c2.com/cgi/wiki?CallWithCurrentContinuation)

.. [5] PEP 234, Iterators
   (http://www.python.org/peps/pep-0234.html)

.. [6] Stackless Python
   (http://stackless.com)

.. [7] Parrot /w coroutines
   (http://www.sidhe.org/~dan/blog/archives/000178.html)

.. [8] PEP 255, Simple Generators
   (http://www.python.org/peps/pep-0255.html)

.. [9] itertools - Functions creating iterators
   (http://docs.python.org/lib/module-itertools.html)

.. [10] Microthreads in Python, David Mertz
   (http://www-106.ibm.com/developerworks/linux/library/l-pythrd.html)


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From stephen at xemacs.org  Wed Sep  8 04:18:59 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed Sep  8 04:19:14 2004
Subject: [Python-Dev] Python 3.0 list of goals
In-Reply-To: <Pine.LNX.4.10.10408172232250.10675-100000@sumeru.stanford.EDU>
	(Dennis
	Allison's message of "Tue, 17 Aug 2004 22:33:54 -0700 (PDT)")
References: <Pine.LNX.4.10.10408172232250.10675-100000@sumeru.stanford.EDU>
Message-ID: <87fz5ty030.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Dennis" == Dennis Allison <allison@sumeru.stanford.EDU> writes:

    Dennis> Ahhhh...  Zeno's paradox again.

Nah, you're thinking of Archimedes's "Sand Reckoner".


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From tim.peters at gmail.com  Wed Sep  8 05:14:49 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Sep  8 05:14:51 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was Re: Another
	test_compilermystery)
In-Reply-To: <1094572868.8342.43.camel@geddy.wooz.org>
References: <002d01c48083$9a89a6c0$5229c797@oemcomputer>
	<20040816112916.GA19969@vicky.ecs.soton.ac.uk>
	<1f7befae04090422024afaee58@mail.gmail.com>
	<e8bf7a5304090518425dc3ebec@mail.gmail.com>
	<chh4cr$qre$1@sea.gmane.org>
	<e8bf7a5304090705092ee4daa7@mail.gmail.com> <413DBB19.40602@zope.com>
	<1094566431.8341.25.camel@geddy.wooz.org>
	<413DCFBD.7010306@theopalgroup.com>
	<1094572868.8342.43.camel@geddy.wooz.org>
Message-ID: <1f7befae040907201426c33006@mail.gmail.com>

After a bit more thought (and it's hard to measure how little), I'd
like to see "bare except" deprecated.  That doesn't mean no way to
catch all exceptions, it means being explicit about intent.  Only a
few of the bare excepts I've seen in my Python life did what was
actually intended, and there's something off in the design when the
easiest thing to say usually does a wrong thing.

I think Java has a saner model in this particular respect:

Throwable
    Exception
    Error

Java's distinction between "checked" and "unchecked" exceptions is a
distinct layer of complication on top of that.  All exceptions derive
from Throwable.  A "catch" clause requires specifying a class (there's
no "bare except").  "An Error is a subclass of Throwable that
indicates serious problems that a reasonable application should not
try to catch".  That includes AssertionError and VirtualMachineError. 
Those are exceptions that should never occur.  It also includes
ThreadDeath, which is expected to occur, but

    The class ThreadDeath is specifically a subclass of Error rather
than Exception,
    even though it is a "normal occurrence", because many applications catch all
    occurrences of Exception and then discard the exception.

and it's necessary for ThreadDeath to reach the top level else the
thread never really dies.

In that respect, it's interesting that SystemExit and
KeyboardInterrupt are *intended* to "reach the top level" too, but
can't be relied on to do so because of ubiquitous bare excepts and
even pseudo-careful "except Exception:"s now.  If people changed those
to "except StandardError:", SystemExit would make it to the top but
KeyboardInterrupt still wouldn't.

Raisable
    Exception
    Stubborn
        ControlFlow
            KeyboardInterrupt
            StopIteration
            SystemExit
        MemoryError

introduces a class of stubborn exceptions, those that wouldn't be
caught by "except Exception:", and with the intent that there's no way
you should get the effect of "except Raisable" without explictly
saying just that (once bare except's deprecation is complete).

Oh well.  We should elect a benevolent dictator for Python!
From cce at clarkevans.com  Wed Sep  8 05:29:25 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 05:29:28 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908014845.GA52384@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
Message-ID: <20040908032925.GA28079@prometheusresearch.com>

Josiah Carlson kindly pointed out (off list), that my use of
SuspendIteration violates the standard idiom of exceptions 
terminating the current function. This got past me, beacuse 
I think a generator not as a function, but rather as a shortcut 
to creating iterators.  The offending code is,

|     def NonBlockingResource():
|         yield "one"
|         while True:
|             rand = randint(1,10)
|             if 2 == rand:
|                 break
|             raise SuspendIteration()
|         yield "two"

There are two solutions:
  (a) introduce a new keyword 'suspend'; or,
  (b) don't do that.
  
It is not essential to the proposal that the generator syntax produce
iterators that can SuspendIteration, it is only essential that the
implementation of generators pass-through this exception. Most
non-blocking resources will be low-level components from an async
database or socket library; they can make iterators the old way.

Cheers,

Clark
From tim.peters at gmail.com  Wed Sep  8 05:40:20 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Sep  8 05:40:25 2004
Subject: [Python-Dev] assert failure on obmalloc
In-Reply-To: <2msm9u14pj.fsf@starship.python.net>
References: <e8bf7a5304090708123c619f4@mail.gmail.com>
	<2msm9u14pj.fsf@starship.python.net>
Message-ID: <1f7befae04090720401c415b32@mail.gmail.com>

[Michael Hudson]
> Don't debug builds route all PyMem_ calls through PyMalloc?

Indeed they do.

> Doesn't pymalloc rely on the GIL being held when it's called?

Indeed it does.

> If both of these are true, there's an obvious problem here, because the call to
> PyMem_NEW in PyThreadState_New certainly isn't called with the GIL
> held...

Indeed that sucks.

> This would only be a problem in a debug build, though.

So it's Jeremy's fault, just as we suspected all along.

There are lock macros in obmalloc, which currently expand to nothing. 
They could be changed to "do something" in a debug build, but I'd
rather not -- the debug capabilities of obmalloc are more useful the
nastier a memory corruption problem is, and few things make problems
nastier than throwing threads into the mix.

A cheap trick is to ensure that all code that may be called without
the GIL calls the platform malloc()/free() directly.  Alas, I haven't
been able to reproduce Jeremy's symptom.
From martin at v.loewis.de  Wed Sep  8 06:41:28 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 06:41:21 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
Message-ID: <413E8D78.2030302@v.loewis.de>

I recently looked into properly implementing the "Register Extensions"
feature in the installer; in 2.4a3, not selecting that doesn't really
work. The problem is that MSI only supports installing either both
the "extension server" (the .exe) and the extension, or neither. So
you can chose not to install word.exe, and it won't install the .doc
extension; if you install word.exe, it will associate .doc with it.

For Python, this leaves us with three options:
1. Don't make registration of extensions optional; always associate
    .py, .pyc, .pyw, .pyo.
2. Don't support installation-on-demand for extensions. This means
    to not use the MSI extension machinery at all, but to directly
    write the registry keys that build the extension. Installing
    these keys can then be made optional.
3. Provide another binary that is the "extension server", and
    install that independently of python.exe, and pythonw.exe.
    In CVS, I have implemented this approach to see whether it
    works (it does), and called this binary "launcher.exe". It
    is a Windows app which supports a -console argument which also
    makes it a console app. This is the the binary that gets
    associated with all four extensions, for the "open" verb.

Currently, I'm in favour of using option 3, but I'd like to hear
whether people would prefer something else instead.

Regards,
Martin


From gvanrossum at gmail.com  Wed Sep  8 06:53:52 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  8 06:53:55 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <413E8D78.2030302@v.loewis.de>
References: <413E8D78.2030302@v.loewis.de>
Message-ID: <ca471dc204090721531f02145a@mail.gmail.com>

On Wed, 08 Sep 2004 06:41:28 +0200, Martin v. L?wis <martin@v.loewis.de> wrote:
> I recently looked into properly implementing the "Register Extensions"
> feature in the installer; in 2.4a3, not selecting that doesn't really
> work. The problem is that MSI only supports installing either both
> the "extension server" (the .exe) and the extension, or neither. So
> you can chose not to install word.exe, and it won't install the .doc
> extension; if you install word.exe, it will associate .doc with it.
> 
> For Python, this leaves us with three options:
> 1. Don't make registration of extensions optional; always associate
>     .py, .pyc, .pyw, .pyo.
> 2. Don't support installation-on-demand for extensions. This means
>     to not use the MSI extension machinery at all, but to directly
>     write the registry keys that build the extension. Installing
>     these keys can then be made optional.
> 3. Provide another binary that is the "extension server", and
>     install that independently of python.exe, and pythonw.exe.
>     In CVS, I have implemented this approach to see whether it
>     works (it does), and called this binary "launcher.exe". It
>     is a Windows app which supports a -console argument which also
>     makes it a console app. This is the the binary that gets
>     associated with all four extensions, for the "open" verb.
> 
> Currently, I'm in favour of using option 3, but I'd like to hear
> whether people would prefer something else instead.
> 
> Regards,
> Martin

I frequently use the extension feature in a console context; when I am
in a directory full of .py files, I can run any one of them by simply
typing its name (and possibly command line arguments). The script will
then interact through the existing console window. WIll this work?
>From your description I fear that this would start the script without
console I/O possibility or in a separate window, both of which would
make this a no-no. If you can confirm that this works as expected, I
think the separate driver is fine, since pretty much by definition you
can't pass any command line arguments to Python (although I would hope
that the environment variables would still work).


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From erik at heneryd.com  Wed Sep  8 09:15:16 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Wed Sep  8 09:15:23 2004
Subject: [Python-Dev] Missing arguments in RE functions
In-Reply-To: <000f01c49535$9ec914c0$e841fea9@oemcomputer>
References: <000f01c49535$9ec914c0$e841fea9@oemcomputer>
Message-ID: <413EB184.9030604@heneryd.com>

Raymond Hettinger wrote:
>>>The first missing feature is the "flags" argument in the findall and
>>>finditer functions.
>>
>> . . .
>>
>>>The second missing feature is the ability to specify start and end
>>>indices when doing matches and searches.
>>
>>+1
>>
>>I've need both of these more than once.
>>
>>Are you up to crafting the code?
> 
> 
> Noam has posted a patch:
>     www.python.org/sf/1024041
> 
> After adding the unittests, does anyone see any reason that this should
> not be in Py2.4?  
> 

+0

I rarely use the functions, but rather precompile the pattern myself, 
even when it's a one-shot throw-away.  It happens once in awhile, and I 
know I've been puzzled by this a few times when I've used the functions 
for a change.


Erik
From Paul.Moore at atosorigin.com  Wed Sep  8 10:21:47 2004
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Wed Sep  8 10:21:52 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>

From: "Martin v. L?wis"
> 3. Provide another binary that is the "extension server", and
>    install that independently of python.exe, and pythonw.exe.
>    In CVS, I have implemented this approach to see whether it
>    works (it does), and called this binary "launcher.exe". It
>    is a Windows app which supports a -console argument which also
>    makes it a console app. This is the the binary that gets
>    associated with all four extensions, for the "open" verb.
>
> Currently, I'm in favour of using option 3, but I'd like to hear
> whether people would prefer something else instead.

With option (3), what happens if you run "launcher -console" from a
command prompt? Does it produce output in the same console window,
or does it launch a new console?

The reason I ask is that cmd.exe uses the association of the .py
extension to treat .py files as executable. If the association is
to a Windows program, cmd.exe will not wait for the command to
finish, but will return a prompt immediately, and the command
output will appear in a separate console.

If this is the case, I'm -1 on option (3).

If it's not, I'd like to see how you coded console.exe, as I've
often needed this sort of behaviour, and never been able to achieve
it correctly!

Paul.


__________________________________________________________________________
This e-mail and the documents attached are confidential and intended 
solely for the addressee; it may also be privileged. If you receive this 
e-mail in error, please notify the sender immediately and destroy it.
As its integrity cannot be secured on the Internet, the Atos Origin group 
liability cannot be triggered for the message content. Although the 
sender endeavours to maintain a computer virus-free network, the sender 
does not warrant that this transmission is virus-free and will not be 
liable for any damages resulting from any virus transmitted.
__________________________________________________________________________
From exarkun at divmod.com  Wed Sep  8 15:07:59 2004
From: exarkun at divmod.com (Jp Calderone)
Date: Wed Sep  8 15:08:06 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908032925.GA28079@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
	<20040908032925.GA28079@prometheusresearch.com>
Message-ID: <413F042F.10803@divmod.com>

Clark C. Evans wrote:
> Josiah Carlson kindly pointed out (off list), that my use of
> SuspendIteration violates the standard idiom of exceptions 
> terminating the current function. This got past me, beacuse 
> I think a generator not as a function, but rather as a shortcut 
> to creating iterators.  The offending code is,
> 
> |     def NonBlockingResource():
> |         yield "one"
> |         while True:
> |             rand = randint(1,10)
> |             if 2 == rand:
> |                 break
> |             raise SuspendIteration()
> |         yield "two"
> 
> There are two solutions:
>   (a) introduce a new keyword 'suspend'; or,
>   (b) don't do that.
>   
> It is not essential to the proposal that the generator syntax produce
> iterators that can SuspendIteration, it is only essential that the
> implementation of generators pass-through this exception. Most
> non-blocking resources will be low-level components from an async
> database or socket library; they can make iterators the old way.
> 

   What about this?

     def somefunc():
         raise SuspendIteration()
         return 'foo'

     def genfunc():
         yield somefunc()

   Jp
From cce at clarkevans.com  Wed Sep  8 15:26:03 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 15:26:08 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <413F042F.10803@divmod.com>
References: <20040908014845.GA52384@prometheusresearch.com>
	<20040908032925.GA28079@prometheusresearch.com>
	<413F042F.10803@divmod.com>
Message-ID: <20040908132603.GB66159@prometheusresearch.com>

On Wed, Sep 08, 2004 at 09:07:59AM -0400, Jp Calderone wrote:
| 
|     def somefunc():
|         raise SuspendIteration()
|         return 'foo'
| 
|     def genfunc():
|         yield somefunc()

Interesting, but:
 - somefunc is a function, thus SuspendIteration() should
   terminate the function; raising an exception
 - somefunc is not a generator, so it cannot be yielded.

However, perhaps something like...

   def suspend(*args,**kwargs):
       raise SuspendIteration(*args,**kwargs)
       # never ever returns

   def myProducer():
       yeild "one"
       suspend()
       yield "two"

Regardless, this is a side point.  The authors of iterators that
raise a SuspendIterator() will be low-level code, like a next()
which reads the next block from a socket or row from a database
query.  In these cases, the class style iterator is sufficient.

The real point, is that user-level generators, such as this example 
from the PEP (which is detailed as a class-based iterator), should
transparently handle SuspendIteration() by passing it up the generator
chain without killing the current scope.

|     def ListAlbums(cursor):
|         cursor.execute("SELECT title, artist FROM album")
|         yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>'
|         for (title, artist) in cursor:
|             yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
|         yield '</table></body></html>'

For those who say that this iterator should be invalidated when
cursor.next() raises SuspendIteration(), I point out that it is not
invalided when cursor.next() raises StopIteration().

Kind Regards,

Clark


-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From cce at clarkevans.com  Wed Sep  8 15:32:15 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 15:32:17 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908132603.GB66159@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
	<20040908032925.GA28079@prometheusresearch.com>
	<413F042F.10803@divmod.com>
	<20040908132603.GB66159@prometheusresearch.com>
Message-ID: <20040908133215.GA86633@prometheusresearch.com>

On Wed, Sep 08, 2004 at 09:26:03AM -0400, Clark C. Evans wrote:
| On Wed, Sep 08, 2004 at 09:07:59AM -0400, Jp Calderone wrote:
| | 
| |     def somefunc():
| |         raise SuspendIteration()
| |         return 'foo'
| | 
| |     def genfunc():
| |         yield somefunc()
| 
| Interesting, but:
|  - somefunc is a function, thus SuspendIteration() should
|    terminate the function; raising an exception
|  - somefunc is not a generator, so it cannot be yielded.

It's too early for me to be posting; scrap the nonsense in this second
point.  I don't think this changes the suggestion below though.

| 
| However, perhaps something like...
| 
|    def suspend(*args,**kwargs):
|        raise SuspendIteration(*args,**kwargs)
|        # never ever returns
| 
|    def myProducer():
|        yeild "one"
|        suspend()
|        yield "two"
| 
| Regardless, this is a side point.  The authors of iterators that
| raise a SuspendIterator() will be low-level code, like a next()
| which reads the next block from a socket or row from a database
| query.  In these cases, the class style iterator is sufficient.
| 
| The real point, is that user-level generators, such as this example 
| from the PEP (which is detailed as a class-based iterator), should
| transparently handle SuspendIteration() by passing it up the generator
| chain without killing the current scope.
| 
| |     def ListAlbums(cursor):
| |         cursor.execute("SELECT title, artist FROM album")
| |         yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>'
| |         for (title, artist) in cursor:
| |             yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
| |         yield '</table></body></html>'
| 
| For those who say that this iterator should be invalidated when
| cursor.next() raises SuspendIteration(), I point out that it is not
| invalided when cursor.next() raises StopIteration().
| 
| Kind Regards,
| 
| Clark
| 
| 
| -- 
| Clark C. Evans                      Prometheus Research, LLC.
|                                     http://www.prometheusresearch.com/
|     o                               office: +1.203.777.2550 
|   ~/ ,                              mobile: +1.203.444.0557 
|  //
| ((   Prometheus Research: Transforming Data Into Knowledge
|  \\  ,
|    \/    - Research Exchange Database
|    /\    - Survey & Assessment Technologies
|    ` \   - Software Tools for Researchers
|     ~ *
| _______________________________________________
| Python-Dev mailing list
| Python-Dev@python.org
| http://mail.python.org/mailman/listinfo/python-dev
| Unsubscribe: http://mail.python.org/mailman/options/python-dev/cce%40clarkevans.com

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From mal at egenix.com  Wed Sep  8 16:47:35 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Sep  8 16:47:40 2004
Subject: [Python-Dev] PEP 328 - Relative Imports
Message-ID: <413F1B87.90301@egenix.com>

Hi there,

I know that this has been discussed a few times in the past,
but the more I have to deal with building applications using
third-party libs or packages, the more I get the feeling that
the choice of making "import module" absolute is the wrong
path to follow.

The typical scenario goes like this:

* you build an application that uses various third-party
   packages and has to maintain them inside another package,
   e.g. ThirdPartyCode

* you don't have access to the (third-party) package source code or
   it's not feasable to make changes to it for maintenance reasons

Another common case is that you have to deal with third-party
code that is not properly packaged as Python package, but comes
as a set of top-level modules.

In this scenario you typically put all those files into a
newly created Python package directory and access the modules
in that directory using the package name.

In Python 2.3 and 2.4 (as well as all previous versions), both
scenarios can easily be implemented without having to change
the third-party code.

The PEP however suggests that starting with 2.5, the interpreter
will issue a warning and 2.6 should default to absolute paths.

I'd like to request that the latter change be postponed to
Python 3k, or that some other way of supporting the above
scenarios is provided that can be enabled in the application.

Please remember that changes to application code are well
possible. What's not possible is making changes to the
packaged third-party code.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mal at egenix.com  Wed Sep  8 16:56:28 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Sep  8 16:56:31 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <cheig3$ki8$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org>
Message-ID: <413F1D9C.20209@egenix.com>

Fredrik Lundh wrote:
> Barry wrote:
> 
>>I'll point out that Template was very deliberately subclassed from
>>unicode, so Template instances /are/ unicode objects.  From the
>>standpoint of type conversion, using /F's notation, T(8) == U, thus
>>because U % 8 == U, T(8) % 8 == U.
> 
> from a user perspective, there's no reason to make templates a sub-
> class of unicode, so the rest of your argument is irrelevant.

Templates are meant to template *text* data, so Unicode is
the right choice of baseclass from a design perspective.

> instead of looking at use patterns, you're stuck defending the existing
> code.  that's not a good way to design usable code.

Perhaps I'm missing something, but where would you use Templates
for templating binary data (where strings or bytes would be a more
appropriate design choice) ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From gvanrossum at gmail.com  Wed Sep  8 17:03:39 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  8 17:03:42 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <20040907215053.GF5208@solar.trillke>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071546.26539.fdrake@acm.org> <20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
	<4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>
	<20040907215053.GF5208@solar.trillke>
Message-ID: <ca471dc204090808035309607a@mail.gmail.com>

Somebody in this thread said that files don't belong in a db, and
proposed to only put the metadata in the db. That argument seems
misguided to me: when recovering the db and filesystem after a host
crash, you'd have to go to extra hoops to make sure the metadata
matches the filesystem data. Note that Perforce puts everything in a
database and it's rock solid. The main problems I had with svn's use
of Berkeley DB were packaging issues (but then, I was using a pre-1.0
beta of svn) and poor management of the Berkeley DB by the svn code,
requiring frequent db "recovery" actions; also a bug that caused
accesses by different users to hose the database in a subtle way. All
of that seems fixable.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Wed Sep  8 17:08:07 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  8 17:08:10 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <413F1D9C.20209@egenix.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org> <cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com>
Message-ID: <ca471dc2040908080861941ab2@mail.gmail.com>

> Templates are meant to template *text* data, so Unicode is
> the right choice of baseclass from a design perspective.

Only in Python 3.0.

But even so, deriving from Unicode (or str) means the template class
inherits a lot of unwanted operations. While I can see that
concatenating templates probably works, slicing them or converting to
lowercase etc. make no sense. IMO the standard Template class should
implement a "narrow" interface, i.e. *only* the template expansion
method (__mod__ or something else), so it's clear that other
compatible template classes shouldn't have to implement anything
besides that. This avoids the issues we have with the mapping
protocol: when does an object implement enough of the mapping API to
be usable? That's currently ill-defined; sometimes, __getitem__ is all
you need, sometimes __contains__ is required, sometimes keys, rarely
setdefault.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Wed Sep  8 17:11:28 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  8 17:11:31 2004
Subject: [Python-Dev] PEP 328 - Relative Imports
In-Reply-To: <413F1B87.90301@egenix.com>
References: <413F1B87.90301@egenix.com>
Message-ID: <ca471dc204090808119668687@mail.gmail.com>

> I know that this has been discussed a few times in the past,
> but the more I have to deal with building applications using
> third-party libs or packages, the more I get the feeling that
> the choice of making "import module" absolute is the wrong
> path to follow.
> 
> The typical scenario goes like this:
> 
> * you build an application that uses various third-party
>    packages and has to maintain them inside another package,
>    e.g. ThirdPartyCode
> 
> * you don't have access to the (third-party) package source code or
>    it's not feasable to make changes to it for maintenance reasons
> 
> Another common case is that you have to deal with third-party
> code that is not properly packaged as Python package, but comes
> as a set of top-level modules.
> 
> In this scenario you typically put all those files into a
> newly created Python package directory and access the modules
> in that directory using the package name.
> 
> In Python 2.3 and 2.4 (as well as all previous versions), both
> scenarios can easily be implemented without having to change
> the third-party code.
> 
> The PEP however suggests that starting with 2.5, the interpreter
> will issue a warning and 2.6 should default to absolute paths.
> 
> I'd like to request that the latter change be postponed to
> Python 3k, or that some other way of supporting the above
> scenarios is provided that can be enabled in the application.
> 
> Please remember that changes to application code are well
> possible. What's not possible is making changes to the
> packaged third-party code.

As long as it's clear that this is a compatibility requirement only I
think it's a good idea to support this way of developing apps (even
though I think that clever sys.path manipulation can probably get
around it, it's not worth breaking existing approaches). All new apps
should however use relative imports to reference their own code, so
the problem won't be repeated in the future.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Wed Sep  8 17:12:56 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep  8 17:13:04 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
Message-ID: <ca471dc20409080812183b735a@mail.gmail.com>

One more thing. I'd like the launcher app's name to begin with "Py".
Maybe PyLaunch.exe or PyStart.exe?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From mal at egenix.com  Wed Sep  8 17:23:12 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Sep  8 17:23:17 2004
Subject: [Python-Dev] Re: Alternative Implementation
	for	PEP292:SimpleString Substitutions
In-Reply-To: <ca471dc2040908080861941ab2@mail.gmail.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>	<4138D622.6050807@egenix.com><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com>
	<ca471dc2040908080861941ab2@mail.gmail.com>
Message-ID: <413F23E0.2090908@egenix.com>

Guido van Rossum wrote:
>>Templates are meant to template *text* data, so Unicode is
>>the right choice of baseclass from a design perspective.
> 
> Only in Python 3.0.

We better start early to ever reach the point of making
a clear distinction between text and binary data in P3k.

> But even so, deriving from Unicode (or str) means the template class
> inherits a lot of unwanted operations. While I can see that
> concatenating templates probably works, slicing them or converting to
> lowercase etc. make no sense. IMO the standard Template class should
> implement a "narrow" interface, i.e. *only* the template expansion
> method (__mod__ or something else), so it's clear that other
> compatible template classes shouldn't have to implement anything
> besides that. This avoids the issues we have with the mapping
> protocol: when does an object implement enough of the mapping API to
> be usable? That's currently ill-defined; sometimes, __getitem__ is all
> you need, sometimes __contains__ is required, sometimes keys, rarely
> setdefault.

Looks like it's ont even clear what templating itself should
mean... you're talking about a templating interface here, not an
object type, like Barry is (for the sake of making Templates compatible
to i18n tools like gettext).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mcherm at mcherm.com  Wed Sep  8 17:58:31 2004
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Sep  8 17:57:47 2004
Subject: [Python-Dev] Subversion, Codeville
Message-ID: <1094659111.413f2c2781664@mcherm.com>

Gregory P. Smith writes:
> There should -never- be a reason to remove the entire proof of a files
> past existence from a repository (unless you live in 1984).  disk space
> is effectively free.

One day a careless Python developer checks in a new cryptography library
based on code she found on the internet. Shortly thereafter, SCO decides
to sue the PSF for using and distributing "their" copyrighted code.
Removing the library from the distributed version isn't sufficient, we
have "a copy" of the code, and that's against the law.

I realize that disk space usually isn't the issue, but as long as laws
make certain information illegal, there will always be reasons to need
to delete information.

-- Michael Chermside


From fredrik at pythonware.com  Wed Sep  8 18:33:00 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Sep  8 18:31:18 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation
	forPEP292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com>
Message-ID: <chnc49$psm$1@sea.gmane.org>

M.-A. Lemburg wrote:

>> from a user perspective, there's no reason to make templates a sub-
>> class of unicode, so the rest of your argument is irrelevant.
>
> Templates are meant to template *text* data, so Unicode is
> the right choice of baseclass from a design perspective.

not true.  as I've shown in SRE and ElementTree (just to give a few
examples), 8-bit strings are superior for the *huge* subset of all text
strings that only contain ASCII data.

>> instead of looking at use patterns, you're stuck defending the existing
>> code.  that's not a good way to design usable code.
>
> Perhaps I'm missing something, but where would you use Templates
> for templating binary data (where strings or bytes would be a more
> appropriate design choice) ?

8-bit strings != binary data.

you clearly haven't read my other posts in this thread.  please do that,
instead of repeating the same bogus arguments over again.

</F> 


From trentm at ActiveState.com  Wed Sep  8 18:36:59 2004
From: trentm at ActiveState.com (Trent Mick)
Date: Wed Sep  8 18:37:04 2004
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Tools/msi msi.py, 1.7, 1.8
In-Reply-To: <E1C550X-0006W5-7f@sc8-pr-cvs1.sourceforge.net>;
	from loewis@users.sourceforge.net on Wed, Sep 08, 2004 at
	09:09:17AM -0700
References: <E1C550X-0006W5-7f@sc8-pr-cvs1.sourceforge.net>
Message-ID: <20040908093659.A11945@ActiveState.com>

[loewis@users.sourceforge.net wrote]
>...
> 	msi.py 
>...
>      add_data(db, "Verb",
> -            [("py", "open", 1, None, r'-console "%1"'),
> +            [("py", "open", 1, None, r'"%1"'),
>              ("pyw", "open", 1, None, r'"%1"'),
> -            ("pyc", "open", 1, None, r'-console "%1"'),
> -            ("pyo", "open", 1, None, r'-console "%1"')])
> +            ("pyc", "open", 1, None, r'"%1"'),
> +            ("pyo", "open", 1, None, r'"%1"')])
>...

Not sure I am following exactly here, but ActivePython makes the
association:
    <pythonExe> "%1" %*

I don't know if msi.py adds that '%*' later or not. IIRC that allows for
script arguments to be passed as well.

   C:\> foo.py -h blah blah
        |      `--`----`-------- %*
        `--- %1


Trent

-- 
Trent Mick
TrentM@ActiveState.com
From mal at egenix.com  Wed Sep  8 18:40:37 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Sep  8 18:40:41 2004
Subject: [Python-Dev] Re: Re: Alternative
	Implementation	forPEP292:SimpleString Substitutions
In-Reply-To: <chnc49$psm$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com>
	<chnc49$psm$1@sea.gmane.org>
Message-ID: <413F3605.7090707@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
> 
>>>from a user perspective, there's no reason to make templates a sub-
>>>class of unicode, so the rest of your argument is irrelevant.
>>
>>Templates are meant to template *text* data, so Unicode is
>>the right choice of baseclass from a design perspective.
> 
> not true.  as I've shown in SRE and ElementTree (just to give a few
> examples), 8-bit strings are superior for the *huge* subset of all text
> strings that only contain ASCII data.
>
>>>instead of looking at use patterns, you're stuck defending the existing
>>>code.  that's not a good way to design usable code.
>>
>>Perhaps I'm missing something, but where would you use Templates
>>for templating binary data (where strings or bytes would be a more
>>appropriate design choice) ?
> 
> 
> 8-bit strings != binary data.
> 
> you clearly haven't read my other posts in this thread.  please do that,
> instead of repeating the same bogus arguments over again.

I've read them all and, to be honest, I don't follow your
argumentation.

The text interpretation of 8-bit strings is only one possible
form of their interpretation. You could just as well have image
data in your 8-bit string and calling .lower() on such a string is
certainly going to render that image data useless.

The whole point in adding Unicode to the language was to make
the difference between text and binary data clear and visible
at the type level.

I'm not saying that you can not store text data in 8-bit strings,
but that we should start to make use of the distinction between
text and binary data.

If we start to store text data in Unicode now and leave binary
data in 8-bit strings, then the move to Unicode strings literals
will be much smoother in P3k.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From fredrik at pythonware.com  Wed Sep  8 18:35:24 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Sep  8 18:40:43 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
References: <413E0D33.7030703@myrealbox.com>
	<006301c49526$3b46ad40$e841fea9@oemcomputer>
Message-ID: <chnc8q$qcm$1@sea.gmane.org>

Raymond Hettinger wrote:
 . .
>> The second missing feature is the ability to specify start and end
>> indices when doing matches and searches.
>
> +1
>
> I've need both of these more than once.

any reason you why you cannot type "re.compile(p).match(...)" ?

</F> 


From mcherm at mcherm.com  Wed Sep  8 18:49:42 2004
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Sep  8 18:48:36 2004
Subject: [Python-Dev] decorator support
Message-ID: <1094662182.413f3826f1c1a@mcherm.com>

Raymond Hettinger writes:
> In my experiments with decorators, it is common to wrap the original
> function with a new function.
>
> After creating the new function, there are efforts to make it
> look like
> the old:
    [...]
> All is well and good except the argspec.  Running help() on the new
> function gives:
    [...]
> So, it would be nice if there were some support for carrying
> forward the
> argspec to inform help(), calltips(), and inspect().

I created something to help address this... the "obvious" solution
of a decorator used for making decorators. It's example #6 in:
   http://www.python.org/cgi-bin/moinmoin/PythonDecoratorLibrary

Currently it doesn't handle argspec, but I propose adding argspec
(it'll be slightly tricky, but shouldn't be impossible), then
including it in a "decorators" package. Until we have a decorators
package, I think it can live in the wiki and/or the cookbook.

If asked to do so, I'll see about updating this to fix the argspec.
If not asked to, I'll probably leave it alone, because as-is it is
simple enough to serve as a decent example as well as being useful
but with argspec handling it would be complex enough to be useful
but too complex to be a good example to learn from.

-- Michael Chermside

From fredrik at pythonware.com  Wed Sep  8 18:56:10 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Sep  8 18:54:20 2004
Subject: [Python-Dev] Re: Re: Alternative
	Implementationfor	PEP292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>	<4138D622.6050807@egenix.com><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><ca471dc2040908080861941ab2@mail.gmail.com>
	<413F23E0.2090908@egenix.com>
Message-ID: <chndfo$u7t$1@sea.gmane.org>

M.-A. Lemburg wrote

> for the sake of making Templates compatible to i18n tools like gettext).

assuming that gettext really always returns a template if you hand it a template,
of course.

given that the 2.4 gettext doesn't seem to map templates to templates on
my machine, that there's no sign of template support in the 2.4 gettext source
code, and that Barry ignored my question about this, I have to assume that
the I18N argument is yet another bogus argument.

</F> 


From mcherm at mcherm.com  Wed Sep  8 19:19:29 2004
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Sep  8 19:18:23 2004
Subject: [Python-Dev] Re: Dangerous exceptions (was
	Re:Another	test_compilermystery)
Message-ID: <1094663969.413f3f217532e@mcherm.com>

 [various discussion about how to NOT catch exceptions by default]

I've been working in the Java world for a while now, and they seem
to have solved this problem quite handily -- no one ever seems to
get confused about it. Ignoring Java's "checked exceptions" (which
are a different problem), they've done the following:

   Throwable
   |
   +-- Error
   |   |
   |   +-- OutOfMemorError (and others like it)
   |
   +-- Exception
       |
       +-- IndexOutOfBoundsException (and others like it)
       |
       +-- <user defined exceptions mostly go here>

As I see it, there are several common cases:

  (1) You want to catch a particular (normal) exception.
      (eg: to handle the problem -- this is the normal thing!)
  (2) You want to catch any normal exception.
      (eg: to log and ignore after calls to some subsystem)
  (3) You want to catch a particular special exception
      (eg: to deal with KeyboardInterrupt, or MemoryError)
  (4) You want to catch ANYTHING, then re-throw it afterward
      (eg: to cleanup a DB connection)

We currently make (1) easy (of course!) by writing "except Foo:", and
we make (4) easy by writing "except:" (the bare except). But most
users only want to use (1) and (2)... only experts use (3) and (4).
So I certainly agree that bare except is used by people who want (2)
and should be using (4) instead.

I would think that both the end goal (python 3000) AND the transition
plan are straightforward. For now, a bare except can't be changed
because of backward compatibility. So create a top-level class which
is NOT named "Exception" ("Raisable" anyone?). Our hierarchy would
look like this:

   Raisable
   |
   +-- MemoryError
   +-- SystemExit
   +-- KeyboardInterrupt
   +-- oddballs like ZODB's ConflictError
   |
   +-- Exception
       |
       +-- most everything else
       +-- user defined exceptions


That's not quite as flat as today's hierarchy, but it works pretty
well. When non-experts want to catch all exceptions, if bare excepts
are deprecated, they will write "except Exception:" (that's just
psychology), so most users will wind up with (2) when that's what
they want. Experts can easily do (3) by catching the exception by
name, and experts can do (4) by catching Raisable. When Python 3000
is released and backward compatibility is not required, we can either
remove the bare except, or change it to mean the same as
"except Exception" (Python 3000 will, of course, forbid raising
strings or anything not decended from Raisable).

I don't need anyone to buy into this approach, but since it seems so
straightforward to me I thought I should write it out anyhow.

-- Michael Chermside

From raymond.hettinger at verizon.net  Wed Sep  8 19:39:43 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Sep  8 19:40:36 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <chnc8q$qcm$1@sea.gmane.org>
Message-ID: <001501c495ca$d3a8bd40$e841fea9@oemcomputer>

> >> The second missing feature is the ability to specify start and end
> >> indices when doing matches and searches.
> >
> > +1
> >
> > I've need both of these more than once.
> 
> any reason you why you cannot type "re.compile(p).match(...)" ?

That is what I usually do and that is the approach taken by the patch.

If you see a downside, feel free to reject his patch.  IMO, it is only a
small win.


Raymond

From fredrik at pythonware.com  Wed Sep  8 20:04:42 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Sep  8 20:02:53 2004
Subject: [Python-Dev] Re: Re: Missing arguments in RE functions
References: <chnc8q$qcm$1@sea.gmane.org>
	<001501c495ca$d3a8bd40$e841fea9@oemcomputer>
Message-ID: <chnhg8$brq$1@sea.gmane.org>

Raymond Hettinger wrote:

>> > I've need both of these more than once.
>>
>> any reason you why you cannot type "re.compile(p).match(...)" ?
>
> That is what I usually do and that is the approach taken by the patch.
>
> If you see a downside, feel free to reject his patch.  IMO, it is only a
> small win.

If it's up to me, it's a clear "not worth it".  The function API is only there for
trivial cases; if you need the full RE power, use pattern objects (you have
to use them anyway if you're serious about RE:s).

but I'm an API minimalist; someone else will have to make the final decision
on this one (Guido, what's your take on API size issues?)

</F> 


From foom at fuhm.net  Wed Sep  8 20:08:17 2004
From: foom at fuhm.net (James Y Knight)
Date: Wed Sep  8 20:08:24 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908014845.GA52384@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
Message-ID: <0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>


On Sep 7, 2004, at 9:48 PM, Clark C. Evans wrote:

> I've packaged up the idea of a coroutine facility using iterators and 
> an
> exception, SuspendIteration.

Very interesting.

> This proposal assumes that a corresponding iterator written using
> this class-based method is possible for existing generators.  The
> challenge seems to be the identification of distinct states within
> the generator where suspension could occur.

That is basically impossible. Essentially *every* operation could 
possibly raise SuspendIteration, because essentially every operation 
can call an arbitrary python function, and python functions can raise 
any exception they want. I think you could still make the proposal work 
in CPython: if I understand its internals properly, it doesn't need to 
do a transformation to a class iterator, it simply suspends the frame 
wherever it is. Thus, being able to suspend at any point in the 
function would not cause an undue performance degradation.

However, I think it is a deal-breaker for JPython. From the generator 
PEP: "It's also believed that efficient implementation in Jython 
requires that the compiler be able to determine potential suspension 
points at compile-time, and a new keyword makes that easy." If this 
quote is right about the implementation of Jython (and it seems likely, 
given the JVM), your proposal makes it impossible to implement 
generators in Jython.

Given that the advantage claimed for this proposal over stackless is 
that it can be implemented in non-CPython runtimes, I think it still 
needs some reworking.

James

From raymond.hettinger at verizon.net  Wed Sep  8 20:16:26 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Sep  8 20:17:21 2004
Subject: [Python-Dev] Re: Re: Missing arguments in RE functions
In-Reply-To: <chnhg8$brq$1@sea.gmane.org>
Message-ID: <001d01c495cf$f54623c0$e841fea9@oemcomputer>

> > If you see a downside, feel free to reject his patch.  IMO, it is
only a
> > small win.
> 
> If it's up to me, it's a clear "not worth it".  The function API is
only
> there for
> trivial cases; if you need the full RE power, use pattern objects (you
> have
> to use them anyway if you're serious about RE:s).
> 
> but I'm an API minimalist; someone else will have to make the final
> decision
> on this one (Guido, what's your take on API size issues?)

It is up to you.  You're still the god of re (among other things).

FWIW, I gave extra weight to the OP's usability enhancement request
because it was born out of experience teaching Python to newbies.  The
patch itself is a little rough and needs refinement.


Raymond

From fredrik at pythonware.com  Wed Sep  8 20:20:17 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Sep  8 20:18:28 2004
Subject: [Python-Dev] Re: Re: Re:
	AlternativeImplementation	forPEP292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>
	<413F3605.7090707@egenix.com>
Message-ID: <chnidf$epp$1@sea.gmane.org>

M.-A. Lemburg wrote:

> The whole point in adding Unicode to the language was to make
> the difference between text and binary data clear and visible
> at the type level.

well, when I wrote the Unicode type, the whole point was to be able to
make it easy to handle Unicode text.  no more, no less.

> If we start to store text data in Unicode now and leave binary
> data in 8-bit strings, then the move to Unicode strings literals
> will be much smoother in P3k.

hopefully, the P3K string design will take a lot more into account than
text-vs-binary; there are many ways to represent text, and many ways
to store binary data, and many usage patterns for them both.  a good
design should take most of this into account.  (google for "stringlib" for
some work I'm doing in this area)

</F> 


From martin at v.loewis.de  Wed Sep  8 20:20:37 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 20:20:29 2004
Subject: [Python-Dev] 
	Re: [Python-checkins] python/dist/src/Tools/msi msi.py, 1.7, 1.8
In-Reply-To: <20040908093659.A11945@ActiveState.com>
References: <E1C550X-0006W5-7f@sc8-pr-cvs1.sourceforge.net>
	<20040908093659.A11945@ActiveState.com>
Message-ID: <413F4D75.8040803@v.loewis.de>

Trent Mick wrote:
> Not sure I am following exactly here, but ActivePython makes the
> association:
>     <pythonExe> "%1" %*
> 
> I don't know if msi.py adds that '%*' later or not. IIRC that allows for
> script arguments to be passed as well.
> 
>    C:\> foo.py -h blah blah

Until yesterday, I didn't even know that was possible, and today, I did
not make the right association (pun intended). Will fix soon.

Regards,
Martin
From paul.dubois at gmail.com  Wed Sep  8 20:22:28 2004
From: paul.dubois at gmail.com (Paul Du Bois)
Date: Wed Sep  8 20:22:31 2004
Subject: [Python-Dev] Subversion
In-Reply-To: <ca471dc204090808035309607a@mail.gmail.com>
References: <E1C4k4z-0002oC-4y@sc8-pr-cvs1.sourceforge.net>
	<200409071546.26539.fdrake@acm.org> <20040907201720.GB1083@panix.com>
	<200409071623.05499.fdrake@acm.org> <413E1F50.90709@v.loewis.de>
	<4A27E67A-0111-11D9-9892-000A95686CD8@redivi.com>
	<20040907215053.GF5208@solar.trillke>
	<ca471dc204090808035309607a@mail.gmail.com>
Message-ID: <85f6a31f04090811221fbd30fd@mail.gmail.com>

On Wed, 8 Sep 2004 08:03:39 -0700, Guido van Rossum
<gvanrossum@gmail.com> wrote:
> Note that Perforce puts everything in a database and it's rock solid. 

It's actually the other way around (about the everything-in-database
bit, not the rock-solid bit). Snipped from 
http://www.perforce.com/perforce/technotes/note033.html:

The Perforce Server stores two kinds of data: versioned files, and
metadata (changelists, opened files, labels, etc.). Both are stored in
the Perforce Server's root directory. Versioned files are stored in
depot subdirectories; there is one subdirectory for each depot in your
Perforce installation. Metadata are stored in the Perforce database.

p
From martin at v.loewis.de  Wed Sep  8 20:28:12 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 20:28:05 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <ca471dc204090721531f02145a@mail.gmail.com>
References: <413E8D78.2030302@v.loewis.de>
	<ca471dc204090721531f02145a@mail.gmail.com>
Message-ID: <413F4F3C.8060905@v.loewis.de>

Guido van Rossum wrote:
> I frequently use the extension feature in a console context; when I am
> in a directory full of .py files, I can run any one of them by simply
> typing its name (and possibly command line arguments). The script will
> then interact through the existing console window. WIll this work?

No. I didn't (really) know that was possible (although Mr Rivest's
bug report should have taught me).

I've tried to fix it, and now think this is impossible: Even though
XP provides an AttachConsole call (which doesn't exist in earlier
releases or W9x), which allows to write in the console from which
the binary was started, there is apparently no way to tell cmd.exe
that it should wait for completion, instead of immediately giving
a prompt.

I have now reverted the change to create launcher.exe, and install
python.exe and pythonw.exe twice (the second time as extpy.exe and
extpyw.exe).

P.S. Out of curiosity, and to the WINDOWS GURUS ON THIS LIST:
How does cmd.exe know whether the program started is a console
application or not? Is there any API for that? Just looking at
the file being run is clearly insufficient - if the file is
foo.py, it needs to look at python.exe.


From martin at v.loewis.de  Wed Sep  8 20:30:58 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 20:30:52 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
Message-ID: <413F4FE2.2090602@v.loewis.de>

Moore, Paul wrote:
> With option (3), what happens if you run "launcher -console" from a
> command prompt? Does it produce output in the same console window,
> or does it launch a new console?

That was a problem originally, which I have now fixed into

(3') Install two binaries, extpy.exe, and extpyw.exe.

With that approach, what do you think?

> If it's not, I'd like to see how you coded console.exe, as I've
> often needed this sort of behaviour, and never been able to achieve
> it correctly!

As I found, you can use AttachConsole in WXP, but that isn't a full
solution, either (more like a Unix background process).

Regards,
Martin
From martin at v.loewis.de  Wed Sep  8 20:32:36 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 20:32:28 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <ca471dc20409080812183b735a@mail.gmail.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
	<ca471dc20409080812183b735a@mail.gmail.com>
Message-ID: <413F5044.6080205@v.loewis.de>

Guido van Rossum wrote:
> One more thing. I'd like the launcher app's name to begin with "Py".
> Maybe PyLaunch.exe or PyStart.exe?

I now call them extpy.exe and extpyw.exe. I deliberately avoided
a py *prefix*, as this really hurts tab completion if you
interactively hit c:\py<tab>\py<tab>. I actually want to ban py.ico
for that very reason from the python directory.

Regards,
Martin

From theller at python.net  Wed Sep  8 20:34:57 2004
From: theller at python.net (Thomas Heller)
Date: Wed Sep  8 20:36:32 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <413F4F3C.8060905@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed,
	08 Sep 2004 20:28:12 +0200")
References: <413E8D78.2030302@v.loewis.de>
	<ca471dc204090721531f02145a@mail.gmail.com>
	<413F4F3C.8060905@v.loewis.de>
Message-ID: <4qm8a9ta.fsf@python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Guido van Rossum wrote:
>> I frequently use the extension feature in a console context; when I am
>> in a directory full of .py files, I can run any one of them by simply
>> typing its name (and possibly command line arguments). The script will
>> then interact through the existing console window. WIll this work?
>
> No. I didn't (really) know that was possible (although Mr Rivest's
> bug report should have taught me).
>
> I've tried to fix it, and now think this is impossible: Even though
> XP provides an AttachConsole call (which doesn't exist in earlier
> releases or W9x), which allows to write in the console from which
> the binary was started, there is apparently no way to tell cmd.exe
> that it should wait for completion, instead of immediately giving
> a prompt.
>
> I have now reverted the change to create launcher.exe, and install
> python.exe and pythonw.exe twice (the second time as extpy.exe and
> extpyw.exe).
>
> P.S. Out of curiosity, and to the WINDOWS GURUS ON THIS LIST:
> How does cmd.exe know whether the program started is a console
> application or not? Is there any API for that? Just looking at
> the file being run is clearly insufficient - if the file is
> foo.py, it needs to look at python.exe.

It seems to be a flag in the exe header.  A quick google search turned
up this:

http://www.codeguru.com/Cpp/W-P/system/misc/article.php/c2897/

Thomas

From martin at v.loewis.de  Wed Sep  8 20:52:35 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 20:52:27 2004
Subject: [Python-Dev] Console vs. GUI applications
In-Reply-To: <4qm8a9ta.fsf@python.net>
References: <413E8D78.2030302@v.loewis.de>	<ca471dc204090721531f02145a@mail.gmail.com>	<413F4F3C.8060905@v.loewis.de>
	<4qm8a9ta.fsf@python.net>
Message-ID: <413F54F3.30500@v.loewis.de>

Thomas Heller wrote:
> It seems to be a flag in the exe header.  A quick google search turned
> up this:
> 
> http://www.codeguru.com/Cpp/W-P/system/misc/article.php/c2897/

Sure. However, if I do

foo.py

then some part of the system must determine that python.exe is
to be invoked, and then must determine that this is a console
binary. Does that all happen in cmd.exe?

Regards,
Martin
From cce at clarkevans.com  Wed Sep  8 20:53:54 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 20:53:57 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
Message-ID: <20040908185353.GA62848@prometheusresearch.com>

On Wed, Sep 08, 2004 at 02:08:17PM -0400, James Y Knight wrote:
| >This proposal assumes that a corresponding iterator written using
| >this class-based method is possible for existing generators.  The
| >challenge seems to be the identification of distinct states within
| >the generator where suspension could occur.
| 
| That is basically impossible. Essentially *every* operation could 
| possibly raise SuspendIteration, because essentially every operation 
| can call an arbitrary python function, and python functions can raise 
| any exception they want.

If the SuspendIteration() was raised in an arbitrary Python function,
it would close-out the function call due to exception semantics.  So,
a brain-dead situation would have to make each time a function is called
a separate state. The proposal is not implying that this would be
converting arbitrary functions into generators, if they happened to
raise SuspendIteration().

| I think you could still make the proposal work 
| in CPython: if I understand its internals properly, it doesn't need to 
| do a transformation to a class iterator, it simply suspends the frame 
| wherever it is. Thus, being able to suspend at any point in the 
| function would not cause an undue performance degradation.

Ok.

| However, I think it is a deal-breaker for JPython. From the generator 
| PEP: "It's also believed that efficient implementation in Jython 
| requires that the compiler be able to determine potential suspension 
| points at compile-time, and a new keyword makes that easy." If this 
| quote is right about the implementation of Jython (and it seems likely, 
| given the JVM), your proposal makes it impossible to implement 
| generators in Jython.

Ok, beacuse suspension points would now include not only 'yield'
statements, but potentially any function call.  So, it could be quite
inefficient, but it is not impossible.  For an optimization, you could
decorate a function if it could throw a SuspendIteration. If an
non-decorated function threw that exception, it would be a deal-breaker.

| Given that the advantage claimed for this proposal over stackless is 
| that it can be implemented in non-CPython runtimes, I think it still 
| needs some reworking.

Thanks for your feedback.

Best,

Clark
From pedronis at bluewin.ch  Wed Sep  8 20:58:21 2004
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Wed Sep  8 20:56:05 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
Message-ID: <413F564D.2070708@bluewin.ch>

James Y Knight wrote:
> 
> On Sep 7, 2004, at 9:48 PM, Clark C. Evans wrote:
> 
>> I've packaged up the idea of a coroutine facility using iterators and an
>> exception, SuspendIteration.
> 
> 
> Very interesting.
> 
>> This proposal assumes that a corresponding iterator written using
>> this class-based method is possible for existing generators.  The
>> challenge seems to be the identification of distinct states within
>> the generator where suspension could occur.
> 
> 
> That is basically impossible. Essentially *every* operation could 
> possibly raise SuspendIteration, because essentially every operation can 
> call an arbitrary python function, and python functions can raise any 
> exception they want. I think you could still make the proposal work in 
> CPython: if I understand its internals properly, it doesn't need to do a 
> transformation to a class iterator, it simply suspends the frame 
> wherever it is. Thus, being able to suspend at any point in the function 
> would not cause an undue performance degradation.

I don't think it is that simple for CPython either, a single bytecode 
can potentially invoke more then just a single builtin or other python 
code, e.g. calling construction can invoke __new__ and __init__,
and then there are all the cases were descriptors are involved with 
their __get__ etc (and __add__,__radd__...). So bytecodes are not the 
right suspension/resumption granularity because you don't want to 
reinvoke things that could have had side-effects.
So you have all the points per bytecode were python code/builtins can be 
invoked or from another POV an exception can be detected.

If I understand the proposal (which is quite vague), like restartable 
syscalls, there is also the matter that whatever raised the 
SuspendIteration should be retried on resumption of the generator, e.g 
calling nested generator next.

So one would have to cherry pick for each bytecode or similar abstract
operations model relevant suspension/resumption points and it would 
still be quite a daunting task to implement this adding the code
for intra bytecode resumption. (Of course this assumes that capturing
the C stack and similar techniques are out of question)

> 
> However, I think it is a deal-breaker for JPython. From the generator 
> PEP: "It's also believed that efficient implementation in Jython 
> requires that the compiler be able to determine potential suspension 
> points at compile-time, and a new keyword makes that easy." If this 
> quote is right about the implementation of Jython (and it seems likely, 
> given the JVM), your proposal makes it impossible to implement 
> generators in Jython.

a hand coded implementation would be a *lot* of work (beyound practical)
for potentially very bad performance and a resulting messy codebase.
One could also encounter resulting code size problems or issues with the 
verifier.


From theller at python.net  Wed Sep  8 21:10:34 2004
From: theller at python.net (Thomas Heller)
Date: Wed Sep  8 21:10:37 2004
Subject: [Python-Dev] Console vs. GUI applications
In-Reply-To: <413F54F3.30500@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed,
	08 Sep 2004 20:52:35 +0200")
References: <413E8D78.2030302@v.loewis.de>
	<ca471dc204090721531f02145a@mail.gmail.com>
	<413F4F3C.8060905@v.loewis.de> <4qm8a9ta.fsf@python.net>
	<413F54F3.30500@v.loewis.de>
Message-ID: <pt4w8tlh.fsf@python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Thomas Heller wrote:
>> It seems to be a flag in the exe header.  A quick google search turned
>> up this:
>> http://www.codeguru.com/Cpp/W-P/system/misc/article.php/c2897/
>
> Sure. However, if I do
>
> foo.py
>
> then some part of the system must determine that python.exe is
> to be invoked, and then must determine that this is a console
> binary. Does that all happen in cmd.exe?

I cannot answer this question (and I'm not the windows guru either;),
but using regmon from sysinternals shows that cmd.exe does more than 500
registry accesses before python.exe is finally started - so it *does* a
lot of work.

Thomas

From cce at clarkevans.com  Wed Sep  8 21:20:57 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 21:20:58 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <413F564D.2070708@bluewin.ch>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
	<413F564D.2070708@bluewin.ch>
Message-ID: <20040908192056.GB62848@prometheusresearch.com>

On Wed, Sep 08, 2004 at 08:58:21PM +0200, Samuele Pedroni wrote:
| If I understand the proposal (which is quite vague), like restartable 
| syscalls, there is also the matter that whatever raised the 
| SuspendIteration should be retried on resumption of the generator, e.g 
| calling nested generator next.

That's exactly the idea.  The SuspendIteration exception could contain,
however, the file/socket that it is blocked on, so a smart scheduler
need not be blindly restarting things.

| So one would have to cherry pick for each bytecode or similar abstract
| operations model relevant suspension/resumption points and it would 
| still be quite a daunting task to implement this adding the code
| for intra bytecode resumption. (Of course this assumes that capturing
| the C stack and similar techniques are out of question)

I was assuming that only calls within the generator to next(), implicit
or otherwise, would be suspension points.

This covers all of my use cases anyway.  In the other situations, if
they are even useful, don't do that.  Convert the SuspendIteration to a
RuntimeError, or resume at the previous suspension point?

The idea of the PEP was that a nested-generator context provides this
limited set of suspension points to make an implementation possible.

Kind Regards,

Clark

-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From tim.peters at gmail.com  Wed Sep  8 21:24:49 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Sep  8 21:25:09 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <413E8D78.2030302@v.loewis.de>
References: <413E8D78.2030302@v.loewis.de>
Message-ID: <1f7befae04090812247d1589b0@mail.gmail.com>

[Martin v. L?wis]
> I recently looked into properly implementing the "Register Extensions"
> feature in the installer; in 2.4a3, not selecting that doesn't really
> work. The problem is that MSI only supports installing either both
> the "extension server" (the .exe) and the extension, or neither. So
> you can chose not to install word.exe, and it won't install the .doc
> extension; if you install word.exe, it will associate .doc with it.
>
> For Python, this leaves us with three options:
> 1. Don't make registration of extensions optional; always associate
>    .py, .pyc, .pyw, .pyo.

-1.  If we do that, I'll never install an alpha or beta again <0.5 wink>.

> 2. Don't support installation-on-demand for extensions. This means
>    to not use the MSI extension machinery at all, but to directly
>    write the registry keys that build the extension. Installing
>    these keys can then be made optional.

+1.  I may or may not want to change/create .py (etc) extensions.  I
never before heard of the concept of "install-on-demand for
extensions", and I don't think I want to.

> 3. Provide another binary that is the "extension server", and
>    install that independently of python.exe, and pythonw.exe.
>    In CVS, I have implemented this approach to see whether it
>    works (it does), and called this binary "launcher.exe". It
>    is a Windows app which supports a -console argument which also
>    makes it a console app. This is the the binary that gets
>    associated with all four extensions, for the "open" verb.

This is soooo convoluted compared to what it's trying to achieve: 
write the registry entries, or don't, end of story.  It would be
nicest if the code to fiddle the registry were materialized as a .reg
file.  Then (later, and manually) switching among multiple installed
Pythons would be easy.
From pedronis at bluewin.ch  Wed Sep  8 21:33:10 2004
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Wed Sep  8 21:30:49 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908192056.GB62848@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>	<413F564D.2070708@bluewin.ch>
	<20040908192056.GB62848@prometheusresearch.com>
Message-ID: <413F5E76.4050805@bluewin.ch>

Clark C. Evans wrote:

> On Wed, Sep 08, 2004 at 08:58:21PM +0200, Samuele Pedroni wrote:
> | If I understand the proposal (which is quite vague), like restartable 
> | syscalls, there is also the matter that whatever raised the 
> | SuspendIteration should be retried on resumption of the generator, e.g 
> | calling nested generator next.
> 
> That's exactly the idea.  The SuspendIteration exception could contain,
> however, the file/socket that it is blocked on, so a smart scheduler
> need not be blindly restarting things.
> 
> | So one would have to cherry pick for each bytecode or similar abstract
> | operations model relevant suspension/resumption points and it would 
> | still be quite a daunting task to implement this adding the code
> | for intra bytecode resumption. (Of course this assumes that capturing
> | the C stack and similar techniques are out of question)
> 
> I was assuming that only calls within the generator to next(), implicit
> or otherwise, would be suspension points.

I missed that.

> 
> This covers all of my use cases anyway.  In the other situations, if
> they are even useful, don't do that.  Convert the SuspendIteration to a
> RuntimeError, or resume at the previous suspension point?
> 
> The idea of the PEP was that a nested-generator context provides this
> limited set of suspension points to make an implementation possible.

then the PEP needs clarification because I had the impression that

def g(src):
    data = src.read()
    yield data
    data = src.read()
    yield data

the read itself could throw a SuspendIteration, and upon the sucessive 
next the src.read() itself would be retried. But if it's only nexts than 
can be suspension points then the generator would be not resumable in 
this case. Which is the case?
From mal at egenix.com  Wed Sep  8 21:44:32 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Sep  8 21:45:00 2004
Subject: [Python-Dev] Re: Re:
	Re:	AlternativeImplementation	forPEP292:SimpleString Substitutions
In-Reply-To: <chnidf$epp$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>	<413F3605.7090707@egenix.com>
	<chnidf$epp$1@sea.gmane.org>
Message-ID: <413F6120.7090603@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
>>The whole point in adding Unicode to the language was to make
>>the difference between text and binary data clear and visible
>>at the type level.
> 
> well, when I wrote the Unicode type, the whole point was to be able to
> make it easy to handle Unicode text.  no more, no less.

... and the Unicode integration made that a reality :-)

In todays globalized world, the only sane way to deal with
different scripts is through Unicode, which is why I
believe that text data should eventually always be stored in
Unicode objects - regardless of whether it takes more memory
or not.

(If you compare development time to prices of a few GB extra
RAM, the effort needed to maintain text in non-Unicode
formats simply doesn't pay off anymore.)

>>If we start to store text data in Unicode now and leave binary
>>data in 8-bit strings, then the move to Unicode strings literals
>>will be much smoother in P3k.
> 
> hopefully, the P3K string design will take a lot more into account than
> text-vs-binary; there are many ways to represent text, and many ways
> to store binary data, and many usage patterns for them both.  a good
> design should take most of this into account.  (google for "stringlib" for
> some work I'm doing in this area)

Ah, now I know where you're coming from :-) Shift tables
don't work well in the Unicode world with its large alphabet.

BTW, you might want to look at the BMS implementation I did
for mxTextTools. Here's a nice reference for pattern
matching:

    http://www-igm.univ-mlv.fr/~lecroq/string/index.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From cce at clarkevans.com  Wed Sep  8 21:58:53 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Wed Sep  8 21:59:01 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <413F5E76.4050805@bluewin.ch>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
	<413F564D.2070708@bluewin.ch>
	<20040908192056.GB62848@prometheusresearch.com>
	<413F5E76.4050805@bluewin.ch>
Message-ID: <20040908195852.GB98180@prometheusresearch.com>

On Wed, Sep 08, 2004 at 09:33:10PM +0200, Samuele Pedroni wrote:
| Clark C. Evans wrote:
| >I was assuming that only calls within the generator to next(), implicit
| >or otherwise, would be suspension points.
| 
| I missed that.

*nod*  I will fix the PEP.

| >This covers all of my use cases anyway.  In the other situations, if
| >they are even useful, don't do that.  Convert the SuspendIteration to a
| >RuntimeError, or resume at the previous suspension point?
| >
| >The idea of the PEP was that a nested-generator context provides this
| >limited set of suspension points to make an implementation possible.
| 
| then the PEP needs clarification because I had the impression that
| 
| def g(src):
|    data = src.read()
|    yield data
|    data = src.read()
|    yield data

The data producers would all be iterators, ones that that could 
possibly raise SuspendIteration() from within their next() method.

| the read itself could throw a SuspendIteration

If read() did raise a SuspendIteration() exception, then it would
make sense to terminate the generator, perhaps with a RuntimeError.
I just hadn't considered this case.  If someone has a clever 
solution that makes this case work, great, but its not something
that I was contemplating.

| and upon the sucessive 
| next the src.read() itself would be retried.
| But if it's only nexts than 
| can be suspension points then the generator would be not resumable in 
| this case. 

Right.

I was musing (but it's not in the PEP) that, iter() would sprout an
option that let the producer know if it can suspend.  If a generator
that was itself called with this suspend flag asked for a child
generator, then the suspend flag would be carried.  But this
is a separate issue.

Thanks for thinking about this PEP.

Clark


-- 
Clark C. Evans                      Prometheus Research, LLC.
                                    http://www.prometheusresearch.com/
    o                               office: +1.203.777.2550 
  ~/ ,                              mobile: +1.203.444.0557 
 //
((   Prometheus Research: Transforming Data Into Knowledge
 \\  ,
   \/    - Research Exchange Database
   /\    - Survey & Assessment Technologies
   ` \   - Software Tools for Researchers
    ~ *
From pedronis at bluewin.ch  Wed Sep  8 22:14:54 2004
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Wed Sep  8 22:12:31 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908195852.GB98180@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
	<413F564D.2070708@bluewin.ch>
	<20040908192056.GB62848@prometheusresearch.com>
	<413F5E76.4050805@bluewin.ch>
	<20040908195852.GB98180@prometheusresearch.com>
Message-ID: <413F683E.1050204@bluewin.ch>

Clark C. Evans wrote:

> On Wed, Sep 08, 2004 at 09:33:10PM +0200, Samuele Pedroni wrote:
> | Clark C. Evans wrote:
> | >I was assuming that only calls within the generator to next(), implicit
> | >or otherwise, would be suspension points.
> | 
> | I missed that.
> 
> *nod*  I will fix the PEP.
> 
> | >This covers all of my use cases anyway.  In the other situations, if
> | >they are even useful, don't do that.  Convert the SuspendIteration to a
> | >RuntimeError, or resume at the previous suspension point?
> | >
> | >The idea of the PEP was that a nested-generator context provides this
> | >limited set of suspension points to make an implementation possible.
> | 
> | then the PEP needs clarification because I had the impression that
> | 
> | def g(src):
> |    data = src.read()
> |    yield data
> |    data = src.read()
> |    yield data
> 
> The data producers would all be iterators, ones that that could 
> possibly raise SuspendIteration() from within their next() method.
> 
> | the read itself could throw a SuspendIteration
> 
> If read() did raise a SuspendIteration() exception, then it would
> make sense to terminate the generator, perhaps with a RuntimeError.
> I just hadn't considered this case.  If someone has a clever 
> solution that makes this case work, great, but its not something
> that I was contemplating.

thinking about it, but this is not different:

  def g(src):
     data = src.next()
     yield data
     data = src.next()
     yield data

  def g(src):
     demand = src.next
     data = demand()
     yield data
     data = demand()
     yield data

what is supposed to happen here, notice that you may know that src.next 
is an iterator 'next' at runtime but not at compile time.
From pje at telecommunity.com  Wed Sep  8 23:38:32 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Sep  8 23:37:58 2004
Subject: [Python-Dev] PEP 302 and 'reload()'
Message-ID: <5.1.1.6.0.20040908172822.020f0a40@mail.telecommunity.com>

It appears to me there is an error in both PEP 302's specification and its 
implementation concerning the correct operation of reload().  First, it says:

     The load_module() method has a few responsibilities that it must
     fulfill *before* it runs any code:

     - It must create the module object.  From Python this can be done
       via the new.module() function, the imp.new_module() function or
       via the module type object; from C with the PyModule_New()
       function or the PyImport_ModuleAdd() function.

This should probably say that if the module already exists in sys.modules, 
it should reuse the existing module object, rather than creating a new 
one.  Otherwise, 'reload()' cannot fulfill its contract.

Second, the actual implementation of PyImport_ReloadModule doesn't actually 
use a loader object, so reload() doesn't work with import hooks at 
all.  There's an SF bug report for this, and a patch to fix it (that also 
adds a test to test_importhooks to ensure that 'reload()' actually invokes 
the loader.

Are there any objections to me fixing either/both of these, and backporting 
the bugfix to the 2.3 maintenance branch?

Also, should PyImport_ReloadModule use the import lock?  It doesn't 
currently, but I'm not clear on why it doesn't.

From martin at v.loewis.de  Wed Sep  8 23:51:25 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep  8 23:51:17 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <1f7befae04090812247d1589b0@mail.gmail.com>
References: <413E8D78.2030302@v.loewis.de>
	<1f7befae04090812247d1589b0@mail.gmail.com>
Message-ID: <413F7EDD.7030600@v.loewis.de>

Tim Peters wrote:
> +1.  I may or may not want to change/create .py (etc) extensions.  I
> never before heard of the concept of "install-on-demand for
> extensions", and I don't think I want to.

Ok, I'll wait for some more votes, and failing them, I'll revert the
entire advertisement (of extensions) change.

Creating a .reg file is a different issue, which may or may not happen.

Regards,
Martin
From pf_moore at yahoo.co.uk  Thu Sep  9 00:03:28 2004
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Thu Sep  9 00:03:11 2004
Subject: [Python-Dev] Re: Install-on-first-use vs. optional extensions
References: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
	<413F4FE2.2090602@v.loewis.de>
Message-ID: <u3c1squz3.fsf@yahoo.co.uk>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Moore, Paul wrote:
>> With option (3), what happens if you run "launcher -console" from a
>> command prompt? Does it produce output in the same console window,
>> or does it launch a new console?
>
> That was a problem originally, which I have now fixed into
>
> (3') Install two binaries, extpy.exe, and extpyw.exe.
>
> With that approach, what do you think?

I tend to agree with Tim - I'm not at all sure what "install-on-first-
use" is doing, and I don't think I care. All I care about are being
able to *not* install the extensions (for alpha/beta releases) and
being able *to* install them (for final releases). I don't have any
problem that this feature could solve, so I don't have a valid reason
to have an opinion...

Paul.
-- 
The only reason some people get lost in thought is because it's
unfamiliar territory -- Paul Fix

From gvanrossum at gmail.com  Thu Sep  9 00:55:16 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  9 00:55:23 2004
Subject: [Python-Dev] Re: Install-on-first-use vs. optional extensions
In-Reply-To: <u3c1squz3.fsf@yahoo.co.uk>
References: <16E1010E4581B049ABC51D4975CEDB8803060F85@UKDCX001.uk.int.atosorigin.com>
	<413F4FE2.2090602@v.loewis.de> <u3c1squz3.fsf@yahoo.co.uk>
Message-ID: <ca471dc2040908155534b6df34@mail.gmail.com>

I'm with Tim too -- the MSI solution seems too convoluted to bother.


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From janssen at parc.com  Thu Sep  9 02:19:47 2004
From: janssen at parc.com (Bill Janssen)
Date: Thu Sep  9 02:22:09 2004
Subject: [Python-Dev] PEP 328 - Relative Imports 
In-Reply-To: Your message of "Wed, 08 Sep 2004 07:47:35 PDT."
	<413F1B87.90301@egenix.com> 
Message-ID: <04Sep8.171951pdt."58612"@synergy1.parc.xerox.com>

> I'd like to request that the latter change be postponed to
> Python 3k, or that some other way of supporting the above
> scenarios is provided that can be enabled in the application.

Well said.

Bill
From gvanrossum at gmail.com  Thu Sep  9 04:20:29 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  9 04:20:39 2004
Subject: [Python-Dev] Re: Re: Missing arguments in RE functions
In-Reply-To: <chnhg8$brq$1@sea.gmane.org>
References: <chnc8q$qcm$1@sea.gmane.org>
	<001501c495ca$d3a8bd40$e841fea9@oemcomputer>
	<chnhg8$brq$1@sea.gmane.org>
Message-ID: <ca471dc20409081920160e29b6@mail.gmail.com>

> If it's up to me, it's a clear "not worth it".  The function API is only there for
> trivial cases; if you need the full RE power, use pattern objects (you have
> to use them anyway if you're serious about RE:s).
> 
> but I'm an API minimalist; someone else will have to make the final decision
> on this one (Guido, what's your take on API size issues?)

I'm with /F here. There are already so many ways to do it, adding more
isn't going to make things easier, and I'd rather see the API stable.
Also, it's awfully close to 2.4b1.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gvanrossum at gmail.com  Thu Sep  9 04:29:06 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Sep  9 04:29:37 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <413F23E0.2090908@egenix.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com> <cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com>
	<ca471dc2040908080861941ab2@mail.gmail.com>
	<413F23E0.2090908@egenix.com>
Message-ID: <ca471dc20409081929333228b2@mail.gmail.com>

> > Only in Python 3.0.
> 
> We better start early to ever reach the point of making
> a clear distinction between text and binary data in P3k.

The introduction of a bytes type in Python 2.5 should be a good start.

> > But even so, deriving from Unicode (or str) means the template class
> > inherits a lot of unwanted operations. While I can see that
> > concatenating templates probably works, slicing them or converting to
> > lowercase etc. make no sense. IMO the standard Template class should
> > implement a "narrow" interface, i.e. *only* the template expansion
> > method (__mod__ or something else), so it's clear that other
> > compatible template classes shouldn't have to implement anything
> > besides that. This avoids the issues we have with the mapping
> > protocol: when does an object implement enough of the mapping API to
> > be usable? That's currently ill-defined; sometimes, __getitem__ is all
> > you need, sometimes __contains__ is required, sometimes keys, rarely
> > setdefault.
> 
> Looks like it's ont even clear what templating itself should
> mean... you're talking about a templating interface here, not an
> object type, like Barry is (for the sake of making Templates compatible
> to i18n tools like gettext).

I don't know zip about i18n or gettext.

But I thought we had plenty of time since Barry has offered to
withdraw the PEP 292 implementation for 2.4?


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From gmccaughan at synaptics-uk.com  Thu Sep  9 10:39:41 2004
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Thu Sep  9 10:40:17 2004
Subject: [Python-Dev] Re: Re:
	=?iso-8859-1?q?Re=3A=09AlternativeImplementation=09forPEP292=3ASimpleStri?=
	=?iso-8859-1?q?ng?= Substitutions
In-Reply-To: <413F6120.7090603@egenix.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<chnidf$epp$1@sea.gmane.org> <413F6120.7090603@egenix.com>
Message-ID: <200409090939.41873.gmccaughan@synaptics-uk.com>

Marc-Andre Lemburg wrote:

> In todays globalized world, the only sane way to deal with
> different scripts is through Unicode, which is why I
> believe that text data should eventually always be stored in
> Unicode objects - regardless of whether it takes more memory
> or not.
> 
> (If you compare development time to prices of a few GB extra
> RAM, the effort needed to maintain text in non-Unicode
> formats simply doesn't pay off anymore.)

This is not as obvious as it seems, because the "few GB
extra RAM" is a price paid by everyone who *uses* the
software. Granted, it's quite common for software to be
only run ever on one or two machines in the company where
it was developed, but not all software is used that way.

Also: the price of "a few GB extra RAM" is not always as
low as it seems. If adding 2GB means moving from 3GB to
5GB, it may mean replacing the CPU and the OS.

That said, I strongly agree that all textual data should
be Unicode as far as the developer is concerned; but, at
least in the USA :-), it makes sense to have an optimized
representation that saves space for ASCII-only text, just
as we have an optimized representation for small integers.
(The benefit is potentially much greater in that case,
though.)

-- 
g


From Paul.Moore at atosorigin.com  Thu Sep  9 10:56:00 2004
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Thu Sep  9 10:56:05 2004
Subject: [Python-Dev] Console vs. GUI applications
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060F88@UKDCX001.uk.int.atosorigin.com>

From: "Martin v. L?wis"
>Thomas Heller wrote:
>> It seems to be a flag in the exe header.  A quick google search turned
>> up this:
>> 
>> http://www.codeguru.com/Cpp/W-P/system/misc/article.php/c2897/
>
> Sure. However, if I do
>
> foo.py
>
> then some part of the system must determine that python.exe is
> to be invoked, and then must determine that this is a console
> binary. Does that all happen in cmd.exe?

I believe so. The relevant Windows API call is CreateProcess, which
only handles EXEs (and maybe some obscure cases like COM and PIF files).
Everything else gets done in user code (in this case CMD.EXE).

So CMD.EXE runs CreateProcess on your launcher.exe. CreateProcess checks
a flag (the "subsystem") in the executable header, and acts dependent on
that. For a "console" executable, it leaves the new process attached to
the current console (the one CMD.EXE is using) and for a "windows"
executable, it detaches the process from any console. (Default behaviour
- there are flags which can affect this).

I can't work out how CMD.EXE "knows" to wait for a child process to release
the console (immediately for a windows process, when it terminates for a
console process) but clearly it does... A test shows that it *is* possible
for two console processes to share the console. The result is an unusable
mess, though, so we should be glad cmd.exe does avoid this :-)

Paul.


__________________________________________________________________________
This e-mail and the documents attached are confidential and intended 
solely for the addressee; it may also be privileged. If you receive this 
e-mail in error, please notify the sender immediately and destroy it.
As its integrity cannot be secured on the Internet, the Atos Origin group 
liability cannot be triggered for the message content. Although the 
sender endeavours to maintain a computer virus-free network, the sender 
does not warrant that this transmission is virus-free and will not be 
liable for any damages resulting from any virus transmitted.
__________________________________________________________________________
From michael.walter at gmail.com  Thu Sep  9 11:32:44 2004
From: michael.walter at gmail.com (Michael Walter)
Date: Thu Sep  9 11:32:51 2004
Subject: [Python-Dev] Console vs. GUI applications
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F88@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F88@UKDCX001.uk.int.atosorigin.com>
Message-ID: <877e9a1704090902326f2373e9@mail.gmail.com>

I guessed CMD.EXE would run ShellExecute(), to which you can pass a
filename such as "foo.py". Didn't verify this tho :)

Cheers,
Michael


On Thu, 9 Sep 2004 09:56:00 +0100, Moore, Paul
<paul.moore@atosorigin.com> wrote:
> From: "Martin v. L?wis"
> >Thomas Heller wrote:
> >> It seems to be a flag in the exe header.  A quick google search turned
> >> up this:
> >>
> >> http://www.codeguru.com/Cpp/W-P/system/misc/article.php/c2897/
> >
> > Sure. However, if I do
> >
> > foo.py
> >
> > then some part of the system must determine that python.exe is
> > to be invoked, and then must determine that this is a console
> > binary. Does that all happen in cmd.exe?
> 
> I believe so. The relevant Windows API call is CreateProcess, which
> only handles EXEs (and maybe some obscure cases like COM and PIF files).
> Everything else gets done in user code (in this case CMD.EXE).
> 
> So CMD.EXE runs CreateProcess on your launcher.exe. CreateProcess checks
> a flag (the "subsystem") in the executable header, and acts dependent on
> that. For a "console" executable, it leaves the new process attached to
> the current console (the one CMD.EXE is using) and for a "windows"
> executable, it detaches the process from any console. (Default behaviour
> - there are flags which can affect this).
> 
> I can't work out how CMD.EXE "knows" to wait for a child process to release
> the console (immediately for a windows process, when it terminates for a
> console process) but clearly it does... A test shows that it *is* possible
> for two console processes to share the console. The result is an unusable
> mess, though, so we should be glad cmd.exe does avoid this :-)
> 
> Paul.
> 
> __________________________________________________________________________
> This e-mail and the documents attached are confidential and intended
> solely for the addressee; it may also be privileged. If you receive this
> e-mail in error, please notify the sender immediately and destroy it.
> As its integrity cannot be secured on the Internet, the Atos Origin group
> liability cannot be triggered for the message content. Although the
> sender endeavours to maintain a computer virus-free network, the sender
> does not warrant that this transmission is virus-free and will not be
> liable for any damages resulting from any virus transmitted.
> __________________________________________________________________________
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>
From fredrik at pythonware.com  Thu Sep  9 11:49:57 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Sep  9 11:50:01 2004
Subject: [Python-Dev] Re: Console vs. GUI applications
References: <16E1010E4581B049ABC51D4975CEDB8803060F88@UKDCX001.uk.int.atosorigin.com>
	<877e9a1704090902326f2373e9@mail.gmail.com>
Message-ID: <chp904$9bt$1@sea.gmane.org>

Michael Walter wrote:

> I guessed CMD.EXE would run ShellExecute(), to which you can pass a
> filename such as "foo.py". Didn't verify this tho :)

> dumpbin /imports \windows\system32\cmd.exe | grep Shell

> dumpbin /imports \windows\system32\cmd.exe | grep Create
      7C81E968    4A  CreateDirectoryW
      7C802332    66  CreateProcessW
      7C810976    52  CreateFileW

(I doubt ShellExecute gives cmd.exe the control it needs.  besides, ShellExecute
is part of the shell layer, not the core Windows API.  and the shell layer depends
on everyone and his brother; I doubt they want the command line interface to
depend on the GDI layer, RPC services, etc.)

</F> 


From michael.walter at gmail.com  Thu Sep  9 11:53:10 2004
From: michael.walter at gmail.com (Michael Walter)
Date: Thu Sep  9 11:53:13 2004
Subject: [Python-Dev] Re: Console vs. GUI applications
In-Reply-To: <chp904$9bt$1@sea.gmane.org>
References: <16E1010E4581B049ABC51D4975CEDB8803060F88@UKDCX001.uk.int.atosorigin.com>
	<877e9a1704090902326f2373e9@mail.gmail.com>
	<chp904$9bt$1@sea.gmane.org>
Message-ID: <877e9a1704090902534aaa4ec7@mail.gmail.com>

Ah, I see.

Thanks,
Michael

On Thu, 9 Sep 2004 11:49:57 +0200, Fredrik Lundh <fredrik@pythonware.com> wrote:
> Michael Walter wrote:
> 
> > I guessed CMD.EXE would run ShellExecute(), to which you can pass a
> > filename such as "foo.py". Didn't verify this tho :)
> 
> > dumpbin /imports \windows\system32\cmd.exe | grep Shell
> 
> > dumpbin /imports \windows\system32\cmd.exe | grep Create
>       7C81E968    4A  CreateDirectoryW
>       7C802332    66  CreateProcessW
>       7C810976    52  CreateFileW
> 
> (I doubt ShellExecute gives cmd.exe the control it needs.  besides, ShellExecute
> is part of the shell layer, not the core Windows API.  and the shell layer depends
> on everyone and his brother; I doubt they want the command line interface to
> depend on the GDI layer, RPC services, etc.)
> 
> </F> 
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>
From arigo at tunes.org  Thu Sep  9 12:14:44 2004
From: arigo at tunes.org (Armin Rigo)
Date: Thu Sep  9 12:20:11 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040908192056.GB62848@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
	<413F564D.2070708@bluewin.ch>
	<20040908192056.GB62848@prometheusresearch.com>
Message-ID: <20040909101444.GA2877@vicky.ecs.soton.ac.uk>

Hi,

I agree with Samuele that the proposal is far too vague currently.  You should
try to describe what precisely should occur in each situation.

A major problem I see with the proposal is that you can describe what should
occur in some situations by presenting source code snippets; such descriptions
correspond easily to possible semantics at the bytecode level.  But bytecode
is not a natural granularity for coroutine issues.  Frames (either of
generators or functions) execute operations that may invoke new frames, and
all frames in the chain except possibly the most recent one need to be
suspended *during* the execution of their current bytecode.  For example, a
generator f() may currently be calling a generator g() with a FOR_ITER
bytecode ('for' statement), a CALL_FUNCTION (calling next()), or actually
anything else like a BINARY_ADD which calls a nb_add implemented in C which
indirectly calls back to Python code.

For this reason it is not reasonably possible to implement restartable
exceptions in general: when an exception is caught, not all the C state is
saved (i.e. you don't know where, *within* the execution of a bytecode, you
should restart).  Your PEP is very similar to restartable exceptions: their
possible semantics are difficult to specify in general.  You may try to do
that to understand what I mean.

This doesn't mean that it is impossible to figure out a more limited concept,
like you are trying to do.  However keeping the "restartable exception" idea
in mind should help focusing on the difficult problems and where restrictions
are needed.

I think that Stackless contains all the solutions in this area, and I'm not
talking about the C stack hacking.  Stackless is sometimes able to switch
coroutines without hacking at the C stack.  I think that if any coroutine
support is ever going to be added to CPython it will be done in a similar way.  
(Generators were also inspired by Stackless, BTW.)  (Also note that although
the generator syntax is nice and helpful, it would have been possible to write
generators without any custom 'yield' syntax if we had restartable exceptions;
this makes the latter idea more general and independent from generators.)


A bient?t,

Armin.
From garth at garthy.com  Thu Sep  9 13:20:39 2004
From: garth at garthy.com (Garth)
Date: Thu Sep  9 13:20:51 2004
Subject: [Python-Dev] Re: Console vs. GUI applications
In-Reply-To: <877e9a1704090902534aaa4ec7@mail.gmail.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F88@UKDCX001.uk.int.atosorigin.com>	<877e9a1704090902326f2373e9@mail.gmail.com>	<chp904$9bt$1@sea.gmane.org>
	<877e9a1704090902534aaa4ec7@mail.gmail.com>
Message-ID: <41403C87.3070705@garthy.com>

use depends.exe to open cmd.exe

It delay loads shell32.dll and uses the functions ShellExecuteExW and 
SHChangeNotify so it may use them.

Garth

Michael Walter wrote:

>Ah, I see.
>
>Thanks,
>Michael
>
>On Thu, 9 Sep 2004 11:49:57 +0200, Fredrik Lundh <fredrik@pythonware.com> wrote:
>  
>
>>Michael Walter wrote:
>>
>>    
>>
>>>I guessed CMD.EXE would run ShellExecute(), to which you can pass a
>>>filename such as "foo.py". Didn't verify this tho :)
>>>      
>>>
>>>dumpbin /imports \windows\system32\cmd.exe | grep Shell
>>>      
>>>
>>>dumpbin /imports \windows\system32\cmd.exe | grep Create
>>>      
>>>
>>      7C81E968    4A  CreateDirectoryW
>>      7C802332    66  CreateProcessW
>>      7C810976    52  CreateFileW
>>
>>(I doubt ShellExecute gives cmd.exe the control it needs.  besides, ShellExecute
>>is part of the shell layer, not the core Windows API.  and the shell layer depends
>>on everyone and his brother; I doubt they want the command line interface to
>>depend on the GDI layer, RPC services, etc.)
>>
>></F> 
>>
>>
>>
>>
>>_______________________________________________
>>Python-Dev mailing list
>>Python-Dev@python.org
>>http://mail.python.org/mailman/listinfo/python-dev
>>Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>>
>>    
>>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/garth%40garthy.com
>
>  
>

From garth at garthy.com  Thu Sep  9 13:39:33 2004
From: garth at garthy.com (Garth)
Date: Thu Sep  9 13:39:35 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <413E8D78.2030302@v.loewis.de>
References: <413E8D78.2030302@v.loewis.de>
Message-ID: <414040F5.4030706@garthy.com>

Couldn't you conditionally run  RegisterExtensionInfo?

And set this in a dialog checkbox? (Which I don't know haow to do in msi 
+ python)

This is my guess at a patch

--- oldsequence.py   Thu Sep  9 12:35
:51 2004
+++ sequence.py      Thu Sep  9 12:35
:31 2004
@@ -50,7 +50,7 @@
 (u'PublishFeatures', None, 6300),
 (u'PublishProduct', None, 6400),
 (u'RegisterClassInfo', None, 4600),
-(u'RegisterExtensionInfo', None, 4700),
+(u'RegisterExtensionInfo', 'INSTALLEXT=1', 4700),
 (u'RegisterMIMEInfo', None, 4900),
 (u'RegisterProgIdInfo', None, 4800),
 (u'AllocateRegistrySpace', u'NOT Installed', 1550),


Martin v. L?wis wrote:

> I recently looked into properly implementing the "Register Extensions"
> feature in the installer; in 2.4a3, not selecting that doesn't really
> work. The problem is that MSI only supports installing either both
> the "extension server" (the .exe) and the extension, or neither. So
> you can chose not to install word.exe, and it won't install the .doc
> extension; if you install word.exe, it will associate .doc with it.
>
> For Python, this leaves us with three options:
> 1. Don't make registration of extensions optional; always associate
>    .py, .pyc, .pyw, .pyo.
> 2. Don't support installation-on-demand for extensions. This means
>    to not use the MSI extension machinery at all, but to directly
>    write the registry keys that build the extension. Installing
>    these keys can then be made optional.
> 3. Provide another binary that is the "extension server", and
>    install that independently of python.exe, and pythonw.exe.
>    In CVS, I have implemented this approach to see whether it
>    works (it does), and called this binary "launcher.exe". It
>    is a Windows app which supports a -console argument which also
>    makes it a console app. This is the the binary that gets
>    associated with all four extensions, for the "open" verb.
>
> Currently, I'm in favour of using option 3, but I'd like to hear
> whether people would prefer something else instead.
>
> Regards,
> Martin
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/garth%40garthy.com
>

From nas at arctrix.com  Thu Sep  9 20:07:44 2004
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Sep  9 20:07:49 2004
Subject: [Python-Dev] unicode inconsistency?
Message-ID: <20040909180743.GA31140@mems-exchange.org>

Perhaps this is more approprate for python-list but I looks like a
bug to me.  Example code:

    class A:
        def __str__(self):
            return u'\u1234'

    '%s' % u'\u1234' # this works
    '%s' % A() # this doesn't work

It will work if 'A' subclasses from 'unicode' but should not be
necessary, IMHO.  Any reason why this shouldn't be fixed?

  Neil
From aahz at pythoncraft.com  Thu Sep  9 20:09:56 2004
From: aahz at pythoncraft.com (Aahz)
Date: Thu Sep  9 20:10:00 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909180743.GA31140@mems-exchange.org>
References: <20040909180743.GA31140@mems-exchange.org>
Message-ID: <20040909180955.GA28902@panix.com>

On Thu, Sep 09, 2004, Neil Schemenauer wrote:
>
> Perhaps this is more approprate for python-list but I looks like a
> bug to me.  Example code:
> 
>     class A:
>         def __str__(self):
>             return u'\u1234'
> 
>     '%s' % u'\u1234' # this works
>     '%s' % A() # this doesn't work
> 
> It will work if 'A' subclasses from 'unicode' but should not be
> necessary, IMHO.  Any reason why this shouldn't be fixed?

Check the recent python-dev archives for a long and nauseating thread
about interactions between __str__ and unicode.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From martin at v.loewis.de  Thu Sep  9 20:29:26 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep  9 20:29:17 2004
Subject: [Python-Dev] Install-on-first-use vs. optional extensions
In-Reply-To: <414040F5.4030706@garthy.com>
References: <413E8D78.2030302@v.loewis.de> <414040F5.4030706@garthy.com>
Message-ID: <4140A106.4020100@v.loewis.de>

Garth wrote:
> Couldn't you conditionally run  RegisterExtensionInfo?

This is what I currently do (see msi.py:build_database).
Unfortunately, it doesn't work: Installer then unconditionally
runs UnregisterExtensionInfo first, which removes any old
extension information before installing a new one.

Now, this could also be made conditional, although defining
the condition is difficult: If the user changes the extension
from "installed" to "absent", UnregisterExtensionInfo *should*
run. In any case, uninstalling the entire package (i.e.
executing the toplevel REMOVE action) doesn't run the
InstallExecuteSequence (I believe), which further complicates
issues.

I've played with a number of options, and could not make it
to work. I have given up now, but if you find a solution,
please let me know.

Regards,
Martin
From martin at v.loewis.de  Thu Sep  9 20:42:38 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep  9 20:42:29 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909180955.GA28902@panix.com>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com>
Message-ID: <4140A41E.9080705@v.loewis.de>

Aahz wrote:
>>It will work if 'A' subclasses from 'unicode' but should not be
>>necessary, IMHO.  Any reason why this shouldn't be fixed?
> 
> 
> Check the recent python-dev archives for a long and nauseating thread
> about interactions between __str__ and unicode.

Although that really doesn't answer this particular question. It was
about str() and its interaction with __str__ and __unicode__, and
whether Python should support __unicode__.

For the specific issue, I would maintain that str() should always
return string objects. I'm not so sure about %s since, as Neil
observes, '%s' % unicode_string gives a unicode result. I can't see
any harm by supporting this operation also if __str__ returns
a Unicode object.

Regards,
Martin
From tim.peters at gmail.com  Thu Sep  9 20:44:56 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  9 20:44:58 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909180743.GA31140@mems-exchange.org>
References: <20040909180743.GA31140@mems-exchange.org>
Message-ID: <1f7befae04090911441c85dfd9@mail.gmail.com>

[Neil Schemenauer]
> Perhaps this is more approprate for python-list but I looks like a
> bug to me.  Example code:
>
>    class A:
>        def __str__(self):
>            return u'\u1234'
> 
>    '%s' % u'\u1234' # this works
>    '%s' % A() # this doesn't work
> 
> It will work if 'A' subclasses from 'unicode' but should not be
> necessary, IMHO.

You know better than to say "doesn't work".  I assume you mean the
latter raises UnicodeEncodeError.

> Any reason why this shouldn't be fixed?

Didn't we just go thru this, last week or so?  PyObject_Str() never
returns a unicode (it returns a str).  That is, str(A()) raises
UnicodeEncodeError, and that's out of interpolation's hands.  As
Martin said last time, a __str__ method that returns a unicode doesn't
make much sense.

I'm not sure you really mean "it will work if 'A' subclasses from
'unicode'" either:

>>> class A(unicode):
...   def __str__(self):
...     return u'\u1234'
...
>>> '%s' % A()
u''
>>> len(_)
0
>>>

That is, A.__str__ is ignored if A subclasses from Unicode.  So
"doesn't blow up" seems more on-target than "works" -- I don't think
you expected an empty Unicode string here.
From nas at arctrix.com  Thu Sep  9 20:50:35 2004
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Sep  9 20:50:39 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909180955.GA28902@panix.com>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com>
Message-ID: <20040909185034.GA31277@mems-exchange.org>

On Thu, Sep 09, 2004 at 02:09:56PM -0400, Aahz wrote:
> Check the recent python-dev archives for a long and nauseating
> thread about interactions between __str__ and unicode.

Using __unicode__ doesn't help.  The core problem is that you cannot
create a class that behaves like 'unicode' in this operation without
subclassing from 'unicode'.  That violates the "duck typing" design
principle of Python.  We violate it other places, usually in the
name of efficiency, but I see no good reason in this case.

I suspect the fix will be pretty straight forward (call tp_str and
if the result is 'unicode' the produce a 'unicode' string).  Again,
is there some reason why we don't want this behavior?

  Neil
From tim.peters at gmail.com  Thu Sep  9 21:00:07 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  9 21:00:17 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <4140A41E.9080705@v.loewis.de>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com> <4140A41E.9080705@v.loewis.de>
Message-ID: <1f7befae04090912007305d532@mail.gmail.com>

[Martin v. L?wis]
> ...
> For the specific issue, I would maintain that str() should always
> return string objects.

__builtin__.str() always does -- or raises an exception.  Same for
PyObject_Str() and PyObject_Repr().

> I'm not so sure about %s since, as Neil observes, '%s' % unicode_string
> gives a unicode result.

That's because PyString_Format()'s '%s' processing special-cases the
snot out of unicode *inputs*.  All other inputs to '%s' (and '%r') go
thru PyObject_Str() or PyObject_Repr(), and, as above, those never
return a unicode.  In Neil's case, they raise the expected exception,
and there's nothing sane PyString_Format can do about that.

> I can't see any harm by supporting this operation also if __str__ returns
> a Unicode object.

It doesn't sound like a good idea to me, at least in part because it
would be darned messy to implement short of saying "OK, we don't give
a rip anymore about what type of objects PyObject_{Str,Repr} return",
and that would have broader consequences that just letting Neil get
away with whatever he's trying to do with str.__mod__.
From tim.peters at gmail.com  Thu Sep  9 21:11:51 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  9 21:11:54 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909185034.GA31277@mems-exchange.org>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com>
	<20040909185034.GA31277@mems-exchange.org>
Message-ID: <1f7befae040909121126062330@mail.gmail.com>

[Neil Schemenauer]
> ...
> I suspect the fix will be pretty straight forward (call tp_str and
> if the result is 'unicode' the produce a 'unicode' string).  Again,
> is there some reason why we don't want this behavior?

Yes:  '%s' is documented as "String (converts any python object using
str())".  It's str(A()) that raises the exception you're seeing, not
interpolation.  To worm around that, you'll effectively have to
duplicate PyObject_Str's implementation (which is more than just
calling tp_str -- that may not exist -- you'll end up at least
duplicating PyObject_Repr's implementation too) inside
PyString_Format(), and end up with a mess that's harder to explain
too.

The *real* problem (IMO) is that we don't have a format code that
means "stick the unicode representation here", i.e. there's no format
code that triggers PyObject_Unicode() directly.  unicode.__mod__
treats '%s' that way, but that isn't documented.
From FBatista at uniFON.com.ar  Thu Sep  9 21:14:24 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Thu Sep  9 21:18:58 2004
Subject: [Python-Dev] unicode inconsistency?
Message-ID: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>

[Tim Peters]

#- The *real* problem (IMO) is that we don't have a format code that
#- means "stick the unicode representation here", i.e. there's no format
#- code that triggers PyObject_Unicode() directly.  unicode.__mod__
#- treats '%s' that way, but that isn't documented.

You mean something like %u? (actually don't know if the "u" is used for
something else)

If %u triggers PyObject_Unicode(), the following will work?

    class A:
        def __unicode__(self):
            return u'\u1234'
 
    '%u' % u'\u1234'
    '%u' % A() 

.	Facundo
From tim.peters at gmail.com  Thu Sep  9 21:28:39 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  9 21:28:44 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>
Message-ID: <1f7befae04090912281fa118fc@mail.gmail.com>

[Batista, Facundo]
> You mean something like %u? (actually don't know if the "u" is used for
> something else)

'%u' is used for unsigned int formats -- although int/long unification
rendered those senseless.

> If %u triggers PyObject_Unicode(), the following will work?
> 
>    class A:
>        def __unicode__(self):
>            return u'\u1234'
> 
>    '%u' % u'\u1234'
>    '%u' % A()

That's the intent, yes.  Neil's original example would *also* "work"
then (because unlike PyObject_Str(), PyObject_Unicode() is happy to
accept a unicode result as-is from a tp_str implementation).
From nas at arctrix.com  Thu Sep  9 21:57:32 2004
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Sep  9 21:57:36 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <1f7befae040909121126062330@mail.gmail.com>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com>
	<20040909185034.GA31277@mems-exchange.org>
	<1f7befae040909121126062330@mail.gmail.com>
Message-ID: <20040909195732.GB31277@mems-exchange.org>

On Thu, Sep 09, 2004 at 03:11:51PM -0400, Tim Peters wrote:
> '%s' is documented as "String (converts any python object using
> str())".  It's str(A()) that raises the exception you're seeing,
> not interpolation.

Shouldn't '%s' % u'\u1234' also raise an exception then?

> To worm around that, you'll effectively have to duplicate
> PyObject_Str's implementation

Yes.  I want something like "PyObject_UnicodeOrStr" that would
return either a unicode object or a str object.  That would make it
easier to write code that produces 'str' results if unicode
characters don't appear in any of the inputs.  Having __str__
methods that can return either 'unicode' or 'str' objects is also
very handy (I don't see how you can say that it doesn't make any
sense).

Perhaps I am on the wrong track.  However, if I understand the /F
bot correctly, he favours a design that does not force everthing to
unicode strings.

  Neil
From nas at arctrix.com  Thu Sep  9 22:01:07 2004
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Sep  9 22:01:10 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <1f7befae04090912281fa118fc@mail.gmail.com>
References: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>
	<1f7befae04090912281fa118fc@mail.gmail.com>
Message-ID: <20040909200107.GC31277@mems-exchange.org>

On Thu, Sep 09, 2004 at 03:28:39PM -0400, Tim Peters wrote:
> >    '%u' % u'\u1234'
> >    '%u' % A()
> 
> That's the intent, yes.  Neil's original example would *also* "work"
> then (because unlike PyObject_Str(), PyObject_Unicode() is happy to
> accept a unicode result as-is from a tp_str implementation).

No, it would not "work" the way I want.  I don't want to force
things to unicode strings unless necessary.

  Neil
From nas at arctrix.com  Thu Sep  9 22:03:26 2004
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Sep  9 22:03:29 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <1f7befae04090912007305d532@mail.gmail.com>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com> <4140A41E.9080705@v.loewis.de>
	<1f7befae04090912007305d532@mail.gmail.com>
Message-ID: <20040909200326.GD31277@mems-exchange.org>

On Thu, Sep 09, 2004 at 03:00:07PM -0400, Tim Peters wrote:
> [Martin v. L?wis]
> > I can't see any harm by supporting this operation also if __str__ returns
> > a Unicode object.
> 
> It doesn't sound like a good idea to me, at least in part because it
> would be darned messy to implement short of saying "OK, we don't give
> a rip anymore about what type of objects PyObject_{Str,Repr} return"

Just to be clear, I don't propose allowing PyObject_Str and
PyObject_Repr to return unicode objects.  That would be a disaster,
IMO.

  Neil
From fredrik at pythonware.com  Thu Sep  9 22:12:30 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Sep  9 22:10:43 2004
Subject: [Python-Dev] Re: unicode inconsistency?
References: <20040909180743.GA31140@mems-exchange.org><20040909180955.GA28902@panix.com><20040909185034.GA31277@mems-exchange.org><1f7befae040909121126062330@mail.gmail.com>
	<20040909195732.GB31277@mems-exchange.org>
Message-ID: <chqdbr$m28$1@sea.gmane.org>

Neil Schemenauer wrote:

> Perhaps I am on the wrong track.  However, if I understand the /F
> bot correctly, he favours a design that does not force everthing to
> unicode strings.

that's correct.

I'm beginning to think that we need an extra method (__text__), that
can return any kind of string that's compatible with Python's text model.

(in today's CPython, that's an 8-bit string with ASCII only, or a Uni-
code string.  future Python's may support more string types, at least at
the C implementation level).

I'm not sure we can change __str__ or __unicode__ without breaking
code in really obscure ways (but I'd be happy to be proven wrong).

</F> 


From aahz at pythoncraft.com  Thu Sep  9 22:21:13 2004
From: aahz at pythoncraft.com (Aahz)
Date: Thu Sep  9 22:21:21 2004
Subject: [Python-Dev] Re: unicode inconsistency?
In-Reply-To: <chqdbr$m28$1@sea.gmane.org>
References: <20040909195732.GB31277@mems-exchange.org>
	<chqdbr$m28$1@sea.gmane.org>
Message-ID: <20040909202112.GB5485@panix.com>

On Thu, Sep 09, 2004, Fredrik Lundh wrote:
>
> I'm beginning to think that we need an extra method (__text__), that
> can return any kind of string that's compatible with Python's text model.

+1

While we're at it, that would be a good opportunity to add the __index__
method (for int-like objects that actually support indexing).  That
would get rid of the issues with using floats as inappropriate inputs.
Can't require __index__ until 3.0, but we can start making it available.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From mal at egenix.com  Thu Sep  9 22:54:26 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu Sep  9 22:54:28 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909200107.GC31277@mems-exchange.org>
References: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>	<1f7befae04090912281fa118fc@mail.gmail.com>
	<20040909200107.GC31277@mems-exchange.org>
Message-ID: <4140C302.10302@egenix.com>

Neil Schemenauer wrote:
> On Thu, Sep 09, 2004 at 03:28:39PM -0400, Tim Peters wrote:
> 
>>>   '%u' % u'\u1234'
>>>   '%u' % A()
>>
>>That's the intent, yes.  Neil's original example would *also* "work"
>>then (because unlike PyObject_Str(), PyObject_Unicode() is happy to
>>accept a unicode result as-is from a tp_str implementation).
> 
> No, it would not "work" the way I want.  I don't want to force
> things to unicode strings unless necessary.

Unicode always causes coercion towards Unicode, just like floats
always cause coercion towards floats. Nothing's going to
change at that end.

Note that your examples do work with %s if the format
string itself is Unicode, so in P3k, you'll no longer have
these problems.

Since there must be a reason why you have a __str__ method
that returns Unicode, I'd suggest you make the format string
itself a Unicode string as well :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 09 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From tim.peters at gmail.com  Thu Sep  9 22:59:18 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep  9 22:59:39 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <20040909195732.GB31277@mems-exchange.org>
References: <20040909180743.GA31140@mems-exchange.org>
	<20040909180955.GA28902@panix.com>
	<20040909185034.GA31277@mems-exchange.org>
	<1f7befae040909121126062330@mail.gmail.com>
	<20040909195732.GB31277@mems-exchange.org>
Message-ID: <1f7befae0409091359320e34c9@mail.gmail.com>

[Tim]
>> '%s' is documented as "String (converts any python object using
>> str())".  It's str(A()) that raises the exception you're seeing,
>> not interpolation.

[Neil]
> Shouldn't '%s' % u'\u1234' also raise an exception then?

Yes, but the existence of one undocumented extension isn't sufficient
reason to multiply them.  The "Unicode exception" here is at least
easy to explain.  To make your case work, we somehow have to explain
that although virtually all ways of invoking __str__ produce an 8-bit
encoding of a unicode return value, for some magical reason
str.__mod__ does not.  The existing "Unicode exception" consists
solely of saying "but unicode inputs don't invoke str(), and force the
interpolation to get passed to unicode.__mod__ instead".

> Yes.  I want something like "PyObject_UnicodeOrStr" that would
> return either a unicode object or a str object.  That would make it
> easier to write code that produces 'str' results if unicode
> characters don't appear in any of the inputs.

I think biting the Unicode bullet whole is saner, but suit yourself.

>  Having __str__ methods that can return either 'unicode' or 'str' objects
> is also very handy (I don't see how you can say that it doesn't make any
> sense).

Didn't we go thru that last week <wink>?  Yes:

    [Neil]
    [... the same class as today's class ...]

    [Martin]
    > This class is incorrect: it does not support str().

    [Neil]
    > Can you be more specific about what is incorrect with the above
    > class?

    [Martin]
    In the default installation, it gives a UnicodeEncodeError.

You didn't respond to that (at least not that I saw), so I assumed you
accepted Martin's nag.  Having a __str__ that returns a unicode object
that the default encoding can't handle is clearly (IMO) begging for
trouble.

> Perhaps I am on the wrong track.  However, if I understand the /F
> bot correctly, he favours a design that does not force everthing to
> unicode strings.

Saying it doesn't make sense to have a __str__ method return a Unicode
value that can't be encoded *as* a str isn't asking anyone to force
anything to Unicode.  __str__ is still trying hard to retain a
*distinction* between str and unicode.  PyObject_Unicode() no longer
plays along with that distinction, but I (mildly) wish it still did.
From martin at v.loewis.de  Thu Sep  9 23:00:12 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep  9 23:00:04 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <4140C302.10302@egenix.com>
References: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>	<1f7befae04090912281fa118fc@mail.gmail.com>	<20040909200107.GC31277@mems-exchange.org>
	<4140C302.10302@egenix.com>
Message-ID: <4140C45C.4050009@v.loewis.de>

M.-A. Lemburg wrote:
>> No, it would not "work" the way I want.  I don't want to force
>> things to unicode strings unless necessary.
> 
> 
> Unicode always causes coercion towards Unicode, just like floats
> always cause coercion towards floats. Nothing's going to
> change at that end.

Not always. As we are discussing right now, str() (and indirectly
%s) coerce Unicode objects into string objects. Also,
PyArg_ParseTuple coerces Unicode into byte strings for the "s"
and "t" formats.

Regards,
Martin
From mal at egenix.com  Thu Sep  9 23:11:53 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu Sep  9 23:11:56 2004
Subject: [Python-Dev] unicode inconsistency?
In-Reply-To: <4140C45C.4050009@v.loewis.de>
References: <A128D751272CD411BC9200508BC2194D053C7925@escpl.tcp.com.ar>	<1f7befae04090912281fa118fc@mail.gmail.com>	<20040909200107.GC31277@mems-exchange.org>
	<4140C302.10302@egenix.com> <4140C45C.4050009@v.loewis.de>
Message-ID: <4140C719.1010906@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>> No, it would not "work" the way I want.  I don't want to force
>>> things to unicode strings unless necessary.
>>
>> Unicode always causes coercion towards Unicode, just like floats
>> always cause coercion towards floats. Nothing's going to
>> change at that end.
> 
> Not always. As we are discussing right now, str() (and indirectly
> %s) coerce Unicode objects into string objects. Also,
> PyArg_ParseTuple coerces Unicode into byte strings for the "s"
> and "t" formats.

I may have been misunderstanding Neil, but I was referring
to Neil's comment that he would not like things to get
forced to Unicode.

If I look at his initial posting, it looks as if Neil wanted
'%s' % A() to return u'\u1234'.

The current implementation tests for Unicode-subclasses,
but does not look at the __str__ return object. In order
to add support for the latter we'd have to add a new C API,
e.g. PyObject_Text() that returns a StringTypes
instance, or catch the UnicodeError caused by the ASCII codec
and let this trigger a redirection to the Unicode formatting
routine (however, this is dangerous since it would cause the
object to be evaluated twice).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 09 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From cce at clarkevans.com  Thu Sep  9 23:55:48 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Sep  9 23:55:53 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040909101444.GA2877@vicky.ecs.soton.ac.uk>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
	<413F564D.2070708@bluewin.ch>
	<20040908192056.GB62848@prometheusresearch.com>
	<20040909101444.GA2877@vicky.ecs.soton.ac.uk>
Message-ID: <20040909215548.GB61544@prometheusresearch.com>

Armin,

On Thu, Sep 09, 2004 at 11:14:44AM +0100, Armin Rigo wrote:
| I agree with Samuele that the proposal is far too vague currently.  You
| should try to describe what precisely should occur in each situation.

Oh, absolutely.  This was a draft PEP to collect feedback.  It will be a
bit before I have a chunk of time to assimilate the comments and produce
another (more detailed) draft.   Your comments were very helpful, I've
got a bit of education in my future.

| A major problem I see with the proposal is that you can describe what
| should occur in some situations by presenting source code snippets; such
| descriptions correspond easily to possible semantics at the bytecode
| level.  But bytecode is not a natural granularity for coroutine issues.

*nod*

| This doesn't mean that it is impossible to figure out a more limited
| concept, like you are trying to do.  However keeping the "restartable
| exception" idea in mind should help focusing on the difficult problems
| and where restrictions are needed.

Best,

Clark
From noamr at myrealbox.com  Fri Sep 10 01:03:05 2004
From: noamr at myrealbox.com (Noam Raphael)
Date: Fri Sep 10 01:04:22 2004
Subject: [Python-Dev] Missing arguments in RE functions
In-Reply-To: <413EB184.9030604@heneryd.com>
References: <000f01c49535$9ec914c0$e841fea9@oemcomputer>
	<413EB184.9030604@heneryd.com>
Message-ID: <4140E129.1040700@myrealbox.com>

I've read the objections. I understand being careful about extending an 
API, but I still think that there are things to improve, even when being 
conservative about the API.

I think that the straightforward functions should be taken seriously. 
The reason is that although you can write 
re.compile(pattern).match(...), re.match(pattern, ...) is shorter and 
just as clear - I think of the fact that REs are first compiled and then 
applied as an implementation issue, which lets you save time when 
applying the same RE many times. The documentation is with me - let me 
quote:

=====================
The sequence

prog = re.compile(pat)
result = prog.match(str)

is equivalent to

result = re.match(pat, str)

but the version using compile() is more efficient when the expression 
will be used several times in a single program.

=====================
findall(string)
    Identical to the findall() function, using the compiled pattern.
=====================

Not only the straightforward functions are not being regarded as being 
"only there for trivial cases", the methods of the compiled RE are 
regarded as sometimes-more-efficient versions of the straightforward 
functions. This is why I didn't even know, until I made my research 
before sending my message to python-dev, that you could match from a 
given start position - I studied the page documenting the functions, 
because I didn't want on an early stage to bother my students with the 
fact that REs are first compiled and then applied, and I didn't find any 
mention of the start position option.

So, as I see it, there are two options.

The first one is to decide that the functions are a ligitimate way of 
using REs in python, and add the optional parameters that I added in my 
patch. In this way, anything you can do with the compiled pattern you 
could do using the functions. (I'm not that big expert in REs, but I 
checked through the documentation and didn't find any functionality that 
was missing from the functions, after adding these parameters.)

The second option is to decide that the functions are only a shortcut, 
meant for use in trivial cases. In that case, two things should be done, 
IMHO: The main thing is to update the documentation, to make that clear. 
It means at least adding a prominent note in the "module contents" page, 
stating something like "these functions are here only as shortcuts; to 
access the full functionality, use compiled patterns". I think that in 
this case, the documentation should be further updated, by changing all 
the function explanations to something like "equivalent to 
re.compile(pattern, flags).match(string)", instead of the detailed 
explanations now given. The second thing that should be done even if the 
functions are considered shortcuts, is to add the "flags" parameter to 
the findall() and finditer() functions - I really can't see any reason 
why the search() and match() functions should have that parameter and 
findall() and finditer() shouldn't - they all get two arguments, pattern 
and string. Why should the optional parameter be available only for the 
older functions?

And a final note: the parameters for start and end positions are already 
available in the findall() and finditer() methods. Should this be left 
an undocumented feature? It seems to me perfectly legitimate to search 
for all the matches of a specific RE in a substring without actually 
copying all the characters of the substring to another string.

Noam

(P.S. Can you please add me to the CC of your replies? It would make it 
easier for me to reply, since I'm not a member of python-dev.)
From greg at cosc.canterbury.ac.nz  Fri Sep 10 02:58:23 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri Sep 10 02:58:28 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
Message-ID: <200409100058.i8A0wNIV002743@cosc353.cosc.canterbury.ac.nz>

PEP: 335
Title: Overloadable Boolean Operators
Version: $Revision: 1.2 $
Last-Modified: $Date: 2004/09/09 14:17:17 $
Author: Gregory Ewing <greg@cosc.canterbury.ac.nz>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Aug-2004
Python-Version: 2.4
Post-History: 05-Sep-2004


Abstract
========

This PEP proposes an extension to permit objects to define their own
meanings for the boolean operators 'and', 'or' and 'not', and suggests
an efficient strategy for implementation.  A prototype of this
implementation is available for download.


Background
==========

Python does not currently provide any '__xxx__' special methods
corresponding to the 'and', 'or' and 'not' boolean operators.  In the
case of 'and' and 'or', the most likely reason is that these operators
have short-circuiting semantics, i.e. the second operand is not
evaluated if the result can be determined from the first operand.  The
usual technique of providing special methods for these operators
therefore would not work.

There is no such difficulty in the case of 'not', however, and it
would be straightforward to provide a special method for this
operator.  The rest of this proposal will therefore concentrate mainly
on providing a way to overload 'and' and 'or'.


Motivation
==========

There are many applications in which it is natural to provide custom
meanings for Python operators, and in some of these, having boolean
operators excluded from those able to be customised can be
inconvenient.  Examples include:

1. Numeric/Numarray, in which almost all the operators are defined on
   arrays so as to perform the appropriate operation between
   corresponding elements, and return an array of the results.  For
   consistency, one would expect a boolean operation between two
   arrays to return an array of booleans, but this is not currently
   possible.

   There is a precedent for an extension of this kind: comparison
   operators were originally restricted to returning boolean results,
   and rich comparisons were added so that comparisons of Numeric
   arrays could return arrays of booleans.

2. A symbolic algebra system, in which a Python expression is
   evaluated in an environment which results in it constructing a tree
   of objects corresponding to the structure of the expression.

3. A relational database interface, in which a Python expression is
   used to construct an SQL query.

A workaround often suggested is to use the bitwise operators '&', '|'
and '~' in place of 'and', 'or' and 'not', but this has some
drawbacks.  The precedence of these is different in relation to the
other operators, and they may already be in use for other purposes (as
in example 1).  There is also the aesthetic consideration of forcing
users to use something other than the most obvious syntax for what
they are trying to express.  This would be particularly acute in the
case of example 3, considering that boolean operations are a staple of
SQL queries.


Rationale
=========

The requirements for a successful solution to the problem of allowing
boolean operators to be customised are:

1. In the default case (where there is no customisation), the existing
   short-circuiting semantics must be preserved.

2. There must not be any appreciable loss of speed in the default
   case.

3. If possible, the customisation mechanism should allow the object to
   provide either short-circuiting or non-short-circuiting semantics,
   at its discretion.

One obvious strategy, that has been previously suggested, is to pass
into the special method the first argument and a function for
evaluating the second argument.  This would satisfy requirements 1 and
3, but not requirement 2, since it would incur the overhead of
constructing a function object and possibly a Python function call on
every boolean operation.  Therefore, it will not be considered further
here.

The following section proposes a strategy that addresses all three
requirements.  A `prototype implementation`_ of this strategy is
available for download.

.. _prototype implementation:
   http://www.cosc.canterbury.ac.nz/~greg/python/obo//Python_OBO.tar.gz


Specification
=============

Special Methods
---------------

At the Python level, objects may define the following special methods.

===============  =================  ========================
Unary            Binary, phase 1    Binary, phase 2
===============  =================  ========================
* __not__(self)  * __and1__(self)   * __and2__(self, other)
                 * __or1__(self)    * __or2__(self, other)
                                    * __rand2__(self, other)
                                    * __ror2__(self, other)
===============  =================  ========================

The __not__ method, if defined, implements the 'not' operator.  If it
is not defined, or it returns NotImplemented, existing semantics are
used.

To permit short-circuiting, processing of the 'and' and 'or' operators
is split into two phases.  Phase 1 occurs after evaluation of the first
operand but before the second.  If the first operand defines the
appropriate phase 1 method, it is called with the first operand as
argument.  If that method can determine the result without needing the
second operand, it returns the result, and further processing is
skipped.

If the phase 1 method determines that the second operand is needed, it
returns the special value NeedOtherOperand.  This triggers the
evaluation of the second operand, and the calling of an appropriate
phase 2 method. During phase 2, the __and2__/__rand2__ and
__or2__/__ror2__ method pairs work as for other binary operators.

Processing falls back to existing semantics if at any stage a relevant
special method is not found or returns NotImplemented.

As a special case, if the first operand defines a phase 2 method but
no corresponding phase 1 method, the second operand is always
evaluated and the phase 2 method called.  This allows an object which
does not want short-circuiting semantics to simply implement the
relevant phase 2 methods and ignore phase 1.


Bytecodes
---------

The patch adds four new bytecodes, LOGICAL_AND_1, LOGICAL_AND_2,
LOGICAL_OR_1 and LOGICAL_OR_2.  As an example of their use, the
bytecode generated for an 'and' expression looks like this::

            .
            .
            .
            evaluate first operand
            LOGICAL_AND_1  L
            evaluate second operand
            LOGICAL_AND_2
       L:   .
            .
            .

The LOGICAL_AND_1 bytecode performs phase 1 processing.  If it
determines that the second operand is needed, it leaves the first
operand on the stack and continues with the following code.  Otherwise
it pops the first operand, pushes the result and branches to L.

The LOGICAL_AND_2 bytecode performs phase 2 processing, popping both
operands and pushing the result.


Type Slots
----------

A the C level, the new special methods are manifested as five new
slots in the type object.  In the patch, they are added to the
tp_as_number substructure, since this allowed making use of some
existing code for dealing with unary and binary operators.  Their
existence is signalled by a new type flag,
Py_TPFLAGS_HAVE_BOOLEAN_OVERLOAD.

The new type slots are::

    unaryfunc nb_logical_not;
    unaryfunc nb_logical_and_1;
    unaryfunc nb_logical_or_1;
    binaryfunc nb_logical_and_2;
    binaryfunc nb_logical_or_2;


Python/C API Functions
----------------------

There are also five new Python/C API functions corresponding to the
new operations::

    PyObject *PyObject_LogicalNot(PyObject *);
    PyObject *PyObject_LogicalAnd1(PyObject *);
    PyObject *PyObject_LogicalOr1(PyObject *);
    PyObject *PyObject_LogicalAnd2(PyObject *, PyObject *);
    PyObject *PyObject_LogicalOr2(PyObject *, PyObject *);


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:
From fredrik at pythonware.com  Fri Sep 10 03:29:27 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Sep 10 03:27:36 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
References: <000f01c49535$9ec914c0$e841fea9@oemcomputer><413EB184.9030604@heneryd.com>
	<4140E129.1040700@myrealbox.com>
Message-ID: <chqvu3$qm9$1@sea.gmane.org>

Noam Raphael wrote:
> This is why I didn't even know, until I made my research before sending my message to python-dev, 
> that you could match from a given start position - I studied the page documenting the functions, 
> because I didn't want on an early stage to bother my students with the fact that REs are first 
> compiled and then applied, and I didn't find any mention of the start position option.

the "I didn't prepare properly, didn't know what I was talking about,
and didn't know what do answer when my students asked me a legitimate
question" argument isn't a good reason to change the language.

if you're doing Python training, make sure you know your Python.  I do,
and I very seldom have problems explaining how things work.

</F> 


From barry at python.org  Fri Sep 10 04:43:55 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 04:44:02 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <ca471dc2040908080861941ab2@mail.gmail.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com> <1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org> <413F1D9C.20209@egenix.com>
	<ca471dc2040908080861941ab2@mail.gmail.com>
Message-ID: <1094784234.13055.14.camel@geddy.wooz.org>

On Wed, 2004-09-08 at 11:08, Guido van Rossum wrote:
> > Templates are meant to template *text* data, so Unicode is
> > the right choice of baseclass from a design perspective.
> 
> Only in Python 3.0.
> 
> But even so, deriving from Unicode (or str) means the template class
> inherits a lot of unwanted operations.

Except that I think in general it'll just be very convenient for
Templates to /be/ unicodes.

But no matter.  It seems like if we make Template a simple class, it
will be possible for applications to mix in Template and unicode if they
want.  E.g. class UTemplate(Template, unicode).

If we go that route, then I agree we probably don't want to use
__mod__(), but I'm not too crazy about using __call__().  "Calling a
template" just seems weird to me.  Besides, extrapolating, I don't think
we need separate Template and SafeTemplate classes.  A single Template
class can have both safe and non-safe substitution methods.

So, I have working code that integrates these changes, and also uses
Tim's metaclass idea to provide a nice, easy-to-document pattern
overloading mechanism.  I chose methods substitute() and
safe_substitute() because, er, that's what they do, and those names also
don't interfere with existing str or unicode methods.

And to make effbot and Raymond happy, it won't auto-promote to unicode
if everything's an 8bit string.

I will check this in and hopefully this will put the issue to bed. 
There will be updated unit tests, and I will update the documentation
and the PEP as appropriate -- if we've reached agreement on it.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040909/27bcbd95/attachment-0001.pgp
From barry at python.org  Fri Sep 10 04:48:56 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 04:49:00 2004
Subject: [Python-Dev] Re: Re: Alternative
	Implementation	forPEP292:SimpleString Substitutions
In-Reply-To: <413F3605.7090707@egenix.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com>
	<chnc49$psm$1@sea.gmane.org>  <413F3605.7090707@egenix.com>
Message-ID: <1094784536.13113.17.camel@geddy.wooz.org>

On Wed, 2004-09-08 at 12:40, M.-A. Lemburg wrote:

> If we start to store text data in Unicode now and leave binary
> data in 8-bit strings, then the move to Unicode strings literals
> will be much smoother in P3k.

Not to mention more consistent with established alternative
implementations of the Python language based on Unicode-only runtimes.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040909/0f1bf80a/attachment.pgp
From barry at python.org  Fri Sep 10 04:53:33 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 04:53:38 2004
Subject: [Python-Dev] Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
In-Reply-To: <ca471dc20409081929333228b2@mail.gmail.com>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com> <cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com> <ca471dc2040908080861941ab2@mail.gmail.com>
	<413F23E0.2090908@egenix.com>
	<ca471dc20409081929333228b2@mail.gmail.com>
Message-ID: <1094784813.13113.23.camel@geddy.wooz.org>

On Wed, 2004-09-08 at 22:29, Guido van Rossum wrote:

> But I thought we had plenty of time since Barry has offered to
> withdraw the PEP 292 implementation for 2.4?

Which I will still do if we cannot reach community agreement by beta1. 
But lets see how the latest proposal goes over.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040909/e87f2cf6/attachment.pgp
From fdrake at acm.org  Fri Sep 10 05:31:39 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Sep 10 05:32:00 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib string.py,
	1.73, 1.74
In-Reply-To: <E1C5blS-0005wo-1L@sc8-pr-cvs1.sourceforge.net>
References: <E1C5blS-0005wo-1L@sc8-pr-cvs1.sourceforge.net>
Message-ID: <200409092331.39486.fdrake@acm.org>

On Thursday 09 September 2004 11:07 pm, bwarsaw@users.sourceforge.net wrote:
 > - Adopt Tim Peter's idea for giving Template a metaclass, which makes the
 >   delimiter, the identifier pattern, or the entire pattern easy to
 > override and document, while retaining efficiency of class-time
 > compilation of the regexp.

Good documentation would really help for this as well.  One simple and one... 
interesting example would be nice.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From nidoizo at yahoo.com  Fri Sep 10 05:37:39 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Fri Sep 10 05:36:21 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <chqvu3$qm9$1@sea.gmane.org>
References: <000f01c49535$9ec914c0$e841fea9@oemcomputer><413EB184.9030604@heneryd.com>	<4140E129.1040700@myrealbox.com>
	<chqvu3$qm9$1@sea.gmane.org>
Message-ID: <chr7fh$79b$1@sea.gmane.org>

Fredrik Lundh wrote:

> Noam Raphael wrote:
> 
>>This is why I didn't even know, until I made my research before sending my message to python-dev, 
>>that you could match from a given start position - I studied the page documenting the functions, 
>>because I didn't want on an early stage to bother my students with the fact that REs are first 
>>compiled and then applied, and I didn't find any mention of the start position option.
> 
> the "I didn't prepare properly, didn't know what I was talking about,
> and didn't know what do answer when my students asked me a legitimate
> question" argument isn't a good reason to change the language.
> 
> if you're doing Python training, make sure you know your Python.  I do,
> and I very seldom have problems explaining how things work.

I don't know what in Noam requests justify what I read as insults (and 
hope were not intended to be).  I think Noam's point is just that the 
function API can be considered incomplete/incoherent when compared to 
the one with pattern objects.  It's debatable and personally I always 
use pattern objects.  It basically depends on the goals of the redundant 
function API, and I have no idea what they are.

I tend to agree with Raymond.  FWIW, I think it's clearer to define the 
function API as pattern objects equivalent in functionality than as 
shortcuts for trivial cases.  However, as you pointed, the advantage of 
not extending the API forces moving the pattern objects.  (I also give 
Python courses, but to be honest I teach regular expressions in Perl, 
avoiding focusing on compilation issues.)

Regards,
Nicolas

From stephen at xemacs.org  Fri Sep 10 07:38:38 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri Sep 10 07:38:46 2004
Subject: [Python-Dev] AlternativeImplementation	forPEP292:SimpleString
	Substitutions
In-Reply-To: <200409090939.41873.gmccaughan@synaptics-uk.com> (Gareth
	McCaughan's message of "Thu, 9 Sep 2004 09:39:41 +0100")
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<chnidf$epp$1@sea.gmane.org> <413F6120.7090603@egenix.com>
	<200409090939.41873.gmccaughan@synaptics-uk.com>
Message-ID: <87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Gareth" == Gareth McCaughan <gmccaughan@synaptics-uk.com> writes:

    Gareth> That said, I strongly agree that all textual data should
    Gareth> be Unicode as far as the developer is concerned; but, at
    Gareth> least in the USA :-), it makes sense to have an optimized
    Gareth> representation that saves space for ASCII-only text, just
    Gareth> as we have an optimized representation for small integers.

This is _not at all_ obvious.  As MAL just pointed out, if efficiency
is a goal, text algorithms often need to be different for operations
on texts that are dense in an 8-bit character space, vs texts that are
sparse in a 16-bit or 20-bit character space.  Note that that is what
</F> is talking about too; he points to SRE and ElementTree.

When viewed from that point of view, the subtext to </F>'s comment is
"I don't want to separately maintain 8-bit versions of new text
facilities to support my non-Unicode applications, I want to impose
that burden on the authors of text-handling PEPs."  That may very well
be the best thing for Python; as </F> has done a lot of Unicode
implementation for Python, he's in a good position to make such
judgements.  But the development costs MAL refers to are bigger than
you are estimating, and will continue as long as that policy does.

While I'm very sympathetic to </F>'s view that there's more than one
way to skin a cat, and a good cat-handling design should account for
that, and conceding his expertise, none-the-less I don't think that
Python really wants to _maintain_ more than one text-processing system
by default.  Of course if you restrict yourself to the class of ASCII-
only strings, you can do better, and of course that is a huge class of
strings.  But that, as such, is important only to efficiency fanatics.

The question is, how often are people going to notice that when they
have pure ASCII they get a 100% speedup, or that they actually can
just suck that 3GB ASCII file into their 4GB memory, rather than
buffering it as 3 (or 6) 2GB Unicode strings?  Compare how often
people are going to notice that a new facility "just works" for
Japanese or Hindi.  I just don't see the former being worth the extra
effort, while the latter makes the "this or that" choice clear.  If a
single representation is enough, it had better be Unicode-based, and
the others can be supported in libraries (which turn binary blobs into
non-standard text objects with appropriate methods) as the need arises.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From raymond.hettinger at verizon.net  Fri Sep 10 07:50:40 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri Sep 10 07:51:34 2004
Subject: [Python-Dev] Re: Alternative Implementation
	forPEP292:SimpleString Substitutions
In-Reply-To: <1094784234.13055.14.camel@geddy.wooz.org>
Message-ID: <003f01c496fa$1b2f6da0$e841fea9@oemcomputer>

[Barry]
> And to make effbot and Raymond happy, it won't auto-promote to unicode
> if everything's an 8bit string.

Glad to see that my happiness now ranks as a development objective ;-)


> There will be updated unit tests, and I will update the documentation
> and the PEP as appropriate -- if we've reached agreement on it.

+1 
Beautiful job.


Barry asked me to bring up one remaining implementation issue for
discussion on python-dev.  

The docs clearly state that only python identifiers are allowed as
placeholders:
 
    [_A-Za-z][_A-Za-z0-9]*

The challenge is that templates can be exposed to non-programmer
end-users with no reason to suspect that one letter of their alphabet is
different from another.  So, as it stands right now, there is a
usability issue with placeholder errors passing silently:

    >>> fechas = {u'hoy':u'lunes', u'ma?ana':u'martes'}
    >>> t = Template(u'?Puede volver $hoy o $ma?ana?')
    >>> t.safe_substitute(fechas)
    u'?Puede volver lunes o $ma?ana?'

The substitution failed silently (no ValueError as would have occurred
with $@ or a dangling $).  It may be especially baffling for the user
because one placeholder succeeded and the other failed without a hint of
why (he can see the key in the mapping, it just won't substitute).  No
clue is offered that the Template was looking for $ma, a partial token,
and didn't find it (the situation is even worse if it does find $ma and
substitutes an unintended value).

I suggest that the above should raise an error:

    ValueError:  Invalid token $ma?ana on line 1, column 24

It is easily possible to detect and report such errors (see an example
in nondist/sandbox/string/curry292.py).

The arguments against such reporting are:
* Raymond is smoking crack.  End users will never make this mistake.
* The docs say python identifiers only.  You blew it.  Tough.  Not a
bug.
* For someone who understands exactly what they are doing, perhaps $ma
is the intended placeholder -- why force them to uses braces:
${ma}?ana.


In addition to the above usability issue, there is one other nit.  The
new invocation syntax offers us the opportunity for to also accept
keyword arguments as mapping alternatives:

    def substitute(self, mapping=None, **kwds):
        if mapping is None:
           mapping == kwds
     . . .

When applicable, this makes for beautiful, readable calls:

    t.substitute(who="Barry", what="mailmeister", when=now())

This would be a simple and nice enchancement to Barry's excellent
implementation.  I recommend that keyword arguments be adopted.


Raymond 

From raymond.hettinger at verizon.net  Fri Sep 10 09:50:31 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri Sep 10 09:51:25 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <chr7fh$79b$1@sea.gmane.org>
Message-ID: <005801c4970a$d9227e00$e841fea9@oemcomputer>

> I tend to agree with Raymond.  FWIW, I think it's clearer to define
the
> function API as pattern objects equivalent in functionality than as
> shortcuts for trivial cases

I'm down to +0 on the request.  Keeping the API stable is also
important.  And, Fred's effort to separate basic from advanced seems
reasonable.

Filling in the missing docs for existing flag, start, and stop args is a
good idea and should probably be done even if the function API changes
are rejected.  


Raymond

From mal at egenix.com  Fri Sep 10 11:05:58 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri Sep 10 11:06:07 2004
Subject: [Python-Dev] PEP 328 - Relative Imports
In-Reply-To: <ca471dc204090808119668687@mail.gmail.com>
References: <413F1B87.90301@egenix.com>
	<ca471dc204090808119668687@mail.gmail.com>
Message-ID: <41416E76.8030603@egenix.com>

Guido van Rossum wrote:
>>I know that this has been discussed a few times in the past,
>>but the more I have to deal with building applications using
>>third-party libs or packages, the more I get the feeling that
>>the choice of making "import module" absolute is the wrong
>>path to follow.
>>
>>The typical scenario goes like this:
>>
>>* you build an application that uses various third-party
>>   packages and has to maintain them inside another package,
>>   e.g. ThirdPartyCode
>>
>>* you don't have access to the (third-party) package source code or
>>   it's not feasable to make changes to it for maintenance reasons
>>
>>Another common case is that you have to deal with third-party
>>code that is not properly packaged as Python package, but comes
>>as a set of top-level modules.
>>
>>In this scenario you typically put all those files into a
>>newly created Python package directory and access the modules
>>in that directory using the package name.
>>
>>In Python 2.3 and 2.4 (as well as all previous versions), both
>>scenarios can easily be implemented without having to change
>>the third-party code.
>>
>>The PEP however suggests that starting with 2.5, the interpreter
>>will issue a warning and 2.6 should default to absolute paths.
>>
>>I'd like to request that the latter change be postponed to
>>Python 3k, or that some other way of supporting the above
>>scenarios is provided that can be enabled in the application.
>>
>>Please remember that changes to application code are well
>>possible. What's not possible is making changes to the
>>packaged third-party code.
> 
> As long as it's clear that this is a compatibility requirement only I
> think it's a good idea to support this way of developing apps (even
> though I think that clever sys.path manipulation can probably get
> around it, it's not worth breaking existing approaches). All new apps
> should however use relative imports to reference their own code, so
> the problem won't be repeated in the future.

I have my doubts that this is going to happen.

People are more likely going to make all imports absolute (like
they already do in Java and other languages) - which
is good, since it makes reading code much easier and allows for
writing packages which are compatible to older Python version,
but it also prevent developing applications using the above
approach.

I also don't think that extension writers will care enough to
make their packages fully relocateable by using relative
imports all over - these are hard to read and don't buy
the developer of the extension anything.

Anyway, what should the strategy for the PEP look like ?

1. postpone the defaulting to absolute until P3k

2. provide a way to customize the behaviour using
    e.g. a sys function

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 10 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From jim at zope.com  Fri Sep 10 13:46:58 2004
From: jim at zope.com (Jim Fulton)
Date: Fri Sep 10 13:47:02 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <41416E76.8030603@egenix.com>
References: <413F1B87.90301@egenix.com>	<ca471dc204090808119668687@mail.gmail.com>
	<41416E76.8030603@egenix.com>
Message-ID: <41419432.2000600@zope.com>

M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> 

...

>> As long as it's clear that this is a compatibility requirement only I
>> think it's a good idea to support this way of developing apps (even
>> though I think that clever sys.path manipulation can probably get
>> around it, it's not worth breaking existing approaches). All new apps
>> should however use relative imports to reference their own code, so
>> the problem won't be repeated in the future.
> 
> 
> I have my doubts that this is going to happen.
> 
> People are more likely going to make all imports absolute (like
> they already do in Java and other languages) - which
> is good, since it makes reading code much easier and allows for
> writing packages which are compatible to older Python version,
> but it also prevent developing applications using the above
> approach.
> 
> I also don't think that extension writers will care enough to
> make their packages fully relocateable by using relative
> imports all over - these are hard to read and don't buy
> the developer of the extension anything.

I find explicit relative imports easier to read, as it
reduces the noise level.

I like the fact that local imports look different from non-local ones.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From jim at zope.com  Fri Sep 10 13:46:58 2004
From: jim at zope.com (Jim Fulton)
Date: Fri Sep 10 13:47:05 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <41416E76.8030603@egenix.com>
References: <413F1B87.90301@egenix.com>	<ca471dc204090808119668687@mail.gmail.com>
	<41416E76.8030603@egenix.com>
Message-ID: <41419432.2000600@zope.com>

M.-A. Lemburg wrote:
> Guido van Rossum wrote:
> 

...

>> As long as it's clear that this is a compatibility requirement only I
>> think it's a good idea to support this way of developing apps (even
>> though I think that clever sys.path manipulation can probably get
>> around it, it's not worth breaking existing approaches). All new apps
>> should however use relative imports to reference their own code, so
>> the problem won't be repeated in the future.
> 
> 
> I have my doubts that this is going to happen.
> 
> People are more likely going to make all imports absolute (like
> they already do in Java and other languages) - which
> is good, since it makes reading code much easier and allows for
> writing packages which are compatible to older Python version,
> but it also prevent developing applications using the above
> approach.
> 
> I also don't think that extension writers will care enough to
> make their packages fully relocateable by using relative
> imports all over - these are hard to read and don't buy
> the developer of the extension anything.

I find explicit relative imports easier to read, as it
reduces the noise level.

I like the fact that local imports look different from non-local ones.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From gmccaughan at synaptics-uk.com  Fri Sep 10 13:57:13 2004
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Fri Sep 10 13:57:46 2004
Subject: [Python-Dev]
	=?iso-8859-1?q?AlternativeImplementation=09forPEP292=3ASimpleString?=
	=?iso-8859-1?q?_Substitutions?=
In-Reply-To: <87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<200409090939.41873.gmccaughan@synaptics-uk.com>
	<87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <200409101257.13802.gmccaughan@synaptics-uk.com>

On Friday 2004-09-10 06:38, Stephen J. Turnbull wrote:
> >>>>> "Gareth" == Gareth McCaughan <gmccaughan@synaptics-uk.com> writes:
> 
>     Gareth> That said, I strongly agree that all textual data should
>     Gareth> be Unicode as far as the developer is concerned; but, at
>     Gareth> least in the USA :-), it makes sense to have an optimized
>     Gareth> representation that saves space for ASCII-only text, just
>     Gareth> as we have an optimized representation for small integers.
> 
> This is _not at all_ obvious.  As MAL just pointed out, if efficiency
> is a goal, text algorithms often need to be different for operations
> on texts that are dense in an 8-bit character space, vs texts that are
> sparse in a 16-bit or 20-bit character space.  Note that that is what
> </F> is talking about too; he points to SRE and ElementTree.

I hope you aren't expecting me to disagree.

> When viewed from that point of view, the subtext to </F>'s comment is
> "I don't want to separately maintain 8-bit versions of new text
> facilities to support my non-Unicode applications, I want to impose
> that burden on the authors of text-handling PEPs."  That may very well
> be the best thing for Python; as </F> has done a lot of Unicode
> implementation for Python, he's in a good position to make such
> judgements.  But the development costs MAL refers to are bigger than
> you are estimating, and will continue as long as that policy does.

How do you know what I am estimating?

> While I'm very sympathetic to </F>'s view that there's more than one
> way to skin a cat, and a good cat-handling design should account for
> that, and conceding his expertise, none-the-less I don't think that
> Python really wants to _maintain_ more than one text-processing system
> by default.  Of course if you restrict yourself to the class of ASCII-
> only strings, you can do better, and of course that is a huge class of
> strings.  But that, as such, is important only to efficiency fanatics.

No, it's important to ... well, people to whom efficiency
matters. There's no need for them to be fanatics.

> The question is, how often are people going to notice that when they
> have pure ASCII they get a 100% speedup, or that they actually can
> just suck that 3GB ASCII file into their 4GB memory, rather than
> buffering it as 3 (or 6) 2GB Unicode strings?  Compare how often
> people are going to notice that a new facility "just works" for
> Japanese or Hindi.

Why is that the question, rather than "how often are people
going to benefit from getting a 100% speedup when they have
pure ASCII"? Or even "how often are people going to try out
Python on an application that uses pure-ASCII strings, and
decide to use some other language that seems to do the job
much faster"?

>                    I just don't see the former being worth the extra
> effort, while the latter makes the "this or that" choice clear.  If a
> single representation is enough, it had better be Unicode-based, and
> the others can be supported in libraries (which turn binary blobs into
> non-standard text objects with appropriate methods) as the need arises.

No question that if a single representation is enough then it
had better be Unicode.

-- 
g

From andrew at andreweland.org  Fri Sep 10 13:45:25 2004
From: andrew at andreweland.org (Andrew Eland)
Date: Fri Sep 10 13:58:10 2004
Subject: [Python-Dev] Adding status code constants to httplib
Message-ID: <414193D5.6010405@andreweland.org>

Hi,

Over in web-sig, we're discussing PEP 333, the Web Server Gateway 
Interface. Rather than defining our own set of constants for the HTTP 
status code integers, we thought it would be a good idea to add them to 
httplib, allowing other applications to benefit. I've uploaded a 
patch[1] to httplib.py and the corresponding documentation. Do people 
think this is a good idea?

   -- Andrew Eland (http://www.andreweland.org)

[1] 
http://sourceforge.net/tracker/index.php?func=detail&aid=1025790&group_id=5470&atid=305470
From skip at pobox.com  Fri Sep 10 15:57:12 2004
From: skip at pobox.com (Skip Montanaro)
Date: Fri Sep 10 15:57:21 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <005801c4970a$d9227e00$e841fea9@oemcomputer>
References: <chr7fh$79b$1@sea.gmane.org>
	<005801c4970a$d9227e00$e841fea9@oemcomputer>
Message-ID: <16705.45752.540765.442498@montanaro.dyndns.org>


    Raymond> I'm down to +0 on the request.  Keeping the API stable is also
    Raymond> important.  And, Fred's effort to separate basic from advanced
    Raymond> seems reasonable.

Adding my two cents, I'm -1 on the idea.  I view re.match() and friends as
convenience functions.  There's no reason to provide all the functionality
of the slightly lower-level re.compile().  If we were to do that, I'd
propose (facetiously) that we deprecate re.compile() as well.

Skip
From fdrake at acm.org  Fri Sep 10 16:14:52 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Sep 10 16:15:06 2004
Subject: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <414193D5.6010405@andreweland.org>
References: <414193D5.6010405@andreweland.org>
Message-ID: <200409101014.52091.fdrake@acm.org>

On Friday 10 September 2004 07:45 am, Andrew Eland wrote:
 > Over in web-sig, we're discussing PEP 333, the Web Server Gateway
 > Interface. Rather than defining our own set of constants for the HTTP
 > status code integers, we thought it would be a good idea to add them to
 > httplib,

+1

Some of us really don't remember what all the numeric codes mean, especially 
the ones we don't see often.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From aahz at pythoncraft.com  Fri Sep 10 16:26:09 2004
From: aahz at pythoncraft.com (Aahz)
Date: Fri Sep 10 16:26:17 2004
Subject: [Python-Dev] PEP292 vs Unicode
In-Reply-To: <87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<chnidf$epp$1@sea.gmane.org> <413F6120.7090603@egenix.com>
	<200409090939.41873.gmccaughan@synaptics-uk.com>
	<87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <20040910142609.GA15723@panix.com>

On Fri, Sep 10, 2004, Stephen J. Turnbull wrote:
>
> While I'm very sympathetic to </F>'s view that there's more than one
> way to skin a cat, and a good cat-handling design should account for
> that, and conceding his expertise, none-the-less I don't think that
> Python really wants to _maintain_ more than one text-processing system
> by default.  Of course if you restrict yourself to the class of ASCII-
> only strings, you can do better, and of course that is a huge class of
> strings.  But that, as such, is important only to efficiency fanatics.

That's a good point, and that's what Python is moving toward.  The thing
is, we currently have two text processing systems, and there's no reason
(given Python's dynamic dispatch capabilities) to treat one of them as
second-class for this issue.  It's particularly onerous in this instance
because Unicode is unfortunately second-class in a number of respects,
and doing what is in some respects a silent switch here would be
needlessly confusing and irritating for users.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From pje at telecommunity.com  Fri Sep 10 17:01:08 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Sep 10 17:01:37 2004
Subject: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <414193D5.6010405@andreweland.org>
Message-ID: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>

At 12:45 PM 9/10/04 +0100, Andrew Eland wrote:
>Over in web-sig, we're discussing PEP 333, the Web Server Gateway 
>Interface. Rather than defining our own set of constants for the HTTP 
>status code integers, we thought it would be a good idea to add them to 
>httplib, allowing other applications to benefit. I've uploaded a patch[1] 
>to httplib.py and the corresponding documentation. Do people think this is 
>a good idea?

I would also put the statuses in a dictionary, such that:

     status_code[BAD_GATEWAY] = "Bad Gateway"

This could be accomplished via something like:

     status_code = dict([
        (val, key.replace('_',' ').title())
            for key,val in globals.items()
                if key==key.upper() and not key.startswith('HTTP')
                    and not key.startswith('_')
     ])

From barry at python.org  Fri Sep 10 17:04:32 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 17:04:37 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <41419432.2000600@zope.com>
References: <413F1B87.90301@egenix.com>
	<ca471dc204090808119668687@mail.gmail.com>
	<41416E76.8030603@egenix.com> <41419432.2000600@zope.com>
Message-ID: <1094828671.30837.23.camel@geddy.wooz.org>

On Fri, 2004-09-10 at 07:46, Jim Fulton wrote:

> I find explicit relative imports easier to read, as it
> reduces the noise level.
> 
> I like the fact that local imports look different from non-local ones.

Yes, +1.  The most important thing IMO is that there be an explicit way
to spell whatever the default isn't.  I was just grumbling the other day
because I had to rename a submodule foologging.py instead of the more
natural logging.py because that module suddenly wanted to start
importing the global logging package.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040910/930a82a1/attachment.pgp
From andrew at andreweland.org  Fri Sep 10 17:12:02 2004
From: andrew at andreweland.org (Andrew Eland)
Date: Fri Sep 10 17:24:53 2004
Subject: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>
References: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>
Message-ID: <4141C442.8050005@andreweland.org>

Phillip J. Eby wrote:

> I would also put the statuses in a dictionary, such that:
> 
>     status_code[BAD_GATEWAY] = "Bad Gateway"

There's a table mapping status codes to messages on 
BaseHTTPRequestHandler at the moment. It could be moved into httplib to 
make it more publically visible.

   -- Andrew
From andrew at andreweland.org  Fri Sep 10 17:46:44 2004
From: andrew at andreweland.org (Andrew Eland)
Date: Fri Sep 10 17:59:36 2004
Subject: [Web-SIG] Re: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <4141CC1F.4000207@xhaus.com>
References: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>	<4141C442.8050005@andreweland.org>
	<4141CC1F.4000207@xhaus.com>
Message-ID: <4141CC64.2090205@andreweland.org>

Alan Kennedy wrote:


> And that mapping has 2 levels of human readable messages on it, for example
> 304: ('Not modified', 'Document has not changed singe given time'),
> I think that, since the human readable versions are seldom heeded 
> anyway, perhaps a single message is all we need?

A simple move would mean we'd have to keep both, for backwards 
compatability. I guess BaseHTTPRequestHandler could mix its long 
messages in with those in a httplib table, but it sounds ugly.

> And I'm -1 on forcing servers, particularly CGI servers, to import the 
> client-side httplib (2.3 httplib.pyc == 42K) just to get this mapping.

I think the number of people who wouldn't import httplib on 
speed/process size grounds is very small. If they're that worried about 
efficiency, they could copy and paste the table, and manage the extra 
development complexity.

   -- Andrew

From pje at telecommunity.com  Fri Sep 10 18:08:37 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Sep 10 18:09:09 2004
Subject: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <4141C442.8050005@andreweland.org>
References: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>
	<5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040910120714.032b5b80@mail.telecommunity.com>

At 04:12 PM 9/10/04 +0100, Andrew Eland wrote:
>Phillip J. Eby wrote:
>
>>I would also put the statuses in a dictionary, such that:
>>     status_code[BAD_GATEWAY] = "Bad Gateway"
>
>There's a table mapping status codes to messages on BaseHTTPRequestHandler 
>at the moment. It could be moved into httplib to make it more publically 
>visible.


It doesn't appear to include HTTP/1.1 status codes.


From tim.hochberg at ieee.org  Fri Sep 10 18:33:09 2004
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Fri Sep 10 18:33:20 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <200409100058.i8A0wNIV002743@cosc353.cosc.canterbury.ac.nz>
References: <200409100058.i8A0wNIV002743@cosc353.cosc.canterbury.ac.nz>
Message-ID: <4141D745.805@ieee.org>

Greg Ewing wrote:

[SNIP]

Just a couple of quick comments on the motivation:


> Motivation
> ==========
> 
> There are many applications in which it is natural to provide custom
> meanings for Python operators, and in some of these, having boolean
> operators excluded from those able to be customised can be
> inconvenient.  Examples include:
> 
> 1. Numeric/Numarray, in which almost all the operators are defined on
>    arrays so as to perform the appropriate operation between
>    corresponding elements, and return an array of the results.  For
>    consistency, one would expect a boolean operation between two
>    arrays to return an array of booleans, but this is not currently
>    possible.
> 
>    There is a precedent for an extension of this kind: comparison
>    operators were originally restricted to returning boolean results,
>    and rich comparisons were added so that comparisons of Numeric
>    arrays could return arrays of booleans.

For Numeric/Numarray, I think and1/or1 would be unnecessary. If that 
were true in general it would simplify the proposal signifigantly: 
and2/or2 could be renamed to and/or and and1/or1 could be dropped.


> 2. A symbolic algebra system, in which a Python expression is
>    evaluated in an environment which results in it constructing a tree
>    of objects corresponding to the structure of the expression.
> 
> 3. A relational database interface, in which a Python expression is
>    used to construct an SQL query.

I would be interested in seeing use cases for either or both of these 
last two examples that show how and1/or1 are useful.


Regards,

-tim

From mal at egenix.com  Fri Sep 10 19:05:32 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri Sep 10 19:05:36 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <1094828671.30837.23.camel@geddy.wooz.org>
References: <413F1B87.90301@egenix.com><41416E76.8030603@egenix.com>	<41419432.2000600@zope.com>
	<1094828671.30837.23.camel@geddy.wooz.org>
Message-ID: <4141DEDC.8080503@egenix.com>

Barry Warsaw wrote:
> On Fri, 2004-09-10 at 07:46, Jim Fulton wrote:
> 
> 
>>I find explicit relative imports easier to read, as it
>>reduces the noise level.
>>
>>I like the fact that local imports look different from non-local ones.
> 
> 
> Yes, +1.  The most important thing IMO is that there be an explicit way
> to spell whatever the default isn't.  I was just grumbling the other day
> because I had to rename a submodule foologging.py instead of the more
> natural logging.py because that module suddenly wanted to start
> importing the global logging package.

If that's the only reason, then placing the whole Python standard
lib under a new top-level package name would be the better
solution, starting with P3k.

I wasn't suggesting not to have relative imports. It is just
that most third-party packages nowadays rely on the current
import lookup mechanism (first local, then global). All of these
would break the day absolute imports become the default.

Whether or not relative imports look right is probably more a question of
taste than anything else... I find getting the number of dots right just
as hard as getting the number '../' right in an relative
path name.

But back to the original question: should absolute imports be
made a P3k feature or will we have a sys.setimportscheme()
hook to tune the setting on a per application basis ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 10 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From nidoizo at yahoo.com  Fri Sep 10 19:18:47 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Fri Sep 10 19:18:47 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <4141DEDC.8080503@egenix.com>
References: <413F1B87.90301@egenix.com><41416E76.8030603@egenix.com>	<41419432.2000600@zope.com>	<1094828671.30837.23.camel@geddy.wooz.org>
	<4141DEDC.8080503@egenix.com>
Message-ID: <chsnli$l3a$1@sea.gmane.org>

M.-A. Lemburg wrote:
> I wasn't suggesting not to have relative imports. It is just
> that most third-party packages nowadays rely on the current
> import lookup mechanism (first local, then global). All of these
> would break the day absolute imports become the default.

Don't you think that they have enough time to adapt?  We're talking 
about 2.6 for the final step and 2.4 is not even released.  If a 
third-party package doesn't adapt, I'm sure it's possible to have a 
wordaround, but isn't that a different issue?  Don't forget that you can 
make you imports in if/else blocks on version.

Regards,
Nicolas

From mcherm at mcherm.com  Fri Sep 10 19:21:25 2004
From: mcherm at mcherm.com (Michael Chermside)
Date: Fri Sep 10 19:19:56 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
Message-ID: <1094836885.4141e2953f7df@mcherm.com>

Fredrik Lundh writes:
> the "I didn't prepare properly, didn't know what I was talking about,
> and didn't know what do answer when my students asked me a legitimate
> question" argument isn't a good reason to change the language.
>
> if you're doing Python training, make sure you know your
> Python.  I do,
> and I very seldom have problems explaining how things work.

Fredrik, a less hostile response would be appropriate here. No one
knows every detail of every API of any reasonably sized library
(like Python's). Students ask questions about the darndest things.
If you have never been stumped by a student's question then you're
not teaching the right people.

My opinion on the underlying question is this: We have two ways
of doing things: using compiled REs, and using the RE functions.
Our goal is to make Python's API be so simple and easy to understand
that people DON'T have to memorize every little detail -- it should
be "obvious". That is, in my opinion, the strongest reason in favor
of minimal APIs.

Right now, there are some things you can do with the RE functions
and a DIFFERENT set of things you can do with the compiled REs.
That's TWO sets of functionality to learn. If Noam's patch can
make the feature set of the RE functions the SAME as the feature
set of the compiled REs, then there's only ONE set of features to
memorize. On the whole, there are MORE indiviual "pieces" to the
API but because of orthogonality the API as a whole is simpler.
Therefore in this case I favor using Noam's patch.

My next-favorite alternative would actually be to remove the
RE functions so there's "only one way to do it". But the functions
are conceptually simpler (as Noam showed, the docs describe the
functions then say the compiled REs work "just the same"), and
they've been in place for years... removing them is not an option.

-- Michael Chermside

From amk at amk.ca  Fri Sep 10 19:42:47 2004
From: amk at amk.ca (A.M. Kuchling)
Date: Fri Sep 10 19:43:13 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <1094836885.4141e2953f7df@mcherm.com>
References: <1094836885.4141e2953f7df@mcherm.com>
Message-ID: <20040910174247.GA17451@rogue.amk.ca>

On Fri, Sep 10, 2004 at 10:21:25AM -0700, Michael Chermside wrote:
> (as Noam showed, the docs describe the
> functions then say the compiled REs work "just the same"),

This fact is just a historical accident, because I initially wrote the
docs for the re module starting with the functions and then moving on
to the methods.  I can restructure the docs to make regex objects
paramount.  (The Regex HOWTO takes this approach; the module-level
functions are mentioned in only one section, and not used outside of
that section.)

--amk

From michel at dialnetwork.com  Thu Sep  9 09:46:24 2004
From: michel at dialnetwork.com (Michel Pelletier)
Date: Fri Sep 10 19:56:43 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
Message-ID: <1094715983.1472.7.camel@debbie>


> Message: 4
> Date: Fri, 10 Sep 2004 12:58:23 +1200
> From: Greg Ewing <greg@cosc.canterbury.ac.nz>
> Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
> To: python-dev@python.org
> Message-ID:
>         <200409100058.i8A0wNIV002743@cosc353.cosc.canterbury.ac.nz>
> 
> 
> Python does not currently provide any '__xxx__' special methods
> corresponding to the 'and', 'or' and 'not' boolean operators.  

I like the PEP with 'and' and 'or', but isn't the 'not' special method
essentially the inverse of __nonzero__?

-Michel

From pje at telecommunity.com  Fri Sep 10 20:18:08 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Sep 10 20:18:45 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <1094715983.1472.7.camel@debbie>
Message-ID: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>

At 12:46 AM 9/9/04 -0700, Michel Pelletier wrote:

> > Message: 4
> > Date: Fri, 10 Sep 2004 12:58:23 +1200
> > From: Greg Ewing <greg@cosc.canterbury.ac.nz>
> > Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
> > To: python-dev@python.org
> > Message-ID:
> >         <200409100058.i8A0wNIV002743@cosc353.cosc.canterbury.ac.nz>
> >
> >
> > Python does not currently provide any '__xxx__' special methods
> > corresponding to the 'and', 'or' and 'not' boolean operators.
>
>I like the PEP with 'and' and 'or', but isn't the 'not' special method
>essentially the inverse of __nonzero__?

There isn't such a method currently.  Also, note that the expression 'not 
x' is currently guaranteed to return a boolean value.  The purpose of the 
PEP is to allow 'not x' to potentially return an arbitrary object, as for 
use in algebraic and query systems that want to use Python code as their 
syntax.  Such systems currently use e.g. '~x' instead of 'not x' because 
the former allows return of arbitrary objects.

IMO, the algebraic/query use cases would be better served by some sort of 
"code literal" or "AST literal" syntax, rather than adding more special 
methods.  The reason is that all too often you want to include "normal" 
Python values in such an expression, but still manipulate them 
symbolically, or have some other sort of special treatment.  A literal 
syntax for Python expressions is more useful for this, which is why I've 
moved to using strings and the parser module to accomplish such 
processing.  At that level, boolean operator methods are moot.

(Code literals would be useful primarily in the ability to have them parsed 
and syntax checked at import time, rather than waiting until runtime.  This 
consideration also applies to PEP 335, but PEP 335 may consume all of its 
compilation performance gains by losing runtime performance at all boolean 
operation sites.)

But anyway, I digress.  Since PEP 335 doesn't significantly help (IMO) with 
algebraic and query systems, that leaves the numeric use cases, which I 
don't have enough experience to comment on.

From barry at python.org  Fri Sep 10 20:32:30 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 20:32:35 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib string.py,
	1.74, 1.75
In-Reply-To: <E1C5emk-0004lS-8g@sc8-pr-cvs1.sourceforge.net>
References: <E1C5emk-0004lS-8g@sc8-pr-cvs1.sourceforge.net>
Message-ID: <1094841150.30836.38.camel@geddy.wooz.org>

On Fri, 2004-09-10 at 02:21, rhettinger@users.sourceforge.net wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18307
> 
> Modified Files:
> 	string.py 
> Log Message:
> __slots__ went missing from Template.

On purpose though.  With __slots__ you can't mix in Template and
unicode.  I don't see any reason to limit the attributes of a Template
instance, so I backed this out.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040910/d80ee6aa/attachment-0001.pgp
From barry at python.org  Fri Sep 10 20:38:14 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 20:38:18 2004
Subject: [Python-Dev] Re: Alternative Implementation
	forPEP292:SimpleString Substitutions
In-Reply-To: <003f01c496fa$1b2f6da0$e841fea9@oemcomputer>
References: <003f01c496fa$1b2f6da0$e841fea9@oemcomputer>
Message-ID: <1094841494.30829.45.camel@geddy.wooz.org>

On Fri, 2004-09-10 at 01:50, Raymond Hettinger wrote:
> [Barry]
> > And to make effbot and Raymond happy, it won't auto-promote to unicode
> > if everything's an 8bit string.
> 
> Glad to see that my happiness now ranks as a development objective ;-)

Well, if I want to get other work done... :)

> > There will be updated unit tests, and I will update the documentation
> > and the PEP as appropriate -- if we've reached agreement on it.
> 
> +1 
> Beautiful job.

Cool!

> The arguments against such reporting are:
> * Raymond is smoking crack.  End users will never make this mistake.
> * The docs say python identifiers only.  You blew it.  Tough.  Not a
> bug.
> * For someone who understands exactly what they are doing, perhaps $ma
> is the intended placeholder -- why force them to uses braces:
> ${ma}?ana.

It also makes it more difficult to document.  IOW, right now the PEP and
the documentation say that the first non-identifier character terminates
the placeholder.  How would you word the rules with your change?

> In addition to the above usability issue, there is one other nit.  The
> new invocation syntax offers us the opportunity for to also accept
> keyword arguments as mapping alternatives:
> 
>     def substitute(self, mapping=None, **kwds):
>         if mapping is None:
>            mapping == kwds
>      . . .
> 
> When applicable, this makes for beautiful, readable calls:
> 
>     t.substitute(who="Barry", what="mailmeister", when=now())
> 
> This would be a simple and nice enchancement to Barry's excellent
> implementation.  I recommend that keyword arguments be adopted.

My only problem with that is the interference that the 'mapping'
argument presents.  IOW, kwds can't contain 'mapping'.  We could solve
that in a couple of ways:

1. ignore the problem and tell people not to do that
2. change 'mapping' to something less likely to collide, such as
'_mapping' or '__mapping__', and then see #1.
3. get rid of the mapping altogether and only have kwds.  This would
change the non-keyword invocation from

mytemplate.substitute(mymapping)

to

mytemplate.substitute(**mymapping)

A bit uglier and harder to document.

Note that there's also a potential collision on 'self'.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040910/20fd8f3e/attachment.pgp
From barry at python.org  Fri Sep 10 20:41:24 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 20:41:28 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
	string.py, 1.73, 1.74
In-Reply-To: <200409092331.39486.fdrake@acm.org>
References: <E1C5blS-0005wo-1L@sc8-pr-cvs1.sourceforge.net>
	<200409092331.39486.fdrake@acm.org>
Message-ID: <1094841684.30831.50.camel@geddy.wooz.org>

On Thu, 2004-09-09 at 23:31, Fred L. Drake, Jr. wrote:
> On Thursday 09 September 2004 11:07 pm, bwarsaw@users.sourceforge.net wrote:
>  > - Adopt Tim Peter's idea for giving Template a metaclass, which makes the
>  >   delimiter, the identifier pattern, or the entire pattern easy to
>  > override and document, while retaining efficiency of class-time
>  > compilation of the regexp.
> 
> Good documentation would really help for this as well.  One simple and one... 
> interesting example would be nice.  ;-)

Yep.  I'm definitely planning on updating the docs.  I'll make sure to
include some examples.  After re-organizing libstring.tex, there's
plenty of room to do so without increasing the clutter.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040910/747dd7cc/attachment.pgp
From barry at python.org  Fri Sep 10 20:47:56 2004
From: barry at python.org (Barry Warsaw)
Date: Fri Sep 10 20:48:03 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <4141DEDC.8080503@egenix.com>
References: <413F1B87.90301@egenix.com><41416E76.8030603@egenix.com>
	<41419432.2000600@zope.com> <1094828671.30837.23.camel@geddy.wooz.org>
	<4141DEDC.8080503@egenix.com>
Message-ID: <1094842075.30831.55.camel@geddy.wooz.org>

On Fri, 2004-09-10 at 13:05, M.-A. Lemburg wrote:

> If that's the only reason, then placing the whole Python standard
> lib under a new top-level package name would be the better
> solution, starting with P3k.

One of my earliest suggestions on the topic did just that.  In fact, you
could do it in a backward compatible way, by introducing an optional
global package.  E.g.

import logging

That would import the local logging.py module if it existed, otherwise
it would import the global logging module.  This is exactly what Python
does today.

from __global__ import logging

That would always import the global logging package.  __global__ is the
optional "fake" global package and would only be used when you want to
explicitly skip any local imports.

IIRC though, Guido never liked this proposal much.  I repost it here on
the off chance that he's way too busy to read every message in this
thread <wink>.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040910/2db44f51/attachment.pgp
From edloper at gradient.cis.upenn.edu  Fri Sep 10 21:59:31 2004
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Fri Sep 10 21:59:44 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <20040910183258.A1DCE1E4009@bag.python.org>
References: <20040910183258.A1DCE1E4009@bag.python.org>
Message-ID: <EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>

> Right now, there are some things you can do with the RE functions
> and a DIFFERENT set of things you can do with the compiled REs.
> That's TWO sets of functionality to learn. If Noam's patch can
> make the feature set of the RE functions the SAME as the feature
> set of the compiled REs, then there's only ONE set of features to
> memorize. On the whole, there are MORE indiviual "pieces" to the
> API but because of orthogonality the API as a whole is simpler.
> Therefore in this case I favor using Noam's patch.

+1.  Consistency makes the API conceptually simpler, even if the 
absolute number of parameters is larger.

-Edward

From martin at v.loewis.de  Fri Sep 10 22:59:06 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Sep 10 22:58:57 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
References: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
Message-ID: <4142159A.7030309@v.loewis.de>

Phillip J. Eby wrote:
>> I like the PEP with 'and' and 'or', but isn't the 'not' special method
>> essentially the inverse of __nonzero__?
> 
> 
> There isn't such a method currently.  

Did you mean to say that there is currently no method named __nonzero__?
This is not true:

 >>> class X:
...   def __nonzero__(self):
...     print "Called"
...     return 13
...
 >>> not X()
Called
False

Regards,
Martin
From pje at telecommunity.com  Fri Sep 10 23:29:51 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Sep 10 23:30:31 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <4142159A.7030309@v.loewis.de>
References: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
	<5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040910172922.02a930f0@mail.telecommunity.com>

At 10:59 PM 9/10/04 +0200, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
>>>I like the PEP with 'and' and 'or', but isn't the 'not' special method
>>>essentially the inverse of __nonzero__?
>>
>>There isn't such a method currently.
>
>Did you mean to say that there is currently no method named __nonzero__?

No; that there was no method named '__not__'.

From raymond.hettinger at verizon.net  Sat Sep 11 00:22:54 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat Sep 11 00:23:50 2004
Subject: [Python-Dev] Re: Alternative ImplementationforPEP292:SimpleString
	Substitutions
In-Reply-To: <1094841494.30829.45.camel@geddy.wooz.org>
Message-ID: <003501c49784$b8292d00$e841fea9@oemcomputer>

> > * For someone who understands exactly what they are doing, perhaps
$ma
> > is the intended placeholder -- why force them to uses braces:
> > ${ma}?ana.
> 
> It also makes it more difficult to document.  IOW, right now the PEP
and
> the documentation say that the first non-identifier character
terminates
> the placeholder.  How would you word the rules with your change?

"""Placeholders must be a valid Python identifier (containing only ASCII
alphanumeric characters and an underscore).  If an unbraced identifier
ends with a non-ASCII alphanumeric character, such as the latin letter n
with tilde in $ma?ana, then a ValueError is raised for the specious
identifier.


> My only problem with that is the interference that the 'mapping'
> argument presents.  IOW, kwds can't contain 'mapping'. 

To support a case where both a mapping and keywords are present, perhaps
an auxiliary class could simplify matters:

   def substitute(self, mapping=None, **kwds):
       if mapping is None:
           mapping = kwds
       elif kwds:
           mapping = _altmap(kwds, mapping)
        . . .


class _altmap:
    def __init__(self, primary, secondary):
        self.primary = primary
        self.secondary = secondary
    def __getitem__(self, key):
        try:
            return self.primary[key]
        except KeyError:
            return self.secondary[key]
        

This matches the way keywords are used with the dict().


Raymond

From nidoizo at yahoo.com  Sat Sep 11 00:51:26 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sat Sep 11 00:51:35 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <1094842075.30831.55.camel@geddy.wooz.org>
References: <413F1B87.90301@egenix.com><41416E76.8030603@egenix.com>	<41419432.2000600@zope.com>
	<1094828671.30837.23.camel@geddy.wooz.org>	<4141DEDC.8080503@egenix.com>
	<1094842075.30831.55.camel@geddy.wooz.org>
Message-ID: <chtb5d$3hu$1@sea.gmane.org>

Barry Warsaw wrote:
> from __global__ import logging
> 
> That would always import the global logging package.  __global__ is the
> optional "fake" global package and would only be used when you want to
> explicitly skip any local imports.
> 
> IIRC though, Guido never liked this proposal much.  I repost it here on
> the off chance that he's way too busy to read every message in this
> thread <wink>.

I agree with Guido.  FWIW, I think imports should be absolute by default 
and that the statu quo is a mistake.  The __global__ solution makes 
absolute imports too verbose, when they are usually in majority.  I also 
don't see any advantage (but clear disadvantages) to mix relative and 
absolute imports with the same syntax, so PEP328 is the way to go. 
Third party packages have 3 releases to adapt, so I don't see the problem.

You have to understand that with the __global__ solution, I would make 
all my imports use that syntax, and that's really verbose.  Where I 
work, we are many working in a root package and right now it's a mess 
because any new module can hide global modules to modules in same 
directory, so modules names must be chosen accordingly (we even run a 
test at night to make sure no import is relative).  And yes, I would 
want to be able to name modules in a package with names like "math", 
"os", "pickle", "test", "unittest", etc. and not wait Python 3 for that 
capability.  I also expect more standard modules to be in packages in 
future.

Regards,
Nicolas

From gvanrossum at gmail.com  Sat Sep 11 02:46:20 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Sep 11 02:46:23 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>
References: <20040910183258.A1DCE1E4009@bag.python.org>
	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>
Message-ID: <ca471dc2040910174644d6ebff@mail.gmail.com>

> +1.  Consistency makes the API conceptually simpler, even if the
> absolute number of parameters is larger.

And how is it more consistent that in one form you have to write

re.compile(r"[a-z]+", re.I).search(line)

while in the other form you have to write

re.search(r"[a-z]+", line, re.I)

???

This parameter ordering issue alone makes me cringe at adding the
flags to the functions.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From py-web-sig at xhaus.com  Fri Sep 10 17:45:35 2004
From: py-web-sig at xhaus.com (Alan Kennedy)
Date: Sat Sep 11 07:00:48 2004
Subject: [Web-SIG] Re: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <4141C442.8050005@andreweland.org>
References: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>
	<4141C442.8050005@andreweland.org>
Message-ID: <4141CC1F.4000207@xhaus.com>

[Phillip J. Eby]
>> I would also put the statuses in a dictionary, such that:
>>
>>     status_code[BAD_GATEWAY] = "Bad Gateway"

[Andrew Eland]
> There's a table mapping status codes to messages on 
> BaseHTTPRequestHandler at the moment. It could be moved into httplib to 
> make it more publically visible.

And that mapping has 2 levels of human readable messages on it, for example

304: ('Not modified', 'Document has not changed singe given time'),

I think that, since the human readable versions are seldom heeded 
anyway, perhaps a single message is all we need?

And I'm -1 on forcing servers, particularly CGI servers, to import the 
client-side httplib (2.3 httplib.pyc == 42K) just to get this mapping.

If the changes are not going to make it in until the next release of 
cpython anyway, then maybe we should just aim for a new module? Or is 
some version of 2.4 the target, in which case minimal patches might make 
it in, whereas new modules won't?

Just my 0,02 euro.

Alan.
From bac at OCF.Berkeley.EDU  Sat Sep 11 07:22:12 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Sep 11 07:22:17 2004
Subject: [Python-Dev] Re: Alternative ImplementationforPEP292:SimpleString
	Substitutions
In-Reply-To: <003501c49784$b8292d00$e841fea9@oemcomputer>
References: <003501c49784$b8292d00$e841fea9@oemcomputer>
Message-ID: <41428B84.8090709@ocf.berkeley.edu>

Raymond Hettinger wrote:

>>>* For someone who understands exactly what they are doing, perhaps
> 
> $ma
> 
>>>is the intended placeholder -- why force them to uses braces:
>>>${ma}?ana.
>>
>>It also makes it more difficult to document.  IOW, right now the PEP
> 
> and
> 
>>the documentation say that the first non-identifier character
> 
> terminates
> 
>>the placeholder.  How would you word the rules with your change?
> 
> 
> """Placeholders must be a valid Python identifier (containing only ASCII
> alphanumeric characters and an underscore).  If an unbraced identifier
> ends with a non-ASCII alphanumeric character, such as the latin letter n
> with tilde in $ma?ana, then a ValueError is raised for the specious
> identifier.
> 

I don't think any of this is needed.  If a non-programmer is being told 
to use string substitution chances are someone is either going to 
explain it to them or there will be another set of docs to explain 
things in a simple way.  I suspect stating exactly what a valid Python 
identifier contains as you did in parentheses above will be enough.

-Brett
From shane.holloway at ieee.org  Sat Sep 11 08:25:38 2004
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Sat Sep 11 08:26:06 2004
Subject: [Python-Dev] PEP 328 - Relative Imports
In-Reply-To: <41416E76.8030603@egenix.com>
References: <413F1B87.90301@egenix.com>	<ca471dc204090808119668687@mail.gmail.com>
	<41416E76.8030603@egenix.com>
Message-ID: <41429A62.3080201@ieee.org>


M.-A. Lemburg wrote:
> People are more likely going to make all imports absolute (like
> they already do in Java and other languages) - which
> is good, since it makes reading code much easier and allows for
> writing packages which are compatible to older Python version,
> but it also prevent developing applications using the above
> approach.
> 
> I also don't think that extension writers will care enough to
> make their packages fully relocateable by using relative
> imports all over - these are hard to read and don't buy
> the developer of the extension anything.
> 
> Anyway, what should the strategy for the PEP look like ?
> 
> 1. postpone the defaulting to absolute until P3k
> 
> 2. provide a way to customize the behaviour using
>    e.g. a sys function

As a package writer, I will go through the effort to write relative 
imports for a few reasons.

  * One is that I often don't know what the final layout of the larger 
package group will be; however, I am usually fairly certain about local 
dependencies.  Having a way to refer to a parent package will enable me 
to be "complete" in this development style.

  * Second, it's really handy to develop something in the sandbox, and 
then move it to production in one fell swooop.  BTW, will __path__ work 
for relative parent references?

  * A third reason I will go through the effort to use relative imports 
is that I'd like to allow application frameworks (and other package 
writers) to "scoop up" any whole packages if they so desire.


Unfortunately, it'll be a while before I can target Python 2.4 directly. 
  But it will be good when I can!

Thanks,
-Shane Holloway
From shane.holloway at ieee.org  Sat Sep 11 08:25:49 2004
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Sat Sep 11 08:26:14 2004
Subject: [Python-Dev] Re: Alternative ImplementationforPEP292:SimpleString
	Substitutions
In-Reply-To: <41428B84.8090709@ocf.berkeley.edu>
References: <003501c49784$b8292d00$e841fea9@oemcomputer>
	<41428B84.8090709@ocf.berkeley.edu>
Message-ID: <41429A6D.90405@ieee.org>

> Raymond Hettinger wrote:
>> """Placeholders must be a valid Python identifier (containing only ASCII
>> alphanumeric characters and an underscore).  If an unbraced identifier
>> ends with a non-ASCII alphanumeric character, such as the latin letter n
>> with tilde in $ma?ana, then a ValueError is raised for the specious
>> identifier.

Brett C. wrote:
> I don't think any of this is needed.  If a non-programmer is being told 
> to use string substitution chances are someone is either going to 
> explain it to them or there will be another set of docs to explain 
> things in a simple way.  I suspect stating exactly what a valid Python 
> identifier contains as you did in parentheses above will be enough.

Also, since Barry has gone to great lengths to make Template 
overrideable, applications can replace the regular expression in their 
derived Template class when there is a need to allow for end-users 
inputing template strings.  So, I'd suggest keeping safe_substitute 
relatively simple, but document the limitation and/or solution.

Thanks,
-Shane Holloway

From fredrik at pythonware.com  Sat Sep 11 08:52:06 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 11 08:50:17 2004
Subject: [Python-Dev] Re: Re: Alternative
	ImplementationforPEP292:SimpleStringSubstitutions
References: <1094841494.30829.45.camel@geddy.wooz.org>
	<003501c49784$b8292d00$e841fea9@oemcomputer>
Message-ID: <chu772$fdk$1@sea.gmane.org>

Raymond Hettinger wrote:

> """Placeholders must be a valid Python identifier (containing only ASCII
> alphanumeric characters and an underscore).  If an unbraced identifier
> ends with a non-ASCII alphanumeric character, such as the latin letter n
> with tilde in $ma�ana, then a ValueError is raised for the specious
> identifier.

so why keep the python identifier limitation?  the RE engine you're using to
parse the template has a concept of "alphanumeric character".  just define
the placeholder syntax as "one or more alphanumeric characters or under-
scores" (\w+), use re.UNICODE if the template is created from a unicode
string, and you're done.

this doesn't mean that people *have* to use non-ASCII characters, of course.
but if they do, things just work.

</F> 


From raymond.hettinger at verizon.net  Sat Sep 11 08:57:24 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat Sep 11 08:58:20 2004
Subject: [Python-Dev] Re: Alternative ImplementationforPEP292:SimpleString
	Substitutions
In-Reply-To: <41428B84.8090709@ocf.berkeley.edu>
Message-ID: <000e01c497cc$983d6720$e841fea9@oemcomputer>

[Brett]
> I suspect stating exactly what a valid Python
> identifier contains as you did in parentheses above will be enough.

Given the template, u'?Puede volver $hoy o $ma?ana?', you think $ma is
an intended placeholder name and that ? should be a delimiter just like
whitespace and punctuation?

If end users always follow the rules, this will never come up.  If they
don't, should there be error message or a silent failure?


Raymond


From martin at v.loewis.de  Sat Sep 11 09:01:34 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep 11 09:01:23 2004
Subject: [Web-SIG] Re: [Python-Dev] Adding status code constants to httplib
In-Reply-To: <4141CC1F.4000207@xhaus.com>
References: <5.1.1.6.0.20040910105252.020caec0@mail.telecommunity.com>	<4141C442.8050005@andreweland.org>
	<4141CC1F.4000207@xhaus.com>
Message-ID: <4142A2CE.5060405@v.loewis.de>

Alan Kennedy wrote:
> And I'm -1 on forcing servers, particularly CGI servers, to import the 
> client-side httplib (2.3 httplib.pyc == 42K) just to get this mapping.

It might be somewhat comforting that the 2.4 httplib.pyc is only 33K.

Regards,
Martin
From stephen at xemacs.org  Sat Sep 11 09:35:08 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat Sep 11 09:35:18 2004
Subject: [Python-Dev] AlternativeImplementation	forPEP292:SimpleString
	Substitutions
In-Reply-To: <200409101257.13802.gmccaughan@synaptics-uk.com> (Gareth
	McCaughan's message of "Fri, 10 Sep 2004 12:57:13 +0100")
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<200409090939.41873.gmccaughan@synaptics-uk.com>
	<87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp>
	<200409101257.13802.gmccaughan@synaptics-uk.com>
Message-ID: <87mzzxqmvn.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Gareth" == Gareth McCaughan <gmccaughan@synaptics-uk.com> writes:

    Gareth> On Friday 2004-09-10 06:38, Stephen J. Turnbull wrote:

    >> But [efficiency], as such, is important only to efficiency
    >> fanatics.

    Gareth> No, it's important to ... well, people to whom efficiency
    Gareth> matters. There's no need for them to be fanatics.

If it matters just because they care, they're fanatics.  If it matters
because they get some other benefit (response time less than the
threshold of hotice, twice as many searches per unit time, half as
many boxes to serve a given load), they're not.  </F>'s talk of many
ways to do things "and Python should account for most of them" strikes
me as fanaticism by that definition; the vast majority of developers
will never deal with the special cases, or write apps that anticipate
dealing with huge ASCII strings.  Those costs should be borne by the
developers who do, and their clients.

I apologize for shoehorning that into my reply to you.

    >> The question is, how often are people going to notice that when
    >> they have pure ASCII they get a 100% speedup [...]?

    Gareth> Why is that the question, rather than "how often are
    Gareth> people going to benefit from getting a 100% speedup when
    Gareth> they have pure ASCII"?

Because "benefit" is very subjective for _one_ person, and I don't
want to even think about putting coefficients on your benefit versus
mine.  If the benefit is large enough, a single person will be willing
to do the extra work.  The question is, should all Python users and
developers bear some burden to make it easier for that person to do
what he needs to do?

I think "notice" is something you can get consensus on.  If a lot of
people are _noticing_ the difference, I think that's a reasonable rule
of thumb for when we might want to put "it", or facilities for making
individual efforts to deal with "it" simpler, into "standard Python"
at some level.  If only a few people are noticing, let them become
expert at dealing with it.

    Gareth> Or even "how often are people going to try out Python on
    Gareth> an application that uses pure-ASCII strings, and decide to
    Gareth> use some other language that seems to do the job much
    Gareth> faster"?

See?  You're now using a "notice" standard, too.  I don't think that's
an accident.

    >> I just don't see the former being worth the extra effort, while
    >> the latter makes the "this or that" choice clear.  If a single
    >> representation is enough, it had better be Unicode-based, and
    >> the others can be supported in libraries (which turn binary
    >> blobs into non-standard text objects with appropriate methods)
    >> as the need arises.

    Gareth> No question that if a single representation is enough then
    Gareth> it had better be Unicode.

Not for you, not for me, not for </F>, I'm pretty sure.  The point
here is that there is a reasonable way to support the others, too, but
their users will have to make more effort than if it were a goal to
support them in the "standard language and libraries."  I think that's
the way to go, and </F> thinks the opposite AFAICT.


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba           
From fredrik at pythonware.com  Sat Sep 11 10:13:44 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 11 10:11:57 2004
Subject: [Python-Dev] 
	Re: AlternativeImplementation	forPEP292:SimpleStringSubstitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><200409090939.41873.gmccaughan@synaptics-uk.com><87k6v2smxt.fsf_-_@tleepslib.sk.tsukuba.ac.jp><200409101257.13802.gmccaughan@synaptics-uk.com>
	<87mzzxqmvn.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <chuc04$n1s$1@sea.gmane.org>

Stephen J. Turnbull wrote:

> I think "notice" is something you can get consensus on.  If a lot of
> people are _noticing_ the difference, I think that's a reasonable rule
> of thumb for when we might want to put "it", or facilities for making
> individual efforts to deal with "it" simpler, into "standard Python"
> at some level.

who are "we"?  does that group include you?

</F> 


From martin at v.loewis.de  Sat Sep 11 10:39:14 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep 11 10:39:04 2004
Subject: [Python-Dev] Re: Alternative ImplementationforPEP292:SimpleString
	Substitutions
In-Reply-To: <000e01c497cc$983d6720$e841fea9@oemcomputer>
References: <000e01c497cc$983d6720$e841fea9@oemcomputer>
Message-ID: <4142B9B2.7060306@v.loewis.de>

Raymond Hettinger wrote:
> [Brett]
> 
>>I suspect stating exactly what a valid Python
>>identifier contains as you did in parentheses above will be enough.
> 
> 
> Given the template, u'?Puede volver $hoy o $ma?ana?', you think $ma is
> an intended placeholder name and that ? should be a delimiter just like
> whitespace and punctuation?

No, I think Brett (and apparently nearly everybody else) thinks that
such a template will not be written over the course of the next five
years, except for demonstration purposes. Instead, what will be written
is u'?Puede volver $today o $tomorrow?' because the template will be
a translation of the original English template, and, during translation,
placeholder names must not be changed (although I have difficulties
imagining possible values for today or tomorrow so that this becomes
meaningful).

> If end users always follow the rules, this will never come up.  If they
> don't, should there be error message or a silent failure?

There is always a chance of a silent failure in SafeTemplates, even with
this rule added - this is the purpose of SafeTemplates. With a Template,
you will get a KeyError. In any case, the failure will not be completely
silent, as the user will see $ma?ana show up in the output.

My prediction is that the typical application is to use Templates, as
users know very well what the placeholders are. Furthermore, the
typical application will use locals/globals/vars(), or dict(key="value")
to create the replacement dictionary. In this application, nobody
would even think of using ma?ana as a key, because you can't get
it into the dictionary.

If this never comes up, it is better to not complicate the rules.
Simple is better than complex.

Regards,
Martin

From fredrik at pythonware.com  Sat Sep 11 10:47:17 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 11 10:56:47 2004
Subject: [Python-Dev] Re: Re: Missing arguments in RE functions
References: <1094836885.4141e2953f7df@mcherm.com>
Message-ID: <chueka$rcu$1@sea.gmane.org>

Michael Chermside wrote:

> Fredrik, a less hostile response would be appropriate here. No one
> knows every detail of every API of any reasonably sized library
> (like Python's).

We're not talking about Python's library, we're talking about Python's RE
library.  It's not that big, really.  The documentation is five moderately-sized
HTML pages, plus a page with examples.  Seven functions (plus two trivial
variations) and two object types.  You cannot use the library at all without
knowing the stuff that's discussed on the first, third, and fifth page; the two
other pages discuss pos/endpos issues within the first few paragraphs.  Are
we trying to optimize Python for people who won't read evenly-numbered
sections?

> On the whole, there are MORE indiviual "pieces" to the
> API but because of orthogonality the API as a whole is
> simpler.

Given that there's no way to order the arguments consistently (since some
arguments apply to the compilation process, other to the match process),
you're obviously using "orthogonal" and "simple" in the Perl sense ;-)

</F> 


From fredrik at pythonware.com  Sat Sep 11 11:51:23 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 11 11:49:32 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292: Simple
	String Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>	<413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org>
	<413F6120.7090603@egenix.com>
Message-ID: <chuhn8$11s$1@sea.gmane.org>

M.-A. Lemburg wrote:

>> (google for "stringlib" for some work I'm doing in this area)
>
> Ah, now I know where you're coming from :-) Shift tables
> don't work well in the Unicode world with its large alphabet.

since most real-life text use characters from only a small number of regions
in that alphabet, compressed shift tables work extremely well (the algorithm
on the stringlib page shows one way to do that, in constant space and O(m)
time).

> BTW, you might want to look at the BMS implementation I did
> for mxTextTools.

did you ever get around to add Unicode support to mxTextTools ?

</F> 


From erik at heneryd.com  Sat Sep 11 13:54:52 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 11 13:55:00 2004
Subject: [Python-Dev] PEP 292: method names
Message-ID: <4142E78C.7010800@heneryd.com>

I haven't followed the template threads very closely, but reading the 
pep/implementation it clearly looks useful.  I don't know if I like the 
method names substitute/safe_substitute though.

* Too long
10/15 character names for something so simple it up until now just 
needed a %?  Programs using templates will probably use them 
frequently...  I'd prefer sub instead of substitute.

* Safe?
safe_substitution doesn't tell you much upon first glance.  Safe?  In 
what way?  You could even argue that the "plain" version really is the 
safer one, as you'll notice typos and thus get a more solid program.  I 
think a name hinting that this method uses the var name as a fallback 
would be better, but can't think of (a short) one...  defaultsub? 
fallbacksub?  loosesub?  Guess I could live with safe, but...


Erik

From erik at heneryd.com  Sat Sep 11 15:04:16 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 11 15:04:20 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <4142E78C.7010800@heneryd.com>
References: <4142E78C.7010800@heneryd.com>
Message-ID: <4142F7D0.5080807@heneryd.com>

Erik Heneryd wrote:
> * Safe?
> safe_substitution doesn't tell you much upon first glance.  Safe?  In 
> what way?  You could even argue that the "plain" version really is the 
> safer one, as you'll notice typos and thus get a more solid program.  I 
> think a name hinting that this method uses the var name as a fallback 
> would be better, but can't think of (a short) one...  defaultsub? 
> fallbacksub?  loosesub?  Guess I could live with safe, but...

Come to think of it, I really like the more OO-ish approach better, than 
to cram everything into a single class.  Is the safe_substitute really 
that special it deserves a special method?  Is it really the one, true 
way to do a "safe" substitution?  IIRC DOS and sh don't agree, so it's 
not that obvious.

I say keep the inheritance thing, it's much more flexible, and delegate 
the KeyError condition to an overridable method.

Erik

From nidoizo at yahoo.com  Sat Sep 11 18:01:28 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sat Sep 11 18:00:15 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <ca471dc2040910174644d6ebff@mail.gmail.com>
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>
	<ca471dc2040910174644d6ebff@mail.gmail.com>
Message-ID: <chv7e6$97v$1@sea.gmane.org>

Guido van Rossum wrote:
> And how is it more consistent that in one form you have to write
> 
> re.compile(r"[a-z]+", re.I).search(line)
> 
> while in the other form you have to write
> 
> re.search(r"[a-z]+", line, re.I)
> 
> ???
> 
> This parameter ordering issue alone makes me cringe at adding the
> flags to the functions.

I agree.  In fact, probably the line parameter should have been the 
first parameter for all functions, but anyway it's too late for that. 
The few times I have not used pattern objects in quick scripts, I always 
put the line at first at the wrong place instinctively, probably for the 
reason you mention (and I'm not pretending my instinct is universal). 
In that context, keeping the API is even more reasonable.

Regards,
Nicolas

From jlgijsbers at planet.nl  Sat Sep 11 18:26:51 2004
From: jlgijsbers at planet.nl (Johannes Gijsbers)
Date: Sat Sep 11 18:25:12 2004
Subject: [Python-Dev] doctest and inspect.getmodule
Message-ID: <20040911162650.GA9132@mail.planet.nl>

I just checked in a change to inspect.getmodule (without running the tests
beforehand, not a smart move) which broke a whole bunc of tests for doctest.
The tests mostly seem to fail because doctest can find modules for objects it
previously couldn't. 

I think the change is basically correct, but I'm not sure how to fix doctest.
Should doctest omit the module, or should the doctest tests be changed to
expect the module being printed?

Oh, I promise I'll run the tests before checking in next time.

Johannes

P.S.: here's the checkin message for the change:

Modified Files:
	inspect.py
Log Message:
Use __module__ attribute when available instead of using isclass()
predicate (functions and methods have grown the __module__ attribute too).
See bug #570300.

Index: inspect.py
=================================================================== RCS
file: /cvsroot/python/python/dist/src/Lib/inspect.py,v retrieving revision
1.54
retrieving revision 1.55
diff -u -d -r1.54 -r1.55
--- inspect.py	18 Aug 2004 12:40:30 -0000	1.54 +++ inspect.py	11 Sep 2004
15:53:22 -0000	1.55 @@ -370,7 +370,7 @@
     """Return the module an object was defined in, or None if not
     found.""" if ismodule(object):
         return object
-    if isclass(object):
+    if hasattr(object, '__module__'):
         return sys.modules.get(object.__module__)
     try:
         file = getabsfile(object)
From fredrik at pythonware.com  Sat Sep 11 18:23:27 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 11 18:30:27 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu><ca471dc2040910174644d6ebff@mail.gmail.com>
	<chv7e6$97v$1@sea.gmane.org>
Message-ID: <chv8mb$ccq$1@sea.gmane.org>

Nicolas Fleury wrote:

>> This parameter ordering issue alone makes me cringe at adding the
>> flags to the functions.
>
> I agree.  In fact, probably the line parameter should have been the first parameter for all 
> functions

so where would you put the pattern?

</F> 


From erik at heneryd.com  Sat Sep 11 18:34:40 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 11 18:34:45 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <chv7e6$97v$1@sea.gmane.org>
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>	<ca471dc2040910174644d6ebff@mail.gmail.com>
	<chv7e6$97v$1@sea.gmane.org>
Message-ID: <41432920.2020701@heneryd.com>

Nicolas Fleury wrote:
> Guido van Rossum wrote:
> 
>> And how is it more consistent that in one form you have to write
>>
>> re.compile(r"[a-z]+", re.I).search(line)
>>
>> while in the other form you have to write
>>
>> re.search(r"[a-z]+", line, re.I)
>>
>> ???
>>
>> This parameter ordering issue alone makes me cringe at adding the
>> flags to the functions.
> 
> 
> I agree.  In fact, probably the line parameter should have been the 
> first parameter for all functions, but anyway it's too late for that. 
> The few times I have not used pattern objects in quick scripts, I always 
> put the line at first at the wrong place instinctively, probably for the 
> reason you mention (and I'm not pretending my instinct is universal). In 
> that context, keeping the API is even more reasonable.

Well, considering that mandatory parameters must come before optional 
ones, theres really not much to do.  At least the mandatory function 
parameters are in the "right" order.


Erik


From nidoizo at yahoo.com  Sat Sep 11 18:44:01 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sat Sep 11 18:42:44 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <chv8mb$ccq$1@sea.gmane.org>
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu><ca471dc2040910174644d6ebff@mail.gmail.com>	<chv7e6$97v$1@sea.gmane.org>
	<chv8mb$ccq$1@sea.gmane.org>
Message-ID: <chv9tv$g6r$1@sea.gmane.org>

Fredrik Lundh wrote:
>>I agree.  In fact, probably the line parameter should have been the first parameter for all 
>>functions
> 
> so where would you put the pattern?

Just after.  You "insert" the line parameter first, since it's the 
additional parameter to the pattern objects functions.  It's basically 
the input followed by everything to modify/search it.  I think it's 
better to insert it at first, since it's a mandatory argument, while it 
can be logical to have optional flags for patterns (and I'm not talking 
about the current request, but in general in API design).  But again, 
it's too late for that and I don't pretend my instinct is universal.

Regards,
Nicolas

From nidoizo at yahoo.com  Sat Sep 11 18:47:32 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sat Sep 11 18:50:54 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <41432920.2020701@heneryd.com>
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>	<ca471dc2040910174644d6ebff@mail.gmail.com>	<chv7e6$97v$1@sea.gmane.org>
	<41432920.2020701@heneryd.com>
Message-ID: <chva4h$g6r$2@sea.gmane.org>

Erik Heneryd wrote:
> At least the mandatory function 
> parameters are in the "right" order.

I think otherwise.  See reply to Fredrik.
Regards,
Nicolas

From erik at heneryd.com  Sat Sep 11 18:51:32 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 11 18:51:38 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <chv9tv$g6r$1@sea.gmane.org>
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu><ca471dc2040910174644d6ebff@mail.gmail.com>	<chv7e6$97v$1@sea.gmane.org>	<chv8mb$ccq$1@sea.gmane.org>
	<chv9tv$g6r$1@sea.gmane.org>
Message-ID: <41432D14.6050106@heneryd.com>

Nicolas Fleury wrote:
> Fredrik Lundh wrote:
> 
>>> I agree.  In fact, probably the line parameter should have been the 
>>> first parameter for all functions
>>
>>
>> so where would you put the pattern?
> 
> 
> Just after.  You "insert" the line parameter first, since it's the 
> additional parameter to the pattern objects functions.  It's basically 
> the input followed by everything to modify/search it.  I think it's 
> better to insert it at first, since it's a mandatory argument, while it 
> can be logical to have optional flags for patterns (and I'm not talking 
> about the current request, but in general in API design).  But again, 
> it's too late for that and I don't pretend my instinct is universal.

compile() doesn't have that many additional parameters, just the 
optional flags.  OTOH the regex object methods do (both mandatory and 
optional).  Wouldn't it be stupid to insert the pattern in the middle of 
the method parameters?


Erik
From gvanrossum at gmail.com  Sat Sep 11 18:55:17 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Sep 11 18:55:22 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <41432D14.6050106@heneryd.com>
References: <20040910183258.A1DCE1E4009@bag.python.org>
	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu>
	<ca471dc2040910174644d6ebff@mail.gmail.com>
	<chv7e6$97v$1@sea.gmane.org> <chv8mb$ccq$1@sea.gmane.org>
	<chv9tv$g6r$1@sea.gmane.org> <41432D14.6050106@heneryd.com>
Message-ID: <ca471dc204091109555f4b46dc@mail.gmail.com>

I don't see any reason to continue this debate. Patch rejected. Go
argue somewhere else if you can't stop arguing. In case any of the
participants think they can convince the rest of the world with *one*
more post, *one* more clever argument: when was the last time that
worked? They didn't change their mind on any of your previous posts,
so why would they now? Think about it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
Ask me about gmail.
From bac at OCF.Berkeley.EDU  Sat Sep 11 19:07:29 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Sep 11 19:07:39 2004
Subject: [Python-Dev] Re: Alternative ImplementationforPEP292:SimpleString
	Substitutions
In-Reply-To: <4142B9B2.7060306@v.loewis.de>
References: <000e01c497cc$983d6720$e841fea9@oemcomputer>
	<4142B9B2.7060306@v.loewis.de>
Message-ID: <414330D1.4060202@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Raymond Hettinger wrote:
> 
>> [Brett]
>>
>>> I suspect stating exactly what a valid Python
>>> identifier contains as you did in parentheses above will be enough.
>>
>>
>>
>> Given the template, u'?Puede volver $hoy o $ma?ana?', you think $ma is
>> an intended placeholder name and that ? should be a delimiter just like
>> whitespace and punctuation?
> 
> 
> No, I think Brett (and apparently nearly everybody else) thinks that
> such a template will not be written over the course of the next five
> years, except for demonstration purposes. Instead, what will be written
> is u'?Puede volver $today o $tomorrow?' because the template will be
> a translation of the original English template, and, during translation,
> placeholder names must not be changed (although I have difficulties
> imagining possible values for today or tomorrow so that this becomes
> meaningful).
> 

Actually, that wasn't what I was thinking, but that also works.  My 
original thinking is that Template will throw a fit and that's fine 
since they didn't follow the rules.

>> If end users always follow the rules, this will never come up.  If they
>> don't, should there be error message or a silent failure?
> 
> 
> There is always a chance of a silent failure in SafeTemplates, even with
> this rule added - this is the purpose of SafeTemplates. With a Template,
> you will get a KeyError. In any case, the failure will not be completely
> silent, as the user will see $ma?ana show up in the output.
> 

Right, my other reason for not thinking this is a big issue.  If you use 
SafeTemplate you will have to watch out for silent problems like this 
anyway.

I just don't think it will be a big problem.  And if people want the 
support they will just use a pure Unicode Template subclass (perhaps we 
should include that in the module?).

-Brett
From nidoizo at yahoo.com  Sat Sep 11 19:28:35 2004
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sat Sep 11 19:27:19 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <41432D14.6050106@heneryd.com>
References: <20040910183258.A1DCE1E4009@bag.python.org>	<EE123D32-0363-11D9-B197-000393C78C88@gradient.cis.upenn.edu><ca471dc2040910174644d6ebff@mail.gmail.com>	<chv7e6$97v$1@sea.gmane.org>	<chv8mb$ccq$1@sea.gmane.org>	<chv9tv$g6r$1@sea.gmane.org>
	<41432D14.6050106@heneryd.com>
Message-ID: <chvchh$m2i$1@sea.gmane.org>

Erik Heneryd wrote:
> compile() doesn't have that many additional parameters, just the 
> optional flags.  OTOH the regex object methods do (both mandatory and 
> optional).  Wouldn't it be stupid to insert the pattern in the middle of 
> the method parameters?

Just a last post to end the debate.  I agree with you.  I looked at the 
API and I think my suggestion was wrong.  I think my basic instinct was 
due to the fact that I made a lot of Perl and that I use mostly only 
match/search/sub with only pattern flags.  Since there's no really 
intuitive way to mix two APIs with mandatory and optional arguments, and 
also that using pattern objects is the way to go, I'm now -1 with the 
patch.  Sorry to have not understood alone, if anyone wants to continue 
the discussion, I will do it privately.
Regards,
Nicolas

From tim.peters at gmail.com  Sat Sep 11 19:46:20 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Sep 11 19:46:26 2004
Subject: [Python-Dev] doctest and inspect.getmodule
In-Reply-To: <20040911162650.GA9132@mail.planet.nl>
References: <20040911162650.GA9132@mail.planet.nl>
Message-ID: <1f7befae04091110466ceb66cc@mail.gmail.com>

[Johannes Gijsbers]
> I just checked in a change to inspect.getmodule (without running the tests
> beforehand, not a smart move) which broke a whole bunc of tests for doctest.
> The tests mostly seem to fail because doctest can find modules for objects it
> previously couldn't.

All failures were like that.  test_doctest.py contains lots of
"recursive" uses of doctest, where test_doctest.py functions contain
docstrings that themselves contain both definitions of functions with
their own docstrings, and calls to doctest functions.  Before your
change, functions defined inside docstrings and dynamically compiled
by doctest.py were a mystery to inspect.getmodule(), but after your
change getmodule() figured it knew which module they came from.  This
had no effect on doctest doctests that showed succeeding doctest
examples, but for doctest doctests showing failing doctest examples,
the failure-output "and which doctest failed?" meta line changed, from
stuff like:

    Line 3, in f

to stuff like:

    File "C:\Code\python\lib\test\test_doctest.py", line 4, in f

Couldn't be more obvious <wink>.

> I think the change is basically correct,

Me too.

> but I'm not sure how to fix doctest.

That's OK, I already did.  doctest didn't need any changes, but the
expected output in test_doctest.py had to be fiddled.

...
> Oh, I promise I'll run the tests before checking in next time.

Everyone is entitled to one screwup per century.  This was yours <wink>.
From mal at egenix.com  Sat Sep 11 23:00:18 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat Sep 11 23:00:17 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292: Simple
	String Substitutions
In-Reply-To: <chuhn8$11s$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>	<413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org>	<413F6120.7090603@egenix.com>
	<chuhn8$11s$1@sea.gmane.org>
Message-ID: <41436762.7040207@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
> 
>>>(google for "stringlib" for some work I'm doing in this area)
>>
>>Ah, now I know where you're coming from :-) Shift tables
>>don't work well in the Unicode world with its large alphabet.
> 
> since most real-life text use characters from only a small number of regions
> in that alphabet, compressed shift tables work extremely well (the algorithm
> on the stringlib page shows one way to do that, in constant space and O(m)
> time).

You mean: a compressed shift table for Unicode patterns ?
I'll have a look.

>>BTW, you might want to look at the BMS implementation I did
>>for mxTextTools.
> 
> 
> did you ever get around to add Unicode support to mxTextTools ?

Yes in egenix-mx-base 2.1.0. It's not yet released, but Google will
find the most recent snapshot :-) The package has been available
as beta for more than a year now; just haven't found time to cut
a release.

The search functions from 2.0 were replaced with search objects
that can deal with both 8-strings and Unicode. However, the
Unicode search implementation uses a rather naive approach
due to the shift table problem (and my lack of time).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 11 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From noamr at myrealbox.com  Sat Sep 11 23:19:50 2004
From: noamr at myrealbox.com (Noam Raphael)
Date: Sat Sep 11 23:21:05 2004
Subject: [Python-Dev] Re: Missing arguments in RE functions
In-Reply-To: <1094836885.4141e2953f7df@mcherm.com>
References: <1094836885.4141e2953f7df@mcherm.com>
Message-ID: <41436BF6.6080903@myrealbox.com>

Ok, so I understand that Guido doesn't want to extend the functions' API 
to have the full functionality. Fine.
However, I've suggested three things that I think should be done in that 
case, and nobody objected.
Here they are:
1. Add a prominent note in the module contents page or in the module's 
main page, stating that some functionality can only be acheived by using 
compiled REs.
2. Document the optional parameters which let you specify the start and 
end pos in the findall and finditer methods of a compiled RE object.
3. Add the optional parameter "flags" to the findall and finditer 
functions. Then, the four functions match, search, findall and finditer 
would have the same interface: function(pattern, string[, flags]).

Does anyone have any objections?

Noam

From tim.peters at gmail.com  Sun Sep 12 00:43:27 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Sep 12 00:43:41 2004
Subject: [Python-Dev] SHA-256 module
Message-ID: <1f7befae04091115436d5a70fa@mail.gmail.com>

[Michael Hudson, on 30 June 2004]
>> Nevertheless, am I right to still believe that there are no known
>> distinct strings which even MD5 to the same hash?

[Andrew Kuchling]
> Correct.

And two months later, the world is all different again:

"""
import md5

S = ('\xd11\xdd\x02\xc5\xe6\xee\xc4i=\x9a\x06\x98\xaf\xf9\\'
     '/\xca\xb5\x87\x12F~\xab@\x04X>\xb8\xfb\x7f\x89U\xad4'
     '\x06\t\xf4\xb3\x02\x83\xe4\x88\x83%qAZ\x08Q%\xe8\xf7'
     '\xcd\xc9\x9f\xd9\x1d\xbd\xf2\x807<[\x96\x0b\x1d\xd1'
     '\xdcA{\x9c\xe4\xd8\x97\xf4ZeU\xd55s\x9a\xc7\xf0\xeb'
     '\xfd\x0c0)\xf1f\xd1\t\xb1\x8fu\'\x7fy0\xd5\\\xeb"'
     '\xe8\xad\xbay\xcc\x15\\\xedt\xcb\xdd_\xc5\xd3m\xb1'
     '\x9b\n\xd85\xcc\xa7\xe3')

T = ('\xd11\xdd\x02\xc5\xe6\xee\xc4i=\x9a\x06\x98\xaf\xf9\\'
     '/\xca\xb5\x07\x12F~\xab@\x04X>\xb8\xfb\x7f\x89U\xad4'
     '\x06\t\xf4\xb3\x02\x83\xe4\x88\x83%\xf1AZ\x08Q%\xe8\xf7'
     '\xcd\xc9\x9f\xd9\x1d\xbdr\x807<[\x96\x0b\x1d\xd1\xdcA{'
     '\x9c\xe4\xd8\x97\xf4ZeU\xd55s\x9aG\xf0\xeb\xfd\x0c0)'
     '\xf1f\xd1\t\xb1\x8fu\'\x7fy0\xd5\\\xeb"\xe8\xad\xbayL'
     '\x15\\\xedt\xcb\xdd_\xc5\xd3m\xb1\x9b\nX5\xcc\xa7\xe3')

assert S != T
print md5.new(S).hexdigest()
print md5.new(T).hexdigest()
print "oops"
"""

A number of hash functions got cracked since this thread started, by
some researchers in China:

    http://eprint.iacr.org/2004/199.pdf

MD5 is truly dead now for "secure" applications.  Maybe someone who
gives a rip <wink> could update the docs.

Best I understand it, SHA-1 still stands, although a variant with half
the rounds has been cracked.  It does increase the desirability (IMO)
of adding SHA-256, lest SHA-1 get cracked too while Python 2.4.j is
still current.
From fredrik at pythonware.com  Sun Sep 12 13:23:15 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep 12 13:24:03 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292:
	SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>	<413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org>	<413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org>
	<41436762.7040207@egenix.com>
Message-ID: <ci1bk5$9e3$1@sea.gmane.org>

M.-A. Lemburg wrote:

> You mean: a compressed shift table for Unicode patterns ?
> I'll have a look.

It's a lossy compression: the entire delta1 table is represented as
two 32-bit values, independent of the size of the source alphabet.
Works amazingly well, at least when combined with the BM-variant
it was designed for...

(I suppose it's too late for 2.4, but it would probably be a good
idea to switch to this algorithm in 2.5)

</F>


From mwh at python.net  Sun Sep 12 18:02:04 2004
From: mwh at python.net (Michael Hudson)
Date: Sun Sep 12 18:02:06 2004
Subject: [Python-Dev] SHA-256 module
In-Reply-To: <1f7befae04091115436d5a70fa@mail.gmail.com> (Tim Peters's
	message of "Sat, 11 Sep 2004 18:43:27 -0400")
References: <1f7befae04091115436d5a70fa@mail.gmail.com>
Message-ID: <2my8jfxypv.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Michael Hudson, on 30 June 2004]
>>> Nevertheless, am I right to still believe that there are no known
>>> distinct strings which even MD5 to the same hash?
>
> [Andrew Kuchling]
>> Correct.
>
> And two months later, the world is all different again:

Heh, I'd already blogged about that:

http://starship.python.net/crew/mwh/blog/nb.cgi/view/weblog/2004/08/18/0

> """
> import md5
>
> S = ('\xd11\xdd\x02\xc5\xe6\xee\xc4i=\x9a\x06\x98\xaf\xf9\\'
>      '/\xca\xb5\x87\x12F~\xab@\x04X>\xb8\xfb\x7f\x89U\xad4'
>      '\x06\t\xf4\xb3\x02\x83\xe4\x88\x83%qAZ\x08Q%\xe8\xf7'
>      '\xcd\xc9\x9f\xd9\x1d\xbd\xf2\x807<[\x96\x0b\x1d\xd1'
>      '\xdcA{\x9c\xe4\xd8\x97\xf4ZeU\xd55s\x9a\xc7\xf0\xeb'
>      '\xfd\x0c0)\xf1f\xd1\t\xb1\x8fu\'\x7fy0\xd5\\\xeb"'
>      '\xe8\xad\xbay\xcc\x15\\\xedt\xcb\xdd_\xc5\xd3m\xb1'
>      '\x9b\n\xd85\xcc\xa7\xe3')
>
> T = ('\xd11\xdd\x02\xc5\xe6\xee\xc4i=\x9a\x06\x98\xaf\xf9\\'
>      '/\xca\xb5\x07\x12F~\xab@\x04X>\xb8\xfb\x7f\x89U\xad4'
>      '\x06\t\xf4\xb3\x02\x83\xe4\x88\x83%\xf1AZ\x08Q%\xe8\xf7'
>      '\xcd\xc9\x9f\xd9\x1d\xbdr\x807<[\x96\x0b\x1d\xd1\xdcA{'
>      '\x9c\xe4\xd8\x97\xf4ZeU\xd55s\x9aG\xf0\xeb\xfd\x0c0)'
>      '\xf1f\xd1\t\xb1\x8fu\'\x7fy0\xd5\\\xeb"\xe8\xad\xbayL'
>      '\x15\\\xedt\xcb\xdd_\xc5\xd3m\xb1\x9b\nX5\xcc\xa7\xe3')
>
> assert S != T
> print md5.new(S).hexdigest()
> print md5.new(T).hexdigest()
> print "oops"
> """
>
> A number of hash functions got cracked since this thread started, by
> some researchers in China:
>
>     http://eprint.iacr.org/2004/199.pdf

Is there any resource that explains these guys results any more fully?
The only examples I've seen only differ in a very few bits.

> MD5 is truly dead now for "secure" applications.

I'd say it's resting :)

> Maybe someone who gives a rip <wink> could update the docs.

> Best I understand it, SHA-1 still stands, although a variant with half
> the rounds has been cracked.  It does increase the desirability (IMO)
> of adding SHA-256, lest SHA-1 get cracked too while Python 2.4.j is
> still current.

I'm hardly an expert, but I'd still like to know more about this
attack.  If it's as limited as it could possibly be (i.e. it can only
make very specific strings differing by a handful of bits hash the
same) then it's only an issue for the paranoid.  If it's as wide as it
could possibly be it seems that all hash functions we currently know
could be doomed.

Cheers,
mwh

-- 
  Q: Isn't it okay to just read Slashdot for the links?
  A: No. Reading Slashdot for the links is like having "just one hit"
     off the crack pipe.
     -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#faq
From tim.peters at gmail.com  Sun Sep 12 21:44:30 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Sep 12 21:44:33 2004
Subject: [Python-Dev] SHA-256 module
In-Reply-To: <2my8jfxypv.fsf@starship.python.net>
References: <1f7befae04091115436d5a70fa@mail.gmail.com>
	<2my8jfxypv.fsf@starship.python.net>
Message-ID: <1f7befae0409121244712506d0@mail.gmail.com>

[Tim Peters]
...
>> A number of hash functions got cracked since this thread started, by
>> some researchers in China:
>>
>>     http://eprint.iacr.org/2004/199.pdf
 
[Michael Hudson]
> Is there any resource that explains these guys results any more fully?

Not that I know of.  I've read that they're writing a paper on *how*
their approach works, but it will take time to finish it.  There's no
doubt that they're on to something.  Apparently the first version of
the paper provided collisions for a hash that wasn't actually MD5, due
(at least) to confusing endianness in places.  This was pointed out at
the conference, and by the next morning they produced two collisions
for "the real" MD5.

> The only examples I've seen only differ in a very few bits.

Probably due to the method, which apparently makes a sequence of
small, controlled changes, based more on analysis than on brute force.
 Given the uses of MD5 for verifying downloads, it doesn't take much
of a change to open "a security hole" in C code, so even if they can't
extend the method beyond a few bits' difference, that would be cold
comfort.  I note that they got to pick both msgs here, and haven't
claimed to be able to derive a collision for a given msg.  When more
about their method is known, it may or may not prove feasible to
extend.

>> MD5 is truly dead now for "secure" applications.

> I'd say it's resting :)

I based "truly dead" on press reaction.  MD5 had been falling out of
favor for years anyway (due to earlier cracks of various weakened
versions); this is just nail-in-the-coffin news.

> ...
> I'm hardly an expert, but I'd still like to know more about this
> attack.  If it's as limited as it could possibly be (i.e. it can only
> make very specific strings differing by a handful of bits hash the
> same) then it's only an issue for the paranoid.  If it's as wide as it
> could possibly be it seems that all hash functions we currently know
> could be doomed.

Security weenies are paranoid by necessity -- paranoia is part of
their field.  I'm not sure there's ever been a real-world attack based
on a "double free" bug, for example, but finding such a bug is
sufficient to kill a product release anyway.

They don't claim to have an attack against SHA-1, BTW.  Someone else
reported collisions using a grossly weakened SHA-1, with 42 rounds
instead of 80.
From martin at v.loewis.de  Sun Sep 12 23:51:27 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep 12 23:51:16 2004
Subject: [Python-Dev] SHA-256 module
In-Reply-To: <2my8jfxypv.fsf@starship.python.net>
References: <1f7befae04091115436d5a70fa@mail.gmail.com>
	<2my8jfxypv.fsf@starship.python.net>
Message-ID: <4144C4DF.5020100@v.loewis.de>

Michael Hudson wrote:
> I'm hardly an expert, but I'd still like to know more about this
> attack.  If it's as limited as it could possibly be (i.e. it can only
> make very specific strings differing by a handful of bits hash the
> same) then it's only an issue for the paranoid.  If it's as wide as it
> could possibly be it seems that all hash functions we currently know
> could be doomed.

The nicest summary I have seen on this so far was Tim Churches' message
<mailman.3198.1094942493.5135.python-list@python.org>. In his
terminology, "collision resistance" has been attacked (i.e. it is now
possible to create pairs of plaintext that hash same). "Preimage
resistance" and "2nd preimage resistance" remain unattacked, atleast
wrt. to this paper. IOW, it is still not possible to easily reconstruct
some plaintext given the hash (good for password hashing), and it is
still not possible to modify a given plaintext so that it still hashes
same (good for signing).

However, the trust into "pseudo-randomness" of the hash is gone now -
for a cryptographically "secure" hash, it should not be possible to
create a collision until the sun collapses.

Regards,
Martin
From dave.l.harrison at gmail.com  Mon Sep 13 03:34:06 2004
From: dave.l.harrison at gmail.com (David Harrison)
Date: Mon Sep 13 03:34:08 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
Message-ID: <a1581f70409121834130848e9@mail.gmail.com>

Hi all,

Quick pep265 summary : People frequently want to count the occurrences
of values in a dict, or sort the results of a d.items() call by value.
 This could be done by extending the current items() definition, or by
creating a new function for the dict object (both requiring a C
implementation).

I've had a read through pep265 a few times now, and every time I've had
two immediate reactions.

First, that I've been there too.  I've found myself innumerable times
needing to count the occurrences of values in a dict.  However second,
that dicts shouldn't be naturally sortable.  A dict does not guarantee the
order that it returns calls such as items() keys() or values().  It's
my feeling that
we should not encourage people to rely on a dict returning a set ordering, since
as a hash based data structure they are designed for key lookup not
sequential traversal - if you want to sort something, massage the data
into a list and then sort the list (I've seen a proposal before that
the sort function be able to handle objects which would allow sorting
of 2 dimensional lists).

With regards to the two arguments put forward by Grant, the first - that
it is an idiom known only to experienced campaigners - does not seem to
be a supportable argument to me.   I think the problem has quite a
simple elegant solution which is rather easily discovered - there are
lots of differences in Python that require an inexperienced programmer
to learn a new idiom (such as the looping construct).

The second, that the solution is full of 'grunge', seems a matter of
taste and use to me.  As mentioned in the pep there are different kinds
of comparison that may be wanted, but could not be supported.  Further,
it is a natural use case of a dict that items held within it need not be
of the same type (and therefore makes the idea of a comparison between
them meaningless).

With respect to implementation suggestions, numbers 1 2 and 3 definitely
don't work for me.  To extend the usage of items() without similarly
extending the usage of keys() and values() would mean that we are
special casing the items() function in a way that makes it inconsistant
with the other dict functions.  Number 5 seems too specific to me.  I
could live with 4 ;-)

I think in the end it's my feeling that these kind of idioms belong in
the cookbook - which, incidentally, it already is to a certain extent
under 'Sorting a Dictionary', another recipe could always be added
for this ;-)

cheers
Dave Harrison
From barry at python.org  Mon Sep 13 03:40:32 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 13 03:40:36 2004
Subject: [Python-Dev] Re: PEP 328 - Relative Imports
In-Reply-To: <chtb5d$3hu$1@sea.gmane.org>
References: <413F1B87.90301@egenix.com><41416E76.8030603@egenix.com>
	<41419432.2000600@zope.com> <1094828671.30837.23.camel@geddy.wooz.org>
	<4141DEDC.8080503@egenix.com>
	<1094842075.30831.55.camel@geddy.wooz.org>
	<chtb5d$3hu$1@sea.gmane.org>
Message-ID: <1095039632.30217.7.camel@geddy.wooz.org>

On Fri, 2004-09-10 at 18:51, Nicolas Fleury wrote:

> I agree with Guido.  FWIW, I think imports should be absolute by default 
> and that the statu quo is a mistake.  The __global__ solution makes 
> absolute imports too verbose, when they are usually in majority.  

I'm really not trying to argue strongly that __global__ is a solution,
but let me just point out that I think they wouldn't be that common. 
You'd add an __global__ only when the "normal" import statement didn't
do what you want, primarily because of a local module name that
conflicted with a global module, and you really wanted the global. 
Ordinarily, those conflicts don't occur.  OTOH, when they do, you can
often "fix" the problem by renaming your local module, but that's a bit
ugly.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040912/22303763/attachment.pgp
From barry at python.org  Mon Sep 13 03:49:37 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 13 03:49:41 2004
Subject: [Python-Dev] Re: Alternative
	ImplementationforPEP292:SimpleString Substitutions
In-Reply-To: <4142B9B2.7060306@v.loewis.de>
References: <000e01c497cc$983d6720$e841fea9@oemcomputer>
	<4142B9B2.7060306@v.loewis.de>
Message-ID: <1095040177.30216.12.camel@geddy.wooz.org>

On Sat, 2004-09-11 at 04:39, "Martin v. L?wis" wrote:

> No, I think Brett (and apparently nearly everybody else) thinks that
> such a template will not be written over the course of the next five
> years, except for demonstration purposes. Instead, what will be written
> is u'?Puede volver $today o $tomorrow?' because the template will be
> a translation of the original English template, and, during translation,
> placeholder names must not be changed (although I have difficulties
> imagining possible values for today or tomorrow so that this becomes
> meaningful).
> 
> > If end users always follow the rules, this will never come up.  If they
> > don't, should there be error message or a silent failure?
> 
> There is always a chance of a silent failure in SafeTemplates, even with
> this rule added - this is the purpose of SafeTemplates. With a Template,
> you will get a KeyError. In any case, the failure will not be completely
> silent, as the user will see $ma?ana show up in the output.
> 
> My prediction is that the typical application is to use Templates, as
> users know very well what the placeholders are. Furthermore, the
> typical application will use locals/globals/vars(), or dict(key="value")
> to create the replacement dictionary. In this application, nobody
> would even think of using ma?ana as a key, because you can't get
> it into the dictionary.
> 
> If this never comes up, it is better to not complicate the rules.
> Simple is better than complex.

I tend to agree, so I'd like to keep the rules as they currently stand. 
Your prediction is aligned with what I think the most common use cases
are too.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040912/fe37123b/attachment.pgp
From adurdin at gmail.com  Mon Sep 13 04:17:13 2004
From: adurdin at gmail.com (Andrew Durdin)
Date: Mon Sep 13 04:17:25 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <a1581f70409121834130848e9@mail.gmail.com>
References: <a1581f70409121834130848e9@mail.gmail.com>
Message-ID: <59e9fd3a04091219174fa7c0f4@mail.gmail.com>

On Mon, 13 Sep 2004 11:34:06 +1000, David Harrison
<dave.l.harrison@gmail.com> wrote:
> 
> With regards to the two arguments put forward by Grant, the first - that
> it is an idiom known only to experienced campaigners - does not seem to
> be a supportable argument to me.   I think the problem has quite a
> simple elegant solution which is rather easily discovered - there are
> lots of differences in Python that require an inexperienced programmer
> to learn a new idiom (such as the looping construct).

And of course, it is better to teach these idioms to newbies so they
become competent. A python newbie will be much better off if they
learn the decorate, sort, [undecorate] idiom and list comprehensions,
neither of which are particularly difficult; and both will serve the
newbie well in many other areas.
 
> With respect to implementation suggestions, numbers 1 2 and 3 definitely
> don't work for me.  To extend the usage of items() without similarly
> extending the usage of keys() and values() would mean that we are
> special casing the items() function in a way that makes it inconsistant
> with the other dict functions.  Number 5 seems too specific to me.  I
> could live with 4 ;-)

To quote the PEP:
"""
Alternatively, items() could simply let us control the (key, value) 
order:

    (3) items(values_first=0)
"""
This suggestion No. 3 from the PEP does not special case the items()
function in a way that makes it "inconsistent with the other dict
functions" (i.e. keys(), values()); however it would suggest that
dict() then also ought take such an inverted, values-first list of
tuples if given an optional values_first parameter. But this IMHO
makes the dict() constructor too complicated, as well as having a
potential conflict with named keywords.
From dave.l.harrison at gmail.com  Mon Sep 13 04:51:13 2004
From: dave.l.harrison at gmail.com (David Harrison)
Date: Mon Sep 13 04:51:19 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <a1581f7040912192932fd8048@mail.gmail.com>
References: <a1581f70409121834130848e9@mail.gmail.com>
	<59e9fd3a04091219174fa7c0f4@mail.gmail.com>
	<a1581f7040912192932fd8048@mail.gmail.com>
Message-ID: <a1581f70409121951edde8e@mail.gmail.com>

> > With respect to implementation suggestions, numbers 1 2 and 3 definitely
> > don't work for me.  To extend the usage of items() without similarly
> > extending the usage of keys() and values() would mean that we are
> > special casing the items() function in a way that makes it inconsistant
> > with the other dict functions.  Number 5 seems too specific to me.  I
> > could live with 4 ;-)
>
> To quote the PEP:
> """
> Alternatively, items() could simply let us control the (key, value)
> order:
>
>     (3) items(values_first=0)
> """
> This suggestion No. 3 from the PEP does not special case the items()
> function in a way that makes it "inconsistent with the other dict
> functions" (i.e. keys(), values()); however it would suggest that
> dict() then also ought take such an inverted, values-first list of
> tuples if given an optional values_first parameter. But this IMHO
> makes the dict() constructor too complicated, as well as having a
> potential conflict with named keywords.

In the sense that items() can still be used as before, it remains
consistent.  However since the same argument could be used to equally
promote the modification of other  dict functions to accept such
arguments - such as keys(values_first=0)  - to make the change to
items() alone is (in my humble opinion) inconsistent.
But that's just my opinion, others may feel differently.
From greg at cosc.canterbury.ac.nz  Mon Sep 13 04:59:41 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Sep 13 04:59:48 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <4141D745.805@ieee.org>
Message-ID: <200409130259.i8D2xfoK008493@cosc353.cosc.canterbury.ac.nz>

> For Numeric/Numarray, I think and1/or1 would be unnecessary. If that 
> were true in general it would simplify the proposal signifigantly: 
> and2/or2 could be renamed to and/or and and1/or1 could be dropped.

It's true that none of the use cases I put forward need and1/or1. But
I was trying to think of the future and at least show how the general
case could be accommodated.

Leaving out and1/or1 would make things simpler, but at the risk of
someone coming up with a use case for them in the future, requiring
yet another change. Wouldn't it be best to get things right from the
beginning if possible?

Also, the simplification wouldn't be all that great.  There would
still be the need for two bytecodes per boolean operation to
accommodate either short-circuiting or not. All that would be saved is
testing for and calling the and1/or1 methods.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From greg at cosc.canterbury.ac.nz  Mon Sep 13 05:05:26 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Sep 13 05:05:34 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <1094715983.1472.7.camel@debbie>
Message-ID: <200409130305.i8D35QmS008516@cosc353.cosc.canterbury.ac.nz>

> I like the PEP with 'and' and 'or', but isn't the 'not' special method
> essentially the inverse of __nonzero__?

No, because:

(1) __nonzero__ is restricted to returning a boolean result.

(2) There are other contexts besides 'not' in which __nonzero__
    gets called.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From tim.hochberg at ieee.org  Mon Sep 13 06:21:14 2004
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Mon Sep 13 06:21:24 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <200409130259.i8D2xfoK008493@cosc353.cosc.canterbury.ac.nz>
References: <4141D745.805@ieee.org>
	<200409130259.i8D2xfoK008493@cosc353.cosc.canterbury.ac.nz>
Message-ID: <4145203A.8070303@ieee.org>

Greg Ewing wrote:
>>For Numeric/Numarray, I think and1/or1 would be unnecessary. If that 
>>were true in general it would simplify the proposal signifigantly: 
>>and2/or2 could be renamed to and/or and and1/or1 could be dropped.
> 
> 
> It's true that none of the use cases I put forward need and1/or1. But
> I was trying to think of the future and at least show how the general
> case could be accommodated.
> 
> Leaving out and1/or1 would make things simpler, but at the risk of
> someone coming up with a use case for them in the future, requiring
> yet another change. Wouldn't it be best to get things right from the
> beginning if possible?

Sure, assuming using and1/or1 is the right approach. I'm not convinced 
it is. I think it would be better to start with something simple, but 
design the syntax so that it can be gracefully upgraded if compelling 
use cases emerge.

My first thought on seeing the current proposal was that the special 
method names need changing. and2/or2 should be just and/or since these 
are the methods that will actually be used. I don't have a good name for 
and1/or1, but it's probably not hard to be more descriptive than the 
current names. Some imperfect possibilities: shortcircand, scand, 
preand, andsc. scand is my favorite of these. I suppose and1 could even 
be kept and only and2 renamed.

After renaming stuff, we're halfway to the simpler solution. The next 
step is, having established both that it's possible to implement full, 
custom short circuiting as per your patch and that there are no use 
cases for the custom short circuiting yet, we then just drop scand/scor 
until a compelling use case shows up, if it ever does.

> Also, the simplification wouldn't be all that great.  There would
> still be the need for two bytecodes per boolean operation to
> accommodate either short-circuiting or not. All that would be saved is
> testing for and calling the and1/or1 methods.

I'll take your word for it that the implementation would not be 
appreciably simpler. However, conceptually it's much simpler without 
__and1__/__or1__. Explaining the full version looks difficult, so why 
burden ourselves with that if we don't have to. At least not yet.

-tim

From adurdin at gmail.com  Mon Sep 13 06:21:38 2004
From: adurdin at gmail.com (Andrew Durdin)
Date: Mon Sep 13 06:21:44 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <a1581f7040912192932fd8048@mail.gmail.com>
References: <a1581f70409121834130848e9@mail.gmail.com>
	<59e9fd3a04091219174fa7c0f4@mail.gmail.com>
	<a1581f7040912192932fd8048@mail.gmail.com>
Message-ID: <59e9fd3a04091221216aae55e9@mail.gmail.com>

On Mon, 13 Sep 2004 12:29:11 +1000, David Harrison
<dave.l.harrison@gmail.com> wrote:
> 
> In the sense that items() can still be used as before, it remains
> consistent.  However since the same argument could be used to equally
> promote the modification of other  dict functions to accept such
> arguments - such as keys(values_first=0)  - to make the change to
> items() alone is (in my humble opinion) inconsistent.
> But that's just my opinion, others may feel differently.

To me, neither mydict.keys(values_first=whatever) nor
mydict.values(values_first=whatever) make any sense: these methods
return a list of only keys or only values, so saying "values first"
when you're getting a list of keys is meaningless.
From stephen at xemacs.org  Mon Sep 13 06:21:32 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon Sep 13 06:21:54 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP 292: Simple
	String Substitutions
In-Reply-To: <chuhn8$11s$1@sea.gmane.org> (Fredrik Lundh's message of "Sat,
	11 Sep 2004 11:51:23 +0200")
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org> <cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com> <chnc49$psm$1@sea.gmane.org>
	<413F3605.7090707@egenix.com> <chnidf$epp$1@sea.gmane.org>
	<413F6120.7090603@egenix.com> <chuhn8$11s$1@sea.gmane.org>
Message-ID: <87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Fredrik" == Fredrik Lundh <fredrik@pythonware.com> writes:

    Fredrik> M.-A. Lemburg wrote:

    >>> (google for "stringlib" for some work I'm doing in this area)

    >> Ah, now I know where you're coming from :-) Shift tables don't
    >> work well in the Unicode world with its large alphabet.

    Fredrik> since most real-life text use characters from only a
    Fredrik> small number of regions in that alphabet,

This is true of "most real-life text", but it's going to be false most
of the time for a large (and rapidly growing) minority of users: those
working with texts comprised mostly of Asian ideographs.  Unihan
(spread over about 80 256-character rows) has a potential big problem:
because it is ordered by root, then stroke count, the simpler (and
usually more frequently used) ideographs with a common root cluster
near the root.  Whether those clusters frequently overlap based on a
simple compression method like "lowest 5 bits" I don't know offhand.

I don't know whether the composed Hangul (~ 40 rows) would show
clustering; that would depend on phonetic frequencies in the Korean
language.

Of course the find algorithm you present is almost surely a big win
over the brute-force method, even in the presence of some degree of
clustering in Unihan and Hangul.  But I worry that it's an exceptional
example, when you use assumptions like "real-life text uses characters
drawn from a small number of short contiguous regions in the alphabet."

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From dave.l.harrison at gmail.com  Mon Sep 13 06:44:25 2004
From: dave.l.harrison at gmail.com (David Harrison)
Date: Mon Sep 13 06:44:28 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <a1581f7040912214378869071@mail.gmail.com>
References: <a1581f70409121834130848e9@mail.gmail.com>
	<59e9fd3a04091219174fa7c0f4@mail.gmail.com>
	<a1581f7040912192932fd8048@mail.gmail.com>
	<59e9fd3a04091221216aae55e9@mail.gmail.com>
	<a1581f7040912214378869071@mail.gmail.com>
Message-ID: <a1581f7040912214457468e9b@mail.gmail.com>

> > In the sense that items() can still be used as before, it remains
> > consistent.  However since the same argument could be used to equally
> > promote the modification of other  dict functions to accept such
> > arguments - such as keys(values_first=0)  - to make the change to
> > items() alone is (in my humble opinion) inconsistent.
> > But that's just my opinion, others may feel differently.
>
> To me, neither mydict.keys(values_first=whatever) nor
> mydict.values(values_first=whatever) make any sense: these methods
> return a list of only keys or only values, so saying "values first"
> when you're getting a list of keys is meaningless.

Oops my mistake, I was thinking along the lines of requesting a sort
order based on value (i.e. keys() returns in the order of its values,
increasing or decreasing).

So my misunderstanding aside ;-) ...

We would have the following situation (just to clarify) :

>>> d = { 'a' : 1 , 'b' :2, 'c':0 }
>>> itemList = d.items(values_first=1)
>>> itemList
[(1, 'a'), (2, 'b'), (0, 'c')]
>>> itemList.sort()
>>> itemList
[(0, 'c'), (1, 'a'), (2, 'b')]

That is probably my preferred option for this usage.

But to play devil's advocate for a minute.  Considering that a dict is
a key based
data structure and not a sequential structure, does it really make sense to be
able to request its inversion ?

For example :
>>> d = { 'a' : [1,2,3] , 'b' : [4,5,6], 'c':[7,8,9] }
>>> d.items(values_first=1)
[ ([1,2,3], 'a'), ([4,5,6], 'b'), ([7,8,9], 'c') ]

This just doesnt make sense, nor would it make sense if the values were objects.

The only use case raised was for counting instances of an item in a
dict, and then inverting _that_ dict (ie. the one that stored the
count values) .

e.g.

for key in d.keys():
        d[key] = d.get(key, 0) + 1

items = [(v,k) for k,v in d.items()]
items.sort()
items.reverse()
items = [(k,v) for v,k in items]

So I'll also raise again my question of whether it is reasonable to
implement a functionality that is not going to be used as a part of
the primary purpose of a dict.  This functionality extension is just
an implementation of one way of using a dict - not necessarily an
example of 'missing functionality' to me.
From ncoghlan at iinet.net.au  Mon Sep 13 06:46:53 2004
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Sep 13 06:47:50 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <a1581f70409121834130848e9@mail.gmail.com>
References: <a1581f70409121834130848e9@mail.gmail.com>
Message-ID: <4145263D.8010402@iinet.net.au>

David Harrison wrote:
> Hi all,
> 
> Quick pep265 summary : People frequently want to count the occurrences
> of values in a dict, or sort the results of a d.items() call by value.
>  This could be done by extending the current items() definition, or by
> creating a new function for the dict object (both requiring a C
> implementation).

In Python 2.4:

->>> ud = dict(a=1, b=2, c=3)
->>> from operator import itemgetter
->>> print sorted(ud.items(), key=itemgetter(1), reverse=True)
[('c', 3), ('b', 2), ('a', 1)]


I'm not entirely sure who needs to be thanked for this addition, but it 
sure makes the 'decorate-sort-undecorate' idiom very, very easy to 
follow (which was, in fact, the point - I do remember that much of the 
discussion).

I think the addition of 'sorted', and the keyword arguments for both it 
and list.sort make PEP 265 somewhat redundant.

Cheers,
Nick.
From shane.holloway at ieee.org  Mon Sep 13 06:59:08 2004
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Mon Sep 13 06:59:39 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <4145203A.8070303@ieee.org>
References: <4141D745.805@ieee.org>	<200409130259.i8D2xfoK008493@cosc353.cosc.canterbury.ac.nz>
	<4145203A.8070303@ieee.org>
Message-ID: <4145291C.6000204@ieee.org>

Tim Hochberg wrote:
 > that there are no use
 > cases for the custom short circuiting yet, we then just drop scand/scor
 > until a compelling use case shows up, if it ever does.

A boolean calculus (predicate) engine would make use of 
short-circuiting.  Or perhaps a state machine would make use of this 
feature.  I agree with Greg that I'd rather the implementation be 
"complete".  Computer Scientists have already been down this road, and 
we know that there are two useful forms.  :)

I like [__and1__, __and__, __or__, __or1__] -- the abbreviation would 
have to be documented anyway, and the '1' says "one argument: self" to me.

Respectfully,
-Shane Holloway
From dave.l.harrison at gmail.com  Mon Sep 13 08:15:42 2004
From: dave.l.harrison at gmail.com (David Harrison)
Date: Mon Sep 13 08:15:47 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <4145263D.8010402@iinet.net.au>
References: <a1581f70409121834130848e9@mail.gmail.com>
	<4145263D.8010402@iinet.net.au>
Message-ID: <a1581f7040912231578399068@mail.gmail.com>

> > Quick pep265 summary : People frequently want to count the occurrences
> > of values in a dict, or sort the results of a d.items() call by value.
> >  This could be done by extending the current items() definition, or by
> > creating a new function for the dict object (both requiring a C
> > implementation).
> 
> In Python 2.4:
> 
> ->>> ud = dict(a=1, b=2, c=3)
> ->>> from operator import itemgetter
> ->>> print sorted(ud.items(), key=itemgetter(1), reverse=True)
> [('c', 3), ('b', 2), ('a', 1)]
> 
> I'm not entirely sure who needs to be thanked for this addition, but it
> sure makes the 'decorate-sort-undecorate' idiom very, very easy to
> follow (which was, in fact, the point - I do remember that much of the
> discussion).
> 
> I think the addition of 'sorted', and the keyword arguments for both it
> and list.sort make PEP 265 somewhat redundant.

Seems like another solution to the problem, which makes this pep even
less meaningful I'd say.  Guess this this pep should be closed then ?
From fredrik at pythonware.com  Mon Sep 13 08:53:53 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Sep 13 08:52:03 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for PEP 292:
	SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com>
	<chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com>
	<chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com>
	<chuhn8$11s$1@sea.gmane.org>
	<87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <ci3g2d$m3g$1@sea.gmane.org>

Stephen J. Turnbull wrote:

> But I worry that it's an exceptional example, when you use assumptions
> like "real-life text uses characters drawn from a small number of short
> contiguous regions in the alphabet."

The problem is that I cannot tell if you've studied search issues, or if you're
just applying general "but wait, it's different for asian languages" arguments
here.

There are many issues here, all pulling in different directions:

- If you look at usage statistics, you'll find that the absolute majority
of all searches are for a single character (usually separators, like colons,
spaces, commas).  The second largest category is computer-level
keywords (usually pure ASCII, also in localized programs), used to
process network protocols, file formats, message headers, etc.  Searches
for "human text" are not that common, really, and search terms are usually
limited to only a few words.

- This means that most searches have exactly the same characteristics,
independent of the locale.  Even if a new algorithm would only be better
for pure-ASCII text, everyone would benefit.

- As for non-ASCII search terms, the "human text" search terms are
usually shorter in languages with many ideographs (my non-scientific
tests indicate that chinese text uses about 4 times less symbols than
english; I'm sure someone can dig up better figures).

- This means that even if you are more likely to get collisions in the
compressed skip table, there are fewer characters in the table.

- This means that you'll probably be able to make long skips as often
as for non-Asian text.

- On the other hand, the long skips are shorter than for non-Asian text,
so you may have to make more of them.

- On the other hand, the target strings are also likely to be shorter, so
that might not matter.

- And so on.

The only way to know for sure is if anyone has the time and energy to carry
out tests on real-life datasets.  (or at least prepare some datasets; I can run
the tests if someone provides me with a file with search terms and a number
of files containing texts to apply them to, preferrably using UTF-8 encoding).

</F> 


From ncoghlan at iinet.net.au  Mon Sep 13 13:46:47 2004
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Sep 13 13:47:37 2004
Subject: [Python-Dev] PEP 265 - Sorting dicts by value
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F8D@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F8D@UKDCX001.uk.int.atosorigin.com>
Message-ID: <414588A7.3040602@iinet.net.au>

Moore, Paul wrote:
> From: Nick Coghlan
> 
>>In Python 2.4:
>>
>>->>> ud = dict(a=1, b=2, c=3)
>>->>> from operator import itemgetter
>>->>> print sorted(ud.items(), key=itemgetter(1), reverse=True)
>>[('c', 3), ('b', 2), ('a', 1)]
> 
> 
> If you haven't done so already, I think this should be submitted
> to the cookbook. It's a nice idiom, and demonstrates some useful
> Python 2.4 features, and how they work well in combination.

It's submitted now.

I have a feeling Raymond is the one who should get the credit for the 
approach, though. I'd be surprised if he made it through the discussions 
about the introduction of sorted without using this example at least once :)

Cheers,
Nick.

From gmccaughan at synaptics-uk.com  Mon Sep 13 14:32:00 2004
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Mon Sep 13 14:32:34 2004
Subject: [Python-Dev]
	=?iso-8859-1?q?AlternativeImplementation=09forPEP292=3ASimpleString?=
	=?iso-8859-1?q?_Substitutions?=
In-Reply-To: <87mzzxqmvn.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<200409101257.13802.gmccaughan@synaptics-uk.com>
	<87mzzxqmvn.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <200409131332.00927.gmccaughan@synaptics-uk.com>

On Saturday 2004-09-11 08:35, Stephen J. Turnbull wrote:

>     >> But [efficiency], as such, is important only to efficiency
>     >> fanatics.
> 
>     Gareth> No, it's important to ... well, people to whom efficiency
>     Gareth> matters. There's no need for them to be fanatics.
> 
> If it matters just because they care, they're fanatics.  If it matters
> because they get some other benefit (response time less than the
> threshold of hotice, twice as many searches per unit time, half as
> many boxes to serve a given load), they're not.  </F>'s talk of many
> ways to do things "and Python should account for most of them" strikes
> me as fanaticism by that definition; the vast majority of developers
> will never deal with the special cases, or write apps that anticipate
> dealing with huge ASCII strings.  Those costs should be borne by the
> developers who do, and their clients.

I am unconvinced that "the vast majority of developers"
will not have work to do that involves a large volume of
ASCII data ... but I'm not sure this is something either
of us is in a position to know. (If it turns out that
you're just completing a PhD thesis entitled "Use of
large-volume string data among software developers",
or something, then please accept my apologies for guessing
wrong and enlighten me!)

> I apologize for shoehorning that into my reply to you.

That's OK.

>     >> The question is, how often are people going to notice that when
>     >> they have pure ASCII they get a 100% speedup [...]?
> 
>     Gareth> Why is that the question, rather than "how often are
>     Gareth> people going to benefit from getting a 100% speedup when
>     Gareth> they have pure ASCII"?
> 
> Because "benefit" is very subjective for _one_ person, and I don't
> want to even think about putting coefficients on your benefit versus
> mine.  If the benefit is large enough, a single person will be willing
> to do the extra work.  The question is, should all Python users and
> developers bear some burden to make it easier for that person to do
> what he needs to do?

"Burden" is just as subjective as "benefit". But let's take
a look at these burdens and benefits.

  - Burden for a very small number of Python developers:
    having to write and maintain a larger body of code,
    with duplication (at least of purpose) between Unicode
    and ASCII strings.

      - Consequent burden on all Python users: more risk
        of those developers getting burned out and giving
        up, less time for them to work on other aspects of
        Python, more danger of bugs in code, larger executables.

        They won't notice this, of course.

  + Benefit for a small (but nearly so small) number of
    Python users: important code runs twice as fast, and
    this makes a real difference to them.

      + Consequent benefit for all Python users: more
        use of Python means more people contributing
        code, bug reports, useful libraries, etc.

        They won't notice this, either.

  + Benefit for all Python users: some of their code runs
    a little faster.

    They won't notice this, either.

Perhaps I'm being obtuse, but it's far from clear to me that
this is a net loss for Python users at large. In any case,
the burdens seem less likely to be noticed than the benefits.

> I think "notice" is something you can get consensus on.  If a lot of
> people are _noticing_ the difference, I think that's a reasonable rule
> of thumb for when we might want to put "it", or facilities for making
> individual efforts to deal with "it" simpler, into "standard Python"
> at some level.  If only a few people are noticing, let them become
> expert at dealing with it.

But even if "noticing the difference" is the key point,
it is a mistake (I think) to make it specifically "noticing
that when they have pure ASCII they get a 100% speedup".
Hence my comment quoted below:

>     Gareth> Or even "how often are people going to try out Python on
>     Gareth> an application that uses pure-ASCII strings, and decide to
>     Gareth> use some other language that seems to do the job much
>     Gareth> faster"?
> 
> See?  You're now using a "notice" standard, too.  I don't think that's
> an accident.

It isn't. It's because I was replying to someone who apparently
took "notice" standards as the only relevant ones, in order to
point out that even with that assumptions there are relevant
questions other than "will anyone notice getting a speedup when
their data are pure ASCII?".

And I, in turn, apologize for shoehorning all *that* into the
word "even". :-)

I still think, though, that a "notice" standard makes for
bad designs. Most people would not notice if all floating-point
operations gave results with the last couple of bits wrong,
but it is a good thing that they don't. Some people wouldn't
notice but would get badly unsatisfactory results. Some people
would notice but would find it impractical to work around the
problems because that would mean tons of code and major losses
in speed.

Most people would not notice if by inserting the magic word
"wibble" at the start of their programs they could make them
10 times faster, but if for some weird reason it were possible
to make that so (but not possible to provide the speedup for
programs without "wibble") then it should be done.

What people notice is easier to define and to measure
than what actually makes a difference to them. That is
not enough reason to treat it as the only criterion.

-- 
g

From stephen at xemacs.org  Mon Sep 13 16:00:57 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon Sep 13 16:01:05 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for PEP 292:
	SimpleString Substitutions
In-Reply-To: <ci3g2d$m3g$1@sea.gmane.org> (Fredrik Lundh's message of "Mon,
	13 Sep 2004 08:53:53 +0200")
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org> <cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com> <chnc49$psm$1@sea.gmane.org>
	<413F3605.7090707@egenix.com> <chnidf$epp$1@sea.gmane.org>
	<413F6120.7090603@egenix.com> <chuhn8$11s$1@sea.gmane.org>
	<87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ci3g2d$m3g$1@sea.gmane.org>
Message-ID: <87pt4qp8ti.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Fredrik" == Fredrik Lundh <fredrik@pythonware.com> writes:

    Fredrik> Stephen J. Turnbull wrote:

    >> But I worry that it's an exceptional example, when you use
    >> assumptions like "real-life text uses characters drawn from a
    >> small number of short contiguous regions in the alphabet."

    Fredrik> The problem is that I cannot tell if you've studied
    Fredrik> search issues,

Enough to understand Boyer-Moore and how the proposed algorithm
differs, and to recognize that your statements about the distribution
of search applications are true.  Not that I want to argue about
search, I'm all in favor of better search.  I was startled to read
that Python still uses a brute-force algorithm for searching.

My point about distribution of ideographs was simply that you made an
unjustified assumption in the context of what is (to me, anyway) an
important subdomain of text processing.  Here, it is "obviously
harmless," but that's because brute force search is so bad.  In other
applications, or with a better status quo, there very well may be real
tradeoffs between what's good for 8-bit text and what's good for
Unicode.

    Fredrik> or if you're just applying general "but wait, it's
    Fredrik> different for asian languages" arguments here.

No, I know that ostrich won't fly.

    Fredrik> Searches for "human text" are not that common, really,
    Fredrik> and search terms are usually limited to only a few words.

In the context of PEP 292 is a focus on "human text" unwarranted?
After all, what motivated the PEP and the implementation was evidently
"human text" processing.  In my experience, the notation for
interpolation it uses would have much bigger advantages over the
format string style for "human text" than for the "non-human text"
applications I know of.  Not that it's useless for the latter, just
that it's much more of a luxury there.

If that's valid, there's a point where it makes sense for people who
develop human-text-oriented features based on Unicode strings to say
"pick the features you really want for 8-bit strings, because you have
to support them yourselves."

    Fredrik> The only way to know for sure is if anyone has the time
    Fredrik> and energy to carry out tests on real-life datasets.  (or
    Fredrik> at least prepare some datasets;

I can prepare datasets and do some statistical work for Japanese, but
it probably won't happen this month.  Sounds like a worthwhile thing
to have around, though.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From barry at python.org  Mon Sep 13 16:32:55 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 13 16:33:01 2004
Subject: [Python-Dev] Re: Alternative
	ImplementationforPEP292:SimpleString Substitutions
In-Reply-To: <003501c49784$b8292d00$e841fea9@oemcomputer>
References: <003501c49784$b8292d00$e841fea9@oemcomputer>
Message-ID: <1095085975.10676.40.camel@geddy.wooz.org>

On Fri, 2004-09-10 at 18:22, Raymond Hettinger wrote:

> > My only problem with that is the interference that the 'mapping'
> > argument presents.  IOW, kwds can't contain 'mapping'. 
> 
> To support a case where both a mapping and keywords are present, perhaps
> an auxiliary class could simplify matters:
> 
>    def substitute(self, mapping=None, **kwds):
>        if mapping is None:
>            mapping = kwds
>        elif kwds:
>            mapping = _altmap(kwds, mapping)
>         . . .
> 
> class _altmap:
>     def __init__(self, primary, secondary):
>         self.primary = primary
>         self.secondary = secondary
>     def __getitem__(self, key):
>         try:
>             return self.primary[key]
>         except KeyError:
>             return self.secondary[key]

> This matches the way keywords are used with the dict().

This isn't exactly what I was concerned about, but I agree that it's a
worthwhile approach.  (I'm going to accept your patch and check it in,
with slight modifications.)

What I was worried about was if you providing 'mapping' positionally,
and kwds contained a 'mapping' key, you'll get a TypeError.  I'm going
to change the positional argument to '__mapping' so collisions of that
kind are less likely, and will document it in libstring.tex.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040913/c4d9e37d/attachment.pgp
From barry at python.org  Mon Sep 13 16:42:12 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 13 16:42:18 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <4142E78C.7010800@heneryd.com>
References: <4142E78C.7010800@heneryd.com>
Message-ID: <1095086532.10677.46.camel@geddy.wooz.org>

On Sat, 2004-09-11 at 07:54, Erik Heneryd wrote:

> * Too long
> 10/15 character names for something so simple it up until now just 
> needed a %?  Programs using templates will probably use them 
> frequently...  I'd prefer sub instead of substitute.

Noted, thanks.  In general I'm not a fan of abbreviations in APIs
though.  Also note that it is trivial for applications to derive and
override __mod__(), aliasing it to whichever version of substitute()
they want.  For example, I plan on multiply inheriting Template and
unicode, and aliasing __mod__() to safe_substitute().

> * Safe?
> safe_substitution doesn't tell you much upon first glance.  Safe?  In 
> what way?  You could even argue that the "plain" version really is the 
> safer one, as you'll notice typos and thus get a more solid program.  I 
> think a name hinting that this method uses the var name as a fallback 
> would be better, but can't think of (a short) one...  defaultsub? 
> fallbacksub?  loosesub?  Guess I could live with safe, but...

Yeah, that's the problem, there are no good alternatives.  As for which
version is "safer", when you're using Templates in an i18n environment,
where the actual Template you're going to be interpolating into comes
from 3rd party language translation teams, the safe_substitute() version
is definitely safer to the application.

Thanks for the feedback.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040913/edf62152/attachment.pgp
From barry at python.org  Mon Sep 13 16:44:37 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 13 16:44:42 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <4142F7D0.5080807@heneryd.com>
References: <4142E78C.7010800@heneryd.com>  <4142F7D0.5080807@heneryd.com>
Message-ID: <1095086677.10672.49.camel@geddy.wooz.org>

On Sat, 2004-09-11 at 09:04, Erik Heneryd wrote:

> Come to think of it, I really like the more OO-ish approach better, than 
> to cram everything into a single class.  Is the safe_substitute really 
> that special it deserves a special method?

Yes.

>   Is it really the one, true 
> way to do a "safe" substitution?  

Probably not.

> IIRC DOS and sh don't agree, so it's 
> not that obvious.

I'm sorry I don't follow that one.

> I say keep the inheritance thing, it's much more flexible, and delegate 
> the KeyError condition to an overridable method.

After the lengthy discussions on python-dev, I'm viewing the role of the
Template class a little differently, so I think it's fine to put them
both in one class.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040913/81a339b6/attachment.pgp
From Paul.Moore at atosorigin.com  Mon Sep 13 16:50:45 2004
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Mon Sep 13 16:50:50 2004
Subject: [Python-Dev] Re: AlternativeImplementationforPEP292:SimpleString
	Substitutions
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060F8F@UKDCX001.uk.int.atosorigin.com>

From: Barry Warsaw
> What I was worried about was if you providing 'mapping' positionally,
> and kwds contained a 'mapping' key, you'll get a TypeError.  I'm going
> to change the positional argument to '__mapping' so collisions of that
> kind are less likely, and will document it in libstring.tex.

Can't you do something like

    def substitute(self, *args, **kwds):
        if len(args) > 1:
            raise TypeError # mild hack...
        if len(args) == 1:
            mapping = args[0]
            mapping.update(kwds)
        else:
            mapping = kwds

        # etc...

This avoids the use of a strangely-named positional argument, at the cost
of a check for too many positional arguments (because the interpreter no
longer does it)

Paul.


__________________________________________________________________________
This e-mail and the documents attached are confidential and intended 
solely for the addressee; it may also be privileged. If you receive this 
e-mail in error, please notify the sender immediately and destroy it.
As its integrity cannot be secured on the Internet, the Atos Origin group 
liability cannot be triggered for the message content. Although the 
sender endeavours to maintain a computer virus-free network, the sender 
does not warrant that this transmission is virus-free and will not be 
liable for any damages resulting from any virus transmitted.
__________________________________________________________________________
From mnot at mnot.net  Sat Sep 11 07:24:29 2004
From: mnot at mnot.net (Mark Nottingham)
Date: Mon Sep 13 17:19:50 2004
Subject: [Python-Dev] Re: [Web-SIG] Adding status code constants to httplib
In-Reply-To: <414193D5.6010405@andreweland.org>
References: <414193D5.6010405@andreweland.org>
Message-ID: <DAB8C847-03B2-11D9-A26E-000A95BD86C0@mnot.net>

FYI; status codes as exceptions;
   http://www.mnot.net/python/http/status.py


On Sep 10, 2004, at 9:45 PM, Andrew Eland wrote:

> Hi,
>
> Over in web-sig, we're discussing PEP 333, the Web Server Gateway  
> Interface. Rather than defining our own set of constants for the HTTP  
> status code integers, we thought it would be a good idea to add them  
> to httplib, allowing other applications to benefit. I've uploaded a  
> patch[1] to httplib.py and the corresponding documentation. Do people  
> think this is a good idea?
>
>   -- Andrew Eland (http://www.andreweland.org)
>
> [1]  
> http://sourceforge.net/tracker/index.php? 
> func=detail&aid=1025790&group_id=5470&atid=305470
> _______________________________________________
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:  
> http://mail.python.org/mailman/options/web-sig/mnot%40mnot.net
>

--
Mark Nottingham     http://www.mnot.net/

From tim.hochberg at cox.net  Mon Sep 13 17:05:29 2004
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Sep 13 17:19:52 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <4145291C.6000204@ieee.org>
References: <4141D745.805@ieee.org>	<200409130259.i8D2xfoK008493@cosc353.cosc.canterbury.ac.nz>
	<4145203A.8070303@ieee.org> <4145291C.6000204@ieee.org>
Message-ID: <4145B739.4040803@cox.net>

Shane Holloway (IEEE) wrote:

> Tim Hochberg wrote:
> > that there are no use
> > cases for the custom short circuiting yet, we then just drop scand/scor
> > until a compelling use case shows up, if it ever does.
>
> A boolean calculus (predicate) engine would make use of 
> short-circuiting.  Or perhaps a state machine would make use of this 
> feature.  I agree with Greg that I'd rather the implementation be 
> "complete".  Computer Scientists have already been down this road, and 
> we know that there are two useful forms.  :)

I have no objections if someone can actually come up with use cases. 
However, I still thinks the names should change: and2/or2 will be used 
the vast majority of the time. Of course, my earlier suggestion to use 
and/or is completely bogus since that's what &/| map to. Doh!

Still, I think the use cases need to be more concrete than what we've 
seen so far. I can come up with a case where short circuiting could be 
used in numarray, but not one where I think it should, so I won't be of 
any help here.

>
> I like [__and1__, __and__, __or__, __or1__] -- the abbreviation would 
> have to be documented anyway, and the '1' says "one argument: self" to 
> me.


Sadly, and/or are already taken. I don't think this helps the and1/and2 
case much though -- having three methods and/and1/and2 is just 
confusing. Maybe booland or logand or logicaland? I dunno, none of those 
are particularly satisfying.

Regards,

-tim


From barry at python.org  Mon Sep 13 17:24:19 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 13 17:24:23 2004
Subject: [Python-Dev] Re:
	AlternativeImplementationforPEP292:SimpleString Substitutions
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060F8F@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060F8F@UKDCX001.uk.int.atosorigin.com>
Message-ID: <1095089059.10677.94.camel@geddy.wooz.org>

On Mon, 2004-09-13 at 10:50, Moore, Paul wrote:
> From: Barry Warsaw
> > What I was worried about was if you providing 'mapping' positionally,
> > and kwds contained a 'mapping' key, you'll get a TypeError.  I'm going
> > to change the positional argument to '__mapping' so collisions of that
> > kind are less likely, and will document it in libstring.tex.
> 
> Can't you do something like
> 
>     def substitute(self, *args, **kwds):
>         if len(args) > 1:
>             raise TypeError # mild hack...
>         if len(args) == 1:
>             mapping = args[0]
>             mapping.update(kwds)
>         else:
>             mapping = kwds
> 
>         # etc...
> 
> This avoids the use of a strangely-named positional argument, at the cost
> of a check for too many positional arguments (because the interpreter no
> longer does it)

Nice.  That's a better hack IMO than the crappy argument name hack.

Thanks,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040913/0ba5d82b/attachment.pgp
From stephen at xemacs.org  Mon Sep 13 18:01:21 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon Sep 13 18:01:28 2004
Subject: [Python-Dev] AlternativeImplementation	forPEP292:SimpleString
	Substitutions
In-Reply-To: <200409131332.00927.gmccaughan@synaptics-uk.com> (Gareth
	McCaughan's message of "Mon, 13 Sep 2004 13:32:00 +0100")
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<200409101257.13802.gmccaughan@synaptics-uk.com>
	<87mzzxqmvn.fsf@tleepslib.sk.tsukuba.ac.jp>
	<200409131332.00927.gmccaughan@synaptics-uk.com>
Message-ID: <87d60qp38u.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Gareth" == Gareth McCaughan <gmccaughan@synaptics-uk.com> writes:

    Gareth> I am unconvinced that "the vast majority of developers"
    Gareth> will not have work to do that involves a large volume of
    Gareth> ASCII data ... but I'm not sure this is something either
    Gareth> of us is in a position to know.

Oh, I'm pretty sure that an awful lot of developers _will_ have work
to do that involves large volumes of ASCII data.  The question is how
much will that work be facilitated by having all (as opposed to a few
well-chosen) text processing features support returning 8-bit strings
as well as Unicodes?

    Gareth> Perhaps I'm being obtuse, but it's far from clear to me
    Gareth> that this is a net loss for Python users at large.

It's not clear to me, either.  I am just not convinced by hand-waving
that says "there's no difference between human text processing and
other text processing, so any text processing facility should be
available in an 8-bit version."  Maybe that's a straw man, but that's
what </F> was advocating AFAICT.

    Gareth> I still think, though, that a "notice" standard makes for
    Gareth> bad designs.

We're not talking about design here, IMO.  We're talking about
requirements.  Of course if you're going to implement a capability,
you should design it "right."

    Gareth> What people notice is easier to define and to measure than
    Gareth> what actually makes a difference to them. That is not
    Gareth> enough reason to treat it as the only criterion.

It's not.  What I'm saying is that if very few people see a noticable
difference, it should be left up to those few to implement what they
need.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From erik at heneryd.com  Mon Sep 13 18:16:08 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Mon Sep 13 18:16:16 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <1095086532.10677.46.camel@geddy.wooz.org>
References: <4142E78C.7010800@heneryd.com>
	<1095086532.10677.46.camel@geddy.wooz.org>
Message-ID: <4145C7C8.5080101@heneryd.com>

Barry Warsaw wrote:
> On Sat, 2004-09-11 at 07:54, Erik Heneryd wrote:
> 
> 
>>* Too long
>>10/15 character names for something so simple it up until now just 
>>needed a %?  Programs using templates will probably use them 
>>frequently...  I'd prefer sub instead of substitute.
> 
> 
> Noted, thanks.  In general I'm not a fan of abbreviations in APIs
> though.  Also note that it is trivial for applications to derive and
> override __mod__(), aliasing it to whichever version of substitute()
> they want.  For example, I plan on multiply inheriting Template and
> unicode, and aliasing __mod__() to safe_substitute().

-1

Well, even if it's trivial, I still think the out-of-the-box API 
shouldn't be hostile against frequent use.  Subclassing just to get a 
decent name/operator feels stupid.  Why not __mod__ = safe_substitute 
per default then?


Erik
From erik at heneryd.com  Mon Sep 13 18:18:28 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Mon Sep 13 18:18:32 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <1095086677.10672.49.camel@geddy.wooz.org>
References: <4142E78C.7010800@heneryd.com> <4142F7D0.5080807@heneryd.com>
	<1095086677.10672.49.camel@geddy.wooz.org>
Message-ID: <4145C854.4070200@heneryd.com>

Barry Warsaw wrote:
> On Sat, 2004-09-11 at 09:04, Erik Heneryd wrote:
> 
> 
>>Come to think of it, I really like the more OO-ish approach better, than 
>>to cram everything into a single class.  Is the safe_substitute really 
>>that special it deserves a special method?
> 
> 
> Yes.
> 
> 
>>  Is it really the one, true 
>>way to do a "safe" substitution?  
> 
> 
> Probably not.
> 
> 
>>IIRC DOS and sh don't agree, so it's 
>>not that obvious.
> 
> 
> I'm sorry I don't follow that one.

DOS: '%NOTFOUND%' => '%NOTFOUND%'
sh: '$NOTFOUND' => ''

BTW, what about a closing delimiter in the standard regex?

>>I say keep the inheritance thing, it's much more flexible, and delegate 
>>the KeyError condition to an overridable method.
> 
> 
> After the lengthy discussions on python-dev, I'm viewing the role of the
> Template class a little differently, so I think it's fine to put them
> both in one class.

I think there are more use cases for a KeyError hook than just sh-style 
substitution; a default value, a computed value (think replacing html 
entities - returning chr(idpattern) on KeyError) etc...

I hope you don't do pep-292 just to fill your own needs (i18n?), but 
also keep your mind open to other uses...


Erik
From bac at OCF.Berkeley.EDU  Mon Sep 13 19:41:08 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Sep 13 19:41:32 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <4145C7C8.5080101@heneryd.com>
References: <4142E78C.7010800@heneryd.com>	<1095086532.10677.46.camel@geddy.wooz.org>
	<4145C7C8.5080101@heneryd.com>
Message-ID: <4145DBB4.8010601@ocf.berkeley.edu>

Erik Heneryd wrote:
> Barry Warsaw wrote:
> 
>> On Sat, 2004-09-11 at 07:54, Erik Heneryd wrote:
>>
>>
>>> * Too long
>>> 10/15 character names for something so simple it up until now just 
>>> needed a %?  Programs using templates will probably use them 
>>> frequently...  I'd prefer sub instead of substitute.
>>
>>
>>
>> Noted, thanks.  In general I'm not a fan of abbreviations in APIs
>> though.  Also note that it is trivial for applications to derive and
>> override __mod__(), aliasing it to whichever version of substitute()
>> they want.  For example, I plan on multiply inheriting Template and
>> unicode, and aliasing __mod__() to safe_substitute().
> 
> 
> -1
> 
> Well, even if it's trivial, I still think the out-of-the-box API 
> shouldn't be hostile against frequent use.  Subclassing just to get a 
> decent name/operator feels stupid.  Why not __mod__ = safe_substitute 
> per default then?
> 

I'm with Barry on this.  Verbosity is going to overtake practicality here.  And 
I think this is a good thing the last thing the stdlib should start doing is 
trying to force people to use some shorthand that we come up with that won't 
necessarily be intuitive to other people ('sub' just doesn't seem right here; 
and don't ask for justification since this is a gut feeling).  I am sure the 
way I tend to abbreviate things is not how anyone else would.  So why would the 
stdlib try to?  We have tried to come up with good names and this is the best 
we came up with.

And as Barry said, you can add __mod__ to your own subclass.  And another 
option entirely is to just assign the method to a shorter name in your code.

And if you *really* want to argue the length thing, you can take into account 
that "substitute" has a decent amount of hand alternation on QWERTY to allow 
for pretty good typing speed.

-Brett
From mal at egenix.com  Mon Sep 13 22:15:01 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon Sep 13 22:15:11 2004
Subject: [Python-Dev] Re: Alternative Implementation for PEP
	292:	SimpleString Substitutions
In-Reply-To: <ci1bk5$9e3$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>	<413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org>	<413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org>	<41436762.7040207@egenix.com>
	<ci1bk5$9e3$1@sea.gmane.org>
Message-ID: <4145FFC5.8090208@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
> 
>>You mean: a compressed shift table for Unicode patterns ?
>>I'll have a look.
> 
> 
> It's a lossy compression: the entire delta1 table is represented as
> two 32-bit values, independent of the size of the source alphabet.
> Works amazingly well, at least when combined with the BM-variant
> it was designed for...
> 
> (I suppose it's too late for 2.4, but it would probably be a good
> idea to switch to this algorithm in 2.5)

Here's a reference that might be interesting for you:

http://citeseer.ist.psu.edu/boldi02compact.html

They use statistical approaches to dealing with the problem of
large alphabets. Their motivation is making Java's Unicode string
implementation faster... sounds familiar, eh :-)

Their motivation was based on work done for the "Managing Gigabytes"
project:

http://www.cs.mu.oz.au/mg/

and

http://www.mds.rmit.edu.au/mg/

Too bad their code is GPLed, but I suppose getting some ideas
is OK ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 13 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From fredrik at pythonware.com  Mon Sep 13 22:18:01 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Sep 13 22:16:11 2004
Subject: [Python-Dev] Re: PEP 292: method names
References: <4142E78C.7010800@heneryd.com>	<1095086532.10677.46.camel@geddy.wooz.org><4145C7C8.5080101@heneryd.com>
	<4145DBB4.8010601@ocf.berkeley.edu>
Message-ID: <ci4v65$s7f$1@sea.gmane.org>

Brett C wrote:

> I am sure the way I tend to abbreviate things is not how anyone
> else would.  So why would the stdlib try to?

it's pretty amazing that you've been able to use Python without noticing
that the standard library is full of abbreviations.

doesn't anyone here think before they post, these days?

</F> 


From fredrik at pythonware.com  Mon Sep 13 22:20:28 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Sep 13 22:20:30 2004
Subject: [Python-Dev] Re: Re: Re: Alternative Implementation for PEP
	292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com>
	<chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com>
	<chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com>
	<chuhn8$11s$1@sea.gmane.org><87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp><ci3g2d$m3g$1@sea.gmane.org>
	<87pt4qp8ti.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <ci4van$sk1$1@sea.gmane.org>

Stephen J. Turnbull wrote:

> In the context of PEP 292 is a focus on "human text" unwarranted?

I'm pretty sure this subthread left the PEP quite a few posts ago.  The rest
of us were talking about string searches, of the find/replace/split variety.

</F> 


From fredrik at pythonware.com  Mon Sep 13 22:47:40 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Sep 13 22:45:54 2004
Subject: [Python-Dev] Re: PEP 335: Overloadable Boolean Operators
References: <200409100058.i8A0wNIV002743@cosc353.cosc.canterbury.ac.nz>
Message-ID: <ci50tm$1b6$1@sea.gmane.org>

Greg Ewing wrote:

> To permit short-circuiting, processing of the 'and' and 'or' operators
> is split into two phases.  Phase 1 occurs after evaluation of the first
> operand but before the second.  If the first operand defines the
> appropriate phase 1 method, it is called with the first operand as
> argument.  If that method can determine the result without needing the
> second operand, it returns the result, and further processing is
> skipped.

nice.

+1 from here (but only +0 on the method names).

</F> 


From raymond.hettinger at verizon.net  Mon Sep 13 23:23:01 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon Sep 13 23:24:05 2004
Subject: [Python-Dev] Decorator PEP elaborations
Message-ID: <000a01c499d7$dbed5900$e841fea9@oemcomputer>

If one of the authors gets a chance, it would be nice to document the
rationale for the order of application being inside-out instead of
top-down:

    @deco3
    @deco2
    @deco1
    def myfunc(args):
        . . .

Also, it would be nice to document the reasons for the approach to
argument handling:

    @deco             # calls deco(f)
    @decomaker(arg)   # calls tmp(f) where tmp=decomaker(arg)


Raymond Hettinger

From fredrik at pythonware.com  Mon Sep 13 23:29:24 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Sep 13 23:27:33 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for
	PEP292:	SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com>	<1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org>	<413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org>	<413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org>	<413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org>	<41436762.7040207@egenix.com><ci1bk5$9e3$1@sea.gmane.org>
	<4145FFC5.8090208@egenix.com>
Message-ID: <ci53bv$3tt$1@sea.gmane.org>

M.-A. Lemburg wrote:

>> (I suppose it's too late for 2.4, but it would probably be a good
>> idea to switch to this algorithm in 2.5)
>
> Here's a reference that might be interesting for you:
>
> http://citeseer.ist.psu.edu/boldi02compact.html
>
> They use statistical approaches to dealing with the problem of
> large alphabets. Their motivation is making Java's Unicode string
> implementation faster... sounds familiar, eh :-)

thanks for the reference.  but I have to admit that I found the following
paper by the same authors to be more interesting ...

    http://citeseer.ist.psu.edu/boldi03rethinking.html

... both because they've looked into efficient designs for mutable strings,
and because of how they use a 32-bit "bloom filter" hashed by the least
significant bits in the Unicode characters...  oh well, there are never any
new ideas ;-)

</F> 


From barry at python.org  Tue Sep 14 00:40:48 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep 14 00:40:54 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <4145C7C8.5080101@heneryd.com>
References: <4142E78C.7010800@heneryd.com>
	<1095086532.10677.46.camel@geddy.wooz.org>
	<4145C7C8.5080101@heneryd.com>
Message-ID: <1095115247.10672.187.camel@geddy.wooz.org>

On Mon, 2004-09-13 at 12:16, Erik Heneryd wrote:

> Well, even if it's trivial, I still think the out-of-the-box API 
> shouldn't be hostile against frequent use.  

It's no more hostile than os.path.splitext or KeyboardInterrupt <wink>. 
Seriously, although Python does use abbreviations sometimes, I
personally think that doing so can create ambiguity and can cause
problems for non-native English speakers.  Python is not Unix.  Besides,
don't most editors and IDEs provide completion these days?

> Subclassing just to get a 
> decent name/operator feels stupid.  Why not __mod__ = safe_substitute 
> per default then?

Because I don't know which version will be more generally preferred by
application authors and in the face of ambiguity I refuse the temptation
to guess.  (I know which version my own applications will prefer but I
think you're the same person arguing that my own needs shouldn't drive
all decisions.)

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040913/93f4bf95/attachment.pgp
From barry at python.org  Tue Sep 14 00:43:37 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep 14 00:43:48 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <4145C854.4070200@heneryd.com>
References: <4142E78C.7010800@heneryd.com>  <4142F7D0.5080807@heneryd.com>
	<1095086677.10672.49.camel@geddy.wooz.org>
	<4145C854.4070200@heneryd.com>
Message-ID: <1095115417.10677.191.camel@geddy.wooz.org>

On Mon, 2004-09-13 at 12:18, Erik Heneryd wrote:
> >>IIRC DOS and sh don't agree, so it's 
> >>not that obvious.
> > 
> > 
> > I'm sorry I don't follow that one.
> 
> DOS: '%NOTFOUND%' => '%NOTFOUND%'
> sh: '$NOTFOUND' => ''

Okay, thanks.

> BTW, what about a closing delimiter in the standard regex?

There isn't one.  The PEP explains the rationale.

> I hope you don't do pep-292 just to fill your own needs (i18n?), but 
> also keep your mind open to other uses...

What can't you do with PEP 292 as it now stands?

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040913/023cab74/attachment.pgp
From erik at heneryd.com  Tue Sep 14 01:44:26 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Tue Sep 14 01:44:31 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <1095115247.10672.187.camel@geddy.wooz.org>
References: <4142E78C.7010800@heneryd.com>	
	<1095086532.10677.46.camel@geddy.wooz.org>
	<4145C7C8.5080101@heneryd.com>
	<1095115247.10672.187.camel@geddy.wooz.org>
Message-ID: <414630DA.1000009@heneryd.com>

Barry Warsaw wrote:
> On Mon, 2004-09-13 at 12:16, Erik Heneryd wrote:
>>Subclassing just to get a 
>>decent name/operator feels stupid.  Why not __mod__ = safe_substitute 
>>per default then?
> 
> 
> Because I don't know which version will be more generally preferred by
> application authors and in the face of ambiguity I refuse the temptation
> to guess.  (I know which version my own applications will prefer but I
> think you're the same person arguing that my own needs shouldn't drive
> all decisions.)

You know, that's an argument for going back to subclasses.


Erik
From erik at heneryd.com  Tue Sep 14 01:48:30 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Tue Sep 14 01:48:35 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <1095115417.10677.191.camel@geddy.wooz.org>
References: <4142E78C.7010800@heneryd.com> <4142F7D0.5080807@heneryd.com>	
	<1095086677.10672.49.camel@geddy.wooz.org>
	<4145C854.4070200@heneryd.com>
	<1095115417.10677.191.camel@geddy.wooz.org>
Message-ID: <414631CE.4060205@heneryd.com>

Barry Warsaw wrote:
> On Mon, 2004-09-13 at 12:18, Erik Heneryd wrote:
>>BTW, what about a closing delimiter in the standard regex?
> 
> 
> There isn't one.  The PEP explains the rationale.

Sorry, must've missed it?  Note that I'm not saying there should be a 
default closer, but an empty group in the regex, for subclasses to fill 
in (for example ml entities could use this).

>>I hope you don't do pep-292 just to fill your own needs (i18n?), but 
>>also keep your mind open to other uses...
> 
> 
> What can't you do with PEP 292 as it now stands?

Examples from previous posts: sh-style safe variables, ml entities.  As 
you really can't reuse anything now, it would be like starting from scratch.


Erik
From gvanrossum at gmail.com  Tue Sep 14 02:12:55 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Sep 14 02:13:00 2004
Subject: [Python-Dev] Decorator PEP elaborations
In-Reply-To: <000a01c499d7$dbed5900$e841fea9@oemcomputer>
References: <000a01c499d7$dbed5900$e841fea9@oemcomputer>
Message-ID: <ca471dc2040913171211cb3153@mail.gmail.com>

No PEP text from me, but:

> If one of the authors gets a chance, it would be nice to document the
> rationale for the order of application being inside-out instead of
> top-down:
> 
>     @deco3
>     @deco2
>     @deco1
>     def myfunc(args):
>         . . .

This is the usual order for function-application. @f @g def foo() ->
foo=f(g(foo).

> Also, it would be nice to document the reasons for the approach to
> argument handling:
> 
>     @deco             # calls deco(f)
>     @decomaker(arg)   # calls tmp(f) where tmp=decomaker(arg)

The thing after the @ can be consered to be an expression (never mind
that syntactically you are restricted), and whatever that expression
returns is called.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From raymond.hettinger at verizon.net  Tue Sep 14 04:00:02 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue Sep 14 04:01:11 2004
Subject: [Python-Dev] PEP 292: method names
In-Reply-To: <414631CE.4060205@heneryd.com>
Message-ID: <000d01c499fe$8d71b9c0$e841fea9@oemcomputer>

[Barry]
> > What can't you do with PEP 292 as it now stands?

[Erik] 
> Examples from previous posts: sh-style safe variables, ml entities.
As
> you really can't reuse anything now, it would be like starting from
> scratch.

It took a good while to refine the existing implementation to handle all
the nuances of the $var format.  My suspicion is that a format with
opening and closing delimiters would have its own share of issues
(nesting and escaping for example) and would warrant its own separate
solution.

The current implementation is pretty darned good and strikes a nice
balance between extensibility goals and simplification goals (using $var
instead of a %(var)s format).  The API is clean and friendly for most
purposes.

Barry has made it possible to create unicode coercing subclasses, to
substitute alternate identifier patterns (such as dotted names), to
specify an alternative delimiter, and to use polymorphism for changing
the implementation without changing client code.  That is quite a bit of
extensibility.  Further hypergeneralization would stray too far from the
original simplification goals.

After experimenting with alternative approaches and writing subclasses,
I learned that no design easily accommodated the most complex use cases.
The pattern, convert function, flags, and invocation are so tightly
coupled that you really are better off coding from scratch.
Fortunately, with the string.py source available as a model, it is not
hard to do.

So, for applications beyond the limits of the current design, my
suggestion is to use regexes to roll your own.  At some point, it is
easier to write a regex than to write a subclass overriding all existing
behaviors.


Raymond

From greg at cosc.canterbury.ac.nz  Tue Sep 14 04:58:22 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Sep 14 04:58:28 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
Message-ID: <200409140258.i8E2wMDW010792@cosc353.cosc.canterbury.ac.nz>

> IMO, the algebraic/query use cases would be better served by some
> sort of "code literal" or "AST literal" syntax

You may be right about the symbolic algebra case, if the intent is to
be able to write code that manipulates expressions, in which case
writing the expressions to be manipulated as literals of some kind may
make sense.

But I don't agree in the SQL case, where my intent is for the user to
simply write Python code that performs database queries, not write
Python code that constructs trees of SQL expressions that perform
database queries. The fact that expression manipulation is going on
should be an implementation detail that the user doesn't need to be
aware of. Having to write the query expressions using some special
syntax would interfere with that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From bac at OCF.Berkeley.EDU  Tue Sep 14 04:58:27 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Sep 14 04:58:35 2004
Subject: [Python-Dev] Re: PEP 292: method names
In-Reply-To: <ci4v65$s7f$1@sea.gmane.org>
References: <4142E78C.7010800@heneryd.com>	<1095086532.10677.46.camel@geddy.wooz.org><4145C7C8.5080101@heneryd.com>	<4145DBB4.8010601@ocf.berkeley.edu>
	<ci4v65$s7f$1@sea.gmane.org>
Message-ID: <41465E53.6050606@ocf.berkeley.edu>

Fredrik Lundh wrote:
> Brett C wrote:
> 
> 
>>I am sure the way I tend to abbreviate things is not how anyone
>>else would.  So why would the stdlib try to?
> 
> 
> it's pretty amazing that you've been able to use Python without noticing
> that the standard library is full of abbreviations.
> 

Just because the stdlib is full of abbreviations does not mean it should be 
continued.  Precedence != acceptance .

-Brett
From pje at telecommunity.com  Tue Sep 14 06:37:06 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Sep 14 06:36:37 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <200409140258.i8E2wMDW010792@cosc353.cosc.canterbury.ac.nz>
References: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>

At 02:58 PM 9/14/04 +1200, Greg Ewing wrote:
> > IMO, the algebraic/query use cases would be better served by some
> > sort of "code literal" or "AST literal" syntax
>
>You may be right about the symbolic algebra case, if the intent is to
>be able to write code that manipulates expressions, in which case
>writing the expressions to be manipulated as literals of some kind may
>make sense.
>
>But I don't agree in the SQL case, where my intent is for the user to
>simply write Python code that performs database queries, not write
>Python code that constructs trees of SQL expressions that perform
>database queries.

So, something like this:

      query("x and y or z")

isn't "code that performs database queries"?

My main concern about the PEP is that it adds overhead to *all* logical 
operations, but the feature will only benefit code that hasn't yet been 
written.  I also fear that as a result, people will start writing complex 
if-then blocks to "optimize" performance of conditionals to get them back 
to where they were before the facility was added.  Also, it considerably 
expands the scope of understanding that someone needs in order to grasp the 
meaning of a logical expression.

For these reasons, I'd feel more comfortable with either a literal syntax 
(to address algebra, SQL, etc.) or some type of special infix notation to 
allow new operators to be defined in Python, so that it isn't necessary to 
use prefix or method notation to perform operations like these.  Neither of 
these solutions burdens applications that don't need the feature(s).

From tjreedy at udel.edu  Tue Sep 14 08:54:44 2004
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Sep 14 08:54:55 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for PEP
	292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org><87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ci3g2d$m3g$1@sea.gmane.org>
Message-ID: <ci64jr$565$1@sea.gmane.org>


"Fredrik Lundh" <fredrik@pythonware.com> wrote in message 
news:ci3g2d$m3g$1@sea.gmane.org...
> usually shorter in languages with many ideographs (my non-scientific
> tests indicate that chinese text uses about 4 times less symbols than
> english; I'm sure someone can dig up better figures).

This is why I am not especially enamored of Unicode and the prospect of 
Python becoming married to it.  It is heavily weighted in favor of 
efficiently representing Chinese and inefficiently representing English. 
To give English equivalent treatment, the 20,000 or so most common words, 
roots, prefixes, and suffixes would each get its own codepoint.

Terry J. Reedy


From stephen at xemacs.org  Tue Sep 14 09:03:19 2004
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue Sep 14 09:03:30 2004
Subject: [Python-Dev] Re: Re: Re: Alternative Implementation for PEP
	292:SimpleString Substitutions
In-Reply-To: <ci4van$sk1$1@sea.gmane.org> (Fredrik Lundh's message of "Mon,
	13 Sep 2004 22:20:28 +0200")
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org> <cheig3$ki8$1@sea.gmane.org>
	<413F1D9C.20209@egenix.com> <chnc49$psm$1@sea.gmane.org>
	<413F3605.7090707@egenix.com> <chnidf$epp$1@sea.gmane.org>
	<413F6120.7090603@egenix.com> <chuhn8$11s$1@sea.gmane.org>
	<87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ci3g2d$m3g$1@sea.gmane.org>
	<87pt4qp8ti.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ci4van$sk1$1@sea.gmane.org>
Message-ID: <87k6uxnxhk.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Fredrik" == Fredrik Lundh <fredrik@pythonware.com> writes:

    Fredrik> Stephen J. Turnbull wrote:

    >> In the context of PEP 292 is a focus on "human text"
    >> unwarranted?

    Fredrik> I'm pretty sure this subthread left the PEP quite a few
    Fredrik> posts ago.

That's a funny way to spell "I don't like the way this is going,
good-bye", but it works for me. <wink>

Have a nice day, thanks for the information on search algorithms and
usage patterns.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From fredrik at pythonware.com  Tue Sep 14 10:33:03 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Sep 14 10:33:08 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org><87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp><ci3g2d$m3g$1@sea.gmane.org>
	<ci64jr$565$1@sea.gmane.org>
Message-ID: <ci6abv$ici$1@sea.gmane.org>

Terry Reedy wrote:

>> usually shorter in languages with many ideographs (my non-scientific
>> tests indicate that chinese text uses about 4 times less symbols than
>> english; I'm sure someone can dig up better figures).
>
> This is why I am not especially enamored of Unicode and the prospect of Python becoming married to 
> it.  It is heavily weighted in favor of efficiently representing Chinese and inefficiently 
> representing English.

Don't confuse Unicode with its UCS-2 and UCS-4 encodings.  On a conceptual
level, good old 7-bit ASCII and 8-bit ISO-Latin-1 are both Unicode.

</F> 


From jacobs at theopalgroup.com  Tue Sep 14 14:04:45 2004
From: jacobs at theopalgroup.com (Kevin Jacobs)
Date: Tue Sep 14 14:04:41 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
References: <5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
	<5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
Message-ID: <4146DE5D.3040702@theopalgroup.com>

Phillip J. Eby wrote:

> My main concern about the PEP is that it adds overhead to *all* 
> logical operations, but the feature will only benefit code that hasn't 
> yet been written.

Actually, there are several packages that implement ugly
workarounds for exactly this issue.  So, in a sense, there
is a significant amount of code that exists that will benefit
from this feature.  Some that come to mind are my own
SQL ADT library, SQLObject, and several parser tools.

> For these reasons, I'd feel more comfortable with either a literal 
> syntax (to address algebra, SQL, etc.) or some type of special infix 
> notation to allow new operators to be defined in Python, so that it 
> isn't necessary to use prefix or method notation to perform operations 
> like these.  Neither of these solutions burdens applications that 
> don't need the feature(s).

Both of your alternatives are being used in some form and
neither is really satisfactory.  Literal representations require
complex parsers, when the Python parser is really what is
desired.  The infix notation idea is interesting, however the
operators desired are usually 'logical and' and 'logical or',
which are clearly spelled 'and' and 'or' in Python.  I see it
as a semantic limitation that Python does not allow overriding
these operators.  Adding extra indirection (i.e., extra byte
codes) _will_ affect performance, but my view is that
correctness and completeness are more important than
performance.

-Kevin

From ndbecker2 at verizon.net  Tue Sep 14 14:48:39 2004
From: ndbecker2 at verizon.net (Neal D. Becker)
Date: Tue Sep 14 14:50:59 2004
Subject: [Python-Dev] find_first (and relatives)
Message-ID: <ci6pb7$o62$2@sea.gmane.org>

I was a bit surprised to find out that python doesn't seem to have builtin
functors, such as find_first.  Although there are ways to simulate such
functions, it would be good to have an expanded set of functional
programming tools which are coded in C for speed.

From pinard at iro.umontreal.ca  Tue Sep 14 15:08:08 2004
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Tue Sep 14 15:08:52 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for PEP
	292:SimpleString Substitutions
In-Reply-To: <ci64jr$565$1@sea.gmane.org>
References: <ci3g2d$m3g$1@sea.gmane.org> <ci64jr$565$1@sea.gmane.org>
Message-ID: <20040914130808.GA2294@alcyon.progiciels-bpi.ca>

[Terry Reedy]

> [Unicode] is heavily weighted in favor of efficiently representing
> Chinese and inefficiently representing English.

You undoubtedly forgot the smiley! :-)

Many people consider that Unicode, or UTF-8 at least, is strongly
favouring English (boldly American) over any other script or language.
If it has not been so, Americans would never have promoted it so much,
and would have rather shown an infinite and eternal reluctance...

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard
From exarkun at divmod.com  Tue Sep 14 15:32:33 2004
From: exarkun at divmod.com (exarkun@divmod.com)
Date: Tue Sep 14 15:33:05 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <4146DE5D.3040702@theopalgroup.com>
Message-ID: <20040914133233.29723.1901953221.divmod.quotient.1109@ohm>

On Tue, 14 Sep 2004 08:04:45 -0400, Kevin Jacobs <jacobs@theopalgroup.com> wrote:
>Phillip J. Eby wrote:
> 
> > For these reasons, I'd feel more comfortable with either a literal 
> > syntax (to address algebra, SQL, etc.) or some type of special infix 
> > notation to allow new operators to be defined in Python, so that it 
> > isn't necessary to use prefix or method notation to perform operations 
> > like these.  Neither of these solutions burdens applications that 
> > don't need the feature(s).
> 
> Both of your alternatives are being used in some form and
> neither is really satisfactory.  Literal representations require
> complex parsers, when the Python parser is really what is
> desired.
  Python's parser is already available, through the compiler module.  The example given earlier, query("x and y or z"), is relatively straightforward to implement as a set of AST manipulations.
  Jp
From mal at egenix.com  Tue Sep 14 15:56:09 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Sep 14 15:56:17 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for
	PEP	292:SimpleString Substitutions
In-Reply-To: <ci64jr$565$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org><87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>	<ci3g2d$m3g$1@sea.gmane.org>
	<ci64jr$565$1@sea.gmane.org>
Message-ID: <4146F879.6070805@egenix.com>

Terry Reedy wrote:
> "Fredrik Lundh" <fredrik@pythonware.com> wrote in message 
> news:ci3g2d$m3g$1@sea.gmane.org...
> 
>>usually shorter in languages with many ideographs (my non-scientific
>>tests indicate that chinese text uses about 4 times less symbols than
>>english; I'm sure someone can dig up better figures).
> 
> This is why I am not especially enamored of Unicode and the prospect of 
> Python becoming married to it.  It is heavily weighted in favor of 
> efficiently representing Chinese and inefficiently representing English. 

Hmm, the Asian world has a very different view on these things.

Representing English ASCII text in UTF-8 is very efficient (1-1), while
typical Asian texts use between 1.5-2 times as much space as their equivalent
in one of the resp. Asian encodings, e.g. take the Japanese translation
of the bible from (only parts of New Testament):

	http://www.cozoh.org/denmo/

 >>> bible = unicode(open('denmo.txt', 'rb').read(), 'shift-jis')
 >>> len(bible)
386980
 >>> len(bible.encode('utf-8'))
1008272
 >>> len(bible.encode('shift-jis'))
697626

Some stats:
-----------

Number of unique code points: 1512

Code point frequency (truncated):

u'\u305f' : =================================
u' '      : =============================
u'\u306e' : ===========================
u'\uff0c' : ==========================
u'\r'     : ========================
u'\n'     : ========================
u'\u306b' : =====================
u'\u3044' : =================
u'\u3066' : =================
u'\u3057' : ================
u'\u3002' : ================
u'\u306f' : ================
u'\u306a' : ===============
u'\u3092' : ==============
u'\u3068' : ============
u'\u308b' : ============
u'\u3089' : ===========
u'\u3063' : ===========
u':'      : ===========
u'}'      : ===========
u'{'      : ===========
u'\u304c' : ==========
u'\u308c' : ==========
u'\u304b' : =========
u'\u3067' : =========
u'1'      : =========
u'\u5f7c' : ========
u'\u3053' : ========
u'\u3042' : =======
u'\u3061' : =======
u'\u3046' : =======
u'2'      : =======
...

As you can see, most code points live in the 0x3000 area. These
code points require 3 bytes in UTF-8, 2 bytes in UTF-16.

> To give English equivalent treatment, the 20,000 or so most common words, 
> roots, prefixes, and suffixes would each get its own codepoint.

I suggest you take this one up with the Unicode Consortium :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 14 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From jacobs at theopalgroup.com  Tue Sep 14 17:29:10 2004
From: jacobs at theopalgroup.com (Kevin Jacobs)
Date: Tue Sep 14 17:29:14 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <20040914133233.29723.1901953221.divmod.quotient.1109@ohm>
References: <20040914133233.29723.1901953221.divmod.quotient.1109@ohm>
Message-ID: <41470E46.5020802@theopalgroup.com>

exarkun@divmod.com wrote:

>On Tue, 14 Sep 2004 08:04:45 -0400, Kevin Jacobs <jacobs@theopalgroup.com> wrote:
>  
>
>>Phillip J. Eby wrote:
>>    
>>
>>>For these reasons, I'd feel more comfortable with either a literal 
>>>syntax (to address algebra, SQL, etc.) or some type of special infix 
>>>notation to allow new operators to be defined in Python, so that it 
>>>isn't necessary to use prefix or method notation to perform operations 
>>>like these.  Neither of these solutions burdens applications that 
>>>don't need the feature(s).
>>>      
>>>
>>Both of your alternatives are being used in some form and
>>neither is really satisfactory.  Literal representations require
>>complex parsers, when the Python parser is really what is
>>desired.
>>    
>>
>  Python's parser is already available, through the compiler module.  The example given earlier, query("x and y or z"), is relatively straightforward to implement as a set of AST manipulations.
>  
>

While strictly true, your suggestion still requires two
distinct parsers (although one implementation) and
two distinct parsing contexts (one embedded in a literal
string). 

The use cases I care about involve minimizing the difference
between evaluating regular Python expressions and
ADT instances -- plus the ability to mix constructs from both
in a seamless way.  If Python didn't support any over-loadable
ADT methods, then this wouldn't be an issue.  However,
the problem is that virtually all ADT methods _are_ defined
_except_ logical conjunction and disjunction.  Thus, I am
more concerned with correcting this oversight than I am with
a fraction of a percent in slowdown in real applications.
(or at least micro-benchmarks are _not_ representative
of any real world situations I've ever cared about)

-Kevin


From pje at telecommunity.com  Tue Sep 14 17:43:05 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Sep 14 17:42:56 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <4146DE5D.3040702@theopalgroup.com>
References: <5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
	<5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>
	<5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040914112226.0354d8d0@mail.telecommunity.com>

At 08:04 AM 9/14/04 -0400, Kevin Jacobs wrote:

>>For these reasons, I'd feel more comfortable with either a literal syntax 
>>(to address algebra, SQL, etc.) or some type of special infix notation to 
>>allow new operators to be defined in Python, so that it isn't necessary 
>>to use prefix or method notation to perform operations like 
>>these.  Neither of these solutions burdens applications that don't need 
>>the feature(s).
>
>Both of your alternatives are being used in some form and
>neither is really satisfactory.  Literal representations require
>complex parsers, when the Python parser is really what is
>desired.

Maybe you missed the earlier part of the thread, where I was suggesting 
that a Python "code literal" or "AST literal" syntax would be helpful.  For 
example, if backquotes didn't already have a use, one might say something like:

     db.query(`x.y==z and foo*bar<27`)

To pass an AST object to the db.query() method.  The advantage would be 
that the AST would be parsed and syntax checked at compile time, rather 
than runtime.

After several experiments with using &, |, and ~ for query expressions, 
I've pretty much quit and gone to using string literals, since AST literals 
don't exist.  But if AST literals *did* exist, I'd certainly use them in 
preference to strings.

But, even if PEP 335 *were* implemented, creating a query system using 
Python expressions would *still* be kludgy, because you still need "seed 
variables" in the current scope to write a query expression.  In my example 
above, I didn't need to bind 'x' or 'y' or 'z' or 'foo' or 'bar', because 
the db.query() method is going to interpret those in some context.  If I 
were using a PEP 335-based query system, I'd have to initialize those 
variables to special querying objects first.

 From my POV, the use of &, |, and ~ were very minor issues.  Being able to 
use 'and', 'or', and 'not' would provided some minor syntactic sugar at 
best.  Trying to implement every *other* Python operator correctly, and 
having to have seed variables is IMO where the bulk of the complexity comes 
from, when trying to use Python syntax as a query language.

That's why I say that an AST literal syntax would be much more useful to me 
than PEP 335 for this type of use case.

As for the numeric use cases, I'm not at all clear why &, |, and ~ (or 
special methods/functions) aren't suitable.


>   The infix notation idea is interesting, however the
>operators desired are usually 'logical and' and 'logical or',
>which are clearly spelled 'and' and 'or' in Python.

Actually, from a pure functionality perspective, the logical operators are 
shortcuts for writing if-then-else blocks, and they compile to almost the 
same bytecode as if-then-else blocks.


>   I see it
>as a semantic limitation that Python does not allow overriding
>these operators.

Python also doesn't allow overriding of 'is' or 'type()' either.  I see the 
logical operators as being rather in the same plane of fundamentals.

From dgm at ecs.soton.ac.uk  Tue Sep 14 11:14:06 2004
From: dgm at ecs.soton.ac.uk (David G Mills)
Date: Tue Sep 14 18:00:12 2004
Subject: [Python-Dev] httplib is not v6 compatible,
	is this going to be fixed?
Message-ID: <Pine.LNX.4.44.0409141008310.29513-100000@login.ecs.soton.ac.uk>

As the link below shows httplib can't handle an IPv6 address, it checks 
for a port number by checking for a : but this simply cuts the IPv6 
address in two and tries to set the port variable as nonsense.

http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=81926#c2

Are there any plans to rectify this, or even an alternative version of 
httplib kicking about that can handle IPv6.

Regards,

David.


From skip at pobox.com  Tue Sep 14 18:36:45 2004
From: skip at pobox.com (Skip Montanaro)
Date: Tue Sep 14 18:37:05 2004
Subject: [Python-Dev] httplib is not v6 compatible,
	is this going to be fixed?
In-Reply-To: <Pine.LNX.4.44.0409141008310.29513-100000@login.ecs.soton.ac.uk>
References: <Pine.LNX.4.44.0409141008310.29513-100000@login.ecs.soton.ac.uk>
Message-ID: <16711.7709.255870.851658@montanaro.dyndns.org>


    David> As the link below shows httplib can't handle an IPv6 address, it
    David> checks for a port number by checking for a : but this simply cuts
    David> the IPv6 address in two and tries to set the port variable as
    David> nonsense.

    David> http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=81926#c2

    David> Are there any plans to rectify this, or even an alternative
    David> version of httplib kicking about that can handle IPv6.

I just checked in a fix for this (it was a one-character change, not
including unit test update).  I don't know if there is a Python bug report
open which now needs to be closed.  Considering the ease of the fix, I sort
of think not (otherwise it would have been fixed long ago).  Note that in
general we have plenty of other things to do with our time without
monitoring other projects' bug trackers looking for possible Python bug
reports.  If they aren't reported on SF we won't here about them.  (The
Debian folks routinely open SF items when Python bug reports wind up in the
Debian tracker.)

-- 
Skip Montanaro
Got spam? http://www.spambayes.org/
skip@pobox.com
From jcarlson at uci.edu  Tue Sep 14 18:58:56 2004
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue Sep 14 19:06:17 2004
Subject: [Python-Dev] find_first (and relatives)
In-Reply-To: <ci6pb7$o62$2@sea.gmane.org>
References: <ci6pb7$o62$2@sea.gmane.org>
Message-ID: <20040914092447.7B21.JCARLSON@uci.edu>


> I was a bit surprised to find out that python doesn't seem to have builtin
> functors, such as find_first.  Although there are ways to simulate such
> functions, it would be good to have an expanded set of functional
> programming tools which are coded in C for speed.

I think I've been here long enough, and it is getting to be my turn to
do this kind of thing once, so here it goes...

Address your concerns of "Python should or should not have this thing"
on python-list first (available as a newsgroup as comp.lang.python if
you so prefer).  Python-dev is about developing the core language, not
about fielding requests without background, support, salutation, and
"thank you for your consideration".

As for "find_first", the first mention of such things I found on the net
were the Boost C++ libraries for searching strings.  Considering your
recent posts on the C++ sig with regards to Boost::Python, this seems
like what you were referring to.  Searching strings are generally done
via "string literal".find("substring literal") or
string_variable.find(substring_variable), and any other such variations
you would care to use.  There also exists a find_all mechanism in the
regular expression module re, which comes standard with Python. Lists
also include find methods, though they call them index().  If your list
is sorted, you may want to consider the bisect module.

If you desire your finding methods to return an iteratble through the
sequence of positions of the item, perhaps this little throwaway
generator would be sufficient (which I'm sure a python-list user could
have helped you with)...

def find_first(str_or_list, item):
    if type(str_or_list) in (str, unicode):
        f = str_or_list.find(item)
        while f != -1:
            yield f
            f = str_or_list.find(item, f)
    elif type(str_or_list) is list:
        try:
            f = str_or_list.index(item)
        except ValueError:
            return
        while 1:
            yield f
            try:
                f = str_or_list.index(item, f)
            except ValueError:
                return
    else:
        raise ValueError,\
          "type %s is not supported for searching"%type(str_or_list)

 - Josiah

From skip at pobox.com  Tue Sep 14 19:12:18 2004
From: skip at pobox.com (Skip Montanaro)
Date: Tue Sep 14 19:12:26 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going to be
	fixed?
In-Reply-To: <Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>
References: <16711.7709.255870.851658@montanaro.dyndns.org>
	<Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>
Message-ID: <16711.9842.758365.619457@montanaro.dyndns.org>


    David> Have you actually tested it, cus I made my own fix than was at
    David> least and additional 7 lines....

I don't have access to ipv6.  I used the ipaddr:port combination in the
RedHat bug report as a test input.  Here's the change to httplib.py.
Replace:

            i = host.find(':')

with

            i = host.rfind(':')

Like I said, it was a one-character fix.  rfind() looks from the back of the
host/port combination for the colon separating the host and port.  Since
port numbers can't contain colons, the first colon found from the back has
to be the colon separating the host-or-address from the port number.

Here's the change to the test case (Lib/test/test_httplib.py).  After the
for loop that checks for invalid ports, add this analogous for loop that
checks valid host/port combinations:

    for hp in ("[fe80::207:e9ff:fe9b]:8000", "www.python.org:80",
               "www.python.org"):
        try:
            h = httplib.HTTP(hp)
        except httplib.InvalidURL:
            print "InvalidURL raised erroneously"

The test case failed before applying the patch and succeeded after.
According to the principals of I test-driven development, I'm done until
another bug surfaces.  In short, I've done what I can to fix the obvious
problem.  Drilling down any deeper than that is impossible for me.  As I
indicated, I have no ipv6 access.

If you test it out and still find problems, please submit a bug report on
SF.  Please *don't* follow up to python-dev.  It's not the appropriate place
to discuss the ins and outs of specific patches.  I only did so because that
was the easiest way to tell the other developers that I'd applied a fix for
the problem.

back-to-my-paying-job-ly, y'rs,

Skip
From dgm at ecs.soton.ac.uk  Tue Sep 14 18:52:11 2004
From: dgm at ecs.soton.ac.uk (David G Mills)
Date: Tue Sep 14 19:12:51 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going to be
	fixed?
In-Reply-To: <16711.7709.255870.851658@montanaro.dyndns.org>
Message-ID: <Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>

Have you actually tested it, cus I made my own fix than was at least and 
additional 7 lines....

David.

On Tue, 14 Sep 2004, Skip Montanaro wrote:

> 
>     David> As the link below shows httplib can't handle an IPv6 address, it
>     David> checks for a port number by checking for a : but this simply cuts
>     David> the IPv6 address in two and tries to set the port variable as
>     David> nonsense.
> 
>     David> http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=81926#c2
> 
>     David> Are there any plans to rectify this, or even an alternative
>     David> version of httplib kicking about that can handle IPv6.
> 
> I just checked in a fix for this (it was a one-character change, not
> including unit test update).  I don't know if there is a Python bug report
> open which now needs to be closed.  Considering the ease of the fix, I sort
> of think not (otherwise it would have been fixed long ago).  Note that in
> general we have plenty of other things to do with our time without
> monitoring other projects' bug trackers looking for possible Python bug
> reports.  If they aren't reported on SF we won't here about them.  (The
> Debian folks routinely open SF items when Python bug reports wind up in the
> Debian tracker.)
> 
> -- 
> Skip Montanaro
> Got spam? http://www.spambayes.org/
> skip@pobox.com
> 

From pje at telecommunity.com  Tue Sep 14 19:20:08 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Sep 14 19:20:30 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going
	to be fixed?
In-Reply-To: <16711.9842.758365.619457@montanaro.dyndns.org>
References: <Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>
	<16711.7709.255870.851658@montanaro.dyndns.org>
	<Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>
Message-ID: <5.1.1.6.0.20040914131736.05819db0@mail.telecommunity.com>

At 12:12 PM 9/14/04 -0500, Skip Montanaro wrote:

>Here's the change to the test case (Lib/test/test_httplib.py).  After the
>for loop that checks for invalid ports, add this analogous for loop that
>checks valid host/port combinations:
>
>     for hp in ("[fe80::207:e9ff:fe9b]:8000", "www.python.org:80",
>                "www.python.org"):

Here's the test case that's missing, then:

     "[fe80::207:e9ff:fe9b]"

From skip at pobox.com  Tue Sep 14 19:55:53 2004
From: skip at pobox.com (Skip Montanaro)
Date: Tue Sep 14 19:56:03 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going
	to be fixed?
In-Reply-To: <5.1.1.6.0.20040914131736.05819db0@mail.telecommunity.com>
References: <Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>
	<16711.7709.255870.851658@montanaro.dyndns.org>
	<5.1.1.6.0.20040914131736.05819db0@mail.telecommunity.com>
Message-ID: <16711.12457.647107.816397@montanaro.dyndns.org>


    Phillip> Here's the test case that's missing, then:

    Phillip>      "[fe80::207:e9ff:fe9b]"

Whoops.  Fixed.

Skip
From alloydflanagan at comcast.net  Tue Sep 14 19:57:47 2004
From: alloydflanagan at comcast.net (alloydflanagan@comcast.net)
Date: Tue Sep 14 19:57:50 2004
Subject: [Python-Dev] OT: Unicode history (was Alternative Impl. for PEP 292)
Message-ID: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>

[Fran�ois Pinard]
>>Many people consider that Unicode, or UTF-8 at least, is strongly
>>favouring English (boldly American) over any other script or language.
>>If it has not been so, Americans would never have promoted it so much,
>>and would have rather shown an infinite and eternal reluctance...
To be fair to the developers of Unicode, I'd suggest that the issue is not favoring (note spelling! :) ) English, but rather keeping compatibility with an enormous amount of existing data which was encoded in ASCII.  Which was an English standard, but you can only do so much in 7 bits...
As for American reluctance, how are you going to convince anyone to double (at least) the storage requirements for their data, to support languages they never use?  That would have cost a great deal of money.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20040914/151e29b1/attachment.htm
From foom at fuhm.net  Tue Sep 14 20:12:35 2004
From: foom at fuhm.net (James Y Knight)
Date: Tue Sep 14 20:12:41 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for PEP
	292:SimpleString Substitutions
In-Reply-To: <ci64jr$565$1@sea.gmane.org>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org><87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ci3g2d$m3g$1@sea.gmane.org> <ci64jr$565$1@sea.gmane.org>
Message-ID: <A796EB1C-0679-11D9-AC9C-000A95A50FB2@fuhm.net>

On Sep 14, 2004, at 2:54 AM, Terry Reedy wrote:
> This is why I am not especially enamored of Unicode and the prospect of
> Python becoming married to it.  It is heavily weighted in favor of
> efficiently representing Chinese and inefficiently representing 
> English.
> To give English equivalent treatment, the 20,000 or so most common 
> words,
> roots, prefixes, and suffixes would each get its own codepoint.

Of course it is perfectly possible to have the Python unicode 
implementation choose to represent some unicode strings with only 8 
bits per character. There is no (conceptual) reason it could not 
represent (u'a' * 8) with 8 bytes + class header overhead. That is 
simply an implementation detail and really has nothing to do with 
Unicode itself.

It would also be possible to use UTF-8 string storage, although this 
has the tradeoff that indexing an element takes linear time w.r.t. 
position instead of constant time.

James

From fumanchu at amor.org  Tue Sep 14 20:39:54 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Tue Sep 14 20:45:51 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3027BDA@exchange.hqamor.amorhq.net>

Kevin Jacobs:
> >Both of your alternatives are being used in some form and
> >neither is really satisfactory.  Literal representations require
> >complex parsers, when the Python parser is really what is
> >desired.

Phillip J. Eby:
> Maybe you missed the earlier part of the thread, where I was 
> suggesting that a Python "code literal" or "AST literal"
> syntax would be helpful.  For example, if backquotes didn't
> already have a use, one might say something like:
> 
>      db.query(`x.y==z and foo*bar<27`)
> 
> To pass an AST object to the db.query() method.  The 
> advantage would be that the AST would be parsed and
> syntax checked at compile time, rather than runtime.

We already have a de facto "code literal syntax": lambdas.

db.query(lambda x: x.y==z and foo*bar<27)

I use this technique in my ORM, dejavu. When declared, the lambda gets
passed immediately into a wrapper which early-binds as much as possible
(using Raymond's cookbook technique). See
http://www.aminus.org/rbre/python/logic.py and /codewalk.py for the
guts. SQL is generated from the lambda as needed (not online at the
moment, sorry, coming soon). The bonus is that you can pass ordinary
Python objects into the lambda and evaluate them. The current downside
is that it's a bytecode hack and therefore limited to CPython, certain
versions. I'd love a generic early-binder mechanism at the language
level to help get around that, but it's not critical for my users (=
me).

> After several experiments with using &, |, and ~ for query 
> expressions, 
> I've pretty much quit and gone to using string literals, 
> since AST literals 
> don't exist.  But if AST literals *did* exist, I'd certainly 
> use them in 
> preference to strings.

I tried &|~ also and quit pretty quickly (sorry, Greg ;). Using the
lambdas allowed me to do more of the parsing earlier, much of it at
compile-time, the rest at declaration time (I can then pickle the
lambdas so users can persist ones they create).

> But, even if PEP 335 *were* implemented, creating a query 
> system using Python expressions would *still* be kludgy,
> because you still need "seed variables" in the current
> scope to write a query expression.
> In my example above, I didn't need to bind 'x' or 'y'
> or 'z' or 'foo' or 'bar', because the db.query() method
> is going to interpret those in some context.  If I 
> were using a PEP 335-based query system, I'd have to
> initialize those variables to special querying objects first.

A lot of that becomes a non-issue if you bind early. Once the constants
are bound, you're left with attribute access on your core objects (x.y)
and special functions (see logic.ieq or logic.today for example). Again,
too, I can use the lambda to evaluate Python objects, the 'Object' side
of "ORM". In that situation, the binding is a benefit.

>8
> 
> That's why I say that an AST literal syntax would be much 
> more useful to me than PEP 335 for this type of use case.

I seem to recall my AST version was quite slow, in pure Python. Can't
recall whether that was all the tuple-unpacking or just my naive
function-call overhead at the time.

Anyway, for those reasons, I'm -0.5.


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From pje at telecommunity.com  Tue Sep 14 21:02:31 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Sep 14 21:02:47 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3027BDA@exchange.hqamor.amo
	rhq.net>
Message-ID: <5.1.1.6.0.20040914144853.0230b0e0@mail.telecommunity.com>

At 11:39 AM 9/14/04 -0700, Robert Brewer wrote:
>We already have a de facto "code literal syntax": lambdas.
>
>db.query(lambda x: x.y==z and foo*bar<27)

Right, but that requires non-portable bytecode disassembly.  If there was a 
simple way to convert from a function object to an AST, I'd be happy with 
that too.


> > But, even if PEP 335 *were* implemented, creating a query
> > system using Python expressions would *still* be kludgy,
> > because you still need "seed variables" in the current
> > scope to write a query expression.
> > In my example above, I didn't need to bind 'x' or 'y'
> > or 'z' or 'foo' or 'bar', because the db.query() method
> > is going to interpret those in some context.  If I
> > were using a PEP 335-based query system, I'd have to
> > initialize those variables to special querying objects first.
>
>A lot of that becomes a non-issue if you bind early. Once the constants
>are bound, you're left with attribute access on your core objects (x.y)
>and special functions (see logic.ieq or logic.today for example). Again,
>too, I can use the lambda to evaluate Python objects, the 'Object' side
>of "ORM". In that situation, the binding is a benefit.

I'm not following what you mean by "bind early".  My point was that in 
order to have bindings for seeds like 'x' and 'z' and 'foo', most query 
languages end up with hacks like 'tables.tablename.columname' or 
'_.table.column' or other rigamarole, and that this is usually more awkward 
to deal with than the &/|/~ operator spelling.


> > That's why I say that an AST literal syntax would be much
> > more useful to me than PEP 335 for this type of use case.
>
>I seem to recall my AST version was quite slow, in pure Python. Can't
>recall whether that was all the tuple-unpacking or just my naive
>function-call overhead at the time.

When I say AST, I just mean "some kind of syntax representation", not 
necessarily the 'parser' module's current AST implementation.  However, I 
have found that it's possible to translate parser-module AST's to query 
specifications quite efficiently in pure Python, such that the overhead is 
minor compared to whatever actual computation you're doing.  The key is 
that the vast majority of AST nodes are a trivial wrapper around another 
AST node.  The core of my AST-handling engine, therefore, looks like this:

     def build(builder, nodelist):
         while len(nodelist)==2:
             nodelist = nodelist[1]
         return production[nodelist[0]](builder,nodelist)

Where 'production' is a table mapping symbol IDs to helper functions that 
invoke methods on 'builder', which then may recursively invoke 'build' on 
items in 'nodelist'.  The first two lines of this function eliminate 
enormous amounts of overhead by ignoring all the zillions of trivial 
wrapper nodes.  (Note that you must include line number information in the 
generated AST, or it will mistake tokens for unnecessary symbols.)


>Anyway, for those reasons, I'm -0.5.

On what?  AST literals, or PEP 335?

From aahz at pythoncraft.com  Tue Sep 14 21:04:36 2004
From: aahz at pythoncraft.com (Aahz)
Date: Tue Sep 14 21:04:39 2004
Subject: [Python-Dev] Re: PEP 292: method names
In-Reply-To: <41465E53.6050606@ocf.berkeley.edu>
References: <4142E78C.7010800@heneryd.com> <4145DBB4.8010601@ocf.berkeley.edu>
	<ci4v65$s7f$1@sea.gmane.org> <41465E53.6050606@ocf.berkeley.edu>
Message-ID: <20040914190436.GA11541@panix.com>

On Mon, Sep 13, 2004, Brett C. wrote:
> Fredrik Lundh wrote:
>>Brett C wrote:
>>>
>>>I am sure the way I tend to abbreviate things is not how anyone
>>>else would.  So why would the stdlib try to?
>>
>>it's pretty amazing that you've been able to use Python without noticing
>>that the standard library is full of abbreviations.
> 
> Just because the stdlib is full of abbreviations does not mean it should be 
> continued.  Precedence != acceptance .

What I find interesting about your responses is that you're using the
abbreviation "stdlib", assuming that your audience will understand that
easily enough.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From pinard at iro.umontreal.ca  Tue Sep 14 21:15:28 2004
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Tue Sep 14 21:16:10 2004
Subject: [Python-Dev] OT: Unicode history (was Alternative Impl. for PEP
	292)
In-Reply-To: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>
References: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>
Message-ID: <20040914191528.GA7964@alcyon.progiciels-bpi.ca>

[alloydflanagan@comcast.net]
> [Fran?ois Pinard]

> >>Many people consider that Unicode, or UTF-8 at least, is strongly
> >>favouring English (boldly American) over any other script or
> >>language.  If it has not been so, Americans would never have
> >>promoted it so much, and would have rather shown an infinite and
> >>eternal reluctance...

> To be fair to the developers of Unicode, I'd suggest that the issue
> is not favoring (note spelling! :) ) English, but rather keeping
> compatibility with an enormous amount of existing data which was
> encoded in ASCII.

Of course, this is the standard and official reason.  Yet, the net
effect of that concern and constraint, noticed by many foreigners, is
that Unicode favours English.  (About "favouring" spelling, I find it
amusing to spell-check my out-going email with a British dictionary.)

> Which was an English standard, but you can only do so much in 7
> bits...  As for American reluctance, how are you going to convince
> anyone to double (at least) the storage requirements for their data,
> to support languages they never use?  That would have cost a great
> deal of money.

I would not think money has to be expressed in term of storage.  Storage
considerations are more likely a justification than an explanation for
the reluctance.  UTF-8 is such that on disk, and for applications using
UTF-8 internally (there are a few), not a single bit is spent on extra
storage for English.  There are cases, and the current Python approach
is one of them, Unicode may be made to be fairly unobtrusive on memory
consumption, at least in English contexts.

The complexity added by Unicode, however, may undoubtedly be a concern,
for any implementor wanting to really address that standard, that is,
further than merely toying with 16-bit characters.  *This* means human
time, and this is where the real cost lies.

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard
From bac at OCF.Berkeley.EDU  Tue Sep 14 21:44:16 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Sep 14 21:44:28 2004
Subject: [Python-Dev] Re: PEP 292: method names
In-Reply-To: <20040914190436.GA11541@panix.com>
References: <4142E78C.7010800@heneryd.com>
	<4145DBB4.8010601@ocf.berkeley.edu>	<ci4v65$s7f$1@sea.gmane.org>
	<41465E53.6050606@ocf.berkeley.edu>
	<20040914190436.GA11541@panix.com>
Message-ID: <41474A10.4080008@ocf.berkeley.edu>

Aahz wrote:
> On Mon, Sep 13, 2004, Brett C. wrote:
> 
>>Fredrik Lundh wrote:
>>
>>>Brett C wrote:
>>>
>>>>I am sure the way I tend to abbreviate things is not how anyone
>>>>else would.  So why would the stdlib try to?
>>>
>>>it's pretty amazing that you've been able to use Python without noticing
>>>that the standard library is full of abbreviations.
>>
>>Just because the stdlib is full of abbreviations does not mean it should be 
>>continued.  Precedence != acceptance .
> 
> 
> What I find interesting about your responses is that you're using the
> abbreviation "stdlib", assuming that your audience will understand that
> easily enough.

My audience is python-dev, and so I do assume they will know what the 
abbreviation is.

But Template is not just for python-dev but the whole Python community so 
making assumptions is little dangerous.

-Brett
From tim.hochberg at ieee.org  Tue Sep 14 21:50:00 2004
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Tue Sep 14 21:53:39 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040914112226.0354d8d0@mail.telecommunity.com>
References: <5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>	<5.1.1.6.0.20040910140546.0298f130@mail.telecommunity.com>	<5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
	<4146DE5D.3040702@theopalgroup.com>
	<5.1.1.6.0.20040914112226.0354d8d0@mail.telecommunity.com>
Message-ID: <ci7i1g$3jm$1@sea.gmane.org>

Phillip J. Eby wrote:

[CHOP]
>
> As for the numeric use cases, I'm not at all clear why &, |, and ~ (or 
> special methods/functions) aren't suitable.

They often are, but sometimes you want a logical and/or/not and &/|/~ 
are mapped to bitwise and/or/not, which isn't always what you want. 
Presumably, if Gregs proposal were adopted, and/or/not would get mapped 
to numarray.logical_and/or/not.

What I find more interesting about this proposal is that one could 
probably finagle it so that (A < B < C) worked correctly for arrays. It 
can't work now since it is equivalent to ((A < B) and (B < C)) and 'and' 
  doesn't do anything sensible for arrays at present. This is one I 
always expect to work even though I know that and/or/not don't work for 
arrays.

-tim

From fumanchu at amor.org  Tue Sep 14 22:41:57 2004
From: fumanchu at amor.org (Robert Brewer)
Date: Tue Sep 14 22:47:55 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3022ED4@exchange.hqamor.amorhq.net>

Phillip J. Eby wrote:
> At 11:39 AM 9/14/04 -0700, Robert Brewer wrote:
> >We already have a de facto "code literal syntax": lambdas.
> >
> >db.query(lambda x: x.y==z and foo*bar<27)
> 
> Right, but that requires non-portable bytecode disassembly.  
> If there was a simple way to convert from a function object
> to an AST, I'd be happy with that too.

If it were fast enough, I would too.

> > > But, even if PEP 335 *were* implemented, creating a query
> > > system using Python expressions would *still* be kludgy,
> > > because you still need "seed variables" in the current
> > > scope to write a query expression.
> > > In my example above, I didn't need to bind 'x' or 'y'
> > > or 'z' or 'foo' or 'bar', because the db.query() method
> > > is going to interpret those in some context.  If I
> > > were using a PEP 335-based query system, I'd have to
> > > initialize those variables to special querying objects first.
> >
> >A lot of that becomes a non-issue if you bind early. Once 
> the constants
> >are bound, you're left with attribute access on your core 
> objects (x.y)
> >and special functions (see logic.ieq or logic.today for 
> example). Again,
> >too, I can use the lambda to evaluate Python objects, the 
> 'Object' side
> >of "ORM". In that situation, the binding is a benefit.
> 
> I'm not following what you mean by "bind early".  My point 
> was that in 
> order to have bindings for seeds like 'x' and 'z' and 'foo', 
> most query 
> languages end up with hacks like 'tables.tablename.columname' or 
> '_.table.column' or other rigamarole, and that this is 
> usually more awkward 
> to deal with than the &/|/~ operator spelling.

Dejavu addresses that by separating the "table binding" from the
expression. That is, given:

z = "Hansel"
e = logic.Expression(lambda x: x.Name.startswith(z))
books = recall(myapp.Book, e)
authors = recall(myapp.Author, e)

...'x' isn't bound within the Expression declaration; it is supplied as
the first param to recall(). For example, you could apply the same
Expression to both a Book class/table and an Author class/table within
the same application, as above. IMO, this is a natural way to map the
lambda-calculus to a query language, where the bound variable =
ORM-object instances (a "table row"). But any free variables need to be
resolved ASAP; therefore, z gets evaluated completely and immediately;
Expression() rewrites the lambda co_code, replacing the closure lookup
with a LOAD_CONST (sticking the value of z into co_consts).

> > > That's why I say that an AST literal syntax would be much
> > > more useful to me than PEP 335 for this type of use case.
> >
> >I seem to recall my AST version was quite slow, in pure Python. Can't
> >recall whether that was all the tuple-unpacking or just my naive
> >function-call overhead at the time.
> 
> When I say AST, I just mean "some kind of syntax representation", not 
> necessarily the 'parser' module's current AST implementation.

Sure.

> However, I have found that it's possible to translate parser-module 
> AST's to query specifications quite efficiently in pure Python,
> such that the overhead is minor compared to whatever actual
> computation you're doing...

Hmm, perhaps I'll look again.

> >Anyway, for those reasons, I'm -0.5.
> 
> On what?  AST literals, or PEP 335?

The PEP. ASTs would be better. A builtin early-binder would make me
happiest, but I won't hold my breath. I don't think it would require new
syntax, either, just something like codewalk.EarlyBinder() and
.LambdaDecompiler() in a standard lib module somewhere. But I may go
back and look at ASTs again.


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From nhodgson at bigpond.net.au  Tue Sep 14 23:41:45 2004
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Tue Sep 14 23:41:52 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for
	PEP292:SimpleString Substitutions
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer>
	<4138D622.6050807@egenix.com>
	<1094315138.8696.36.camel@geddy.wooz.org>
	<cheig3$ki8$1@sea.gmane.org> <413F1D9C.20209@egenix.com>
	<chnc49$psm$1@sea.gmane.org> <413F3605.7090707@egenix.com>
	<chnidf$epp$1@sea.gmane.org> <413F6120.7090603@egenix.com>
	<chuhn8$11s$1@sea.gmane.org>
	<87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ci3g2d$m3g$1@sea.gmane.org> <ci64jr$565$1@sea.gmane.org>
	<A796EB1C-0679-11D9-AC9C-000A95A50FB2@fuhm.net>
Message-ID: <024501c49aa3$a1afb450$a44a8890@neil>

James Y Knight:

> It would also be possible to use UTF-8 string storage, although this
> has the tradeoff that indexing an element takes linear time w.r.t.
> position instead of constant time.

   At the cost of additional storage, indexing into UTF-8 by character
rather than byte can be made better than linear. Two techniques are (1)
maintain a list containing the byte index of some character index values
(such as each line start) then use linear access from the closest known
index and (2) to cache the most recent access due to the likelihood that the
next access will be close.

   While I have thought about this problem, it has only once came up
seriously for Scintilla (an editing component) and that was when someone was
trying to provide a UCS2 facade that matched existing interfaces.

   Neil

From barry at barrys-emacs.org  Tue Sep 14 23:46:41 2004
From: barry at barrys-emacs.org (Barry Scott)
Date: Tue Sep 14 23:47:24 2004
Subject: [Python-Dev] OT: Unicode history (was Alternative Impl. for PEP
	292)
In-Reply-To: <20040914191528.GA7964@alcyon.progiciels-bpi.ca>
References: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>
	<20040914191528.GA7964@alcyon.progiciels-bpi.ca>
Message-ID: <904FAAD7-0697-11D9-9E6D-000A95A8705A@barrys-emacs.org>


On Sep 14, 2004, at 20:15, Fran?ois Pinard wrote:

> Of course, this is the standard and official reason.  Yet, the net
> effect of that concern and constraint, noticed by many foreigners, is
> that Unicode favours English.  (About "favouring" spelling, I find it
> amusing to spell-check my out-going email with a British dictionary.)

First where national character sets. Working in more then one language 
was
a nightmare.

Then came ISO 10646 which gave every language its own unique set
of code points. But ISO 10646 is not easy to process which lead to the
development of unicode that is easier to implement and work but could
not originally deal with all the code points required for all the worlds
languages. I believe that was been fixed now you can have 32bit unicode.

Somewhere in the code point space you have to have ASCII. I'd be 
charitable
and say that its pragmatic that its in code page 0 given the history of 
the computer
industry.

 From now on if you use unicode no language has an advantage,
all are equal and software authors stand a chance to create 
international
software.

Barry

From martin at v.loewis.de  Tue Sep 14 23:48:16 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Sep 14 23:48:21 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going	to be
	fixed?
In-Reply-To: <16711.12457.647107.816397@montanaro.dyndns.org>
References: <Pine.LNX.4.44.0409141751260.29513-100000@login.ecs.soton.ac.uk>	<16711.7709.255870.851658@montanaro.dyndns.org>	<5.1.1.6.0.20040914131736.05819db0@mail.telecommunity.com>
	<16711.12457.647107.816397@montanaro.dyndns.org>
Message-ID: <41476720.9060803@v.loewis.de>

Skip Montanaro wrote:
>     Phillip> Here's the test case that's missing, then:
> 
>     Phillip>      "[fe80::207:e9ff:fe9b]"
> 
> Whoops.  Fixed.

The code was still incorrect. The square brackets don't belong
to the host name - they are part of the URL syntax. Before passing
them to the socket module, they need to be stripped off. I have now
changed httplib to do that right when parsing host:port.

Regards,
Martin

From martin at v.loewis.de  Wed Sep 15 00:03:05 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 00:03:10 2004
Subject: [Python-Dev] OT: Unicode history (was Alternative Impl. for PEP
	292)
In-Reply-To: <904FAAD7-0697-11D9-9E6D-000A95A8705A@barrys-emacs.org>
References: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>	<20040914191528.GA7964@alcyon.progiciels-bpi.ca>
	<904FAAD7-0697-11D9-9E6D-000A95A8705A@barrys-emacs.org>
Message-ID: <41476A99.9030305@v.loewis.de>

Barry Scott wrote:
> Then came ISO 10646 which gave every language its own unique set
> of code points. But ISO 10646 is not easy to process which lead to the
> development of unicode that is easier to implement and work but could
> not originally deal with all the code points required for all the worlds
> languages. 

I think this is historically incorrect. ISO 10646 and Unicode were
developed in lock-step, and the very first publication of ISO 10646
(in 1993) had precisely the same character assignments as Unicode 1.1.
Ever since then, both standards are roughly the same.

> I believe that was been fixed now you can have 32bit unicode.

This is also incorrect. Unicode now has roughly 20.09 bits. ISO 10646
used to have 32 bits, but now also restricts itself to 20.09 bits.
There are encodings of it which take four octets per code point.

> Somewhere in the code point space you have to have ASCII. I'd be charitable
> and say that its pragmatic that its in code page 0 given the history of 
> the computer
> industry.

Strictly speaking, this is group 0, plane 0, row 0 (actually, only the
first 128 cells of this row).

>  From now on if you use unicode no language has an advantage,
> all are equal and software authors stand a chance to create international
> software.

... assuming encodings are the only issue in creating international
software.

Regards,
Martin
From martin at v.loewis.de  Wed Sep 15 00:04:52 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 00:04:55 2004
Subject: [Python-Dev] Re: Re: Alternative Implementation for
	PEP	292:SimpleString Substitutions
In-Reply-To: <A796EB1C-0679-11D9-AC9C-000A95A50FB2@fuhm.net>
References: <007301c491e7$13f1c0a0$e841fea9@oemcomputer><4138D622.6050807@egenix.com><1094315138.8696.36.camel@geddy.wooz.org><cheig3$ki8$1@sea.gmane.org><413F1D9C.20209@egenix.com><chnc49$psm$1@sea.gmane.org><413F3605.7090707@egenix.com><chnidf$epp$1@sea.gmane.org><413F6120.7090603@egenix.com><chuhn8$11s$1@sea.gmane.org><87ekl6re7n.fsf@tleepslib.sk.tsukuba.ac.jp>	<ci3g2d$m3g$1@sea.gmane.org>
	<ci64jr$565$1@sea.gmane.org>
	<A796EB1C-0679-11D9-AC9C-000A95A50FB2@fuhm.net>
Message-ID: <41476B04.3020009@v.loewis.de>

James Y Knight wrote:
> Of course it is perfectly possible to have the Python unicode 
> implementation choose to represent some unicode strings with only 8 bits 
> per character.

That would break the C API, though, which is part of Python.

Regards,
Martin
From pinard at iro.umontreal.ca  Wed Sep 15 01:58:19 2004
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Wed Sep 15 01:59:01 2004
Subject: [Python-Dev] OT: Unicode history (was Alternative Impl. for PEP
	292)
In-Reply-To: <904FAAD7-0697-11D9-9E6D-000A95A8705A@barrys-emacs.org>
References: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>
	<20040914191528.GA7964@alcyon.progiciels-bpi.ca>
	<904FAAD7-0697-11D9-9E6D-000A95A8705A@barrys-emacs.org>
Message-ID: <20040914235819.GA10975@alcyon.progiciels-bpi.ca>

[Barry Scott]

> Then came ISO 10646 which gave every language its own unique set
> of code points.

Many languages at most.  That's far from "every language".  And some
languages, and not the least, were not satisfied with ISO 10646, many
countries long resisted its adoption as a national standard.

> But ISO 10646 is not easy to process which lead to the development of
> unicode [...]

ISO 10646 and Unicode converged.  Unicode was the fact of an industry
consortium, ISO 10646 was more in the realm of international standards.
Why do you say that ISO 10646 was especially "not easy to process"?

> that is easier to implement and work but could not originally deal
> with all the code points required for all the worlds languages.

Before the convergence, ISO 10646 more than Unicode was designed for
many code points, and so, ISO 10646 was more opened to many languages.

> I believe that was been fixed now you can have 32bit unicode.

Neither ISO 10646 nor Unicode are 32 bits.  The limit is 31 bits.

> From now on if you use unicode no language has an advantage, all are
> equal and software authors stand a chance to create international
> software.

English has a clear and definite advantage in Unicode, and this is
reflected in various Unicode-aware programs.  Taking Python as an mere
example, English texts may be translated from `unicode' to `str' without
raising an exception -- not many languages benefit of this property.

Some languages have all their characters pre-combined in Unicode, and
these have the advantage over the others of needing only one code
point per character.  Lately introduced languages met the established
resistance of Unicode (and W3C) to any new pre-combined characters, and
have to cope with zero-width diacritics, so inducing purely artificial
complexities in programs.  Unicode might well have granted them the same
service as early comers.

And there are more complex or difficult things which are needed by
some languages when Unicoded, still unneeded by the above languages,
directionality marks quickly come to mind.

Software authors will support Unicode more or less deeply depending
on the fact they aim German, Hebrew or Korean.  I do not think most
American-centric applications will go very far supporting Unicode.  For
real and complete Unicode support, software authors are only equal by
the hell they have to suffer.  I hardly call this a "chance"! :-)

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard
From greg at cosc.canterbury.ac.nz  Wed Sep 15 03:15:09 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Sep 15 03:15:16 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
Message-ID: <200409150115.i8F1F9Y4012592@cosc353.cosc.canterbury.ac.nz>

"Phillip J. Eby" <pje@telecommunity.com>:

> So, something like this:
> 
>       query("x and y or z")
> 
> isn't "code that performs database queries"?

Yes, but it's not Python code - it's SQL code wrapped
in a string wrapped in Python code. I want just Python
code.

> My main concern about the PEP is that it adds overhead to *all*
> logical operations, but the feature will only benefit code that
> hasn't yet been written.

The overhead shouldn't be substantially worse than that already
incurred by all the other operators being overloadable.  Also,
realistically, how much code do you think has boolean operations as a
speed bottleneck? I find it hard to imagine what such code would be
like.

> I also fear that as a result, people will start writing complex
> if-then blocks to "optimize" performance of conditionals to get them
> back to where they were before the facility was added.

If people do that, they're guilty of premature optimisation if they
haven't actually measured the speed of their code and found an actual
problem with it. I expect such cases will be extremely rare if they
occur at all.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From greg at cosc.canterbury.ac.nz  Wed Sep 15 03:54:51 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Sep 15 03:54:56 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <20040914133233.29723.1901953221.divmod.quotient.1109@ohm>
Message-ID: <200409150154.i8F1spLt012644@cosc353.cosc.canterbury.ac.nz>

exarkun@divmod.com:

> Python's parser is already available, through the compiler module.
> The example given earlier, query("x and y or z"), is relatively
> straightforward to implement as a set of AST manipulations.

But that misses the point, which is to have the expression
blend in seamlessly with the rest of the Python code. Anything
which requires the explicit invocation of a separate parsing
phase prevents that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From jhylton at gmail.com  Wed Sep 15 04:24:24 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Wed Sep 15 04:24:56 2004
Subject: [Python-Dev] --with-tsc compile fails
Message-ID: <e8bf7a530409141924139b7036@mail.gmail.com>

I'm feeling pretty out of it :-).  I'm very happy to see that the
Pentium tsc patch made it into the core; I had missed it.  I'm amused
that the Pentium tsc patch works for PPC, too.  Anyway, I tried to use
it this evening and the compilation failed:

../Python/ceval.c:50:21: asm/msr.h: No such file or directory
../Python/ceval.c: In function `PyEval_EvalFrame':
../Python/ceval.c:575: warning: implicit declaration of function `rdtscll'
../Python/ceval.c:572: warning: `inst0' might be used uninitialized in
this function
../Python/ceval.c:572: warning: `inst1' might be used uninitialized in
this function
../Python/ceval.c:572: warning: `loop0' might be used uninitialized in
this function
../Python/ceval.c:572: warning: `loop1' might be used uninitialized in
this function

It sounds like <asm/msr.h> is for Microsoft platforms, but I'm
building on Linux.  Perhaps the change to add PPC support screwed up
the ifdefs that were detecting a Windows compile?  Does it work for
anyone else?

Jeremy
From greg at cosc.canterbury.ac.nz  Wed Sep 15 04:44:12 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Sep 15 04:44:22 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <ci7i1g$3jm$1@sea.gmane.org>
Message-ID: <200409150244.i8F2iCfi012752@cosc353.cosc.canterbury.ac.nz>

> What I find more interesting about this proposal is that one could 
> probably finagle it so that (A < B < C) worked correctly for arrays.

Yes. Despite what I said earlier, I've now decided that
the new semantics should be extended to A < B < C as well.
I'll update the pep & patch at some point to reflect this.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From ilya at bluefir.net  Wed Sep 15 05:06:30 2004
From: ilya at bluefir.net (Ilya Sandler)
Date: Wed Sep 15 05:07:36 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <e8bf7a530409141924139b7036@mail.gmail.com>
References: <e8bf7a530409141924139b7036@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0409141956550.16206@bagira>

> It sounds like <asm/msr.h> is for Microsoft platforms, but I'm
> building on Linux.  Perhaps the change to add PPC support screwed up
> the ifdefs that were detecting a Windows compile?  Does it work for
> anyone else?

/usr/include/asm/msr.h exists on my linux system (mixed Debian 3.0)
(msr.h came with linux-kernel-headers package, my kernel version 2.4.25)

and compile with WITH_TSC defined worked fine for me about a week ago

Ilya

PS. just checked my other ancient RedHat 7.2 install and it also has
/usr/include/asm/msr.h


On Tue, 14 Sep 2004, Jeremy Hylton wrote:

> I'm feeling pretty out of it :-).  I'm very happy to see that the
> Pentium tsc patch made it into the core; I had missed it.  I'm amused
> that the Pentium tsc patch works for PPC, too.  Anyway, I tried to use
> it this evening and the compilation failed:
>
> ../Python/ceval.c:50:21: asm/msr.h: No such file or directory
> ../Python/ceval.c: In function `PyEval_EvalFrame':
> ../Python/ceval.c:575: warning: implicit declaration of function `rdtscll'
> ../Python/ceval.c:572: warning: `inst0' might be used uninitialized in
> this function
> ../Python/ceval.c:572: warning: `inst1' might be used uninitialized in
> this function
> ../Python/ceval.c:572: warning: `loop0' might be used uninitialized in
> this function
> ../Python/ceval.c:572: warning: `loop1' might be used uninitialized in
> this function
>
> It sounds like <asm/msr.h> is for Microsoft platforms, but I'm
> building on Linux.  Perhaps the change to add PPC support screwed up
> the ifdefs that were detecting a Windows compile?  Does it work for
> anyone else?
>
> Jeremy
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ilya%40bluefir.net
>
From pje at telecommunity.com  Wed Sep 15 05:15:37 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Sep 15 05:16:00 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <200409150115.i8F1F9Y4012592@cosc353.cosc.canterbury.ac.nz>
References: <5.1.1.6.0.20040914002619.02aaf8c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040914225734.02433b50@mail.telecommunity.com>

At 01:15 PM 9/15/04 +1200, Greg Ewing wrote:
>"Phillip J. Eby" <pje@telecommunity.com>:
>
> > So, something like this:
> >
> >       query("x and y or z")
> >
> > isn't "code that performs database queries"?
>
>Yes, but it's not Python code - it's SQL code wrapped
>in a string wrapped in Python code. I want just Python
>code.

But if this were possible:

     query(``x and y or z``)

such that the expression ``x and y or z`` results in a Python AST for that 
expression, then you'd be able to do whatever you want with it.


> > My main concern about the PEP is that it adds overhead to *all*
> > logical operations, but the feature will only benefit code that
> > hasn't yet been written.
>
>The overhead shouldn't be substantially worse than that already
>incurred by all the other operators being overloadable.  Also,
>realistically, how much code do you think has boolean operations as a
>speed bottleneck? I find it hard to imagine what such code would be
>like.

So it's acceptable to slow down all logical operations, add new byte codes, 
and expand the size of the eval loop, all to support a niche usage?  That 
doesn't make sense to me.

Again, I'm not familiar with the numeric use cases, but I am familiar with 
algebraic manipulation of Python code for SQL generation and other 
purposes, and I honestly don't see any  benefit to the PEP for those 
purposes.  AST's are more useful, and I'd support a PEP to make code 
expressible as literals, because that wouldn't impose overhead on systems 
that doesn't use them.  (For one thing, they could be expressed as 
constants in code objects, so the bytecode would just be LOAD_CONST.)

For the numeric use cases, frankly I don't see why one would want to apply 
short-circuiting boolean operators to arrays, since presumably the values 
in them have already been evaluated.
And if the idea is to make them *not* be short-circuting operators, that 
seems to me to corrupt the whole point of the logical operators versus 
their bitwise counterparts.

From greg at cosc.canterbury.ac.nz  Wed Sep 15 06:34:46 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Sep 15 06:34:58 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040914225734.02433b50@mail.telecommunity.com>
Message-ID: <200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>

"Phillip J. Eby" <pje@telecommunity.com>:

> For the numeric use cases, frankly I don't see why one would want to
> apply short-circuiting boolean operators to arrays, since presumably
> the values in them have already been evaluated.  And if the idea is
> to make them *not* be short-circuting operators, that seems to me to
> corrupt the whole point of the logical operators versus their
> bitwise counterparts.

There's more to it than short-circuiting. Consider

  a = array([42, ""])
  b = array([(), "spam"])

One might reasonably expect the result of 'a or b' to
be

  array([42, "spam"])

which is considerably different from a bitwise operation.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From martin at v.loewis.de  Wed Sep 15 07:53:13 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 07:53:17 2004
Subject: [Python-Dev] tempfile.TemporaryFile on Windows NT
Message-ID: <4147D8C9.3020508@v.loewis.de>

The tempfile module has a wrapper class to implement
delete on close. On NT+, this is not necessary, since
the system supports the O_TEMPORARY flag. However
the wrapper is still created 'so that file.name is useful
(i.e. not "(fdopen)"'. I find this a weak argument, since
file.name is also "fdopen" on POSIX.

So I would like to drop the wrapper object on Windows NT,
and have tempfile.TemporaryFile return a proper file
object. Any objections?

If there are objections, would they change if file.name
would point uniformly to the file name of the temporary
file?

If so, should this be better achieved by os.fdopen grow
a name argument, or by using builtin open() in the first
place? On Windows, one can pass the additional "D" flag
to open() to get a delete-on-close file.

Regards,
Martin
From foom at fuhm.net  Wed Sep 15 08:33:50 2004
From: foom at fuhm.net (James Y Knight)
Date: Wed Sep 15 08:33:57 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>
References: <200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>
Message-ID: <34EB398C-06E1-11D9-AC9C-000A95A50FB2@fuhm.net>

On Sep 15, 2004, at 12:34 AM, Greg Ewing wrote:
> There's more to it than short-circuiting. Consider
>
>   a = array([42, ""])
>   b = array([(), "spam"])
>
> One might reasonably expect the result of 'a or b' to
> be
>
>   array([42, "spam"])
>
> which is considerably different from a bitwise operation.

One might, but *I* would reasonably expect it to give me array a, by 
extrapolation from every other data type in python.

Consider also this:
   x and 4 or 5
which is of course a common idiom to workaround the lack of an 
if-then-else expression.

So, try with x = array([42, 0])

Currently, doing this with numarray raises an exception "An array 
doesn't make sense as a truth value.  Use sometrue(a) or alltrue(a).". 
Odd, since nearly all python objects can somehow be turned into a truth 
value, but ok. [Forbidding __nonzero__ prevents horrible mistakes from 
occurring because of the misuse of the comparison operators as 
element-wise comparison.  "if array([1,2,3]) == array([3,2,1]): print 
'Bad'" of course oughtn't print 'Bad'.]

However, with this change, it may instead return:
  array([4, 5])
and that's nothing like what was meant.

The idiom would change to:
   bool(x) and 4 or 5
I suppose...

James

PS: Perl6 has distinct element-wise operators ("hyper" operators). I 
find that less distasteful than misusing regular operators as 
element-wise operators, when they really have vastly different 
semantics.

From tim.hochberg at ieee.org  Wed Sep 15 08:48:16 2004
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Wed Sep 15 08:48:29 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>
References: <5.1.1.6.0.20040914225734.02433b50@mail.telecommunity.com>
	<200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>
Message-ID: <ci8ojo$li4$1@sea.gmane.org>

Greg Ewing wrote:
> "Phillip J. Eby" <pje@telecommunity.com>:
> 
> 
>>For the numeric use cases, frankly I don't see why one would want to
>>apply short-circuiting boolean operators to arrays, since presumably
>>the values in them have already been evaluated.  And if the idea is
>>to make them *not* be short-circuting operators, that seems to me to
>>corrupt the whole point of the logical operators versus their
>>bitwise counterparts.
> 
> 
> There's more to it than short-circuiting. Consider
> 
>   a = array([42, ""])
>   b = array([(), "spam"])
> 
> One might reasonably expect the result of 'a or b' to
> be
> 
>   array([42, "spam"])
> 
> which is considerably different from a bitwise operation.

Another example from numarray land. You can pick out subarrays, by 
indexing with an array of booleans, which can be pretty slick.

 >>> import numarray as na
 >>> a = na.arange(9)
 >>> a[a < 4]
array([0, 1, 2, 3])

You would like a[2 < a < 4] to work, but instead you need:

 >>> a[(2 < a) & (a < 4)]

Gregs proposal could fix this.

Or suppose you want to find the logical and of a, b. Consider trying to 
use bitwise ops:

 >>> a = na.array([1,1,1,1]) # all true
 >>> b = na.array([2,2,2,2]) # all true
 >>> a & b
array([0, 0, 0, 0]) # oops, that's why there's logical_and
 >>> na.logical_and(a,b)
array([1, 1, 1, 1], type=Bool)
 >>> (a!=0) & (b!=0) # this also works, but it does 3x as much work
array([1, 1, 1, 1], type=Bool)

Again with Greg's proposal one could write 'a and b' for this. Much nicer.

It's not that you couldn't make numarrays short circuit. In the 
expression "a and b", if all the elements of a are false, then we can 
skip evaluating b. I'm just not sure that this is a good idea.

-tim


From dgm at ecs.soton.ac.uk  Wed Sep 15 11:24:48 2004
From: dgm at ecs.soton.ac.uk (David G Mills)
Date: Wed Sep 15 11:32:20 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going to be
	fixed?
In-Reply-To: <41476720.9060803@v.loewis.de>
Message-ID: <Pine.LNX.4.44.0409151024140.29513-100000@login.ecs.soton.ac.uk>

And where can we get a copy of this new 'official' httplib?

David.

On Tue, 14 Sep 2004, [ISO-8859-1] "Martin v. L=F6wis" wrote:

> Skip Montanaro wrote:
> >     Phillip> Here's the test case that's missing, then:
> >=20
> >     Phillip>      "[fe80::207:e9ff:fe9b]"
> >=20
> > Whoops.  Fixed.
>=20
> The code was still incorrect. The square brackets don't belong
> to the host name - they are part of the URL syntax. Before passing
> them to the socket module, they need to be stripped off. I have now
> changed httplib to do that right when parsing host:port.
>=20
> Regards,
> Martin
>=20

From FBatista at uniFON.com.ar  Wed Sep 15 14:53:55 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Wed Sep 15 14:58:36 2004
Subject: [Python-Dev] Trying to extract documentation from the CVS
Message-ID: <A128D751272CD411BC9200508BC2194D053C7962@escpl.tcp.com.ar>

I'm packaging decimal. In part because of the suggestion of Alex Martelli,
and in part because, as I need it for my SiGeFi project, it's a must to
offer the user to download it separately if he/she has Py2.3 and cannot
upgrade.

The main issue I have is including the documentation. I want to include
something like a "decimal.pdf" with the decimal documentation only. So I
copied the libdecimal.tex and tried to convert it, and I couldn't.

I'm a completely tex newbie, but I think that there's an issue with the
syntax (that the file uses it own and I don't know which files use). I've
generated the documentation from the CVS files (with the make), so I guess I
have all the necessary support programs in my machine (not here at office,
at home).

So, the questions are:

- Is possible to extract only one file and generate a .pdf from it? And a
.html? 
- There's somewhere a how-to? Or the procedure is so simple that is not
needed?
- Which files from CVS I need?

Sorry if some of these questions are not python-dev specific and could be
answered only with tex knowledge.

Thank you very much.

Facundo Batista
Desarrollo de Red
fbatista@unifon.com.ar
(54 11) 5130-4643
Cel: 15 5097 5024

From mwh at python.net  Wed Sep 15 15:35:07 2004
From: mwh at python.net (Michael Hudson)
Date: Wed Sep 15 15:35:08 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040914144853.0230b0e0@mail.telecommunity.com>
	(Phillip J. Eby's message of "Tue, 14 Sep 2004 15:02:31 -0400")
References: <5.1.1.6.0.20040914144853.0230b0e0@mail.telecommunity.com>
Message-ID: <2m1xh3y7sk.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> When I say AST, I just mean "some kind of syntax representation", not
> necessarily the 'parser' module's current AST implementation.

That ... thing ... isn't an AST in any sense of the word.

I know the documentation and the method names (used to) suggest it is,
but that doesn't make it true :)

Cheers,
mwh

-- 
  It's relatively seldom that desire for sex is involved in 
  technology procurement decisions.          -- ESR at EuroPython 2002
From mwh at python.net  Wed Sep 15 15:51:55 2004
From: mwh at python.net (Michael Hudson)
Date: Wed Sep 15 15:51:57 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <e8bf7a530409141924139b7036@mail.gmail.com> (Jeremy Hylton's
	message of "Tue, 14 Sep 2004 22:24:24 -0400")
References: <e8bf7a530409141924139b7036@mail.gmail.com>
Message-ID: <2mwtyvwsg4.fsf@starship.python.net>

Jeremy Hylton <jhylton@gmail.com> writes:

> I'm feeling pretty out of it :-).  I'm very happy to see that the
> Pentium tsc patch made it into the core; I had missed it.  I'm amused
> that the Pentium tsc patch works for PPC, too.

I did consider changing all the names but couldn't be bothered.

> Anyway, I tried to use it this evening and the compilation failed:
>
> ../Python/ceval.c:50:21: asm/msr.h: No such file or directory
> ../Python/ceval.c: In function `PyEval_EvalFrame':
> ../Python/ceval.c:575: warning: implicit declaration of function `rdtscll'
> ../Python/ceval.c:572: warning: `inst0' might be used uninitialized in
> this function
> ../Python/ceval.c:572: warning: `inst1' might be used uninitialized in
> this function
> ../Python/ceval.c:572: warning: `loop0' might be used uninitialized in
> this function
> ../Python/ceval.c:572: warning: `loop1' might be used uninitialized in
> this function
>
> It sounds like <asm/msr.h> is for Microsoft platforms, but I'm
> building on Linux.  Perhaps the change to add PPC support screwed up
> the ifdefs that were detecting a Windows compile?

Well, it failed like that for me both before and after my PPC changes.
I'm fairly sure I didn't mess this up.  Maybe there's some
kernel-headers package that's necessary.

OTOH, I think one could replace the include by

#define rdtscll(val) \
     __asm__ __volatile__("rdtsc" : "=A" (val))

if my limited googling is anything to go by.  It also seems asm/msr.h
is a "kernel internal header with absolutely no stable API
properties...." (Redhat bugzilla).

So, now I've written this email <wink>, I think we should take out the
include and put in the #define.

Anyone who cares about, e.g., Windows can find out how to make their
compiler do this.

Cheers,
mwh

-- 
  Presumably pronging in the wrong place zogs it.
                                        -- Aldabra Stoddart, ucam.chat
From theller at python.net  Wed Sep 15 17:27:24 2004
From: theller at python.net (Thomas Heller)
Date: Wed Sep 15 17:27:32 2004
Subject: [Python-Dev] PyExc_UnicodeDecodeError
Message-ID: <vfefeen7.fsf@python.net>

Can anyone explain why calling this code in a C extension

static PyObject *
test(PyObject *self, PyObject *arg)
{
        PyErr_SetString(PyExc_UnicodeDecodeError, "blah blah");
        return NULL;
}

PyMethodDef module_methods[] = {
        {"test", test, METH_NOARGS},
        {NULL, NULL}
};


does this (same in 2.3.4, and 2.4 current CVS):

>>> from somewhere import test
>>> test()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: function takes exactly 5 arguments (1 given)
>>>

Thomas

From mal at egenix.com  Wed Sep 15 17:35:36 2004
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Sep 15 17:35:40 2004
Subject: [Python-Dev] PyExc_UnicodeDecodeError
In-Reply-To: <vfefeen7.fsf@python.net>
References: <vfefeen7.fsf@python.net>
Message-ID: <41486148.7090007@egenix.com>

Thomas Heller wrote:
> Can anyone explain why calling this code in a C extension
> 
> static PyObject *
> test(PyObject *self, PyObject *arg)
> {
>         PyErr_SetString(PyExc_UnicodeDecodeError, "blah blah");
>         return NULL;
> }
> 
> PyMethodDef module_methods[] = {
>         {"test", test, METH_NOARGS},
>         {NULL, NULL}
> };
> 
> 
> does this (same in 2.3.4, and 2.4 current CVS):
> 
> 
>>>>from somewhere import test
>>>>test()
> 
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: function takes exactly 5 arguments (1 given)
> 

See Python/exceptions.c:

PyObject * PyUnicodeDecodeError_Create(
	const char *encoding, const char *object, int length,
	int start, int end, const char *reason)
{
     return PyObject_CallFunction(PyExc_UnicodeDecodeError, "ss#iis",
	encoding, object, length, start, end, reason);
}

This exception is thrown by codecs that want to signal a
decoding error. It includes the context of the problem as
well as the reason string.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 15 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From pje at telecommunity.com  Wed Sep 15 17:56:31 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Sep 15 17:57:19 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean
  Operators
In-Reply-To: <ci8ojo$li4$1@sea.gmane.org>
References: <200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>
	<5.1.1.6.0.20040914225734.02433b50@mail.telecommunity.com>
	<200409150434.i8F4YkpL012903@cosc353.cosc.canterbury.ac.nz>
Message-ID: <5.1.1.6.0.20040915115358.033df630@mail.telecommunity.com>

At 11:48 PM 9/14/04 -0700, Tim Hochberg wrote:
>Again with Greg's proposal one could write 'a and b' for this. Much nicer.
>
>It's not that you couldn't make numarrays short circuit. In the expression 
>"a and b", if all the elements of a are false, then we can skip evaluating 
>b. I'm just not sure that this is a good idea.

My point is that the idea of using 'and' in order to implement something 
that's *not* short-circuiting seems like a bad idea.  I'd rather see 
array-specific operators added, or some sort of infix notation for 
functions so that you can define custom operators for such specialized usages.

From pje at telecommunity.com  Wed Sep 15 17:58:27 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Sep 15 17:59:12 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <2m1xh3y7sk.fsf@starship.python.net>
References: <5.1.1.6.0.20040914144853.0230b0e0@mail.telecommunity.com>
	<5.1.1.6.0.20040914144853.0230b0e0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040915115744.03364ae0@mail.telecommunity.com>

At 02:35 PM 9/15/04 +0100, Michael Hudson wrote:
>"Phillip J. Eby" <pje@telecommunity.com> writes:
>
> > When I say AST, I just mean "some kind of syntax representation", not
> > necessarily the 'parser' module's current AST implementation.
>
>That ... thing ... isn't an AST in any sense of the word.
>
>I know the documentation and the method names (used to) suggest it is,
>but that doesn't make it true :)

Well, it's definitely syntax and it's definitely a tree, so it's at least 
an ST.  :)

From theller at python.net  Wed Sep 15 18:01:07 2004
From: theller at python.net (Thomas Heller)
Date: Wed Sep 15 18:01:16 2004
Subject: [Python-Dev] PyExc_UnicodeDecodeError
In-Reply-To: <41486148.7090007@egenix.com> (M.'s message of "Wed, 15 Sep
	2004 17:35:36 +0200")
References: <vfefeen7.fsf@python.net> <41486148.7090007@egenix.com>
Message-ID: <k6uved30.fsf@python.net>

"M.-A. Lemburg" <mal@egenix.com> writes:

> Thomas Heller wrote:
>> Can anyone explain why calling this code in a C extension
>> static PyObject *
>> test(PyObject *self, PyObject *arg)
>> {
>>         PyErr_SetString(PyExc_UnicodeDecodeError, "blah blah");
>>         return NULL;
>> }
>> PyMethodDef module_methods[] = {
>>         {"test", test, METH_NOARGS},
>>         {NULL, NULL}
>> };
>> does this (same in 2.3.4, and 2.4 current CVS):
>>
>>>>>from somewhere import test
>>>>>test()
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in ?
>> TypeError: function takes exactly 5 arguments (1 given)
>>
>
> See Python/exceptions.c:
>
> PyObject * PyUnicodeDecodeError_Create(
> 	const char *encoding, const char *object, int length,
> 	int start, int end, const char *reason)
> {
>      return PyObject_CallFunction(PyExc_UnicodeDecodeError, "ss#iis",
> 	encoding, object, length, start, end, reason);
> }
>
> This exception is thrown by codecs that want to signal a
> decoding error. It includes the context of the problem as
> well as the reason string.

Thanks, this makes sense.  The real problem I wanted to solve is a
little bit less contrieved ;-)

In this context: I find Exceptions being much too underdocumented.
Not only that a lot of built in exceptions are not listed in 
<http://www.python.org/doc/current/api/standardExceptions.html>,
also I find the description for the exceptions here
<http://www.python.org/dev/doc/devel/lib/module-exceptions.html>
very diffcult to understand, if you want to define a subclass of, for
example, WindowsError for your own code.

A much more interesting and understandable reading is the exceptions.py
module which was last used in 1.5, afaik.

I'm not sure what there can be done about that, maybe keep exceptions.py
in sync (although unused) with the current code, and point to it from
the docs?

Thomas

From martin at v.loewis.de  Wed Sep 15 20:22:18 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 20:22:22 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going to be
	fixed?
In-Reply-To: <Pine.LNX.4.44.0409151024140.29513-100000@login.ecs.soton.ac.uk>
References: <Pine.LNX.4.44.0409151024140.29513-100000@login.ecs.soton.ac.uk>
Message-ID: <4148885A.5090803@v.loewis.de>

David G Mills wrote:
> And where can we get a copy of this new 'official' httplib?

As usual: In the CVS.

Regards,
Martin
From martin at v.loewis.de  Wed Sep 15 20:43:13 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 20:43:16 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <2mwtyvwsg4.fsf@starship.python.net>
References: <e8bf7a530409141924139b7036@mail.gmail.com>
	<2mwtyvwsg4.fsf@starship.python.net>
Message-ID: <41488D41.9090905@v.loewis.de>

Michael Hudson wrote:
> Well, it failed like that for me both before and after my PPC changes.
> I'm fairly sure I didn't mess this up.  Maybe there's some
> kernel-headers package that's necessary.
> 
> OTOH, I think one could replace the include by
> 
> #define rdtscll(val) \
>      __asm__ __volatile__("rdtsc" : "=A" (val))
> 
> if my limited googling is anything to go by.  It also seems asm/msr.h
> is a "kernel internal header with absolutely no stable API
> properties...." (Redhat bugzilla).

I'ld still like to understand why it fails for your system (it works
fine on mine). Do you have a definition for rdtscll in
/usr/include/asm/msr.h? Is it a define like the one you just put there?
If so, why does the macro not expand?

Regards,
Martin
From martin at v.loewis.de  Wed Sep 15 20:46:06 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 20:46:08 2004
Subject: [Python-Dev] PyExc_UnicodeDecodeError
In-Reply-To: <k6uved30.fsf@python.net>
References: <vfefeen7.fsf@python.net> <41486148.7090007@egenix.com>
	<k6uved30.fsf@python.net>
Message-ID: <41488DEE.1030506@v.loewis.de>

Thomas Heller wrote:
> In this context: I find Exceptions being much too underdocumented.
[...]
> I'm not sure what there can be done about that, maybe keep exceptions.py
> in sync (although unused) with the current code, and point to it from
> the docs?

The best solution for missing, incomplete, and incomprehensible
documentation is to add, complete, and rewrite the documentation.
Do you volunteer?

Regards,
Martin
From tim.peters at gmail.com  Wed Sep 15 20:51:16 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Sep 15 20:51:19 2004
Subject: [Python-Dev] tempfile.TemporaryFile on Windows NT
In-Reply-To: <4147D8C9.3020508@v.loewis.de>
References: <4147D8C9.3020508@v.loewis.de>
Message-ID: <1f7befae0409151151486156b@mail.gmail.com>

[Martin v. L?wis]
> The tempfile module has a wrapper class to implement
> delete on close. On NT+, this is not necessary, since
> the system supports the O_TEMPORARY flag. However
> the wrapper is still created 'so that file.name is useful
> (i.e. not "(fdopen)"'. I find this a weak argument, since
> file.name is also "fdopen" on POSIX.

File names are much more important on Windows, because Windows doesn't
allow to rename or delete an open file, two things Unixheads are
apparently incapable of avoiding.  Without a name, debugging Unixhead
code on Windows gets harder.  Files on Windows, at least from the std
C level, always have names.

> So I would like to drop the wrapper object on Windows NT,
> and have tempfile.TemporaryFile return a proper file
> object. Any objections?

I would sorely miss knowing the name.
 
> If there are objections, would they change if file.name
> would point uniformly to the file name of the temporary
> file?

Yes, but file.name is read-only now -- you can't change file.name from
Python code.

> If so, should this be better achieved by os.fdopen grow
> a name argument, or by using builtin open() in the first
> place? On Windows, one can pass the additional "D" flag
> to open() to get a delete-on-close file.

The latter doesn't fly, because there's no flag to open() that gets
the effect of the Windows O_NOINHERIT, and O_NOINHERIT is a pragmatic
necessity for sane use of temp files on Windows.  Indeed, although the
docs don't say this, without O_NOINHERIT even O_TEMPORARY isn't
reliable (program P creates a file F w/ O_TEMPORARY but not
O_NOINHERIT; P spawns program Q; Q inherits F's file descriptor, but
does *not* inherit the "delete on close" info about F; P exits; F is
not deleted then because a handle is still open on F (in Q); Q exits;
F isn't deleted then either because Q never knew that F was a "delete
on close" file; P and Q are both gone now, but F never goes away;
specifying O_NOINHERIT too stops this).

BTW, the docs also don't say this:  a file created with O_TEMPORARY
cannot be opened by name again, not even by the process that created
the file.  That's why there's no "security risk" in having a named
O_TEMPORARY file visible in the filesystem on Windows (although, as
above, that can lose if O_NOINHERIT isn't used too, or even if the
creating process goes away without running the C runtime cleanup
code).
From jhylton at gmail.com  Wed Sep 15 20:56:43 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Wed Sep 15 20:56:47 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <2mwtyvwsg4.fsf@starship.python.net>
References: <e8bf7a530409141924139b7036@mail.gmail.com>
	<2mwtyvwsg4.fsf@starship.python.net>
Message-ID: <e8bf7a530409151156625fea55@mail.gmail.com>

On Wed, 15 Sep 2004 14:51:55 +0100, Michael Hudson <mwh@python.net> wrote:
> Jeremy Hylton <jhylton@gmail.com> writes:
> 
> > I'm feeling pretty out of it :-).  I'm very happy to see that the
> > Pentium tsc patch made it into the core; I had missed it.  I'm amused
> > that the Pentium tsc patch works for PPC, too.
> 
> I did consider changing all the names but couldn't be bothered.

There's nothing wrong with amusing names for obscure stuff like this :-).
 
> OTOH, I think one could replace the include by
> 
> #define rdtscll(val) \
>      __asm__ __volatile__("rdtsc" : "=A" (val))
> 
> if my limited googling is anything to go by.  It also seems asm/msr.h
> is a "kernel internal header with absolutely no stable API
> properties...." (Redhat bugzilla).
> 
> So, now I've written this email <wink>, I think we should take out the
> include and put in the #define.

I'll give it a try tonight.  I double-checked and my somewhat tweaked
RH Linux distro doesn't have an asm/msr.h.  I'd rather not try to find
out if there is an rdtscll() defined somewhere else.

jeremy
From barry at barrys-emacs.org  Wed Sep 15 20:56:34 2004
From: barry at barrys-emacs.org (Barry Scott)
Date: Wed Sep 15 20:57:18 2004
Subject: [Python-Dev] OT: Unicode history (was Alternative Impl. for PEP
	292)
In-Reply-To: <41476A99.9030305@v.loewis.de>
References: <091420041757.13290.414731190005E51E000033EA2200751150020E090E020E04000B970104040E@comcast.net>	<20040914191528.GA7964@alcyon.progiciels-bpi.ca>
	<904FAAD7-0697-11D9-9E6D-000A95A8705A@barrys-emacs.org>
	<41476A99.9030305@v.loewis.de>
Message-ID: <F6E2A75E-0748-11D9-9E6D-000A95A8705A@barrys-emacs.org>


On Sep 14, 2004, at 23:03, Martin v. L?wis wrote:

> I think this is historically incorrect. ISO 10646 and Unicode were
> developed in lock-step, and the very first publication of ISO 10646
> (in 1993) had precisely the same character assignments as Unicode 1.1.
> Ever since then, both standards are roughly the same.

ISO is not known for its speed. You are probable right about 
publication date.
However I'm sure I had my draft iso 10646 a long time before the unicode
got going. But its all a long time ago, I'll not bet on it.

> ... assuming encodings are the only issue in creating international
> software.

Of course you are right its one part of the puzzle to allow a piece of
software to be acceptable in a particular culture.

Barry

From martin at v.loewis.de  Wed Sep 15 21:06:19 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 21:06:22 2004
Subject: [Python-Dev] tempfile.TemporaryFile on Windows NT
In-Reply-To: <1f7befae0409151151486156b@mail.gmail.com>
References: <4147D8C9.3020508@v.loewis.de>
	<1f7befae0409151151486156b@mail.gmail.com>
Message-ID: <414892AB.7010403@v.loewis.de>

Tim Peters wrote:
> Yes, but file.name is read-only now -- you can't change file.name from
> Python code.
> 
> 
>>If so, should this be better achieved by os.fdopen grow
>>a name argument

So what about adding a name argument to fdopen?

Regards,
Martin
From bac at OCF.Berkeley.EDU  Wed Sep 15 20:26:49 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Sep 15 21:17:38 2004
Subject: [Python-Dev] python-dev Summary for 2004-08-16 through 2004-08-31
	[draft]
Message-ID: <41488969.70909@ocf.berkeley.edu>

OK, using the new coverage style of "whatever Brett finds interesting", here is 
the next summary.  Plan to send this out some time this upcoming weekend so 
edits need to get in between now and then.

If anyone thinks a thread should be covered (see "Skipped Threads"), write a 
summary and I will add it with an author mention.

--------------------------------------

=====================
Summary Announcements
=====================
After asking people last week about how they wanted me to change the python-dev 
Summary as as to allow me to retain my free time, all respondents unanimously 
went with the option of letting me choose what I wanted to cover.  That made me 
happy for a couple of reasons.  One is that it makes the summaries more 
enjoyable for me since I mostly cover stuff I like and thus should be less 
bored at points while writing.  It also flattering in a way that people trusted 
what I would cover enough to not want to suggest what I should cover.  It also 
allows me to spend more time on python-dev participating than sitting on the 
sidelines dreading have to summarize some 100 email thread on whether we should 
move over to VC 7 or something (can you tell I was bored out of my skull by 
that thread?).

So this summary starts the new coverage style.  As you can see it is much 
shorter than normal since I didn't try to be as thorough (7 pages compared to 
the usual 10-20).  But I think this style allows what I do summarize to have 
more details than it would normally have; quality over quantity.  I have 
reintroduced the "Skipped Threads" section of the Summaries so people can see 
what I skipped in case there is something they might want to read that I just 
didn't care about.  In places where I remember something partially relevant I 
added a sentence on it so it isn't just a list of subject lines.

Enjoy.

=========
Summaries
=========
-------------
PEP movements
-------------
`PEP 3000`_ (Python 3.0 Plans) came into creation.  This text's point of 
existence is to list changes known to be planned for Python 3.0 and not 
hypothetical (Guido suggested all hypothetical talk call the version Python 
3000, partially for marketing purposes).  Plus I am a co-author and it finally 
completes my time in the `School of Hard Knocks`_. =)

`PEP 333`_ (Python Web Server Gateway Interface v1.0) proposes a "standard 
interface between web servers and Python web applications or frameworks, to 
promote web application portability across a variety of web servers".

`PEP 309`_ (Partial Function Application) was updated with some details.

`PEP 334`_ (Simple Coroutines via SuspendIteration) came into existence to 
suggest coming up with some form of lightweight coroutines.

.. _School of Hard Knocks: 
http://mail.python.org/pipermail/python-dev/2002-September/028725.html
.. _PEP 3000: http://www.python.org/peps/pep-3000.html
.. _PEP 333: http://www.python.org/peps/pep-0333.html
.. _PEP 309: http://www.python.org/peps/pep-0309.html
.. _PEP 334: http://www.python.org/peps/pep-0334.html

Contributing threads:
   - `Minimal 'stackless' PEP using generators? 
<http://mail.python.org/pipermail/python-dev/2004-August/048239.html>`__


--------------------------------
Decorators "issue" mostly solved
--------------------------------
While the hubbub over using a character for decorators was brewing, people 
began suggesting reserving a character that would never be used in Python for 
anything.  The thought was that people who wanted to use a character to 
represent application-specific information could use the reserved symbol and 
not have to worry about clashing with possible future features like Leo and 
IPython are with the use of '@'.  But no reservation of a character occurred.

Towards the end of the month, to meet the a3 deadline, a unified proposal from 
the community came forward led by Robert Brewer and Michael Sparks.  They 
pushed the J2 proposal::

   using:
       somedecorator
       staticmethod
   def func():
       pass

Guido contemplated the proposal, saying "it got pretty darn close" to being 
accepted, but in the end decided not to.  For Guido's full reasoning see 
http://mail.python.org/pipermail/python-dev/2004-September/048518.html .  But 
he said he had two key issues.  One was the indentation "suggests that its 
contents should be a sequence of statement, but in fact it is not".  Issue two 
was that using a keyword to start a line was a real attention grabber and that 
"using" did not deserve this.

The topic of how the whole decorators situation was handled was touched upon. 
He realized that "dramatic changes must be discussed with the community at 
large".  He was also impressed by how the community pulled together to propose 
an alternative as it did and hopes to see more proposals of the same quality in 
the future.

So now what?  Guido said that he would be willing to change the character used 
for decorators for 2.4b1 .  That means if '@' drives you nuts but something 
else like '!' works for you then speak up and try to get the community to rally 
behind it.

Contributing threads:
   - `Decorator order implemented backwards? 
<http://mail.python.org/pipermail/python-dev/2004-August/047512.html>`__
   - `Considering decorator syntax on an individual feature 
<http://mail.python.org/pipermail/python-dev/2004-August/048081.html>`__
   - `PEP 318: Suggest we drop it 
<http://mail.python.org/pipermail/python-dev/2004-August/048025.html>`__
   - `__metaclass__ and __author__ are already decorators 
<http://mail.python.org/pipermail/python-dev/2004-August/048176.html>`__
   - `Reserved Characters 
<http://mail.python.org/pipermail/python-dev/2004-August/048166.html>`__
   - `PEP 318: Can't we all just get along? 
<http://mail.python.org/pipermail/python-dev/2004-August/048213.html>`__
   - `Multiple decorators per line 
<http://mail.python.org/pipermail/python-dev/2004-August/048227.html>`__
   - `Important decorator proposal on c.l.p. 
<http://mail.python.org/pipermail/python-dev/2004-August/048332.html>`__
   - `Re: [Python-checkins] python/nondist/peps pep-0318.txt... 
<http://mail.python.org/pipermail/python-dev/2004-August/048323.html>`__
   - `CO_FUTURE_DECORATORS 
<http://mail.python.org/pipermail/python-dev/2004-August/048354.html>`__
   - `decorators: If you go for it, go all the way!!! :) 
<http://mail.python.org/pipermail/python-dev/2004-August/048378.html>`__
   - `Re: Re: def fn (args) [dec,dec]: 
<http://mail.python.org/pipermail/python-dev/2004-August/048432.html>`__
   - `J2 proposal final 
<http://mail.python.org/pipermail/python-dev/2004-August/048428.html>`__
   - `(my) revisions to PEP318 finally done. 
<http://mail.python.org/pipermail/python-dev/2004-August/048471.html>`__
   - `Rejecting the J2 decorators proposal 
<http://mail.python.org/pipermail/python-dev/2004-September/048518.html>`__

-----------------------------------------------------------
When should something be put under the great powers of -O ?
-----------------------------------------------------------
Python has had a simple peephole optimizer in the compiler since 2.3 that 
optimized imported bytecode.  Raymond Hettinger moved it up, though, so that 
the optimization would be saved to .pyc files and thus remove the need to 
repeat the process every time.

Guido questioned this move.  He thought that since it was an optimization it 
should fall under the -O command-line option.

But then people came forward to suggest that Raymond's move was good, saying 
that the cost of the optimization was non-existent and thus should be used.  I 
brought up the point that a definition of what should be considered an 
optimization; anything that changes the initial opcode, or something that takes 
extra time/memory or changes semantics?  Tim Peters stepped forward and said 
that since the optimizations were so simple that he thought they should be 
kept.  David Abrahams also came forward and said they should be kept to get 
more testing on them since they were not complex and thus did not influence 
debugging of code.

In the end Raymond's change was kept in place.

Contributing threads:
   - `Re: [Python-checkins] python/dist/src/Python compile.c, 2.319, 2.320 
<http://mail.python.org/pipermail/python-dev/2004-August/048032.html>`__

----------------------------------------
2.4a3 out the doors so kick those tires!
----------------------------------------
`Python 2.4a3`__ has been released.  As usual, please download it, run the 
regression tests, and report any errors you get.  Since this will be the last 
alpha this is your last chance to get new features in before b1 comes out.

The use of priorities on the SourceForge tracker has also been clarified. 
Anything set to 9 **must** be dealt with before the next release.  Priority 8 
is to be dealt with before b1; it changes functionality so if it isn't in by b1 
it won't be in until the next version.  Priority 7 is for something that should 
get in before the final release.  Anthony Baxter also gained sole control of 
setting the priority so as to keep the settings consistent.

.. _Python 2.4a3: http://www.python.org/2.4/

Contributing threads:
   - `2.4a3 release is September 2, SF tracker keywords 
<http://mail.python.org/pipermail/python-dev/2004-August/048078.html>`__

-

-
Stemming from a conversation about moving Python over to Unicode only for 
string representation for 3.0, the discussion of a bytes type came up.  People 
were saying they used str to store binary data and that if str went away or no 
longer represented straight binary data (since Unicode has different encodings 
the values can change while meaning the same thing in terms of characters) they 
would need a way to deal with this.

The idea that the array module solved this was basically dismissed since it 
seemed more built-in support was needed for convenience.  It also meant more 
flexibility in terms of what interfaces were implemented.  There was also some 
issues with getting array to work the exact way people wanted it to.

The next question was whether literal support was needed.  Would you really 
need to write something like ``b"\x66\x6f\x6f"`` instead of ``bytes([0x66, 
0x6f, 0x6f])``?

How all of this would play with Unicode ended up being discussed.  In the end 
it seemed that one could encode and decode back and forth but that all work 
with character should be in Unicode and only decoded into bytes on the I/O 
barrier (writing to disk or the network, for instance) to minimize any possible 
encoding errors and to make usage easier.

Mutability came up.  Being mutable would be handy, but it killed its usage as a 
dictionary key.  It was suggested that bytes hash to a tuple of integers 
representing the bytes, but nothing more was said.  But in general almost 
everyone agreed that having the bytes type be mutable was best.

`PEP 332`_ was sketched out during the early part of this discussion, but has 
not been updated since it died down.

.. _PEP 332: http://www.python.org/peps/pep-0332.html

Contributing threads:
   - `adding a bytes sequence type to Python 
<http://mail.python.org/pipermail/python-dev/2004-August/047722.html>`__
   - `Byte string class hierarchy 
<http://mail.python.org/pipermail/python-dev/2004-August/048027.html>`__

--------------------------------------------
String substitution sure is a touchy subject
--------------------------------------------
PEP 292`_ (Simpler String Substitutions) got a huge amount of discussion this 
past two weeks.  Ignoring the syntax discussions (that was decided long ago 
before the PEP was accepted and had consensus and thus was a moot point) and 
the discussion of whether a trailing ``$`` at the end of the substitution 
pattern should be considered an error or not (it is), a couple of topics were 
discussed.

To make this summary easier to follow, realize that the class that implements 
PEP 292 is named "Template" and thus I will just refer to the implementation by 
that name.

The first topic was over whether Template should return Unicode objects.  The 
side supporting it pointed out that Python 3.0 was going to be using Unicode 
for strings exclusively so it would be good to start using them now.  It also 
went with the initial design of PEP 292 which was to help with i18n where 
Unicode is constantly used.

People against, though, didn't want to suddenly be given a Unicode object when 
a string was used for template string passed in.  That would be too surprising 
and lead to inconsistent usage thanks to sudden mixing of strings and Unicode 
objects in code.  This issue was resolved by no longer subclassing unicode but 
making it easy to subclass Template so as to add direct Unicode support with ease.

The second issue was other the design of the API.  Originally Template was a 
class that overrode __mod__ to make it work like string interpolation works now 
for str and unicode.  But then some people felt a class was too heavy-handed if 
there was no way to change the way Template worked through a subclass.  This 
obviously led to a desire for functions to do the work for both Template and 
SafeTemplate (similar class to Template that left in substitution points if 
they didn't match any values in the dict passed in).

In the end the class design was kept thanks to Tim Peters and metaclasses.  Tim 
came up with a neat way to have the regex be generated at class creation time 
through a metaclass and thus allow subclasses to change how Template matched 
substitution points and such, all without a performance hit at instance 
creation time.  Use of __mod__ and the SafeTemplate class were removed and 
Template grew substitute and safe_substitute methods.  Everyone at this point 
seems happy with the design.

.. _PEP 292: http://www.python.org/peps/pep-0322.html

Contributing threads:
   - `Update PEP 292 
<http://mail.python.org/pipermail/python-dev/2004-August/048160.html>`__
   - `PEP 292 - Simpler String Substitutions 
<http://mail.python.org/pipermail/python-dev/2004-August/048236.html>`__
   - `Alternative Implementation for PEP 292: Simple String Substitutions 
<http://mail.python.org/pipermail/python-dev/2004-August/048406.html>`__
   - `Alternative placeholder delimiters for PEP 292 
<http://mail.python.org/pipermail/python-dev/2004-August/048469.html>`__

-------------------------------------------
Private names considered rude in the stdlib
-------------------------------------------
Anthony Baxter suggested banning use of mangled private names (names starting 
with ``__``) in the stdlib.  His argument was that they are a hack and the 
stdlib is supposed to act as a good example and that name mangling was not good.

Guido essentially agreed with the caveat that some uses of private names is 
justified such as if a private name is storing the equivalent of a 'friend' 
function from C++.

Contributing threads:
   - `__mangled in stdlib considered poor form 
<http://mail.python.org/pipermail/python-dev/2004-August/048444.html>`__

===============
Skipped Threads
===============
Warnocked (i.e., emails that get essentially no response) emails very 
insignificant threads are not listed

- Find out whether PyEval_InitThreads has been called?
- Unifying Long Integers and Integers: baseint
- test_tempfile failure on Mac OSX
- Deprecate sys.exitfunc?
- multiple instances of python on XP
- Adding 'lexists()' to os.path
- #ifdeffery
- Weekly Python Bug/Patch Summary
- problem with pymalloc on the BeOS port.
- Proposed change to logging
- sre.py backward compatibility and PEP 291
- Dealing with test__locale failure on OS X before a3
- os.urandom API
- Decoding incomplete unicode
       Basically culminated into new stateful UTF-8 and UTF-16 decoders but 
that's all I know  =)
- Decimal module portability to 2.3?
       Looks like this will happen; wait for next summary for a more concrete 
answer
- Python icons
       If you think you can come up with a good icon for the Windows installer 
please let c.l.py know and it might get used
- [Python-checkins] python/dist/src/Lib/test test_string.py, 1.25, 1.26
- list += string??
From trentm at ActiveState.com  Wed Sep 15 20:59:48 2004
From: trentm at ActiveState.com (Trent Mick)
Date: Wed Sep 15 21:20:25 2004
Subject: [Python-Dev] [TARGETDIR]lib-tk added to PythonPath in MSI
Message-ID: <20040915115947.A26465@ActiveState.com>


Round about line 1088 of Tools/msi/msi.py, lib-tk is added to the
PythonPath:

    ("PythonPath", -1, prefix+r"\PythonPath", "",
    "[TARGETDIR]Lib;[TARGETDIR]DLLs;[TARGETDIR]lib-tk", "REGISTRY"),

Shouldn't that be this instead?

    ("PythonPath", -1, prefix+r"\PythonPath", "",
    "[TARGETDIR]Lib;[TARGETDIR]DLLs;[TARGETDIR]Lib\\lib-tk", "REGISTRY"),

Trent

-- 
Trent Mick
TrentM@ActiveState.com
From pje at telecommunity.com  Wed Sep 15 21:26:11 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Sep 15 21:27:28 2004
Subject: [Python-Dev] python-dev Summary for 2004-08-16 through
	2004-08-31 [draft]
In-Reply-To: <41488969.70909@ocf.berkeley.edu>
Message-ID: <5.1.1.6.0.20040915152555.025ba470@mail.telecommunity.com>

At 11:26 AM 9/15/04 -0700, Brett C. wrote:
>-
>
>-
>Stemming from a conversation about moving Python over to Unicode only for 
>string representation for 3.0, the discussion of a bytes type came 
>up.  People were saying they used str to store binary data and that if str 
>went away or no longer represented straight binary data (since Unicode has 
>different encodings the values can change while meaning the same thing in 
>terms of characters) they would need a way to deal with this.

Looks like this section was supposed to have a title, but it got lost.

From theller at python.net  Wed Sep 15 21:30:23 2004
From: theller at python.net (Thomas Heller)
Date: Wed Sep 15 21:35:59 2004
Subject: [Python-Dev] PyExc_UnicodeDecodeError
In-Reply-To: <41488DEE.1030506@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed,
	15 Sep 2004 20:46:06 +0200")
References: <vfefeen7.fsf@python.net> <41486148.7090007@egenix.com>
	<k6uved30.fsf@python.net> <41488DEE.1030506@v.loewis.de>
Message-ID: <oek7e3e8.fsf@python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Thomas Heller wrote:
>> In this context: I find Exceptions being much too underdocumented.
> [...]
>> I'm not sure what there can be done about that, maybe keep exceptions.py
>> in sync (although unused) with the current code, and point to it from
>> the docs?
>
> The best solution for missing, incomplete, and incomprehensible
> documentation is to add, complete, and rewrite the documentation.
> Do you volunteer?

I known.  Maybe I'll do something about it.  So far I think it's
difficult to describe the behaviour of the exception classes - that was
my impression when I looked at (for example) the description of
EnvironmentError in the library docs, and compared that to the code in
<http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Lib/Attic/exceptions.py?rev=1.18&view=markup>

Do others share this impression, or is it me only?

As I said, one idea would be to keep exceptions.py, although unused, in
sync with the current C code, and include it in the docs.
Another idea that came to my mind is the include the Python code in the
docstring.

Thomas

From bac at OCF.Berkeley.EDU  Wed Sep 15 21:40:21 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Sep 15 21:40:32 2004
Subject: [Python-Dev] python-dev Summary for 2004-08-16 through	2004-08-31
	[draft]
In-Reply-To: <5.1.1.6.0.20040915152555.025ba470@mail.telecommunity.com>
References: <5.1.1.6.0.20040915152555.025ba470@mail.telecommunity.com>
Message-ID: <41489AA5.8090206@ocf.berkeley.edu>

Phillip J. Eby wrote:
> At 11:26 AM 9/15/04 -0700, Brett C. wrote:
> 
>> -
>>
>> -
>> Stemming from a conversation about moving Python over to Unicode only 
>> for string representation for 3.0, the discussion of a bytes type came 
>> up.  People were saying they used str to store binary data and that if 
>> str went away or no longer represented straight binary data (since 
>> Unicode has different encodings the values can change while meaning 
>> the same thing in terms of characters) they would need a way to deal 
>> with this.
> 
> 
> Looks like this section was supposed to have a title, but it got lost.
> 

Yep.  Fixed in my copy now.  Thanks, Phillip.

-Brett
From martin at v.loewis.de  Wed Sep 15 21:57:29 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 21:57:32 2004
Subject: [Python-Dev] PyExc_UnicodeDecodeError
In-Reply-To: <oek7e3e8.fsf@python.net>
References: <vfefeen7.fsf@python.net>
	<41486148.7090007@egenix.com>	<k6uved30.fsf@python.net>
	<41488DEE.1030506@v.loewis.de> <oek7e3e8.fsf@python.net>
Message-ID: <41489EA9.3040205@v.loewis.de>

Thomas Heller wrote:
> Do others share this impression, or is it me only?

I never have the need to raise any exception except for
standard errors taking a single char*, so I probably haven't
noticed, yet. If my impression is right, and people either
raise their own exceptions, or "simple" standard errors,
your usage of exceptions would count as "guru application".

Gurus are expected to read and understand the source code
of the interpreter, so I don't care much about the state
of the documentation in this area.

Regards,
Martin
From tim.peters at gmail.com  Wed Sep 15 22:30:55 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Sep 15 22:31:24 2004
Subject: [Python-Dev] tempfile.TemporaryFile on Windows NT
In-Reply-To: <414892AB.7010403@v.loewis.de>
References: <4147D8C9.3020508@v.loewis.de>
	<1f7befae0409151151486156b@mail.gmail.com>
	<414892AB.7010403@v.loewis.de>
Message-ID: <1f7befae04091513304eebed0c@mail.gmail.com>

[Martin v. L?wis]
> So what about adding a name argument to fdopen?

I suppose.  The file wrapper on Windows never bothered me, although
Windows breakage in the tempfile code has cost me plenty of grief.  So
I'm much more concerned about not breaking it again than in repairing
an inelegance that didn't violate my sense of aesthetics to begin with
<wink>.
From mike at skew.org  Wed Sep 15 23:04:16 2004
From: mike at skew.org (Mike Brown)
Date: Wed Sep 15 23:04:15 2004
Subject: [Python-Dev] urllib.urlopen() vs IDNs, percent-encoded hosts, ':'
Message-ID: <200409152104.i8FL4G14033121@chilled.skew.org>

Over the last couple of years, while implementing an RFC 2396 and RFC 2396bis 
compliant URI library for 4Suite, I've amassed a sizable list of, um, 
complaints about urllib.

Many of the issues I have run into are attributable to the age of urllib (I am 
pretty sure it predates the unicode type) and the obsolescence of the specs on 
which parts of it are based (it's essentially in RFC 1808 land, with a 
smattering of patches to bring aspects of it closer to RFC 2396). Other issues 
are matters of API entrenchment, either for the convenience for users (e.g. 
treating '/' and '\' as equivalent on Windows) or for compatibility with the 
APIs of other libraries & applications.

When I'm comfortable enough with 4Suite's Ft.Lib.Uri APIs I intend to formally 
propose incorporating updated implementations into Python core, perhaps 
distributed among urllib, urllib2, and urlparse or maybe in a new module, as 
appropriate. I'm not really ready to make such a proposal, though, as I still 
have some philosophical questions about str/unicode transparency in APIs (e.g. 
urllib.unquote, when given unicode, does not percent-decode characters above 
\u007f, and I'm wondering if that's ideal), and I am also unclear on what the 
policy is regarding using regular expressions in core Python modules -- it 
seems to be a no-no, but I don't know for sure... any comments on that 
particular matter would be appreciated.

Anyway, there's at least one part of Ft.Lib.Uri that I think could stand to be 
addressed more immediately: there is a bit of transformation that one must 
perform on a spec-conformant URI in order to get urllib.urlopen() to process 
it correctly. This should not be necessary, IMHO.

The main issues are:

1. urlopen() cannot reliably process unicode unless there are no
   percent-encoded octets above %7F and no characters above \u007f
   (I think that's the gist of it, at least).

I don't think this is necessarily a bug, as a proper URI will never contain 
non-ASCII characters. However since urlopen()'s API is unfortunately such that 
it accepts OS-specific filesystem paths, which nowadays may be unicode, it may 
be time to tighten up the API and say that the url argument *must* be a URI, 
and that if unicode is given, it will be converted to str and thus must not 
contain non-ASCII characters.

2. urlopen() (the URI scheme-specific openers it uses, actually) does not
   percent-decode the host portion of a URL before doing a DNS lookup.

This wasn't really a problem until IDNs came along; no one was using non-ASCII 
in their hostnames. But now we have to deal with URLs where the host component
is a string of percent-encoded UTF-8 octets, like

    'http://www.%E3%81%BB%E3%82%93%E3%81%A8%E3%81%86%E3%81%AB%E3%81%AA%E3%81%8C%E3%81%84%E3%82%8F%E3%81%91%E3%81%AE%E3%82%8F%E3%81%8B%E3%82%89%E3%81%AA%E3%81%84%E3%81%A9%E3%82%81%E3%81%84%E3%82%93%E3%82%81%E3%81%84%E3%81%AE%E3%82%89%E3%81%B9%E3%82%8B%E3%81%BE%E3%81%A0%E3%81%AA%E3%81%8C%E3%81%8F%E3%81%97%E3%81%AA%E3%81%84%E3%81%A8%E3%81%9F%E3%82%8A%E3%81%AA%E3%81%84.w3.mag.keio.ac.jp/'

which are supposed decoded back to Unicode (in this case, it's a string of
Japanese characters) and then IDNA-encoded for the DNS lookup, so that it will
be interpreted as if it were the equally-unintelligible-but-DNS-friendly

    'http://www.xn--n8jaaaaai5bhf7as8fsfk3jnknefdde3fg11amb5gzdb4wi9bya3kc6lra.w3.mag.keio.ac.jp/'

Even though IDNs are the main application for percent-encoded octets in the
host component, it is necessary in simpler cases as well, like

    'http://www.w%33.org'

which would need to be interpreted as

    'http://www.w3.org'

Python 2.3 introduced an IDNA codec, and both the socket and httplib modules 
were updated to accept unicode hostnames (e.g. the Japanese characters 
represented by, but not shown, in the examples above), automatically applying 
IDNA encoding prior to doing the DNS lookup.

urllib's urlopeners were *not* updated accordingly. This should be changed. 
The way I do it in Ft.Lib.Uri is to rewrite the hostname, regardless of its 
URI scheme (since once I pass it to urlopen it's out of my hands), to a
percent-decoded, IDNA-encoded version before passing it to urlopen. Ideally
it should be handled by each opener as necessary, I think.

3. On Windows, urlopen() only recognizes '|' as a Windows drivespec character, 
   whereas ':' is just as, if not more, common in 'file' URIs.

file:///C:/Windows/notepad.exe is a perfectly valid 'file' URI and should not 
fail to be interpreted on Windows as C:\Windows\notepad.exe. Currently the 
only way to get it to work is to replace the ':' with '|', which was 
established in the days of the Mosaic web browsers, I believe, and that has 
remained as a widely supported, but arbitrary & unnecessary convention.

I would prefer that all the APIs that expect '|' instead of ':' be updated to 
not consider '|' to be canon, but the simplest workaround for the sake of 
using ':'-containing URIs with urllib.urlopen() is just to do a simle string 
replacement in the path, e.g.

    if os.name == 'nt' and scheme == 'file':
        path = path.replace(':','|',1)

(assuming you've already got the path and scheme components of the given URI 
split out).


I would appreciate any comments that anyone has on the feasibility of
these suggestions.

Thanks,

Mike


P.S. If you're curious, the current version of Ft.Lib.Uri is at 

  http://cvs.4suite.org/cgi-bin/viewcvs.cgi/4Suite/Ft/Lib/Uri.py

and a test suite for it (which relies on a custom framework, not unittest,
but that should be fairly understandable anyway) is at

  http://cvs.4suite.org/cgi-bin/viewcvs.cgi/4Suite/test/Lib/test_uri.py

The function that I am currently using to massage a URI to make it safe for 
urllib.urlopen() is named MakeUrllibSafe. I wouldn't recommend it as-is, 
though, since it relies on other functions that deal with more convoluted 
unicode issues that I'm trying to avoid asking about in this post.
From martin at v.loewis.de  Wed Sep 15 23:18:10 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Sep 15 23:18:12 2004
Subject: [Python-Dev] tempfile.TemporaryFile on Windows NT
In-Reply-To: <1f7befae04091513304eebed0c@mail.gmail.com>
References: <4147D8C9.3020508@v.loewis.de>	
	<1f7befae0409151151486156b@mail.gmail.com>	
	<414892AB.7010403@v.loewis.de>
	<1f7befae04091513304eebed0c@mail.gmail.com>
Message-ID: <4148B192.2040809@v.loewis.de>

Tim Peters wrote:
> I suppose.  The file wrapper on Windows never bothered me, although
> Windows breakage in the tempfile code has cost me plenty of grief.  So
> I'm much more concerned about not breaking it again than in repairing
> an inelegance that didn't violate my sense of aesthetics to begin with
> <wink>.

Ah, ok. This is probably the time to present my case: Somebody
complained on c.l.p that isinstance(tempfile.TemporaryFile(), file)
gives True on Linux but False on Windows. While this result is
"in principle correct", I think something can be done to make it
correct practically, too.

Regards,
Martin
From martin at v.loewis.de  Wed Sep 15 23:40:01 2004
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed Sep 15 23:40:14 2004
Subject: [Python-Dev] urllib.urlopen() vs IDNs, percent-encoded hosts, ':'
In-Reply-To: <200409152104.i8FL4G14033121@chilled.skew.org>
References: <200409152104.i8FL4G14033121@chilled.skew.org>
Message-ID: <4148B6B1.4040902@v.loewis.de>

Mike Brown wrote:
> 1. urlopen() cannot reliably process unicode unless there are no
>    percent-encoded octets above %7F and no characters above \u007f
>    (I think that's the gist of it, at least).

And that feature is by design. URLs are conceptually byte strings,
not character strings, so passing Unicode strings is mostly a
meaningless operation. Mostly - because if the Unicode string is
pure ASCII, it probably matches most implementations and user
expectations to convert it to pure ASCII first, and then treat it
as a URL.

IETF is working on resolving the issue, by introducing IRIs. It
appears that draft-duerst-iri-09.txt is what will become the relevant
RFC. Once the RFC is published, urllib and urllib2 should be updated
to support IRIs; contributions are welcome.

> I don't think this is necessarily a bug, as a proper URI will never contain 
> non-ASCII characters. However since urlopen()'s API is unfortunately such that 
> it accepts OS-specific filesystem paths, which nowadays may be unicode, it may 
> be time to tighten up the API and say that the url argument *must* be a URI, 
> and that if unicode is given, it will be converted to str and thus must not 
> contain non-ASCII characters.

No. I'ld rather prefer to specify that it if it is a Unicode string, it
must be an IRI, and is converted to an URI according to the IRI spec.

> 2. urlopen() (the URI scheme-specific openers it uses, actually) does not
>    percent-decode the host portion of a URL before doing a DNS lookup.
> 
> This wasn't really a problem until IDNs came along; no one was using non-ASCII 
> in their hostnames. But now we have to deal with URLs where the host component
> is a string of percent-encoded UTF-8 octets.

Hmm. I think there is no backup in any standard for doing that.
Applications that put URL-escaped UTF-8 bytes into host names deserve to
lose. There are two valid ways for putting non-ASCII characters into the
hostname part of an URL: use Unicode strings, or use IDNA. It may be
that IRIs add another way (I haven't checked this aspect specifically),
but unless there is some RFC supporting such a protocol, any response
by urllib is fine, exceptions preferred.

> Even though IDNs are the main application for percent-encoded octets in the
> host component, it is necessary in simpler cases as well, like
> 
>     'http://www.w%33.org'
> 
> which would need to be interpreted as
> 
>     'http://www.w3.org'

We would have to check: this might be valid usage, but I somewhat doubt
it.

> urllib's urlopeners were *not* updated accordingly. This should be changed. 

The change was deliberately deferred until the IRI RFC is published.

> 3. On Windows, urlopen() only recognizes '|' as a Windows drivespec character, 
>    whereas ':' is just as, if not more, common in 'file' URIs.

I have long ago given up trying to understand this issue. I'm happy to
change this forth and back about once or twice a year, until somebody
comes up with a clear and definitive story, backed up by standards and
product documentation, so that we might get a stable implementation some
day. Feel free to write patches.

Regards,
Martin
From gvanrossum at gmail.com  Wed Sep 15 23:46:43 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep 15 23:46:46 2004
Subject: [Python-Dev] Strawman decision: @decorator won't change
Message-ID: <ca471dc2040915144653c5dba2@mail.gmail.com>

Anthony Baxter asked me for a pronouncement on whether @decorator will
change to use some other character instead; I kept this open as a
possibility before 2.4b1 (which is tentatively scheduled for Oct 7th).
Given the near-complete silence following my rejection of the J2
alternative proposal, I don't expect there to be a massive popular
movement to change the character, but I admit I haven't looked for
responses outside python-dev.

Let's plan on doing the following. If in the next 7 days there's no
indication that some group of users wants to rally for a different
character, the decision to keep @ is made final on Sept 23. To change
the character, somebody will need to start rallying for a different
character, and be able to show signs of significant support by that
date.

The definition of "significant support" is intentionally left open for
interpretation, I'll review the evidence on the 23rd.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From s.percivall at chello.se  Thu Sep 16 01:03:20 2004
From: s.percivall at chello.se (Simon Percivall)
Date: Thu Sep 16 01:03:23 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
Message-ID: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>

The title says it all, tabs breaking installation.

Lines 537 and 538 in httplib.py
Lines 124, 129, 130, 131 in test_httplib.py

//Simon

From bac at OCF.Berkeley.EDU  Thu Sep 16 01:27:29 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Sep 16 01:27:42 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>
Message-ID: <4148CFE1.5010503@ocf.berkeley.edu>

Simon Percivall wrote:
> The title says it all, tabs breaking installation.
> 
> Lines 537 and 538 in httplib.py
> Lines 124, 129, 130, 131 in test_httplib.py
> 

Fixed.  Bad, Martin, bad!  =)

-Brett
From mike at skew.org  Thu Sep 16 02:10:17 2004
From: mike at skew.org (Mike Brown)
Date: Thu Sep 16 02:10:20 2004
Subject: [Python-Dev] urllib.urlopen() vs IDNs, percent-encoded hosts, ':'
In-Reply-To: <4148B6B1.4040902@v.loewis.de>
	=?UTF-8?Q?from_=22Martin_v=2E_L=C3=B6wis=22_at_Sep_15=2C_2004_11=3A40=3A01_p?=
	=?UTF-8?Q?m?=
Message-ID: <200409160010.i8G0AI78034010@chilled.skew.org>

"Martin v. L?wis" wrote:
> Mike Brown wrote:
> > 1. urlopen() cannot reliably process unicode unless there are no
> >    percent-encoded octets above %7F and no characters above \u007f
> >    (I think that's the gist of it, at least).
> 
> And that feature is by design. URLs are conceptually byte strings,
> not character strings, so passing Unicode strings is mostly a
> meaningless operation.

No. The intent is actually that a URI is (not conceptually, just *is*) a 
string of characters; the syntax is only defined in terms of bytes due to 
peculiarities of the grammar. A percent-encoded sequence conceptually 
represents an encoded character, or part of one in the case of multibyte 
encodings, that may or may not be allowed by the syntax to appear as a literal 
character in that part of the URI.

This was actually clear in RFC 2396 sections 1.5 and 2, but has been explained 
somewhat better in the rephrased section 2 of rfc2396bis, which is in Last 
Call.

As for what was by design, the fact that a unicode url arg fails relatively 
deep in the processing (generally when it gets handed to urllib.unquote) or a 
resolver, and that it isn't ASCII-fied first, and that this isn't documented, 
and that urlopen() seems to be designed to be a URL-or-filepath-opener, all 
seems to indicate to me that this 'design' isn't very deliberate.

> Mostly - because if the Unicode string is
> pure ASCII, it probably matches most implementations and user
> expectations to convert it to pure ASCII first, and then treat it
> as a URL.

Well, we can take it for granted that an object that purports to be a URI must 
consist only of characters from a limited subset of ASCII. If the object is 
unicode, then there is no ambiguity about what each item in the sequence 
means, it's just a character and it must be in the allowed set in order to be 
interpreted unambiguously, so unicode is actually the ideal type of argument 
to urlopen(). If the object is a byte str, then we can pretty much assume that 
each byte represents its ASCII equivalent and is subject to the same 
restrictions, although this should be documented, lest someone pass in a UCS-2 
or UTF-16 string expecting its characters to be magically decoded.

The question is, does the url argument to urlopen() purport to be or is it 
assumed to be a URL? The function is quite lenient about what it accepts as a 
URL -- it accepts pretty much anything you give it, be it unicode or str, with 
or without a scheme component, relative to some unknown base, and loaded with 
illegal characters, and it tries to deal with it as best it can -- yet it 
still rejects or inconsistently handles some valid URIs, and this is what I 
want to see changed.

Perhaps I should rephrase part of the issue this way: If the argument to 
urlopen() is assumed to be a URI, then %FF in the argument should not be 
interpreted any differently when the argument is a str vs when it is unicode. 
RFC 2396 left it ambiguous as to what characters are represented by %80-%FF, 
so an implementation thereof may make such interpretations as it pleases.
The current implementation doesn't do this in a consistent manner.

> IETF is working on resolving the issue, by introducing IRIs. It
> appears that draft-duerst-iri-09.txt is what will become the relevant
> RFC. Once the RFC is published, urllib and urllib2 should be updated
> to support IRIs; contributions are welcome.
>
> > I don't think this is necessarily a bug, as a proper URI will never contain 
> > non-ASCII characters. However since urlopen()'s API is unfortunately such that 
> > it accepts OS-specific filesystem paths, which nowadays may be unicode, it may 
> > be time to tighten up the API and say that the url argument *must* be a URI, 
> > and that if unicode is given, it will be converted to str and thus must not 
> > contain non-ASCII characters.
> 
> No. I'd rather prefer to specify that it if it is a Unicode string, it
> must be an IRI, and is converted to an URI according to the IRI spec.

OK, that's probably a good way to go about it.

You should note however that percent-encoded sequences are legal in IRIs and 
pass through unchanged in the conversion to URI, so this does not solve the 
problem of how they are interpreted (i.e. the %80-%FF pass-through in certain 
situations). In an IRI that you construct yourself, you are much less likely 
to ever see a percent-encoded octet, but nevertheless, being a superset of 
URI, any IRI may contain them.

> > 2. urlopen() (the URI scheme-specific openers it uses, actually) does not
> >    percent-decode the host portion of a URL before doing a DNS lookup.
> > 
> > This wasn't really a problem until IDNs came along; no one was using non-ASCII 
> > in their hostnames. But now we have to deal with URLs where the host component
> > is a string of percent-encoded UTF-8 octets.
> 
> Hmm. I think there is no backup in any standard for doing that.

OK, you're right; it was in an IETF draft of its own (draft-uri-idn-something) 
and in February of this year was folded into rfc2396bis. How IDNs are 
represented in URIs is indeed currently restricted to IDNA (RFC 3490) only, by 
virtue of the fact that RFC 2396 forbids percent-encoding in hostnames.

I sometimes forget which aspects of rfc2396bis are changes from RFC 2396 and 
its predecessors, and which are clarifications / bugfixes.

> Applications that put URL-escaped UTF-8 bytes into host names deserve to
> lose.

Come February or whenever rfc2396bis and the IRI draft become RFCs, that
will no longer be a position you can maintain.

> There are two valid ways for putting non-ASCII characters into the
> hostname part of an URL: use Unicode strings, or use IDNA. It may be
> that IRIs add another way (I haven't checked this aspect specifically)

They do by virtue of reference to "RFCYYYY" which is a placeholder for
the RFC that the rfc2396bis draft will become, pending approval.

> but unless there is some RFC supporting such a protocol, any response
> by urllib is fine, exceptions preferred.

Consider it a feature request then.

> > urllib's urlopeners were *not* updated accordingly. This should be changed. 
> 
> The change was deliberately deferred until the IRI RFC is published.

OK.

> > 3. On Windows, urlopen() only recognizes '|' as a Windows drivespec character, 
> >    whereas ':' is just as, if not more, common in 'file' URIs.
> 
> I have long ago given up trying to understand this issue. I'm happy to
> change this forth and back about once or twice a year, until somebody
> comes up with a clear and definitive story, backed up by standards and
> product documentation, so that we might get a stable implementation some
> day. Feel free to write patches.

OK, a few points to understand:

- There is no canonical form of 'file' URI for any OS path.
  All conventions are established by implementations.

- 'file' as a URL scheme is very vaguely specified.
  It is being revised now but the revision may not be any better,
  from what I've seen so far on the mailing list for it.

- No RFC disallows ":" in the path component of any URL,
  except when it needs to appear in the first segment of the path
  component of what is now called a relative URI reference, when
  that path component is hierarchical (as determined by the scheme).
  In that situation, the segment must be prepended with './' in order
  to ensure that it is interpreted correctly.

Thus 'C:/autoexec.bat' as a URI reference (like in an href in an HTML doc) 
must be interpreted as scheme 'C' (not 'file'), and (by RFC 2396) 
non-hierarchical path '/autoexec.bat' or (by rfc2396bis) authority/hostname 
autoexec.bat, path ''. In either case it shouldn't be resolvable.

Meanwhile, './C:/autoexec.bat' is scheme <inherited from base URI>,
authority <inherited from base URI>, path './C:/autoexec.bat',
which is much less ambiguous.

Using '|' allows one to write 'C|/autoexec.bat' as a relative URI reference,
but that is, as far as I can tell, the only advantage to using it.

Let me be clear though - I am not suggesting getting rid of support for '|'.
I am merely saying that there is no reason ':' should, on Windows, fail to
be treated the same as '|' for the purpose of representing the ':' in a
drivespec.
From nnorwitz at gmail.com  Thu Sep 16 03:27:57 2004
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu Sep 16 03:28:03 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <e8bf7a530409151156625fea55@mail.gmail.com>
References: <e8bf7a530409141924139b7036@mail.gmail.com>
	<2mwtyvwsg4.fsf@starship.python.net>
	<e8bf7a530409151156625fea55@mail.gmail.com>
Message-ID: <ee2a432c04091518277cb7076@mail.gmail.com>

On Wed, 15 Sep 2004 14:56:43 -0400, Jeremy Hylton <jhylton@gmail.com> wrote:
> On Wed, 15 Sep 2004 14:51:55 +0100, Michael Hudson <mwh@python.net> wrote:
> >
> > if my limited googling is anything to go by.  It also seems asm/msr.h
> > is a "kernel internal header with absolutely no stable API
> > properties...." (Redhat bugzilla).
> >
> > So, now I've written this email <wink>, I think we should take out the
> > include and put in the #define.

In RedHat 9 and Fedora Core 1, msr.h is not installed under
/usr/include/.  There are only versions for x86 and amd64 in the
kernel source.

Michael's suggestion about adding the #define is probably the best way
to handle it for now.

Neal
From greg at cosc.canterbury.ac.nz  Thu Sep 16 03:46:52 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Sep 16 03:46:58 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <34EB398C-06E1-11D9-AC9C-000A95A50FB2@fuhm.net>
Message-ID: <200409160146.i8G1kqRf014588@cosc353.cosc.canterbury.ac.nz>

> Consider also this:
>    x and 4 or 5
> which is of course a common idiom to workaround the lack of an 
> if-then-else expression.

Actually, I hope it isn't common, because it's flawed.
It doesn't always work properly even with current
Python semantics.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From greg at cosc.canterbury.ac.nz  Thu Sep 16 03:48:25 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Sep 16 03:48:30 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <34EB398C-06E1-11D9-AC9C-000A95A50FB2@fuhm.net>
Message-ID: <200409160148.i8G1mPBJ014594@cosc353.cosc.canterbury.ac.nz>

> PS: Perl6 has distinct element-wise operators ("hyper" operators). I
> find that less distasteful than misusing regular operators as
> element-wise operators, when they really have vastly different
> semantics.

There was a huge discussion about that a while back.
I don't think anything came of it, though.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From greg at cosc.canterbury.ac.nz  Thu Sep 16 04:01:29 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Sep 16 04:01:34 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <ci8ojo$li4$1@sea.gmane.org>
Message-ID: <200409160201.i8G21TCL014610@cosc353.cosc.canterbury.ac.nz>

> It's not that you couldn't make numarrays short circuit. In the
> expression "a and b", if all the elements of a are false, then we
> can skip evaluating b. I'm just not sure that this is a good idea.

Whether it would be worth it would be application-dependent,
i.e. it would only help if pre-scanning all the elements of a
were cheaper enough than evaluating b. Probably not a good
idea to make it the default behaviour.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From tim.peters at gmail.com  Thu Sep 16 04:34:58 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep 16 04:35:05 2004
Subject: [Python-Dev] tempfile.TemporaryFile on Windows NT
In-Reply-To: <4148B192.2040809@v.loewis.de>
References: <4147D8C9.3020508@v.loewis.de>
	<1f7befae0409151151486156b@mail.gmail.com>
	<414892AB.7010403@v.loewis.de>
	<1f7befae04091513304eebed0c@mail.gmail.com>
	<4148B192.2040809@v.loewis.de>
Message-ID: <1f7befae0409151934593ea8b4@mail.gmail.com>

[Martin v. L?wis]
> Ah, ok. This is probably the time to present my case: Somebody
> complained on c.l.p that isinstance(tempfile.TemporaryFile(), file)
> gives True on Linux but False on Windows. While this result is
> "in principle correct", I think something can be done to make it
> correct practically, too.

I'm not going to object, but writing "isinstance(..., file)" is almost
never a *practical* thing to do in Python code anyway, so I don't
personally see the attraction.  Since tons of file-like objects aren't
instances of __builtin__.file, and it doesn't make a lick of
difference that they aren't, "isinstance(..., file)" isn't in the
practical Pythoneer's vocabulary.  That doesn't mean you can't want
this change for inscrutable reasons, though <wink>.
From anthony at interlink.com.au  Thu Sep 16 04:48:10 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Sep 16 04:49:09 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going to be
	fixed?
In-Reply-To: <4148885A.5090803@v.loewis.de>
References: <Pine.LNX.4.44.0409151024140.29513-100000@login.ecs.soton.ac.uk>
	<4148885A.5090803@v.loewis.de>
Message-ID: <4148FEEA.4050801@interlink.com.au>

Martin v. L?wis wrote:
> David G Mills wrote:
> 
>> And where can we get a copy of this new 'official' httplib?
> 
> 
> As usual: In the CVS.

As well as in 2.4b1, and, I assume 2.3.5, assuming the fix
gets backported.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From skip at pobox.com  Thu Sep 16 05:09:34 2004
From: skip at pobox.com (Skip Montanaro)
Date: Thu Sep 16 05:10:50 2004
Subject: [Python-Dev] httplib is not v6 compatible, is this going to be
	fixed?
In-Reply-To: <4148FEEA.4050801@interlink.com.au>
References: <Pine.LNX.4.44.0409151024140.29513-100000@login.ecs.soton.ac.uk>
	<4148885A.5090803@v.loewis.de> <4148FEEA.4050801@interlink.com.au>
Message-ID: <16713.1006.188325.822437@montanaro.dyndns.org>


    Anthony> ... and, I assume 2.3.5, assuming the fix gets backported.

I missed this as well.  Will backport.

Skip
From mike at skew.org  Thu Sep 16 06:20:10 2004
From: mike at skew.org (Mike Brown)
Date: Thu Sep 16 06:20:08 2004
Subject: [Python-Dev] urllib.urlopen() vs IDNs, percent-encoded hosts, ':'
In-Reply-To: <200409160010.i8G0AI78034010@chilled.skew.org> "from Mike Brown
	at Sep 15, 2004 06:10:17 pm"
Message-ID: <200409160420.i8G4KAjI035125@chilled.skew.org>

> Meanwhile, './C:/autoexec.bat' is scheme <inherited from base URI>,
> authority <inherited from base URI>, path './C:/autoexec.bat',

I meant to say, path <result of merging ./C:/autoexec.bat with the path
of the base URI>.
From greg at cosc.canterbury.ac.nz  Thu Sep 16 06:39:19 2004
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Sep 16 06:39:34 2004
Subject: [Python-Dev] ANN: PEP 335: Overloadable Boolean Operators
In-Reply-To: <5.1.1.6.0.20040915115744.03364ae0@mail.telecommunity.com>
Message-ID: <200409160439.i8G4dJU5014867@cosc353.cosc.canterbury.ac.nz>

> Well, it's definitely syntax and it's definitely a tree, so it's at least 
> an ST.  :)

I'd call it a VST (Verbose Syntax Tree).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
From martin at v.loewis.de  Thu Sep 16 08:37:39 2004
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu Sep 16 08:37:41 2004
Subject: [Python-Dev] urllib.urlopen() vs IDNs, percent-encoded hosts, ':'
In-Reply-To: <200409160010.i8G0AI78034010@chilled.skew.org>
References: <200409160010.i8G0AI78034010@chilled.skew.org>
Message-ID: <414934B3.6030004@v.loewis.de>

Mike Brown wrote:
> No. The intent is actually that a URI is (not conceptually, just *is*) a 
> string of characters

You are right: URIs are meant to be written on paper. However, RFC 2396
also acknowledges that the issue of non-ASCII characters is unresolved.
It suggests (in 2.1) that the URI scheme should specify the
interpretation of byte values.

> This was actually clear in RFC 2396 sections 1.5 and 2, but has been explained 
> somewhat better in the rephrased section 2 of rfc2396bis, which is in Last 
> Call.

This suggests that new URI schemes should mandate UTF-8 in the
components, but is silent on the issue of existing schemes.

> The question is, does the url argument to urlopen() purport to be or is it 
> assumed to be a URL? The function is quite lenient about what it accepts as a 
> URL -- it accepts pretty much anything you give it, be it unicode or str, with 
> or without a scheme component, relative to some unknown base, and loaded with 
> illegal characters, and it tries to deal with it as best it can -- yet it 
> still rejects or inconsistently handles some valid URIs, and this is what I 
> want to see changed.

If something passed to it is clearly a valid URL, and there is a clear
definition of how a computer should process it, and urllib doesn't, than
this is certainly a bug and should be fixed. Can you give an example of
such a URL?

> Perhaps I should rephrase part of the issue this way: If the argument to 
> urlopen() is assumed to be a URI, then %FF in the argument should not be 
> interpreted any differently when the argument is a str vs when it is unicode. 

Certainly. Indeed, urllib makes no difference, AFAICT.
"http://localhost/%FF" and u"http://localhost/%FF" are processed in
the same way.

> RFC 2396 left it ambiguous as to what characters are represented by %80-%FF, 
> so an implementation thereof may make such interpretations as it pleases.
> The current implementation doesn't do this in a consistent manner.

No. RFC 2396 defers the specifications to the specific schema.

>>Applications that put URL-escaped UTF-8 bytes into host names deserve to
>>lose.
> 
> 
> Come February or whenever rfc2396bis and the IRI draft become RFCs, that
> will no longer be a position you can maintain.

I see. I think I could accept a patch in this direction for
Python 2.4 even if RFC2396bis isn't published, assuming the patch
arrives before 2.4b1.

> Let me be clear though - I am not suggesting getting rid of support for '|'.
> I am merely saying that there is no reason ':' should, on Windows, fail to
> be treated the same as '|' for the purpose of representing the ':' in a
> drivespec.

I know that I personally won't touch this code, except for applying
patches. So if you have a clear vision of what needs to be changed
and how, submit a patch.

As for using regular expressions in the standard library: It seems you
believe this is discouraged. I don't know why you think so - I've never
heard of such a constraint before (in general - in specific cases,
submitters may have been told that alternatives are more efficient).

Regards,
Martin
From martin at v.loewis.de  Thu Sep 16 08:43:43 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep 16 08:43:45 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <4148CFE1.5010503@ocf.berkeley.edu>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>
	<4148CFE1.5010503@ocf.berkeley.edu>
Message-ID: <4149361F.3030906@v.loewis.de>

Brett C. wrote:
> Simon Percivall wrote:
> 
>> The title says it all, tabs breaking installation.
>>
>> Lines 537 and 538 in httplib.py
>> Lines 124, 129, 130, 131 in test_httplib.py
>>
> 
> Fixed.  Bad, Martin, bad!  =)

I should learn not to use vim for Python editing...

Regards,
Martin
From symbiont+py at berlios.de  Thu Sep 16 09:15:56 2004
From: symbiont+py at berlios.de (Jeff Pitman)
Date: Thu Sep 16 09:17:45 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <4149361F.3030906@v.loewis.de>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>
	<4148CFE1.5010503@ocf.berkeley.edu> <4149361F.3030906@v.loewis.de>
Message-ID: <200409161515.56466.symbiont+py@berlios.de>

On Thursday 16 September 2004 14:43, "Martin v. L?wis" wrote:
> I should learn not to use vim for Python editing...

in vimrc:

set softtabstop=4
set shiftwidth=4
set expandtab

-- 
-jeff
From mwh at python.net  Thu Sep 16 13:56:38 2004
From: mwh at python.net (Michael Hudson)
Date: Thu Sep 16 13:56:43 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <41488D41.9090905@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Wed,
	15 Sep 2004 20:43:13 +0200")
References: <e8bf7a530409141924139b7036@mail.gmail.com>
	<2mwtyvwsg4.fsf@starship.python.net> <41488D41.9090905@v.loewis.de>
Message-ID: <2mbrg6whop.fsf@starship.python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Michael Hudson wrote:
>> Well, it failed like that for me both before and after my PPC changes.
>> I'm fairly sure I didn't mess this up.  Maybe there's some
>> kernel-headers package that's necessary.
>> OTOH, I think one could replace the include by
>> #define rdtscll(val) \
>>      __asm__ __volatile__("rdtsc" : "=A" (val))
>> if my limited googling is anything to go by.  It also seems
>> asm/msr.h
>> is a "kernel internal header with absolutely no stable API
>> properties...." (Redhat bugzilla).
>
> I'ld still like to understand why it fails for your system (it works
> fine on mine). Do you have a definition for rdtscll in
> /usr/include/asm/msr.h?

I don't *have* asm/msr.h!  And the impression I get is that we
shouldn't be going near it with the proverbial bargepole.

Cheers,
mwh

-- 
  (ps: don't feed the lawyers: they just lose their fear of humans)
                                         -- Peter Wood, comp.lang.lisp
From erik at heneryd.com  Thu Sep 16 14:15:31 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Thu Sep 16 14:15:38 2004
Subject: [Python-Dev] python-dev Summary for 2004-08-16 through 2004-08-31
	[draft]
In-Reply-To: <41488969.70909@ocf.berkeley.edu>
References: <41488969.70909@ocf.berkeley.edu>
Message-ID: <414983E3.9090509@heneryd.com>

Brett C. wrote:
> The second issue was other the design of the API.  Originally Template 
> was a class that overrode __mod__ to make it work like string 
> interpolation works now for str and unicode.  But then some people felt 
> a class was too heavy-handed if there was no way to change the way 
> Template worked through a subclass.  This obviously led to a desire for 
> functions to do the work for both Template and SafeTemplate (similar 
> class to Template that left in substitution points if they didn't match 
> any values in the dict passed in).
> 
> In the end the class design was kept thanks to Tim Peters and 
> metaclasses.  Tim came up with a neat way to have the regex be generated 
> at class creation time through a metaclass and thus allow subclasses to 
> change how Template matched substitution points and such, all without a 
> performance hit at instance creation time.  Use of __mod__ and the 
> SafeTemplate class were removed and Template grew substitute and 
> safe_substitute methods.  Everyone at this point seems happy with the 
> design.

Well, not *everyone*.  As expressed in the PEP 292: Method Names thread 
I (still) think that:

1) substitute() and safe_substitute() are far too long names for such 
(probably) common/frequent operations.

2) The design would be more flexible if done with the 
Template/SafeTemplate class approach.  Less code duplication, easier to 
extend and it solves the long method name problem...

Didn't get that much (any) positive feedback though... <wink>


Erik
From perry at stsci.edu  Tue Sep 14 22:55:53 2004
From: perry at stsci.edu (Perry Greenfield)
Date: Thu Sep 16 15:14:18 2004
Subject: [Python-Dev] Re: ANN: PEP 335: Overloadable Boolean Operators
Message-ID: <774A39E3-0690-11D9-8495-000A95B68E50@stsci.edu>

Tim Hochberg wrote:
> Phillip J. Eby wrote:
>
> [CHOP]
> >
> > As for the numeric use cases, I'm not at all clear why &, |, and ~ 
> (or
> > special methods/functions) aren't suitable.
>
> They often are, but sometimes you want a logical and/or/not and &/|/~
> are mapped to bitwise and/or/not, which isn't always what you want.
> Presumably, if Gregs proposal were adopted, and/or/not would get mapped
> to numarray.logical_and/or/not.
>
I'll go further than that. *Most* of the time Numeric/numarray users
want logical and/or/not. Bitwise operations are, by comparison, rarely
desired.

It is true that one can use the bitwise operators in place of the
logical ones (and currently, that's what I tell people to use), but
you better make sure the arguments are booleans or limited to 0/1 values
or the result is not what is expected. In the great majority of cases
the arguments are booleans, but there are times when that isn't true,
and using the bitwise operators causes real grief if the user has
become accustomed to using the bitwise operator mindlessly.

Furthermore, most new array users naturally expect and/or/not to 
operate on
the array and are usually very annoyed that it doesn't work. This is one
of the largest (if not the largest) remaining warts for using arrays
with Python. I sure would like to see the PEP accepted. No one who
has tried to write many array expressions with the functional or method
equivalents would argue for their use in place of the operators.

Perry Greenfield

From mike at skew.org  Thu Sep 16 17:50:24 2004
From: mike at skew.org (Mike Brown)
Date: Thu Sep 16 17:50:48 2004
Subject: [Python-Dev] URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <414934B3.6030004@v.loewis.de>
	=?UTF-8?Q?from_=22Martin_v=2E_L=C3=B6wis=22_at_Sep_16=2C_2004_08=3A37=3A39_a?=
	=?UTF-8?Q?m?=
Message-ID: <200409161550.i8GFoODM038508@chilled.skew.org>

"Martin v. L?wis" wrote:
> You are right: URIs are meant to be written on paper. However, RFC 2396
> also acknowledges that the issue of non-ASCII characters is unresolved.
> It suggests (in 2.1) that the URI scheme should specify the
> interpretation of byte values.

Right. This part of the thread was just about how the argument to 
urllib.urlopen() should be handled when given as unicode vs str. You seemed to 
be saying it should be str because a URI is fundamentally bytes and should be 
analyzed as such, whereas I'm saying no, a URI is fundamentally characters and 
should be analyzed as such. I mentioned %-encoding and the quirk of the BNF 
just because those are aspects of the syntax that are byte-oriented and are the 
source of much confusion, and because they may have influenced your assertion.

Are we in agreement on these points?

 -  A URL/URI consists of a finite sequence of Unicode characters;

 -  urlopen(), and anything else that takes a URL/URI argument,
    must accept both str and unicode;

 -  If given unicode, each character in the string directly represents
    a character in the URL/URI and needs no interpretation;

 -  If given str, each byte in the string represents a character in
    the URL/URI according to US-ASCII interpretation;

 -  Characters or bytes outside the ASCII range, and even certain
    characters in the ASCII range, are not permitted in a URL/URI,
    and thus the interpretation of a string containing them may
    result in an exception or other unpredictable results.

If even these principles can be agreed upon, then I can submit a
documentation patch, at the very least.

Furthermore, what about this principle?

 -  The urllib, urllib2, and urlparse modules currently do not
    claim to conform to any particular standards governing the
    interpretation of URLs; they merely acknowledge that some
    standards may be applicable. However, the intent is to provide
    standards-conformant behavior where possible, to the extent 
    that the module APIs overlap with functionality mandated by
    current standards.

    When the relevant standards become obsolete due to publication
    of updated standards (e.g. RFC 1630 -> 1738 -> 1808 -> 2396),
    the implementations *may* be updated accordingly, and users
    should expect behavior that conforms to either the current or
    obsoleted standards. Which standards are applicable to a
    particular implementation should be documented in the module
    and in its functions & classes where necessary.

And how about these?

 -  urlopen() is documented as accepting a 'url' argument that is
    the URL of 'a network object' that can be read; a file-like
    object, based on either a local file or a socket, is normally
    returned. This 'network object' may be a local file if the
    'file' scheme is used or if the URL's scheme component is omitted.

    For convenience, the 'url' argument is permitted to be given as
    a str or unicode, and may be 'absolute' or 'relative'.

    If RFC 2396 or rfc2396bis apply, then the argument is assumed to
    be what is defined in the grammar as a URI-reference. A fragment
    component, if present, is stripped (this requires a change to the
    implementation) and in all cases, the reference is resolved
    against a default base URI.

    If RFC 1808 applies (the current implementation is based largely
    on this spec, which did not clearly distinguish between a reference
    and a URI), it is what is defined in the grammar as a URL, and
    if it is relative (relativeURL in the grammar), it is considered
    to be relative to a default base URL.

    (This is essentially describing the current implementation in
    terms used by the standards).

 -  In urlopen() and the URLOpener classes it depends on, the default
    base URI is the result of resolving the result of os.getcwd(),
    converted to a URL by some undocumented means, against the base
    'file:///'. 

    (I don't think this would require a change to the implementation,
    but it is a principle that should be agreed upon and documented,
    and perhaps the nuances of getcwd vs getcwdu should be addressed).

 -  The resolution of URIs having the 'file' scheme is undertaken on
    the local filesystem according to conventions that should be, but
    presently aren't, documented. A preferred mapping of filesystem
    paths to URIs and back should be documented for each platform.

 -  In urlopen(), the processing of a 'url' argument that is
    syntactically absolute may be nonconformant on platforms
    that use ":" in their filesystem paths. On such platforms, if the
    first ":" in what is syntactically an absolute URL/URI appears to
    be intended for use other than as a scheme component delimiter,
    the path will assumed to be relative. Furthermore, on Windows,
    '\', which is not allowed in a URL, or its equivalent percent-
    encoded sequence '%5C' (case-insensitive), will be interpreted as
    a '/' in the URL.

    Thus, on Windows, an argument such as r'C:\a\b\c.txt' will be
    treated as if it were 'file:///C:/a/b/c.txt' by the URLOpeners.
    This is a convenience feature for the benefit of users who do
    not have the means to convert an OS path to full 'file' URL.

    (This mostly describes current behavior, assuming we can reach
    agreement that the "C:" in the example above should be treated
    no differently than "C|").

> As for using regular expressions in the standard library: It seems you
> believe this is discouraged. I don't know why you think so - I've never
> heard of such a constraint before (in general - in specific cases,
> submitters may have been told that alternatives are more efficient).

I was just surprised to find that regular expressions are not used much in
urllib, urllib2, and urlparse. The implementations seem to be going to a
lot of trouble to process URLs using find() and string slices. I thought
perhaps there was a good reason for this.

I must attend to other things right now; will comment on the other issues 
later.

-Mike
From bac at OCF.Berkeley.EDU  Thu Sep 16 18:41:15 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Sep 16 18:41:34 2004
Subject: [Python-Dev] python-dev Summary for 2004-08-16 through 2004-08-31
	[draft]
In-Reply-To: <414983E3.9090509@heneryd.com>
References: <41488969.70909@ocf.berkeley.edu> <414983E3.9090509@heneryd.com>
Message-ID: <4149C22B.8070004@ocf.berkeley.edu>

Erik Heneryd wrote:
> Brett C. wrote:
> 
>> The second issue was other the design of the API.  Originally Template 
>> was a class that overrode __mod__ to make it work like string 
>> interpolation works now for str and unicode.  But then some people 
>> felt a class was too heavy-handed if there was no way to change the 
>> way Template worked through a subclass.  This obviously led to a 
>> desire for functions to do the work for both Template and SafeTemplate 
>> (similar class to Template that left in substitution points if they 
>> didn't match any values in the dict passed in).
>>
>> In the end the class design was kept thanks to Tim Peters and 
>> metaclasses.  Tim came up with a neat way to have the regex be 
>> generated at class creation time through a metaclass and thus allow 
>> subclasses to change how Template matched substitution points and 
>> such, all without a performance hit at instance creation time.  Use of 
>> __mod__ and the SafeTemplate class were removed and Template grew 
>> substitute and safe_substitute methods.  Everyone at this point seems 
>> happy with the design.
> 
> 
> Well, not *everyone*.

OK, it now says "practically everyone".  =)

-Brett
From bac at OCF.Berkeley.EDU  Thu Sep 16 18:45:12 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Sep 16 18:45:30 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <200409161515.56466.symbiont+py@berlios.de>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>	<4148CFE1.5010503@ocf.berkeley.edu>
	<4149361F.3030906@v.loewis.de>
	<200409161515.56466.symbiont+py@berlios.de>
Message-ID: <4149C318.2010902@ocf.berkeley.edu>

Jeff Pitman wrote:
> On Thursday 16 September 2004 14:43, "Martin v. L?wis" wrote:
> 
>>I should learn not to use vim for Python editing...
> 
> 
> in vimrc:
> 
> set softtabstop=4
> set shiftwidth=4
> set expandtab
> 

I don't want this to explode into a major thread, but if people think coming up 
with a good vimrc file for Python would be worth having as a separate SF 
project send me an email **personally**; DON"T CC python-dev!  Been 
contemplating doing this so that there is always up-to-date Vim config stuff 
(syntax highlighting, ai, etc.) while also leading to code that follows PEP 7 
and 8 so that all of us Vim users here can check in without having to worry 
about not following the style guidelines.

-Brett
From Jack.Jansen at cwi.nl  Thu Sep 16 21:48:09 2004
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Sep 16 21:47:59 2004
Subject: [Python-Dev] Python/PSF at SANE 2004 - Announcement and a request
	for help
Message-ID: <5615AEDA-0819-11D9-8800-000D934FF6B4@cwi.nl>

At this years' SANE conference (System Administration and Networking 
Europe, www.sane.nl) in Amsterdam there will be a Free and Open Source 
Bazar on wednesday evening, september 29, from 18.30 until 22.00. The 
bazar will be open to the general public (i.e. free as in beer), and 
about 20 FOSS groups will be present. In addition, Richard Stallman 
will present a talk.

Among the groups present is, you guessed it, the Python Software 
Foundation. And the person who volunteered for this is, you guessed it, 
me. The intention is to provide visitors with information on both the 
Python language and the PSF. The setting is informal: there will be a 
tabletop and a backdrop we can use to put material up. In addition 
there are rooms available to hold BOF sessions.

That concludes the announcement bit, on to the request bit: I'm looking 
for people who'd be willing to join me in manning the stand. And, 
ideally, also with preparing some material to put up on the backdrop 
and/or demonstrations we could stage (I can supply the computer, 
provided it's a Macintosh:-) But if you'd just like to loiter at the 
stand to tell people how wonderful Python is that's also very welcome.

Please let me know if you're willing to help,
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From pje at telecommunity.com  Thu Sep 16 23:14:07 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Sep 16 23:13:05 2004
Subject: [Python-Dev] PEP 302 and 'reload()'
In-Reply-To: <5.1.1.6.0.20040908172822.020f0a40@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040916171009.039ca5a0@mail.telecommunity.com>

At 05:38 PM 9/8/04 -0400, Phillip J. Eby wrote:
>It appears to me there is an error in both PEP 302's specification and its 
>implementation concerning the correct operation of reload().  First, it says:
>
>     The load_module() method has a few responsibilities that it must
>     fulfill *before* it runs any code:
>
>     - It must create the module object.  From Python this can be done
>       via the new.module() function, the imp.new_module() function or
>       via the module type object; from C with the PyModule_New()
>       function or the PyImport_ModuleAdd() function.
>
>This should probably say that if the module already exists in sys.modules, 
>it should reuse the existing module object, rather than creating a new 
>one.  Otherwise, 'reload()' cannot fulfill its contract.
>
>Second, the actual implementation of PyImport_ReloadModule doesn't 
>actually use a loader object, so reload() doesn't work with import hooks 
>at all.  There's an SF bug report for this, and a patch to fix it (that 
>also adds a test to test_importhooks to ensure that 'reload()' actually 
>invokes the loader.
>
>Are there any objections to me fixing either/both of these, and 
>backporting the bugfix to the 2.3 maintenance branch?

Since there have been no objections, I'll undertake (schedule permitting) 
to correct PEP 302, fix PyImport_ReloadModule 
and  Lib/test/test_importhooks, and backport the changes.

I'll note that there are other issues that affect reloading from e.g. 
zipfiles, but those are over my head to tackle at present.  However, until 
the PEP 302-level issues are dealt with, there's no chance of fixing 
reload-from-zip, since the underlying reload mechanism itself is broken 
with respect to PEP 302.


>Also, should PyImport_ReloadModule use the import lock?  It doesn't 
>currently, but I'm not clear on why it doesn't.

Since noone has answered this, I'll have to assume that there is a good 
reason, and won't fiddle with it.  But I'd still appreciate an answer.

From martin at v.loewis.de  Thu Sep 16 23:30:39 2004
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu Sep 16 23:30:42 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <200409161550.i8GFoODM038508@chilled.skew.org>
References: <200409161550.i8GFoODM038508@chilled.skew.org>
Message-ID: <414A05FF.8080000@v.loewis.de>

Mike Brown wrote:
> Right. This part of the thread was just about how the argument to 
> urllib.urlopen() should be handled when given as unicode vs str. You seemed to 
> be saying it should be str because a URI is fundamentally bytes and should be 
> analyzed as such, whereas I'm saying no, a URI is fundamentally characters and 
> should be analyzed as such. I mentioned %-encoding and the quirk of the BNF 
> just because those are aspects of the syntax that are byte-oriented and are the 
> source of much confusion, and because they may have influenced your assertion.
> 
> Are we in agreement on these points?

I think I have to answer "no". The % notation is not a quirk of the BNF.
I.e. when the BNF states that an URI contains %AC (say), this does *not*
mean that the actual URI in-memory-or-on-the-wire contains the byte
\xAC. The spec actually says that the URI, in memory, on the wire, or
on paper, actually contains the three character '%', 'A', and 'C'. So
usage of that escape mechanism is *not* a result of the BNF notation;
it is the inherent desire that URIs contain only characters in ASCII.
URIs that contain non-ASCII characters have to escape them "somehow",
typically using the % notation.

>  -  A URL/URI consists of a finite sequence of Unicode characters;

No. An URI contains of a finite sequence of characters. Whether they
are Unicode or not is not specified. The assumption certainly is that
if the characters are coded (i.e. assigned to numbers), those numbers
don't have to match Unicode code points at all. An URI that consists
of KOI-8R characters would very well be possible.

>  -  urlopen(), and anything else that takes a URL/URI argument,
>     must accept both str and unicode;

Certainly.

>  -  If given unicode, each character in the string directly represents
>     a character in the URL/URI and needs no interpretation;

No. Only ASCII characters in the string need no interpretation. For
non-ASCII characters, urllib needs to assume some escaping mechanism.

>  -  If given str, each byte in the string represents a character in
>     the URL/URI according to US-ASCII interpretation;

Yes, if the bytes are meaningful in ASCII.

>  -  Characters or bytes outside the ASCII range, and even certain
>     characters in the ASCII range, are not permitted in a URL/URI,
>     and thus the interpretation of a string containing them may
>     result in an exception or other unpredictable results.

Yes.

>  -  The urllib, urllib2, and urlparse modules currently do not
>     claim to conform to any particular standards governing the
>     interpretation of URLs; they merely acknowledge that some
>     standards may be applicable. However, the intent is to provide
>     standards-conformant behavior where possible, to the extent 
>     that the module APIs overlap with functionality mandated by
>     current standards.

Yes. For input that is out of scope of existing standards, backwards

> 
>     When the relevant standards become obsolete due to publication
>     of updated standards (e.g. RFC 1630 -> 1738 -> 1808 -> 2396),
>     the implementations *may* be updated accordingly, and users
>     should expect behavior that conforms to either the current or
>     obsoleted standards. Which standards are applicable to a
>     particular implementation should be documented in the module
>     and in its functions & classes where necessary.
> 
> And how about these?
> 
>  -  urlopen() is documented as accepting a 'url' argument that is
>     the URL of 'a network object' that can be read; a file-like
>     object, based on either a local file or a socket, is normally
>     returned. This 'network object' may be a local file if the
>     'file' scheme is used or if the URL's scheme component is omitted.
> 
>     For convenience, the 'url' argument is permitted to be given as
>     a str or unicode, and may be 'absolute' or 'relative'.
> 
>     If RFC 2396 or rfc2396bis apply, then the argument is assumed to
>     be what is defined in the grammar as a URI-reference. A fragment
>     component, if present, is stripped (this requires a change to the
>     implementation) and in all cases, the reference is resolved
>     against a default base URI.
> 
>     If RFC 1808 applies (the current implementation is based largely
>     on this spec, which did not clearly distinguish between a reference
>     and a URI), it is what is defined in the grammar as a URL, and
>     if it is relative (relativeURL in the grammar), it is considered
>     to be relative to a default base URL.
> 
>     (This is essentially describing the current implementation in
>     terms used by the standards).
> 
>  -  In urlopen() and the URLOpener classes it depends on, the default
>     base URI is the result of resolving the result of os.getcwd(),
>     converted to a URL by some undocumented means, against the base
>     'file:///'. 
> 
>     (I don't think this would require a change to the implementation,
>     but it is a principle that should be agreed upon and documented,
>     and perhaps the nuances of getcwd vs getcwdu should be addressed).
> 
>  -  The resolution of URIs having the 'file' scheme is undertaken on
>     the local filesystem according to conventions that should be, but
>     presently aren't, documented. A preferred mapping of filesystem
>     paths to URIs and back should be documented for each platform.
> 
>  -  In urlopen(), the processing of a 'url' argument that is
>     syntactically absolute may be nonconformant on platforms
>     that use ":" in their filesystem paths. On such platforms, if the
>     first ":" in what is syntactically an absolute URL/URI appears to
>     be intended for use other than as a scheme component delimiter,
>     the path will assumed to be relative. Furthermore, on Windows,
>     '\', which is not allowed in a URL, or its equivalent percent-
>     encoded sequence '%5C' (case-insensitive), will be interpreted as
>     a '/' in the URL.
> 
>     Thus, on Windows, an argument such as r'C:\a\b\c.txt' will be
>     treated as if it were 'file:///C:/a/b/c.txt' by the URLOpeners.
>     This is a convenience feature for the benefit of users who do
>     not have the means to convert an OS path to full 'file' URL.
> 
>     (This mostly describes current behavior, assuming we can reach
>     agreement that the "C:" in the example above should be treated
>     no differently than "C|").
> 
> 
>>As for using regular expressions in the standard library: It seems you
>>believe this is discouraged. I don't know why you think so - I've never
>>heard of such a constraint before (in general - in specific cases,
>>submitters may have been told that alternatives are more efficient).
> 
> 
> I was just surprised to find that regular expressions are not used much in
> urllib, urllib2, and urlparse. The implementations seem to be going to a
> lot of trouble to process URLs using find() and string slices. I thought
> perhaps there was a good reason for this.
> 
> I must attend to other things right now; will comment on the other issues 
> later.
> 
> -Mike
> 
> 

From martin at v.loewis.de  Thu Sep 16 23:39:00 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep 16 23:39:01 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
Message-ID: <414A07F4.9000905@v.loewis.de>

{I hit sent too early, here is the rest }

Mike Brown wrote:
> Right. This part of the thread was just about how the argument to 
> urllib.urlopen() should be handled when given as unicode vs str. You seemed to 
> be saying it should be str because a URI is fundamentally bytes and should be 
> analyzed as such, whereas I'm saying no, a URI is fundamentally characters and 
> should be analyzed as such. I mentioned %-encoding and the quirk of the BNF 
> just because those are aspects of the syntax that are byte-oriented and are the 
> source of much confusion, and because they may have influenced your assertion.
> 
> Are we in agreement on these points?

I think I have to answer "no". The % notation is not a quirk of the BNF.
I.e. when the BNF states that an URI contains %AC (say), this does *not*
mean that the actual URI in-memory-or-on-the-wire contains the byte
\xAC. The spec actually says that the URI, in memory, on the wire, or
on paper, actually contains the three character '%', 'A', and 'C'. So
usage of that escape mechanism is *not* a result of the BNF notation;
it is the inherent desire that URIs contain only characters in ASCII.
URIs that contain non-ASCII characters have to escape them "somehow",
typically using the % notation.

>  -  A URL/URI consists of a finite sequence of Unicode characters;

No. An URI contains of a finite sequence of characters. Whether they
are Unicode or not is not specified. The assumption certainly is that
if the characters are coded (i.e. assigned to numbers), those numbers
don't have to match Unicode code points at all. An URI that consists
of KOI-8R characters would very well be possible.

>  -  urlopen(), and anything else that takes a URL/URI argument,
>     must accept both str and unicode;

Certainly.

>  -  If given unicode, each character in the string directly represents
>     a character in the URL/URI and needs no interpretation;

No. Only ASCII characters in the string need no interpretation. For
non-ASCII characters, urllib needs to assume some escaping mechanism.

>  -  If given str, each byte in the string represents a character in
>     the URL/URI according to US-ASCII interpretation;

Yes, if the bytes are meaningful in ASCII.

>  -  Characters or bytes outside the ASCII range, and even certain
>     characters in the ASCII range, are not permitted in a URL/URI,
>     and thus the interpretation of a string containing them may
>     result in an exception or other unpredictable results.

Yes.

>  -  The urllib, urllib2, and urlparse modules currently do not
>     claim to conform to any particular standards governing the
>     interpretation of URLs; they merely acknowledge that some
>     standards may be applicable. However, the intent is to provide
>     standards-conformant behavior where possible, to the extent 
>     that the module APIs overlap with functionality mandated by
>     current standards.

Yes. For input that is out of scope of existing standards, backwards
compatibility is desirable, unless there is a strong indication that
Python should have raised an exception for this input all along.

>     When the relevant standards become obsolete due to publication
>     of updated standards (e.g. RFC 1630 -> 1738 -> 1808 -> 2396),
>     the implementations *may* be updated accordingly, and users
>     should expect behavior that conforms to either the current or
>     obsoleted standards. Which standards are applicable to a
>     particular implementation should be documented in the module
>     and in its functions & classes where necessary.

Yes.

>  -  urlopen() is documented as accepting a 'url' argument that is
>     the URL of 'a network object' that can be read; a file-like
>     object, based on either a local file or a socket, is normally
>     returned. This 'network object' may be a local file if the
>     'file' scheme is used or if the URL's scheme component is omitted.

Yes.

>     If RFC 1808 applies (the current implementation is based largely
>     on this spec, which did not clearly distinguish between a reference
>     and a URI), it is what is defined in the grammar as a URL, and
>     if it is relative (relativeURL in the grammar), it is considered
>     to be relative to a default base URL.

This is troublesome. What is a meaningful base URL? This should be 
mentioned prominently.

>  -  In urlopen() and the URLOpener classes it depends on, the default
>     base URI is the result of resolving the result of os.getcwd(),
>     converted to a URL by some undocumented means, against the base
>     'file:///'. 
> 
>     (I don't think this would require a change to the implementation,
>     but it is a principle that should be agreed upon and documented,
>     and perhaps the nuances of getcwd vs getcwdu should be addressed).

Sounds good.

>  -  The resolution of URIs having the 'file' scheme is undertaken on
>     the local filesystem according to conventions that should be, but
>     presently aren't, documented. A preferred mapping of filesystem
>     paths to URIs and back should be documented for each platform.

Ok.

>  -  In urlopen(), the processing of a 'url' argument that is
>     syntactically absolute may be nonconformant on platforms
>     that use ":" in their filesystem paths. On such platforms, if the
>     first ":" in what is syntactically an absolute URL/URI appears to
>     be intended for use other than as a scheme component delimiter,
>     the path will assumed to be relative. Furthermore, on Windows,
>     '\', which is not allowed in a URL, or its equivalent percent-
>     encoded sequence '%5C' (case-insensitive), will be interpreted as
>     a '/' in the URL.

Ok.

>     (This mostly describes current behavior, assuming we can reach
>     agreement that the "C:" in the example above should be treated
>     no differently than "C|").

I have no problem with that. There are no one-letter URL schemata,
are there?

> I must attend to other things right now; will comment on the other issues 
> later.

Take your time. This has been sitting around for many releases - one
more or less doesn't matter much in the global flow of things :-)

Regards,
Martin
From martin at v.loewis.de  Thu Sep 16 23:51:24 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Sep 16 23:51:25 2004
Subject: [Python-Dev] --with-tsc compile fails
In-Reply-To: <2mbrg6whop.fsf@starship.python.net>
References: <e8bf7a530409141924139b7036@mail.gmail.com>	<2mwtyvwsg4.fsf@starship.python.net>
	<41488D41.9090905@v.loewis.de> <2mbrg6whop.fsf@starship.python.net>
Message-ID: <414A0ADC.4060806@v.loewis.de>

Michael Hudson wrote:
> I don't *have* asm/msr.h!  And the impression I get is that we
> shouldn't be going near it with the proverbial bargepole.

Ah, ok - I probably missed the relevant gcc error message about
the missing header file in earlier reports.

It is fine then to use a copy of the macro. It should probably
apply to all installations where both __GNUC__ and __i386__
are defined.

Regards,
Martin
From anthony at interlink.com.au  Fri Sep 17 05:19:34 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Sep 17 05:20:22 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <4149C318.2010902@ocf.berkeley.edu>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>	<4148CFE1.5010503@ocf.berkeley.edu>	<4149361F.3030906@v.loewis.de>	<200409161515.56466.symbiont+py@berlios.de>
	<4149C318.2010902@ocf.berkeley.edu>
Message-ID: <414A57C6.9090803@interlink.com.au>

Instead of 'softtabstop', use 'set et' (expandtabs).

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From kbk at shore.net  Fri Sep 17 05:34:23 2004
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri Sep 17 05:34:30 2004
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200409170334.i8H3YNF7005875@h006008a7bda6.ne.client2.attbi.com>

Patch / Bug Summary
___________________

Patches :  241 open ( -6) /  2622 closed (+26) /  2863 total (+20)
Bugs    :  764 open ( +6) /  4453 closed (+38) /  5217 total (+44)
RFE     :  150 open ( +2) /   131 closed ( +0) /   281 total ( +2)

New / Reopened Patches
______________________

Use Py_CLEAR where necessary to avoid crashes  (2004-09-01)
       http://python.org/sf/1020188  reopened by  mwh

Decimal performance enhancements  (2004-09-02)
       http://python.org/sf/1020845  opened by  Nick Coghlan

topdir calculated incorrectly in bdist_rpm  (2004-09-03)
       http://python.org/sf/1022003  opened by  Anthony Tuininga

add support for the AutoReq flag in bdist_rpm  (2004-09-03)
       http://python.org/sf/1022011  opened by  Anthony Tuininga

Improve Template error detection and reporting  (2004-09-03)
CLOSED http://python.org/sf/1022173  opened by  Raymond Hettinger

test_random depends on os.urandom  (2004-09-03)
CLOSED http://python.org/sf/1022176  opened by  Raymond Hettinger

Conserve memory with list.pop()  (2004-09-06)
CLOSED http://python.org/sf/1022910  opened by  Raymond Hettinger

CodeContext - an extension to show you where you are  (2004-04-16)
       http://python.org/sf/936169  reopened by  noamr

Add arguments to RE functions  (2004-09-08)
CLOSED http://python.org/sf/1024041  opened by  Noam Raphael

Fix for 1022152   (2004-09-08)
CLOSED http://python.org/sf/1024238  opened by  Andrew Durdin

Error when int sent to PyLong_AsUnsignedLong  (2004-09-08)
       http://python.org/sf/1024670  opened by  Clinton R. Nixon

Check for NULL returns in compile.c:com_import_stmt  (2004-09-10)
       http://python.org/sf/1025636  opened by  Dima Dorfman

Add status code constants to httplib  (2004-09-10)
       http://python.org/sf/1025790  opened by  Andrew Eland

Clarify language in Data Structures chapter of tutorial  (2004-09-10)
CLOSED http://python.org/sf/1025795  opened by  Dima Dorfman

Fix TeX pasto in liboptparse.tex  (2004-09-10)
CLOSED http://python.org/sf/1025800  opened by  Dima Dorfman

typo repair  (2004-09-12)
CLOSED http://python.org/sf/1026384  opened by  George Yoshida

Add keyword arguments to Template substitutions  (2004-09-12)
CLOSED http://python.org/sf/1026859  opened by  Raymond Hettinger

building on OpenBSD 3.5  (2004-09-12)
CLOSED http://python.org/sf/1026986  opened by  Trevor Perrin

Specify a source baseurl for bdist_rpm.  (2004-09-15)
       http://python.org/sf/1028432  opened by  Chris Ottrey

Adding IPv6 host handling to httplib  (2004-09-15)
       http://python.org/sf/1028502  opened by  David Mills

Changes to cookielib.py & friends for 2.4b1  (2004-09-16)
       http://python.org/sf/1028908  opened by  John J Lee

tarfile.py longnames are truncated in getnames()  (2004-09-16)
       http://python.org/sf/1029061  opened by  Lars Gust?bel

Patches Closed
______________

Use Py_CLEAR where necessary to avoid crashes  (2004-09-01)
       http://python.org/sf/1020188  closed by  rhettinger

Py_CLEAR to implicitly cast its argument to PyObject *  (2004-09-01)
       http://python.org/sf/1020185  closed by  rhettinger

Implementation for PEP 318 using syntax J2  (2004-08-22)
       http://python.org/sf/1013835  closed by  ms_

fix for several sre escaping bugs (fixes #776311)  (2004-08-29)
       http://python.org/sf/1018386  closed by  niemeyer

Improve Template error detection and reporting  (2004-09-03)
       http://python.org/sf/1022173  closed by  rhettinger

test_random depends on os.urandom  (2004-09-03)
       http://python.org/sf/1022176  closed by  rhettinger

bsddb's DB.keys() method ignores transaction argument  (2004-08-26)
       http://python.org/sf/1017405  closed by  greg

Conserve memory with list.pop()  (2004-09-06)
       http://python.org/sf/1022910  closed by  rhettinger

PEP 292 reference implementation  (2004-03-23)
       http://python.org/sf/922115  closed by  bcannon

Multi-line strings and unittest  (2004-08-30)
       http://python.org/sf/1019220  closed by  purcell

Decoding incomplete unicode  (2004-07-27)
       http://python.org/sf/998993  closed by  doerwalter

Add arguments to RE functions  (2004-09-07)
       http://python.org/sf/1024041  closed by  rhettinger

Fix for 1022152   (2004-09-08)
       http://python.org/sf/1024238  closed by  jlgijsbers

Fix for duplicate attributes in generated HTML  (2004-08-20)
       http://python.org/sf/1013055  closed by  fdrake

Address bug 980938, add set_debug_output() function  (2004-07-03)
       http://python.org/sf/984492  closed by  jlgijsbers

make test_fcntl 64bit clean  (2003-09-13)
       http://python.org/sf/805626  closed by  loewis

NetBSD py_curses.h fix  (2003-09-15)
       http://python.org/sf/806800  closed by  loewis

Add script support to bdist_rpm.py  (2003-09-17)
       http://python.org/sf/808115  closed by  loewis

Add --force-arch=ARCH to bdist_rpm.py  (2003-09-17)
       http://python.org/sf/808120  closed by  loewis

Clarify language in Data Structures chapter of tutorial  (2004-09-10)
       http://python.org/sf/1025795  closed by  jlgijsbers

Fix TeX pasto in liboptparse.tex  (2004-09-10)
       http://python.org/sf/1025800  closed by  jlgijsbers

typo repair  (2004-09-11)
       http://python.org/sf/1026384  closed by  jlgijsbers

make Demo/scripts/primes.py usable as a module  (2004-01-04)
       http://python.org/sf/870286  closed by  jlgijsbers

reflect the removal of mpz  (2003-11-15)
       http://python.org/sf/842567  closed by  jlgijsbers

Add keyword arguments to Template substitutions  (2004-09-12)
       http://python.org/sf/1026859  closed by  bwarsaw

building on OpenBSD 3.5  (2004-09-13)
       http://python.org/sf/1026986  closed by  loewis

fix for glob with directories which contain brackets  (2003-05-15)
       http://python.org/sf/738389  closed by  progoth

New / Reopened Bugs
___________________

a wrong link from "frame object" in lib index  (2004-09-01)
CLOSED http://python.org/sf/1020540  opened by  Ilya Sandler

senddigest error  (2004-09-01)
       http://python.org/sf/1020605  opened by  James O'Kane

PyThreadState_Next not thread safe?  (2004-09-02)
       http://python.org/sf/1021318  opened by  John Ehresman

Trivial fix for obscure bug in os.urandom()  (2004-09-03)
       http://python.org/sf/1021596  opened by  Nick Mathewson

use first_name, not first, in code samples  (2004-09-02)
       http://python.org/sf/1021621  opened by  Steve R. Hastings

2.4a3: unhelpful error message from distutils  (2004-09-03)
       http://python.org/sf/1021756  opened by  Fredrik Lundh

Import random fails   (2004-09-03)
CLOSED http://python.org/sf/1021890  opened by  Paul D. Lusk

wrong options are set to python.exe  (2004-09-03)
       http://python.org/sf/1022010  reopened by  loewis

wrong options are set to python.exe  (2004-09-04)
CLOSED http://python.org/sf/1022010  opened by  George Yoshida

re.match(), re.MULTILINE and "^" broken  (2004-09-03)
CLOSED http://python.org/sf/1022030  opened by  Pat Notz

Bad examples of gettext.translation  (2004-09-03)
CLOSED http://python.org/sf/1022152  opened by  Facundo Batista

x, y in curses window object documentation  (2004-09-04)
       http://python.org/sf/1022311  opened by  Felix Wiemann

Solaris: reentrancy issues  (2004-08-29)
       http://python.org/sf/1018492  reopened by  loewis

test_xrange fails on osf1 v5.1b  (2004-09-06)
       http://python.org/sf/1022813  opened by  roadkill

random.shuffle should restrict the type of its argument   (2004-09-06)
CLOSED http://python.org/sf/1022880  opened by  Faheem Mitha

Generator exps fail with large value of range  (2004-09-06)
CLOSED http://python.org/sf/1022912  opened by  Andy Elvey

make test fails on HP-UX11i  (2004-09-06)
CLOSED http://python.org/sf/1022951  opened by  Richard Townsend

binascii.a2b_hqx("") raises SystemError  (2004-09-06)
CLOSED http://python.org/sf/1022953  opened by  Florian Bauer

Example does not match diagram.  (2004-09-06)
CLOSED http://python.org/sf/1023359  opened by  Nefarious CodeMonkey, Jr.

script which sets random.seed still returns random value  (2004-09-07)
CLOSED http://python.org/sf/1023453  opened by  Faheem Mitha

test__locale fails  (2004-09-07)
CLOSED http://python.org/sf/1023798  opened by  Michael Hudson

Include/pyport.h: Bad LONG_BIT assumption on non-glibc sys  (2004-09-07)
       http://python.org/sf/1023838  opened by  Gregor Richards

WinCVS doesn't recognize 2.4a3  (2004-09-08)
CLOSED http://python.org/sf/1024427  opened by  David W. Thomas

struct.calcsize() behaves strangely with short type  (2004-09-09)
CLOSED http://python.org/sf/1024669  opened by  Serafeim Zanikolas

shutils.rmtree() uses excessive amounts of memory  (2004-09-09)
       http://python.org/sf/1025127  opened by  James Henstridge

HTML Documentation for 2.4a3 not found  (2004-09-09)
       http://python.org/sf/1025392  opened by  Colin J. Williams

email.Utils.parseaddr fails to parse valid addresses  (2004-09-09)
       http://python.org/sf/1025395  opened by  Charles

asyncore.file_dispatcher should not take fd as argument  (2004-09-10)
       http://python.org/sf/1025525  opened by  david houlder

tkinter.py invalid number of parameter for _tkinet.create  (2004-09-10)
CLOSED http://python.org/sf/1025599  opened by  bertrandbfr

X to the power of 0 may give wrong answer  (2004-09-10)
CLOSED http://python.org/sf/1025872  opened by  Nick Coghlan

"ASCII" in doc section "String literals"  (2004-09-10)
CLOSED http://python.org/sf/1026038  opened by  Felix Wiemann

Confusing error message when subclassing from invalid base  (2004-09-11)
CLOSED http://python.org/sf/1026269  opened by  Gerrit Holl

iso-latin-1 strings and functions lower & upper  (2004-09-11)
CLOSED http://python.org/sf/1026480  opened by  Tomasz Kowaltowski

HardwareRandom should be renamed OSRandom  (2004-09-13)
CLOSED http://python.org/sf/1027105  opened by  Trevor Perrin

unicode DNS names in socket, urllib, urlopen  (2004-09-13)
       http://python.org/sf/1027206  opened by  Damjan Georgievski

socket.ssl should explain that it is a 2/3 connection  (2004-09-13)
       http://python.org/sf/1027394  opened by  adam goucher

Argument missing from calltip for new-style class init  (2004-09-13)
       http://python.org/sf/1027566  opened by  Loren Guthrie

os.stat errors when using shared drive on XP or NT  (2004-09-13)
       http://python.org/sf/1027570  opened by  zeke

In DOM Node Objects, add more explanations for insertBefore  (2004-09-14)
       http://python.org/sf/1027771  opened by  M.-A. DARCHE

Cookies without values are silently ignored (by design?)  (2004-09-14)
       http://python.org/sf/1028088  opened by  Doug Sheppard

date-datetime comparison  (2004-09-14)
CLOSED http://python.org/sf/1028306  opened by  Donnal Walter

get_installer_filename   (2004-09-15)
       http://python.org/sf/1028334  opened by  bingo

Python 2.3.4 broken?  (2004-09-15)
CLOSED http://python.org/sf/1028447  opened by  Stan

Problem linking on windows using mingw32 and C++  (2004-09-15)
       http://python.org/sf/1028697  opened by  Steve Menard

No command line args when script run without python.exe  (2004-09-16)
       http://python.org/sf/1029047  opened by  Kerim Borchaev

PEP 302 loader not carried through by reload function  (2004-09-16)
       http://python.org/sf/1029475  opened by  Stephen Haberman

test_pep277 fails  (2004-09-17)
       http://python.org/sf/1029561  opened by  Marel Baczynski

Bugs Closed
___________

Crash from Rapid Clicks  (2004-07-14)
       http://python.org/sf/990911  closed by  kbk

a wrong link from "frame object" in lib index  (2004-09-01)
       http://python.org/sf/1020540  closed by  rhettinger

httplib.HTTPConnection sends extra blank line  (2004-08-31)
       http://python.org/sf/1019956  closed by  jhylton

re.sub: two-digit group-reference hangs  (2004-08-30)
       http://python.org/sf/1018815  closed by  niemeyer

re.finditer hangs on final empty match  (2003-10-03)
       http://python.org/sf/817234  closed by  niemeyer

Make Problem on HPUX  (2004-07-14)
       http://python.org/sf/991125  closed by  plusk

Import random fails   (2004-09-03)
       http://python.org/sf/1021890  closed by  rhettinger

Regular expression failure of the sre engine  (2003-07-23)
       http://python.org/sf/776311  closed by  niemeyer

wrong options are set to python.exe  (2004-09-03)
       http://python.org/sf/1022010  closed by  loewis

wrong options are set to python.exe  (2004-09-03)
       http://python.org/sf/1022010  closed by  loewis

re.match(), re.MULTILINE and "^" broken  (2004-09-03)
       http://python.org/sf/1022030  closed by  effbot

Bad examples of gettext.translation  (2004-09-04)
       http://python.org/sf/1022152  closed by  jlgijsbers

Solaris: reentrancy issues  (2004-08-29)
       http://python.org/sf/1018492  closed by  loewis

including Python.h redefines _POSIX_C_SOURCE  (2004-08-27)
       http://python.org/sf/1017450  closed by  loewis

inspect.getmodule symlink-related failur  (2002-06-18)
       http://python.org/sf/570300  closed by  jlgijsbers

__metaclass__ in locals is ignored  (2004-08-30)
       http://python.org/sf/1019048  closed by  bcannon

split method documentation can be improved  (2004-02-21)
       http://python.org/sf/901654  closed by  rhettinger

random.shuffle should restrict the type of its argument   (2004-09-05)
       http://python.org/sf/1022880  closed by  rhettinger

Generator exps fail with large value of range  (2004-09-06)
       http://python.org/sf/1022912  closed by  rhettinger

make test fails on HP-UX11i  (2004-09-06)
       http://python.org/sf/1022951  closed by  rhettinger

binascii.a2b_hqx("") raises SystemError  (2004-09-06)
       http://python.org/sf/1022953  closed by  rhettinger

mimetypes add_type has bogus self parameter  (2004-08-23)
       http://python.org/sf/1014022  closed by  doerwalter

Example does not match diagram.  (2004-09-06)
       http://python.org/sf/1023359  closed by  akuchling

"rich comparison'' methods hide stack overflow  (2004-08-30)
       http://python.org/sf/1019129  closed by  rhettinger

script which sets random.seed still returns random value  (2004-09-07)
       http://python.org/sf/1023453  closed by  rhettinger

test__locale fails  (2004-09-07)
       http://python.org/sf/1023798  closed by  bcannon

WinCVS doesn't recognize 2.4a3  (2004-09-08)
       http://python.org/sf/1024427  closed by  loewis

os.system segmentation fault   (2004-08-25)
       http://python.org/sf/1015937  closed by  nnorwitz

struct.calcsize() behaves strangely with short type  (2004-09-08)
       http://python.org/sf/1024669  closed by  mwh

RE engine internal error with LARGE RE: scalability bug  (2003-12-10)
       http://python.org/sf/857676  closed by  effbot

"build" target doesn't check umask  (2004-06-22)
       http://python.org/sf/977937  closed by  melicertes

tkinter.py invalid number of parameter for _tkinet.create  (2004-09-10)
       http://python.org/sf/1025599  closed by  loewis

X to the power of 0 may give wrong answer  (2004-09-10)
       http://python.org/sf/1025872  closed by  tim_one

Unspecific errors with metaclass  (2004-08-23)
       http://python.org/sf/1014215  closed by  rhettinger

"ASCII" in doc section "String literals"  (2004-09-10)
       http://python.org/sf/1026038  closed by  loewis

Confusing error message when subclassing from invalid base  (2004-09-11)
       http://python.org/sf/1026269  closed by  mwh

iso-latin-1 strings and functions lower & upper  (2004-09-11)
       http://python.org/sf/1026480  closed by  kowaltowski

crash error in glob.glob; directories with brackets  (2003-05-15)
       http://python.org/sf/738361  closed by  progoth

HardwareRandom should be renamed OSRandom  (2004-09-13)
       http://python.org/sf/1027105  closed by  rhettinger

date-datetime comparison  (2004-09-14)
       http://python.org/sf/1028306  closed by  tim_one

Python 2.3.4 broken?  (2004-09-15)
       http://python.org/sf/1028447  closed by  mwh

New / Reopened RFE
__________________

proposed struct module format code addition  (2004-09-06)
       http://python.org/sf/1023290  opened by  Josiah Carlson

urllib2 http auth  (2004-09-10)
       http://python.org/sf/1025540  opened by  Tim Nelson

From symbiont+py at berlios.de  Fri Sep 17 06:35:39 2004
From: symbiont+py at berlios.de (Jeff Pitman)
Date: Fri Sep 17 06:37:50 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <4149C318.2010902@ocf.berkeley.edu>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>
	<200409161515.56466.symbiont+py@berlios.de>
	<4149C318.2010902@ocf.berkeley.edu>
Message-ID: <200409171235.39871.symbiont+py@berlios.de>

On Friday 17 September 2004 00:45, Brett C. wrote:
> would be worth having as a separate SF
> project

I believe the facilities at www.vim.org are sufficient for such an 
effort. Additionally, such standards-compliance should be pushed into 
the upstream vim tarball as well.  Account holders on vim.org can 
develop vim scripts here: http://www.vim.org/scripts/index.php, which 
allows for a simple release mechanism.  For discussion, c.l.python 
could work, albeit a bit noisy.  Maybe a keyword in the subject line or 
something for those with filtering technologies would be beneficial.

take care,
-- 
-jeff
From symbiont+py at berlios.de  Fri Sep 17 06:39:42 2004
From: symbiont+py at berlios.de (Jeff Pitman)
Date: Fri Sep 17 06:41:46 2004
Subject: [Python-Dev] tabs in httplib.py and test_httplib.py
In-Reply-To: <414A57C6.9090803@interlink.com.au>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>
	<4149C318.2010902@ocf.berkeley.edu>
	<414A57C6.9090803@interlink.com.au>
Message-ID: <200409171239.42375.symbiont+py@berlios.de>

On Friday 17 September 2004 11:19, Anthony Baxter wrote:
> Instead of 'softtabstop', use 'set et' (expandtabs).

I use both.  et will make it so you don't mess with \t insanity. sts 
makes it nice when you want to mass indent things using a region and 
the "=" indenter.  (Ctrl-V,j,j,j,j,j,=) <-- Something to this effect.

have fun,
-- 
-jeff
From mike at skew.org  Fri Sep 17 07:39:11 2004
From: mike at skew.org (Mike Brown)
Date: Fri Sep 17 07:39:09 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <414A05FF.8080000@v.loewis.de>
	=?UTF-8?Q?from_=22Martin_v=2E_L=C3=B6wis=22_at_Sep_16=2C_2004_11=3A30=3A39_p?=
	=?UTF-8?Q?m?=
Message-ID: <200409170539.i8H5dBfF042148@chilled.skew.org>

"Martin v. L?wis" wrote:
> > Are we in agreement on these points?
> 
> I think I have to answer "no". The % notation is not a quirk of the BNF.

That's not what I said at *all*.  The quirk of the BNF is a completely 
separate issue, and is this: BNF mandates that its terminals are integers, 
e.g. character ":" in a particular BNF-based grammar represents the value 58 
(in decimal). RFC 2396 makes use of the grammar to define the generic syntax, 
but stipulates (well, rfc2396bis clarifies that the intent was to stipulate) 
that the intent is to actually define the syntax in terms of characters, so 
the ":" in the grammar really does mean the colon character, in that spec.

So there is no disagreement there, really.

> >  -  A URL/URI consists of a finite sequence of Unicode characters;
> 
> No. An URI contains of a finite sequence of characters.

You are correct. This is stated in RFC 2396, and Martin Duerst and I pushed 
for rfc2396bis to settle upon a definition of character just to make it extra 
clear, so I should have known better.

> >  -  If given unicode, each character in the string directly represents
> >     a character in the URL/URI and needs no interpretation;
> 
> No. Only ASCII characters in the string need no interpretation. For
> non-ASCII characters, urllib needs to assume some escaping mechanism.

Err, no. Let me start over. The question is: what do we do with a unicode 
object given as the 'url' argument in urllib.urlopen(), etc.?

Assumption 1:
  Resolution to absolute form and subsequent dereferencing of a
  character sequence that is intended to identify a resource,
  in order to be performed in a manner that is conformant with
  [pick one: RFC 1630, RFC 1738, RFC 1808, RFC 2396, the RFC that
  rfc2396bis will likely become, or the RFC that the IRIs draft will
  likely become], requires that the character sequence actually *be*
  [depending on which spec you chose] a URL, a URI reference, or 
  an IRI reference. Those standards do not define how to resolve &
  dereference other types of resource identifiers, be they character
  sequences or otherwise.

Assumption 2:
  The aforementioned standards unambiguously define the syntax to which a
  resource-identifying character sequence must conform in order to be
  considered a URL, a URI reference, or an IRI reference. The standards
  do not define how character sequences that do not conform to the syntax
  can be processed (but they do not forbid such processing; they just say
  that they aren't applicable to those situations).

Assumption 3:
  When an argument is given to an RFC 1808-era URL resolution function
  that is documented as requiring that the argument be [an object that
  represents] a 'URL', then the caller implicitly asserts that whatever
  object passed indeed represents a URL.

Assumption 4:
  The object passed into the function, of course, is going to manifest
  relatively concretely, as, say, a Python str or unicode object, so
  the function, if it intends to perform standards-conformant resolution,
  must behave as if it has interpreted the object as a resource-identifying
  sequence of abstract characters, and must verify somehow that the sequence
  adheres to the syntax requirements of a URL / URI ref / IRI ref. This
  verification can either be an explicit syntax check, or can be a feature
  of the conversion of the object as resource-identifying characters.

In either case, we need to define the mechanics of that conversion. This 
is what I am attempting to unambiguously do for str and unicode arguments
by saying how each item in a str or unicode object maps to the characters
that are going to be treated as a URL/URI ref.

It is true that we are under no obligation in our API to assume a one-to-one 
mapping between the characters in a unicode argument and the characters in the 
resource-identifying string that, in turn, may or may not be a URL, but to do 
otherwise seems a bit unintuitive, to me. You seem to be suggesting that a 
one-to-one mapping be assumed until a syntax error is found. Then, if the 
syntax error is of a certain type (like the character is > U+007F, then you 
seem to be saying that you want some kind of cleanup to be performed in order 
to ensure that the resulting string is conformant to the URL syntax.

I feel that since urllib is under no obligation to assume anything about what 
the syntax-violating characters are intended to mean, it would be within its 
rights to reject the argument altogether, and I would rather see it do that 
than try to guess what the user intended -- especially in this domain, where 
such guesses, if wrong, only lead developers to be even more confused about 
topics that are already barely understood as it is.

For example, some specs (HTML, XHTML, XSLT) suggest that processors of those 
types of documents perform UTF-8 based percent-encoding of any non-ASCII 
characters that mistakenly appear in attribute values that are normally 
supposed to contain URI references (hrefs and the like). Users who rely on 
this then wonder why many widely-deployed HTTP servers/CGI/PHP apps, etc. -- 
the ones that assume %-encoded octets in the Request-URI are iso-8859-1 based 
-- misinterpret the characters. To me, convenience afforded by the automatic
percent-encoding is outweighed by the harm introduced by the wrong guesses
and the reinforcement of the belief in the document author or developer that
a URI reference is whatever string of characters they want it to be.

I have a feeling this is a matter of personal philosophy. I've never been a 
huge fan of the "be lenient in what you accept, strict in what you produce" 
mantra. URLs/URIs have a strict syntax, and IMHO we should enforce it so that 
developers can learn about and code to standards, rather than becoming reliant 
upon the crutch of lenient-yet-convenient APIs.

But if we are going to accept arbitrary strings and then attempt to make 'em 
fit the URL syntax, then we should, IMHO, acknowledge (in API documentation) 
that this is behavior provided for the sake of having a convenient API, and is 
not within the scope of the standards. Hopefully the marginal percentage of 
developers who actually read the API docs can then learn that 
u'http://m.v.l\xd6wis/' is not a URL, even if urllib happens to convert it to 
one, and in my perfect fantasy-world, they'd be less inclined to give us any
reason to make lenient APIs. Actually, in a perfect world I probably would
not be inclined to obsess over such things :)

-Mike
From martin at v.loewis.de  Fri Sep 17 08:07:30 2004
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri Sep 17 08:07:31 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <200409170539.i8H5dBfF042148@chilled.skew.org>
References: <200409170539.i8H5dBfF042148@chilled.skew.org>
Message-ID: <414A7F22.3070006@v.loewis.de>

Mike Brown wrote:
> It is true that we are under no obligation in our API to assume a one-to-one 
> mapping between the characters in a unicode argument and the characters in the 
> resource-identifying string that, in turn, may or may not be a URL, but to do 
> otherwise seems a bit unintuitive, to me.

Not at all. If the URI contains the sequence '%A0', does that constitute
one or three characters? You suggested earlier that the host part of an
URI could be UTF-8 encoded. In that case, a single character translates
into, say, 2 octets, which then get %-escaped, translating into 6 ASCII
characters. So a single Unicode character may end up in multiple ASCII
characters during processing.

> You seem to be suggesting that a 
> one-to-one mapping be assumed until a syntax error is found. Then, if the 
> syntax error is of a certain type (like the character is > U+007F, then you 
> seem to be saying that you want some kind of cleanup to be performed in order 
> to ensure that the resulting string is conformant to the URL syntax.
 >
> I feel that since urllib is under no obligation to assume anything about what 
> the syntax-violating characters are intended to mean, it would be within its 
> rights to reject the argument altogether, and I would rather see it do that 
> than try to guess what the user intended -- especially in this domain, where 
> such guesses, if wrong, only lead developers to be even more confused about 
> topics that are already barely understood as it is.

Either is fine. It appears that the future URI RFC and the IRI RFC will
suggest that the "cleanup" is the right action, and that the
implementation should indeed process the string.

> To me, convenience afforded by the automatic
> percent-encoding is outweighed by the harm introduced by the wrong guesses
> and the reinforcement of the belief in the document author or developer that
> a URI reference is whatever string of characters they want it to be.

I agree. However, I hope that the IRI RFC will resolve the issue for
good, at least when the input is a Python Unicode string. When the input
is a Python byte string, it seems natural to %-escape the non-ASCII
bytes.

> But if we are going to accept arbitrary strings and then attempt to make 'em 
> fit the URL syntax, then we should, IMHO, acknowledge (in API documentation) 
> that this is behavior provided for the sake of having a convenient API, and is 
> not within the scope of the standards. Hopefully the marginal percentage of 
> developers who actually read the API docs can then learn that 
> u'http://m.v.l\xd6wis/' is not a URL, even if urllib happens to convert it to 
> one, and in my perfect fantasy-world, they'd be less inclined to give us any
> reason to make lenient APIs. 

But it is an IRI reference, isn't it? I think urllib then should process
it as such.

Regards,
Martin
From mike at skew.org  Fri Sep 17 09:54:21 2004
From: mike at skew.org (Mike Brown)
Date: Fri Sep 17 09:54:21 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <414A07F4.9000905@v.loewis.de>
	=?UTF-8?Q?from_=22Martin_v=2E_L=C3=B6wis=22_at_Sep_16=2C_2004_11=3A39=3A00_p?=
	=?UTF-8?Q?m?=
Message-ID: <200409170754.i8H7sLWr042680@chilled.skew.org>

"Martin v. L> >     If RFC 1808 applies (the current implementation is based largely
> >     on this spec, which did not clearly distinguish between a reference
> >     and a URI), it is what is defined in the grammar as a URL, and
> >     if it is relative (relativeURL in the grammar), it is considered
> >     to be relative to a default base URL.
> 
> This is troublesome. What is a meaningful base URL? This should be 
> mentioned prominently.

In effect, this is what happens in the current implementation, but I don't 
think it was ever anyone's intent to think of it in terms of standards-based 
resolution-to-absolute-form against a base URL, and in any event, it's not 
as well-documented as it should be.

User expectation in most contexts, even when it doesn't apply (as in the most 
prominent use of relative references: HTML/XML document processing) is that 
relative references are relative to a base having something to do with the 
current working directory of the URL processor. Wrong as it often is to make 
such an assumption, in the case of urlopen() we have no context that would 
define a base URL. The documented precedent is that the 'file' scheme is 
assumed, and the implementation, IIRC, is such that the relative path is run 
through url2pathname which does very little to it, and it is then passed right 
to open(), so in effect the current working directory is assumed.

For the sake of having a sane policy going forward, I would rather see the 
behavior expressed in terms that would be governed by standards, which is what 
I attempted to do. Luckily, the behavior is such that it is possible.

There is an issue though: if disallowed/non-ASCII characters or bytes are in 
the urlopen() argument, and it's a relative URL, then right now the 
implementation is (I think) such that those characters or bytes pass through 
unchanged to the open() call. So if we do anything to these characters/bytes 
beforehand, such as %-encoding them as I think you were suggesting (see 
previous email), then for compatibility we'd have to specify that we're 
%-decoding them again in a way that results in the original characters/bytes 
being passed to open().

> >     (This mostly describes current behavior, assuming we can reach
> >     agreement that the "C:" in the example above should be treated
> >     no differently than "C|").
> 
> I have no problem with that. There are no one-letter URL schemata,
> are there?

There aren't, although in principle I wish the API weren't lenient;
people would quickly learn that C:\x\y\z is not a URL and C:/x/y/z is
only allowed by the standards to be interpreted in one way: the one
they probably don't want, and what they really need to do is learn to
use file:///blahblahblah.


In 4Suite's Ft.Lib.Uri we needed to conduct strictly conformant processing of 
URI references in our DOM, XPath, XSLT, and HTTP implementations. I found that 
we couldn't use urllib for hardly anything of this sort without a great deal 
of working around / closing up the holes opened by all these 'conveniences'.

Tightening up the conformance issues meant that we needed to help users 
produce valid URIs from filesystem paths and vice-versa. Once again, the core 
Python libs were of little use -- pathname2url and url2pathname are 
platform-dependent, and are so full of bugs^H^H^H^Hfeatures that I had to 
start from scratch and roll my own functions. I think what I've got at this 
point would make great additions to urllib2, but I'll save them for another 
day...

At least with all the "OKs" you've given so far, I can submit a patch or three 
to get some of the documentation updated.

> > I must attend to other things right now; will comment on the other issues 
> > later.
> 
> Take your time. This has been sitting around for many releases - one
> more or less doesn't matter much in the global flow of things :-)

Heh, agreed. I wish rfc2396bis and IRIs would hurry on through the IETF's 
machinery. I've only been actively paying attention to the former, but they
both have a lot going for them.
From mike at skew.org  Fri Sep 17 11:02:39 2004
From: mike at skew.org (Mike Brown)
Date: Fri Sep 17 11:02:40 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <414A7F22.3070006@v.loewis.de>
	=?UTF-8?Q?from_=22Martin_v=2E_L=C3=B6wis=22_at_Sep_17=2C_2004_08=3A07=3A30_a?=
	=?UTF-8?Q?m?=
Message-ID: <200409170902.i8H92dgF042997@chilled.skew.org>

"Martin v. L?wis" wrote:
> > It is true that we are under no obligation in our API to assume a one-to-one 
> > mapping between the characters in a unicode argument and the characters in the 
> > resource-identifying string that, in turn, may or may not be a URL, but to do 
> > otherwise seems a bit unintuitive, to me.
> 
> Not at all. If the URI contains the sequence '%A0',
> does that constitute one or three characters?

Yes, it does. :)

I think I've got this right:

%A0 in a URI is three characters in the URI. Together they are representing 
one octet (byte A0) in much the same way that the 6 characters &#232; 
represents a single small-e-with-acute character in ISO/IEC 10646-based markup 
languages.

If the sequence were %00-%7F, then the octet represented by that sequence 
would in turn represent a single character in the ASCII range, and you would 
be allowed to use equivalence rules and knowledge of the syntax in order to 
ascertain whether the sequence is interchangeable with the raw character at 
that position in the URI.

But since in this example it is %80-%FF, the octet represented by the sequence 
does not automatically represent a character; it represents, at best, a 
scheme- or implementation-defined code unit which may or may not be an encoded 
character or portion thereof.

> You suggested earlier that the host part of an
> URI could be UTF-8 encoded. In that case, a single character translates
> into, say, 2 octets, which then get %-escaped, translating into 6 ASCII
> characters. So a single Unicode character may end up in multiple ASCII
> characters during processing.

That sounds right, but I think I need to an example to understand where the 
disagreement is. It's not a URI at the point where it contains a non-ASCII
character.

Theoretical resolution procedure of argument u'http://m.v.l\xf6wis/':

  arg         u'http://m.v.l\xf6wis/'
  => IRI ref  u'http://m.v.l\xf6wis/'
  => URI ref  u'http://m.v.l%C3%B6wis/'

and likewise, just for example,

  arg         u'http://m.v.l%C3%B6wis/'
  => IRI ref  u'http://m.v.l%C3%B6wis/'
  => URI ref  u'http://m.v.l%C3%B6wis/'


In any event, the argument has become the URI reference
u'http://m.v.l%C3%B6wis/' (which we don't need to necessarily store
as unicode, but I prefer to write it as such for clarity):

  1. Resolve to absolute form (necessary even with absolute refs
     in order to eliminate dot segments in the path; the rfc2396bis
     algorithm is preferable to the buggy ones in older specs for this).

     The base URI will be based on os.getcwd(). We'll say cwd is
     '/home/mike/test' to keep it simple. Base URI then is
     u'file:///home/mike/test'. Resolution to absolute form results
     in, in this case, no change: the URI represented by the URI ref
     is the same as the ref itself: u'http://m.v.l%C3%B6wis/'.

  2. URI is decomposed into its components:
       scheme: u'http'
       authority: u'm.v.l%C3%B6wis'
       path: u'/'
       query: undefined
       fragment: undefined

  3. Fragment, if any, is stripped prior to dereference, per specs.

  4. For http scheme, authority is split into:
       user: undefined
       pass: undefined
       host: u'm.v.l%C3%B6wis'
       port: u'80' (default)

  5. host is percent-decoded with a UTF-8 basis:
       host: u'm.v.l\xf6wis'

  6. socket object is obtained for host
     u'm.v.l\xf6wis' and port 80 (int);
     socket module applies IDNA encoding and does DNS lookup of
     'm.v.xn--lwis-5qa', connects to corresponding IP address on port 80

  7. properly formatted HTTP request message (a byte string)
     is sent for Request-URI '/' with Host header 'Host: m.v.xn--lwis-5qa'


If the initial argument were a byte string, I agree that any non-ASCIIs
should be percent-encoded directly. Processing would then be conducted
exactly as above.

-Mike
From mike at skew.org  Fri Sep 17 11:04:27 2004
From: mike at skew.org (Mike Brown)
Date: Fri Sep 17 11:04:25 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was Re:
	urllib.urlopen...)
In-Reply-To: <200409170902.i8H92dgF042997@chilled.skew.org> "from Mike Brown
	at Sep 17, 2004 03:02:39 am"
Message-ID: <200409170904.i8H94RUC043035@chilled.skew.org>

I wrote:
> %A0 in a URI is three characters in the URI. Together they are representing 
> one octet (byte A0) in much the same way that the 6 characters &#232; 
> represents a single small-e-with-acute character in ISO/IEC 10646-based markup 
> languages.

Actually I suppose it's not 'much the same way' since &#232; does not at any
point represent bytes, but you know what I mean, I think.
From FBatista at uniFON.com.ar  Fri Sep 17 16:10:21 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Sep 17 16:14:48 2004
Subject: [Python-Dev] Decimal, copyright and license
Message-ID: <A128D751272CD411BC9200508BC2194D053C7980@escpl.tcp.com.ar>

People:

I'm creating a decimal installer (for Py2.3 users), making tarball, .rpm and
.exe versions available.

What I don't know is what to put about license and copyright.

Regarding copyright, my first draft says:

    Copyright (c) 2004 Python Software Foundation.
    All rights reserved.
    
Regarding license, didn't put nothing yet, should I write something like the
following and include the file?

    See the file "LICENSE" for information on the history of this
    software, terms & conditions for usage, and a DISCLAIMER OF ALL
    WARRANTIES.

Remember that the "decimal installer" will be available for download not in
a Python location.

Thanks!

Facundo Batista
Desarrollo de Red
fbatista@unifon.com.ar
(54 11) 5130-4643
Cel: 15 5097 5024


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20040917/516634a9/attachment.html
From skip at pobox.com  Fri Sep 17 16:20:51 2004
From: skip at pobox.com (Skip Montanaro)
Date: Fri Sep 17 16:20:59 2004
Subject: [Python-Dev] Re: Weekly Python Patch/Bug Summary
In-Reply-To: <200409170414.i8H48pF8022459@h006008a7bda6.ne.client2.attbi.com>
References: <200409170414.i8H48pF8022459@h006008a7bda6.ne.client2.attbi.com>
Message-ID: <16714.62147.810666.375480@montanaro.dyndns.org>


    Kurt> Patch / Bug Summary
    Kurt> ___________________

    Kurt> Patches :  241 open ( -6) /  2622 closed (+26) /  2863 total (+20)
    Kurt> Bugs    :  764 open ( +6) /  4453 closed (+38) /  5217 total (+44)
    Kurt> RFE     :  150 open ( +2) /   131 closed ( +0) /   281 total ( +2)

Let me take the opportunity to thank Kurt for providing this excellent
summary (much better than my original hack) and invite the larger Python
community to participate in Python's development by reviewing patches and
bug reports.  If you're new to Python development, I urge you to read

    http://www.python.org/dev/dev_intro.html

especially the "Helping Out" section.

Skip
From symbiont+py at berlios.de  Fri Sep 17 17:16:18 2004
From: symbiont+py at berlios.de (Jeff Pitman)
Date: Fri Sep 17 17:18:33 2004
Subject: [Python-Dev] Decimal, copyright and license
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C7980@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C7980@escpl.tcp.com.ar>
Message-ID: <200409172316.18261.symbiont+py@berlios.de>

On Friday 17 September 2004 22:10, Batista, Facundo wrote:
> Remember that the "decimal installer" will be available for download
> not in a Python location.

For the ignorant (me): what's a "decimal installer"?

thanks,
-- 
-jeff
From FBatista at uniFON.com.ar  Fri Sep 17 17:50:16 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Sep 17 17:54:50 2004
Subject: [Python-Dev] Decimal, copyright and license
Message-ID: <A128D751272CD411BC9200508BC2194D053C7981@escpl.tcp.com.ar>

#- > Remember that the "decimal installer" will be available
#- for download
#- > not in a Python location.
#-
#- For the ignorant (me): what's a "decimal installer"?

An installer which puts the decimal module in site-packages.

Sorry for the before-turn-on-neurons-written mail.

.	Facundo
From fdrake at acm.org  Fri Sep 17 19:09:47 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Sep 17 19:10:04 2004
Subject: [Python-Dev] Planning to drop gzip compression for future releases.
Message-ID: <200409171309.48011.fdrake@acm.org>

At this point, I'm planning to drop the gzip-compressed archives for all 
future Python releases.  The bzip2 archives are much smaller (saving 
bandwidth, disk space, and download time), and supporting software seems to 
have become widely available in both free and commercial tools.

I'm still planning to make ZIP archives available.  If anyone would like to 
argue that I should drop that as well, feel free.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From skip at pobox.com  Fri Sep 17 19:43:19 2004
From: skip at pobox.com (Skip Montanaro)
Date: Fri Sep 17 19:43:38 2004
Subject: [Python-Dev] Decimal, copyright and license
In-Reply-To: <200409172316.18261.symbiont+py@berlios.de>
References: <A128D751272CD411BC9200508BC2194D053C7980@escpl.tcp.com.ar>
	<200409172316.18261.symbiont+py@berlios.de>
Message-ID: <16715.8759.13545.439652@montanaro.dyndns.org>


    Jeff> On Friday 17 September 2004 22:10, Batista, Facundo wrote:
    >> Remember that the "decimal installer" will be available for download
    >> not in a Python location.

    Jeff> For the ignorant (me): what's a "decimal installer"?

It's sort of like an impressionist's tatoo machine.  After using it you have
lots of little dots all over. ;-)

More seriously, I suspect it's an installer for the new Decimal class for
use with older versions of Python.

Skip
From jcarlson at uci.edu  Fri Sep 17 20:00:16 2004
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Sep 17 20:06:54 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <200409171309.48011.fdrake@acm.org>
References: <200409171309.48011.fdrake@acm.org>
Message-ID: <20040917105329.F123.JCARLSON@uci.edu>

> At this point, I'm planning to drop the gzip-compressed archives for all 
> future Python releases.  The bzip2 archives are much smaller (saving 
> bandwidth, disk space, and download time), and supporting software seems to 
> have become widely available in both free and commercial tools.

Sounds good.  When are we going to start offering a bzip2 library in
Python?

> I'm still planning to make ZIP archives available.  If anyone would like to 
> argue that I should drop that as well, feel free.  ;-)

Zip has been the de-facto standard for compression in the windows world
for around 10 years.  While other formats are making inroads (rar, ace,
bzip2, etc.), they are not supported by the most popular windows
archiver, WinZip: http://www.download.com/sort/3150-2250-0-1-4.html?

When the most popular compression tool for Windows starts offering bzip2
compression, then it seems like a good idea to toss the zip file format.

 - Josiah

From lalo at laranja.org  Fri Sep 17 20:14:09 2004
From: lalo at laranja.org (Lalo Martins)
Date: Fri Sep 17 20:17:15 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <200409171309.48011.fdrake@acm.org>
References: <200409171309.48011.fdrake@acm.org>
Message-ID: <20040917181409.GN21135@laranja.org>

On Fri, Sep 17, 2004 at 01:09:47PM -0400, Fred L. Drake, Jr. wrote:
> I'm still planning to make ZIP archives available.  If anyone would like to 
> argue that I should drop that as well, feel free.  ;-)

1. the main archive software packages for all OSes support
tar.bz2 in their current releases.  (This includes WinZip,
WinRAR and whatnot.)

2. if you can't be bothered to know what is a tar.bz2 and how to
open it, you won't be getting the ZIP, but rather the EXE installer.

[]s,
                                               |alo
                                               +----
--
            Those who trade freedom for security
               lose both and deserve neither.
--
http://www.laranja.org/                mailto:lalo@laranja.org
 pgp key: http://garfield.laranja.org/~lalo/gpgkey-signed.asc

GNU: never give up freedom                 http://www.gnu.org/
From jcarlson at uci.edu  Fri Sep 17 20:27:08 2004
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Sep 17 20:33:55 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040917105329.F123.JCARLSON@uci.edu>
References: <200409171309.48011.fdrake@acm.org>
	<20040917105329.F123.JCARLSON@uci.edu>
Message-ID: <20040917112644.F126.JCARLSON@uci.edu>

> Sounds good.  When are we going to start offering a bzip2 library in
> Python?

Nevermind, it is already there.

 - Josiah

From tim.peters at gmail.com  Fri Sep 17 20:35:41 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Sep 17 20:35:57 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040917181409.GN21135@laranja.org>
References: <200409171309.48011.fdrake@acm.org>
	<20040917181409.GN21135@laranja.org>
Message-ID: <1f7befae04091711353b5538a8@mail.gmail.com>

[Lalo Martins]
> 1. the main archive software packages for all OSes support
> tar.bz2 in their current releases.  (This includes WinZip,
> WinRAR and whatnot.)

WinZip 9.0 SR-1 (which is the current release) does not support bz2.
From mike at skew.org  Fri Sep 17 22:37:53 2004
From: mike at skew.org (Mike Brown)
Date: Fri Sep 17 22:38:00 2004
Subject: [Python-Dev] Re: URL processing conformance and principles
In-Reply-To: <200409170902.i8H92dgF042997@chilled.skew.org>
References: <200409170902.i8H92dgF042997@chilled.skew.org>
Message-ID: <414B4B21.2060004@skew.org>

Oops, found another little mistake in my last email:
>      The base URI will be based on os.getcwd(). We'll say cwd is
>      '/home/mike/test' to keep it simple. Base URI then is
>      u'file:///home/mike/test'.

I meant to say u'file:///home/mike/test/' (with trailing slash). Even 
though the filesystem does not care, the resolution-to-absolute-form 
algorithm does.
From Scott.Daniels at Acm.Org  Sat Sep 18 00:33:38 2004
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat Sep 18 00:32:30 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040917181409.GN21135@laranja.org>
References: <200409171309.48011.fdrake@acm.org>
	<20040917181409.GN21135@laranja.org>
Message-ID: <cifolq$iof$1@sea.gmane.org>

Lalo Martins wrote:
> On Fri, Sep 17, 2004 at 01:09:47PM -0400, Fred L. Drake, Jr. wrote:
>>I'm still planning to make ZIP archives available.  If anyone would like to 
>>argue that I should drop that as well, feel free.  ;-)
> 
> 1. the main archive software packages for all OSes support
> tar.bz2 in their current releases.  (This includes WinZip,
> WinRAR and whatnot.)
> 
> 2. if you can't be bothered to know what is a tar.bz2 and how to
> open it, you won't be getting the ZIP, but rather the EXE installer.

.zip is the only one of these 3 formats that allows you to decompress a
few files without expanding the entire archive.  This feature is useful
to me at least (and makes up for the larger size).
-- 
-- Scott David Daniels
Scott.Daniels@Acm.Org

From python at discworld.dyndns.org  Sat Sep 18 00:46:15 2004
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Sat Sep 18 00:39:11 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <cifolq$iof$1@sea.gmane.org>
References: <200409171309.48011.fdrake@acm.org>
	<20040917181409.GN21135@laranja.org> <cifolq$iof$1@sea.gmane.org>
Message-ID: <20040917224615.GC8367@discworld.dyndns.org>

Scott David Daniels <Scott.Daniels@Acm.Org> wrote:
> 
> .zip is the only one of these 3 formats that allows you to decompress a
> few files without expanding the entire archive.  This feature is useful
> to me at least (and makes up for the larger size).

tar supports that as well, and with better compression when paired with bzip2.
Hint:  tar xzf archive [file] [...]

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------
From nbastin at opnet.com  Sat Sep 18 00:54:44 2004
From: nbastin at opnet.com (Nick Bastin)
Date: Sat Sep 18 00:55:12 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040917181409.GN21135@laranja.org>
References: <200409171309.48011.fdrake@acm.org>
	<20040917181409.GN21135@laranja.org>
Message-ID: <91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>


On Sep 17, 2004, at 2:14 PM, Lalo Martins wrote:

> On Fri, Sep 17, 2004 at 01:09:47PM -0400, Fred L. Drake, Jr. wrote:
>> I'm still planning to make ZIP archives available.  If anyone would 
>> like to
>> argue that I should drop that as well, feel free.  ;-)
>
> 1. the main archive software packages for all OSes support
> tar.bz2 in their current releases.  (This includes WinZip,
> WinRAR and whatnot.)

If we're only talking binary releases, then I don't really care, but 
please don't make this change for the source releases.  There are 
several platforms on which Python is supported which do not support 
bzip2 out of the box (Solaris, as a prime example).  It adds just that 
much more heartache to get python installed on such a system.

--
Nick

From Scott.Daniels at Acm.Org  Sat Sep 18 01:51:18 2004
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat Sep 18 01:50:13 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040917224615.GC8367@discworld.dyndns.org>
References: <200409171309.48011.fdrake@acm.org>	<20040917181409.GN21135@laranja.org>
	<cifolq$iof$1@sea.gmane.org>
	<20040917224615.GC8367@discworld.dyndns.org>
Message-ID: <cift7c$rnm$1@sea.gmane.org>

Charles Cazabon wrote:
> Scott David Daniels <Scott.Daniels@Acm.Org> wrote:
> 
>>.zip is the only one of these 3 formats that allows you to decompress a
>>few files without expanding the entire archive.  This feature is useful
>>to me at least (and makes up for the larger size).
> 
> tar supports that as well, and with better compression when paired with bzip2.
> Hint:  tar xzf archive [file] [...]

Right, but the only way it can extract the last file ofthetar archive is
to expand the entire arcive (in order to determine the bytes at the end
of the archive).  .zip looks in the directory for the file, reads the
bytes representing the compressed file (and only that file), and uses
them to expand the file to its original version.

> 
> Charles


-- 
-- Scott David Daniels
Scott.Daniels@Acm.Org

From fredrik at pythonware.com  Sat Sep 18 09:10:02 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 18 09:08:07 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for
	futurereleases.
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>
	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
Message-ID: <cigmsj$4fk$1@sea.gmane.org>

Nick Bastin wrote:

> If we're only talking binary releases, then I don't really care, but please don't make this change 
> for the source releases.  There are several platforms on which Python is supported which do not 
> support bzip2 out of the box (Solaris, as a prime example).  It adds just that much more heartache 
> to get python installed on such a system.

agreed.  it may come as a surprise to some people, but Linux
is not the only Unix system out there.  Python works extremely
well on non-Linux systems too...

</F> 


From martin at v.loewis.de  Sat Sep 18 10:42:18 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep 18 10:42:18 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression
	for	futurereleases.
In-Reply-To: <cigmsj$4fk$1@sea.gmane.org>
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
	<cigmsj$4fk$1@sea.gmane.org>
Message-ID: <414BF4EA.4050600@v.loewis.de>

Fredrik Lundh wrote:
> agreed.  it may come as a surprise to some people, but Linux
> is not the only Unix system out there.  Python works extremely
> well on non-Linux systems too...

But then, a Unix system does not have gzip, either. So we probably
should use compress(1), or, better yet, distribute uncompressed
tar files. Perhaps we should use cpio instead, or pax, because
we need to avoid GNU tar extensions. Maybe IP isn't available, either,
so we should ship QIC tapes.

On Solaris, bzip2 is in the SUNWbzipS package, and installs into
/usr/bin.

Regards,
Martin

P.S. Just found this on compress(1) of Solaris 9:

NOTES
      Although compressed files are  compatible  between  machines
      with large memory, -b 12 should be used for file transfer to
      architectures with a  small  process  data  space  (64KB  or
      less).

Solaris 9 requires a 512MB swap partition for installation, and the
installer makes heavy use of Java...
From fredrik at pythonware.com  Sat Sep 18 13:00:18 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 18 12:58:28 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>
	<414BF4EA.4050600@v.loewis.de>
Message-ID: <cih4cb$sr2$1@sea.gmane.org>

Martin v. L�wis wrote:

>> agreed.  it may come as a surprise to some people, but Linux
>> is not the only Unix system out there.  Python works extremely
>> well on non-Linux systems too...
>
> But then, a Unix system does not have gzip, either.

Of the build systems I checked, all had gunzip, most had unzip, but
only the Linux systems had bunzip2.

The bzip2 homepage contains 1.0.2 binaries for exactly three plat-
forms, compared to over 20 systems for gzip and 30 systems for
unzip.  I suppose older bzip2 versions (0.9.5) are compatible, but
someone should verify that they work before you pull the gzip
archives.

> Maybe IP isn't available, either, so we should ship QIC tapes.

That's a really helpful comment.

</F> 


From martin at v.loewis.de  Sat Sep 18 13:27:02 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep 18 13:27:02 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <cih4cb$sr2$1@sea.gmane.org>
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>	<414BF4EA.4050600@v.loewis.de>
	<cih4cb$sr2$1@sea.gmane.org>
Message-ID: <414C1B86.7070302@v.loewis.de>

Fredrik Lundh wrote:
> Of the build systems I checked, all had gunzip, most had unzip, but
> only the Linux systems had bunzip2.

Sure, there are systems that don't have bunzip2 installed. However,
what is the problem of installing it? All you need is a C compiler,
and I'm sure you have one - how else are you going to install Python?

And if building bzip2 yourself is a problem for some reason I cannot
imagine, then what is the problem with using a prebuilt binary?

As I said, Solaris (atleast Solaris 9) comes with bzip2. If you have
an older Solaris release, you can get a binary from sunfreeware.com.

For HP-UX, you can get it from the HP porting center, e.g.

http://hpux.asknet.de/hppd/hpux/Misc/bzip2-1.0.2/
(both PA-RISC and Itanium binaries, for 10.20, 11.00, 11.20,
  and 11.22)

For AIX, you can get it from http://www.bullfreeware.com/.

What other systems have you been looking at?

Regards,
Martin
From erik at heneryd.com  Sat Sep 18 14:07:54 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 18 14:08:00 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C1B86.7070302@v.loewis.de>
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>	<414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>
	<414C1B86.7070302@v.loewis.de>
Message-ID: <414C251A.80108@heneryd.com>

Martin v. L?wis wrote:
> Fredrik Lundh wrote:
> 
>> Of the build systems I checked, all had gunzip, most had unzip, but
>> only the Linux systems had bunzip2.
> 
> 
> Sure, there are systems that don't have bunzip2 installed. However,
> what is the problem of installing it? All you need is a C compiler,
> and I'm sure you have one - how else are you going to install Python?

Yes, those with older, bzip2less systems can probably figure out how to 
get it and build it, but why force them when it's practically no work 
keeping it?  It's one (sic) extra command for the release manager and 
~9M extra disk space per release on www.python.org.

And besides that... only GNU tar supports the j flag. <wink>


Erik
From fredrik at pythonware.com  Sat Sep 18 15:13:47 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Sep 18 15:20:31 2004
Subject: [Python-Dev] Re: Re: Planning to drop gzip compression for
	futurereleases.
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>	<414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org><414C1B86.7070302@v.loewis.de>
	<414C251A.80108@heneryd.com>
Message-ID: <cihc6k$chp$1@sea.gmane.org>

Erik Heneryd wrote:

> It's one (sic) extra command for the release manager and  ~9M extra
> disk space per release on www.python.org.

but at 50 cents a gigabyte, and an endless stream of alphas and release
candidates, that might turn out to be rather expensive.

oh wait, you wrote megabytes, not gigabytes.

</F> 


From martin at v.loewis.de  Sat Sep 18 15:52:05 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Sep 18 15:52:12 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C251A.80108@heneryd.com>
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>	<414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>
	<414C1B86.7070302@v.loewis.de> <414C251A.80108@heneryd.com>
Message-ID: <414C3D85.20706@v.loewis.de>

Erik Heneryd wrote:
> Yes, those with older, bzip2less systems can probably figure out how to 
> get it and build it, but why force them when it's practically no work 
> keeping it?  It's one (sic) extra command for the release manager and 
> ~9M extra disk space per release on www.python.org.

Fred wouldn't have asked if it was no effort in keeping it. There is
certainly more than one command to it - you have to md5sum the file,
and copy the md5sum into the release notes. You have to upload the file
from your workstation to python.org. I don't know how you do that, but
I need to use my DSL link for uploading the MSI files; it takes roughly
30min to upload. Fortunately, I have a DSL flatrate.

Regards,
Martin
From erik at heneryd.com  Sat Sep 18 20:00:12 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 18 20:00:26 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C3D85.20706@v.loewis.de>
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>	<414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>
	<414C1B86.7070302@v.loewis.de> <414C251A.80108@heneryd.com>
	<414C3D85.20706@v.loewis.de>
Message-ID: <414C77AC.1070507@heneryd.com>

Martin v. L?wis wrote:
> Erik Heneryd wrote:
> 
>> Yes, those with older, bzip2less systems can probably figure out how 
>> to get it and build it, but why force them when it's practically no 
>> work keeping it?  It's one (sic) extra command for the release manager 
>> and ~9M extra disk space per release on www.python.org.
> 
> 
> Fred wouldn't have asked if it was no effort in keeping it. There is
> certainly more than one command to it - you have to md5sum the file,
> and copy the md5sum into the release notes. You have to upload the file
> from your workstation to python.org. I don't know how you do that, but
> I need to use my DSL link for uploading the MSI files; it takes roughly
> 30min to upload. Fortunately, I have a DSL flatrate.
> 
> Regards,
> Martin

Yeah, I was a bit hasty.  Sure, it's more than one command:

* pack it
* unpack it
* diff it against the known-to-be-good bzip2 tree
* md5sum it and add that to the release notes
* add another link on the download page
* ...something else?

but I still think my point stands - it's not that much work, really, and 
it'd be a nice service to those with bzip2less systems.  Regarding 
upload times I guess I'm just another spoiled swede;  I've been on 
ethernet for so long I can barely remember what 5k/s was like...

That said, I do realise that it all adds up and that doing a release 
take some work, so whether you decide to keep it or not: thanks.


Erik
From aahz at pythoncraft.com  Sat Sep 18 20:10:07 2004
From: aahz at pythoncraft.com (Aahz)
Date: Sat Sep 18 20:10:10 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <cih4cb$sr2$1@sea.gmane.org>
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
Message-ID: <20040918181007.GA7132@panix.com>

On Sat, Sep 18, 2004, Fredrik Lundh wrote:
>
> Of the build systems I checked, all had gunzip, most had unzip, but
> only the Linux systems had bunzip2.
> 
> The bzip2 homepage contains 1.0.2 binaries for exactly three plat-
> forms, compared to over 20 systems for gzip and 30 systems for
> unzip.  I suppose older bzip2 versions (0.9.5) are compatible, but
> someone should verify that they work before you pull the gzip
> archives.

Granted that bz2-only isn't a viable option, what does gz give us over
bz2/zip that makes it worthwhile to keep?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From pythondev at bitfurnace.com  Sat Sep 18 20:43:03 2004
From: pythondev at bitfurnace.com (damien morton)
Date: Sat Sep 18 20:47:45 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040918181007.GA7132@panix.com>
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
	<20040918181007.GA7132@panix.com>
Message-ID: <414C81B7.10901@bitfurnace.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20040918/eed68330/attachment.htm
From bac at OCF.Berkeley.EDU  Sat Sep 18 21:57:56 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Sep 18 21:58:04 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C81B7.10901@bitfurnace.com>
References: <414BF4EA.4050600@v.loewis.de>
	<cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com>
	<414C81B7.10901@bitfurnace.com>
Message-ID: <414C9344.5020501@ocf.berkeley.edu>

damien morton wrote:
> Umm, gzip compression is also one of the possible http compression 
> algorithms. bz2 isnt.
> 

What does HTTP compression have to do with whether we have a gzipped release of 
Python?

My personal take on all of this is that we make the release manager's job as 
simple as possible.  That means either ditch gzip files or ditch bzip2 files. 
If we stick with gzip we basically eat the bandwidth cost.  If we go with bzip2 
we need to link to where to get the source to compile, if not host a copy of 
the bzip2 source ourselves.  But either way I completely sympathize with the 
release managers and I am all for making people's lives easier at release time.

So I say we should go with bzip2.  While we might get our bandwidth for free 
thanks to the good graces of XS4ALL and Thomas, I don't think we should view it 
as infinite since they are still footing the bill.  If we can do something 
easily that would reduce their cost enough to buy Thomas a soda I think we 
should do it.  If that means some people need to go download some free 
software, then so be it.  Considering Python has practically no required tools 
beyond a C compiler we have rather low dependency requirements for UNIX in my eyes.

Hell, bzip2's source is less than the difference between 2.4's bzip2 source 
package compared to the gzip one.  We could have a copy of the latest bzip2 on 
our server for people to download and we would still save on bandwidth even 
when people need both Python and bzip2.

Plus, without starting a flame war, bzip2 is under a BSD license so it gets a 
gold star from me.  =)

-Brett
From erik at heneryd.com  Sat Sep 18 22:46:10 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 18 22:46:16 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C9344.5020501@ocf.berkeley.edu>
References: <414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com>	<414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu>
Message-ID: <414C9E92.2030803@heneryd.com>

Brett C. wrote:
> But either way I completely sympathize with the release managers and I 
> am all for making people's lives easier at release time.

Yep.  I suppose that's what this is all about.  Should we add 5 minutes 
of work for:

1) the release manager
2) the n (small integer) people with bzip2less systems

Think 1) is the way to go, at least for finals.  Oh, whatever, I don't 
even really care.  I'll shut up now.


Erik
From barry at python.org  Sat Sep 18 22:52:27 2004
From: barry at python.org (Barry Warsaw)
Date: Sat Sep 18 22:52:33 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C9344.5020501@ocf.berkeley.edu>
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
	<20040918181007.GA7132@panix.com> <414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu>
Message-ID: <1095540746.29261.1.camel@geddy.wooz.org>

On Sat, 2004-09-18 at 15:57, Brett C. wrote:

> My personal take on all of this is that we make the release manager's job as 
> simple as possible.

Although if someone from the community wanted to volunteer to build tgz
files, that might go a long way toward keeping this option available. 
Disk space on python.org isn't (or shouldn't be) an issue.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040918/ae5bc467/attachment.pgp
From phd at mail2.phd.pp.ru  Sat Sep 18 22:57:35 2004
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Sat Sep 18 22:57:45 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C9E92.2030803@heneryd.com>
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
	<20040918181007.GA7132@panix.com> <414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu> <414C9E92.2030803@heneryd.com>
Message-ID: <20040918205735.GA24237@phd.pp.ru>

On Sat, Sep 18, 2004 at 10:46:10PM +0200, Erik Heneryd wrote:
> Yep.  I suppose that's what this is all about.  Should we add 5 minutes 
> of work for:
> 
> 1) the release manager

   Add 5 minutes for EVERY release.

> 2) the n (small integer) people with bzip2less systems

   Add 5 minutes to install bzip2 ONCE and forever.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.
From nhodgson at bigpond.net.au  Sat Sep 18 23:01:07 2004
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Sat Sep 18 23:01:14 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for
	futurereleases.
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
	<20040918181007.GA7132@panix.com> <414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu> <414C9E92.2030803@heneryd.com>
Message-ID: <008601c49dc2$9e257e10$a44a8890@neil>

   Are there site statistics that show the current relative demand for .gz
versus .bz2?

   Neil

From gvanrossum at gmail.com  Sat Sep 18 23:19:21 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Sep 18 23:19:30 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for
	futurereleases.
In-Reply-To: <008601c49dc2$9e257e10$a44a8890@neil>
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
	<20040918181007.GA7132@panix.com> <414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu> <414C9E92.2030803@heneryd.com>
	<008601c49dc2$9e257e10$a44a8890@neil>
Message-ID: <ca471dc204091814194fe9212e@mail.gmail.com>

"When in doubt, don't pass."

If there was all around agreement to drop gzip, I'd say go for it. But
since there isn't, let's keep supporting it and test the waters again
in a year or two.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From erik at heneryd.com  Sat Sep 18 23:49:40 2004
From: erik at heneryd.com (Erik Heneryd)
Date: Sat Sep 18 23:49:44 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <1095540746.29261.1.camel@geddy.wooz.org>
References: <414BF4EA.4050600@v.loewis.de>
	<cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com>
	<414C81B7.10901@bitfurnace.com>	<414C9344.5020501@ocf.berkeley.edu>
	<1095540746.29261.1.camel@geddy.wooz.org>
Message-ID: <414CAD74.3030108@heneryd.com>

Barry Warsaw wrote:
> On Sat, 2004-09-18 at 15:57, Brett C. wrote:
> 
> 
>>My personal take on all of this is that we make the release manager's job as 
>>simple as possible.
> 
> 
> Although if someone from the community wanted to volunteer to build tgz
> files, that might go a long way toward keeping this option available. 
> Disk space on python.org isn't (or shouldn't be) an issue.

Sure.  I could build the tar.gz if given a tar.bz2/cvs pointer, though I 
personally think even the coordination overhead wouldn't make it 
worthwhile.  If nothing else, just to end this IMHO silly thread.


Erik
From python at rcn.com  Sun Sep 19 00:34:08 2004
From: python at rcn.com (Raymond Hettinger)
Date: Sun Sep 19 00:35:25 2004
Subject: [Python-Dev] Noam's open regex requests
In-Reply-To: <41436BF6.6080903@myrealbox.com>
Message-ID: <004d01c49dcf$9cff88c0$e841fea9@oemcomputer>

[Noam Raphael]
> I've suggested three things that I think should be done in that
> case, and nobody objected.
>
> 1. Add a prominent note in the module contents page or in the module's
> main page, stating that some functionality can only be acheived by
using
> compiled REs.

I would make that read "The methods of compiled regular expressions
allow more options than their simplified function counterparts.  Most
non-trivial applications always use the compiled form."


> 2. Document the optional parameters which let you specify the start
and
> end pos in the findall and finditer methods of a compiled RE object.

This seems reasonable to me.  The API is already exposed and is useful.
Why not document it.  AFAICT, there are no plans to take away the
functionality.


> 3. Add the optional parameter "flags" to the findall and finditer
> functions. Then, the four functions match, search, findall and
finditer
> would have the same interface: function(pattern, string[, flags]).

This also seems reasonable to me.  It is marginally useful and it may
reduce the learning curve ever so slightly.  There is nothing special
about findall() and finditer() that makes them different from match()
and search() with respect to flags.


Raymond Hettinger

From nbastin at opnet.com  Sun Sep 19 04:12:58 2004
From: nbastin at opnet.com (Nick Bastin)
Date: Sun Sep 19 04:13:29 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040918205735.GA24237@phd.pp.ru>
References: <414BF4EA.4050600@v.loewis.de> <cih4cb$sr2$1@sea.gmane.org>
	<20040918181007.GA7132@panix.com> <414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu> <414C9E92.2030803@heneryd.com>
	<20040918205735.GA24237@phd.pp.ru>
Message-ID: <6CDA27F9-09E1-11D9-BB3D-000D932927FE@opnet.com>


On Sep 18, 2004, at 4:57 PM, Oleg Broytmann wrote:

> On Sat, Sep 18, 2004 at 10:46:10PM +0200, Erik Heneryd wrote:
>> Yep.  I suppose that's what this is all about.  Should we add 5 
>> minutes
>> of work for:
>>
>> 1) the release manager
>
>    Add 5 minutes for EVERY release.
>
>> 2) the n (small integer) people with bzip2less systems
>
>    Add 5 minutes to install bzip2 ONCE and forever.

Sure, on every machine that you need to install python on (and it isn't 
5 minutes either - most solaris machines aren't that fast).  That's 
assuming that it's acceptable to your corporation to just be adding 
software to your unix machines that hasn't gone through a qualification 
process.

--
Nick

From bac at OCF.Berkeley.EDU  Sun Sep 19 07:46:25 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Sep 19 07:46:33 2004
Subject: [Python-Dev] vimrc file in Misc (was: tabs in httplib.py and
	test_httplib.py)
In-Reply-To: <4149C318.2010902@ocf.berkeley.edu>
References: <700FAFA4-076B-11D9-B3F2-0003934AD54A@chello.se>	<4148CFE1.5010503@ocf.berkeley.edu>	<4149361F.3030906@v.loewis.de>	<200409161515.56466.symbiont+py@berlios.de>
	<4149C318.2010902@ocf.berkeley.edu>
Message-ID: <414D1D31.9050107@ocf.berkeley.edu>

I just checked in a vimrc file in Misc that attempts to set the proper settings 
to follow PEPs 7 & 8.  You can safely source it in your own vimrc file in order 
to get the proper settings for Python and C files.

Hope it proves useful.

-Brett
From paul at pfdubois.com  Sun Sep 19 08:05:28 2004
From: paul at pfdubois.com (Paul F. Dubois)
Date: Sun Sep 19 08:05:31 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases
In-Reply-To: <20040918204637.B934C1E4016@bag.python.org>
References: <20040918204637.B934C1E4016@bag.python.org>
Message-ID: <414D21A8.4090708@pfdubois.com>

Some of this discussion has wandered into the 'we are all competent 
computer people here so what is the problem going to get x and 
installing it' line of reasoning.

Many Python users are not very good at computing. If they know how to 
install Python now why make that disappear? The fact that in
principle they could manage somehow if we didn't provide a zip file 
doesn't mean they should have to. It just isn't a big deal on our end.

Not only are batteries included, but we have several sizes to fit your 
equipment.
From fredrik at pythonware.com  Sun Sep 19 09:25:04 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep 19 09:23:09 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
References: <414BF4EA.4050600@v.loewis.de><cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com><414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu>
Message-ID: <cijc4o$kr6$1@sea.gmane.org>

Brett C wrote:

> My personal take on all of this is that we make the release manager's job as simple as possible.

first the "no abbreviations in the standard library" and now "who cares
about users; releases are for the release manager".  have you even seen
a Python user in real life?

</F> 


From chris.cavalaria at free.fr  Sun Sep 19 11:15:25 2004
From: chris.cavalaria at free.fr (Christophe Cavalaria)
Date: Sun Sep 19 11:15:34 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases
References: <20040918204637.B934C1E4016@bag.python.org>
	<414D21A8.4090708@pfdubois.com>
Message-ID: <cijind$us5$1@sea.gmane.org>

Paul F. Dubois wrote:

> Some of this discussion has wandered into the 'we are all competent
> computer people here so what is the problem going to get x and
> installing it' line of reasoning.
> 
> Many Python users are not very good at computing. If they know how to
> install Python now why make that disappear? The fact that in
> principle they could manage somehow if we didn't provide a zip file
> doesn't mean they should have to. It just isn't a big deal on our end.
> 
> Not only are batteries included, but we have several sizes to fit your
> equipment.

Well, we are not talking about a simple
click-the-shiny-exe-and-the-yes-button install here. We are talking about
source code install of Python on Unix-like computers. The number of users
who can install Python that way but can't install bzip2, even with the
source code must be very very small.

From martin at v.loewis.de  Sun Sep 19 11:29:41 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep 19 11:29:41 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression
	for	futurereleases.
In-Reply-To: <008601c49dc2$9e257e10$a44a8890@neil>
References: <414BF4EA.4050600@v.loewis.de>
	<cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com>
	<414C81B7.10901@bitfurnace.com>	<414C9344.5020501@ocf.berkeley.edu>
	<414C9E92.2030803@heneryd.com>
	<008601c49dc2$9e257e10$a44a8890@neil>
Message-ID: <414D5185.5010002@v.loewis.de>

Neil Hodgson wrote:
>    Are there site statistics that show the current relative demand for .gz
> versus .bz2?

Within the last few days (since the logs rotated on Sep 13), there have
been 1095 accesses to Python-2.3.4.tar.bz2, and 5168 to Python-2.3.4.tgz.

Regards,
Martin
From fredrik at pythonware.com  Sun Sep 19 14:00:11 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep 19 14:00:18 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases
References: <414BF4EA.4050600@v.loewis.de><cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com><414C81B7.10901@bitfurnace.com>	<414C9344.5020501@ocf.berkeley.edu><414C9E92.2030803@heneryd.com><008601c49dc2$9e257e10$a44a8890@neil>
	<414D5185.5010002@v.loewis.de>
Message-ID: <cijscb$h51$1@sea.gmane.org>

Martin v. L�wis wrote:

> Within the last few days (since the logs rotated on Sep 13), there have
> been 1095 accesses to Python-2.3.4.tar.bz2, and 5168 to Python-2.3.4.tgz.

so given the "we'll save 5 minutes for each release, and users stuck with
gzip only loses 5 minutes each" rationale, I assume this means that some-
one's planning to make 314400 Python releases over the next year?

</F> 


From jjl at pobox.com  Sun Sep 19 16:03:53 2004
From: jjl at pobox.com (John J Lee)
Date: Sun Sep 19 16:04:34 2004
Subject: [Python-Dev] Re: URL processing conformance and principles (was
	Re:	urllib.urlopen...)
In-Reply-To: <200409170754.i8H7sLWr042680@chilled.skew.org>
References: <200409170754.i8H7sLWr042680@chilled.skew.org>
Message-ID: <Pine.LNX.4.58.0409191414530.5568@alice>

On Fri, 17 Sep 2004, Mike Brown wrote:
[...]
> Tightening up the conformance issues meant that we needed to help users 
> produce valid URIs from filesystem paths and vice-versa. Once again, the core 
> Python libs were of little use -- pathname2url and url2pathname are 
> platform-dependent, and are so full of bugs^H^H^H^Hfeatures that I had to 
> start from scratch and roll my own functions. I think what I've got at this 
> point would make great additions to urllib2, but I'll save them for another 
> day...

You must be worn out after those posts :-), but:

Would certainly be nice to have some more compliant, perhaps less
forgiving functions for those tasks, so +1 for adding your OsPathToUri()  
and UriToOsPath() somewhere in the stdlib.  Maybe urllib2 is as good a
place as any.  I suppose somebody knowledgeable about both Macs and URIs
must volunteer to do the Mac work first, though.


John
From martin at v.loewis.de  Sun Sep 19 20:05:53 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep 19 20:05:50 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases
In-Reply-To: <cijscb$h51$1@sea.gmane.org>
References: <414BF4EA.4050600@v.loewis.de><cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com><414C81B7.10901@bitfurnace.com>	<414C9344.5020501@ocf.berkeley.edu><414C9E92.2030803@heneryd.com><008601c49dc2$9e257e10$a44a8890@neil>	<414D5185.5010002@v.loewis.de>
	<cijscb$h51$1@sea.gmane.org>
Message-ID: <414DCA81.90007@v.loewis.de>

Fredrik Lundh wrote:
> so given the "we'll save 5 minutes for each release, and users stuck with
> gzip only loses 5 minutes each" rationale, I assume this means that some-
> one's planning to make 314400 Python releases over the next year?

Talking about helpful comments...

Regards,
Martin
From martin at v.loewis.de  Sun Sep 19 20:37:54 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep 19 20:37:52 2004
Subject: [Python-Dev] [TARGETDIR]lib-tk added to PythonPath in MSI
In-Reply-To: <20040915115947.A26465@ActiveState.com>
References: <20040915115947.A26465@ActiveState.com>
Message-ID: <414DD202.5060002@v.loewis.de>

Trent Mick wrote:
> Shouldn't that be this instead?
> 
>     ("PythonPath", -1, prefix+r"\PythonPath", "",
>     "[TARGETDIR]Lib;[TARGETDIR]DLLs;[TARGETDIR]Lib\\lib-tk", "REGISTRY"),

Indeed it should; thanks for pointing that out. Fixed in 1.12.

Regards,
Martin
From anthony at interlink.com.au  Mon Sep 20 04:06:01 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Sep 20 04:07:03 2004
Subject: [Python-Dev] Planning to drop gzip compression for
	future	releases.
In-Reply-To: <91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
References: <200409171309.48011.fdrake@acm.org>	<20040917181409.GN21135@laranja.org>
	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
Message-ID: <414E3B09.9070407@interlink.com.au>

Nick Bastin wrote:
> If we're only talking binary releases, then I don't really care, but 
> please don't make this change for the source releases.  There are 
> several platforms on which Python is supported which do not support 
> bzip2 out of the box (Solaris, as a prime example).  It adds just that 
> much more heartache to get python installed on such a system.

I have no intention of dropping tar.gz source releases. I think Fred
was talking about the documentation tarballs. Even then, I think there's
some advantages to keeping both, and I don't really see the advantage
to dropping the tar.gz format. But hey, that's up to Fred - he's the
one who makes the doc releases.

Anthony


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From anthony at interlink.com.au  Mon Sep 20 04:08:52 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Sep 20 04:09:10 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C3D85.20706@v.loewis.de>
References: <200409171309.48011.fdrake@acm.org><20040917181409.GN21135@laranja.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><cigmsj$4fk$1@sea.gmane.org>	<414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>	<414C1B86.7070302@v.loewis.de>
	<41
Message-ID: <414E3BB4.60505@interlink.com.au>

Martin v. L?wis wrote:
> Fred wouldn't have asked if it was no effort in keeping it. There is
> certainly more than one command to it - you have to md5sum the file,
> and copy the md5sum into the release notes. You have to upload the file
> from your workstation to python.org. I don't know how you do that, but
> I need to use my DSL link for uploading the MSI files; it takes roughly
> 30min to upload. Fortunately, I have a DSL flatrate.

Again, we're only talking about the documentation tarballs. I'm still
going to be making both tar.gz and tar.bz2 format source releases -
yes, it's a bit more work (gotta gpg sign both, upload both) but I'm
completely unconvinced that forcing people to install bzip2 everywhere
is a useful approach.

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From anthony at interlink.com.au  Mon Sep 20 05:31:08 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Sep 20 05:31:37 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414C9344.5020501@ocf.berkeley.edu>
References: <414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com>	<414C81B7.10901@bitfurnace.com>
	<414C9344.5020501@ocf.berkeley.edu>
Message-ID: <414E4EFC.1080003@interlink.com.au>

Brett C. wrote:
> My personal take on all of this is that we make the release manager's 
> job as simple as possible.  That means either ditch gzip files or ditch 
> bzip2 files. 

I disagree, almost 100%. The job of release management is to make
it as easy as possible for people to get and use Python. The language
isn't being organised for _my_ benefit.

Last I looked, tar.bz2 was less than 1/4 of tar.gz in terms of number
of downloads (see http://www.python.org/wwwstats/usage_200409.html)
That's hardly a case for switching.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From martin at v.loewis.de  Mon Sep 20 08:25:16 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Sep 20 08:25:31 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for future
	releases.
In-Reply-To: <414E4EFC.1080003@interlink.com.au>
References: <414BF4EA.4050600@v.loewis.de>	<cih4cb$sr2$1@sea.gmane.org>	<20040918181007.GA7132@panix.com>	<414C81B7.10901@bitfurnace.com>	<414C9344.5020501@ocf.berkeley.edu>
	<414E4EFC.1080003@interlink.com.au>
Message-ID: <414E77CC.5030108@v.loewis.de>

Anthony Baxter wrote:
> Last I looked, tar.bz2 was less than 1/4 of tar.gz in terms of number
> of downloads (see http://www.python.org/wwwstats/usage_200409.html)
> That's hardly a case for switching.

Although that may partly result from http://www.python.org/download/
referring to the .tgz only.

Regards,
Martin
From FBatista at uniFON.com.ar  Mon Sep 20 18:17:02 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Sep 20 18:21:42 2004
Subject: [Python-Dev] Copyright and license texts for distributing Decimal
Message-ID: <A128D751272CD411BC9200508BC2194D053C7987@escpl.tcp.com.ar>

I'll reformulate my question.

I've got a proyect, SiGeFi, which needs the decimal module, so for being
used with Py2.3 it's a must for the project to let the user to install the
decimal module "easily". Also, Alex Martelli asked me to create a "decimal
module" installer, for reasons related to a recipe for his new book.

So I get into distutils and prepared tarball, .rpm and .exe packages for
install the decimal module in your system if you have Py2.3. What I didn't
solve is the license and copyright to put in the package.

Regarding copyright, my first draft says: 

    Copyright (c) 2004 Python Software Foundation. 
    All rights reserved. 
    
Regarding license, didn't put nothing yet, should I write something like the
following and include the file? 

    See the file "LICENSE" for information on the history of this 
    software, terms & conditions for usage, and a DISCLAIMER OF ALL 
    WARRANTIES. 

What license should I include in the packages?  And the copyright text?

Thank you!

Facundo Batista
Desarrollo de Red
fbatista@unifon.com.ar
(54 11) 5130-4643
Cel: 15 5097 5024


-----Mensaje original-----
De: Batista, Facundo 
Enviado el: Viernes, 17 de Septiembre de 2004 11:10
Para: Python Dev (E-mail)
Asunto: [Python-Dev] Decimal, copyright and license


People: 
I'm creating a decimal installer (for Py2.3 users), making tarball, .rpm and
.exe versions available. 

What I don't know is what to put about license and copyright. 

Regarding copyright, my first draft says: 
    Copyright (c) 2004 Python Software Foundation. 
    All rights reserved. 
    
Regarding license, didn't put nothing yet, should I write something like the
following and include the file? 
    See the file "LICENSE" for information on the history of this 
    software, terms & conditions for usage, and a DISCLAIMER OF ALL 
    WARRANTIES. 

Remember that the "decimal installer" will be available for download not in
a Python location. 

Thanks! 

Facundo Batista 
Desarrollo de Red 
fbatista@unifon.com.ar 
(54 11) 5130-4643 
Cel: 15 5097 5024 
From tim.peters at gmail.com  Mon Sep 20 22:34:55 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Sep 20 22:35:28 2004
Subject: [Python-Dev] Decimal, copyright and license
In-Reply-To: <A128D751272CD411BC9200508BC2194D053C7980@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D053C7980@escpl.tcp.com.ar>
Message-ID: <1f7befae04092013342051997e@mail.gmail.com>

[Batista, Facundo]
> I'm creating a decimal installer (for Py2.3 users), making tarball, .rpm and .exe
> versions available.
>
> What I don't know is what to put about license and copyright.

In the language of the PSF license, you're making "a derivative work"
then.  Your derivative work is *your* work, and you can license it
however you like (although, as the PSF license says, you must
*include* the PSF license and copyright notice).  The copyright is
yours, since it's your work.

> Regarding copyright, my first draft says:
>
>    Copyright (c) 2004 Python Software Foundation. 
>    All rights reserved. 

You hold copyright whether you say so or not.  You won't get into
trouble by claiming the PSF holds copyright, though.

> Regarding license, didn't put nothing yet, should I write something like the
> following and include the file?

No matter what else you do, you must include the PSF license and
copyright.  The license you want to use for your part of the work is
entirely up to you; the PSF license imposes no restrictions there.

>    See the file "LICENSE" for information on the history of this 
>    software, terms & conditions for usage, and a DISCLAIMER OF ALL 
>    WARRANTIES.

That would be suitable if you want to leave the impression that you're
licensing your work under the terms of the PSF license.  That's fine,
if that's what you want to do.  If you want to write a license saying
people have to pay you a million dollars each time they use your
installer, that's also fine.

> Remember that the "decimal installer" will be available for download not in a
> Python location.

That part doesn't really matter.  What you suggest above is all fine.
From tommy at ilm.com  Mon Sep 20 23:34:31 2004
From: tommy at ilm.com (Tommy Burnette)
Date: Mon Sep 20 23:45:40 2004
Subject: [Python-Dev] built on beer?
Message-ID: <16719.19687.44576.866934@evoke.lucasdigital.com>

hey team,

in a completely un-python-related thread about mobile phones last
week, a friend (who I did not know knew anything about python), when
asked what made a certain nokia phone stand out above one from another
company, replied:


	"... nokia runs python, the language built on beer."

	
does the PSF have any t-shirts that advertise this fact? :)


From tommy at ilm.com  Tue Sep 21 02:28:05 2004
From: tommy at ilm.com (Tommy Burnette)
Date: Tue Sep 21 02:28:11 2004
Subject: [Python-Dev] built on beer?
In-Reply-To: <16719.19687.44576.866934@evoke.lucasdigital.com>
References: <16719.19687.44576.866934@evoke.lucasdigital.com>
Message-ID: <16719.30101.8590.767412@evoke.lucasdigital.com>


apologies for replying to my own posting- the "fact" I wished to
advertise was the beer one, not the nokia one!

Tommy Burnette writes:
| hey team,
| 
| in a completely un-python-related thread about mobile phones last
| week, a friend (who I did not know knew anything about python), when
| asked what made a certain nokia phone stand out above one from another
| company, replied:
| 
| 
| 	"... nokia runs python, the language built on beer."
| 
| 	
| does the PSF have any t-shirts that advertise this fact? :)
| 
| 
| _______________________________________________
| Python-Dev mailing list
| Python-Dev@python.org
| http://mail.python.org/mailman/listinfo/python-dev
| Unsubscribe: http://mail.python.org/mailman/options/python-dev/tommy%40ilm.com


From fdrake at acm.org  Tue Sep 21 16:36:15 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep 21 16:36:33 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <414E3B09.9070407@interlink.com.au>
References: <200409171309.48011.fdrake@acm.org>
	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
	<414E3B09.9070407@interlink.com.au>
Message-ID: <200409211036.15861.fdrake@acm.org>

[Responding to an absolutely enormous deluge of emails on python-dev...]

On Sunday 19 September 2004 10:06 pm, Anthony Baxter wrote:
 > I have no intention of dropping tar.gz source releases. I think Fred
 > was talking about the documentation tarballs. Even then, I think there's
 > some advantages to keeping both, and I don't really see the advantage
 > to dropping the tar.gz format. But hey, that's up to Fred - he's the
 > one who makes the doc releases.

Dang, it doesn't pay to be away from email for three days, does it?

Yes, I was only talking about documentation releases.  It never occurred to me 
anyone would think I was talking about Python source releases.  Maybe I 
shouldn't have added python-dev to the recipients list for my original email, 
but too often objections get heard quite late if I don't include python-dev.

For the documentation, there's a much longer history of providing the bz2 
versions of the archives.  There are also many more archives we can drop per 
release.

While in theory disk space isn't supposed to be an issue, it seems to be 
something our sysadmin group is dealing with on a regular basis (mostly 
cleaning up old webserver logs).  So while the space itself may not be an 
issue, it certainly generates tedious work for volunteers.

My motivation in dropping the bz2 archives is two-fold:

1.  Reduce disk space consumed per release, mostly to ease the burden on the 
sysadmin group.

2.  Reduce the number of files posted for the documentation per release, so 
that choices for end-users are easier to pick through.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From fdrake at acm.org  Tue Sep 21 17:07:02 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep 21 17:07:29 2004
Subject: [Doc-SIG] Re: [Python-Dev] Planning to drop gzip compression for
	future releases.
In-Reply-To: <200409211036.15861.fdrake@acm.org>
References: <200409171309.48011.fdrake@acm.org>
	<414E3B09.9070407@interlink.com.au>
	<200409211036.15861.fdrake@acm.org>
Message-ID: <200409211107.02590.fdrake@acm.org>

This morning, I wrote:
 > My motivation in dropping the bz2 archives is two-fold:

That should be the *gz* archives, not the bz2 archives!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From psoberoi at gmail.com  Tue Sep 21 17:31:43 2004
From: psoberoi at gmail.com (Paramjit Oberoi)
Date: Tue Sep 21 17:31:45 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <200409211036.15861.fdrake@acm.org>
References: <200409171309.48011.fdrake@acm.org>
	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
	<414E3B09.9070407@interlink.com.au>
	<200409211036.15861.fdrake@acm.org>
Message-ID: <e443ad0e0409210831a725a6f@mail.gmail.com>

I'm not sure how easy or difficult it would be---but it would be very
convenient for me if the documentation was also downloadable in
windows help (CHM) format.  Currently CHM files are only available in
windows installers, but I use them on Linux (easier to search, etc).

-param

On Tue, 21 Sep 2004 10:36:15 -0400, Fred L. Drake, Jr. <fdrake@acm.org> wrote:
> [Responding to an absolutely enormous deluge of emails on python-dev...]
> 
> On Sunday 19 September 2004 10:06 pm, Anthony Baxter wrote:
>  > I have no intention of dropping tar.gz source releases. I think Fred
>  > was talking about the documentation tarballs. Even then, I think there's
>  > some advantages to keeping both, and I don't really see the advantage
>  > to dropping the tar.gz format. But hey, that's up to Fred - he's the
>  > one who makes the doc releases.
> 
> Dang, it doesn't pay to be away from email for three days, does it?
> 
> Yes, I was only talking about documentation releases.  It never occurred to me
> anyone would think I was talking about Python source releases.  Maybe I
> shouldn't have added python-dev to the recipients list for my original email,
> but too often objections get heard quite late if I don't include python-dev.
> 
> For the documentation, there's a much longer history of providing the bz2
> versions of the archives.  There are also many more archives we can drop per
> release.
> 
> While in theory disk space isn't supposed to be an issue, it seems to be
> something our sysadmin group is dealing with on a regular basis (mostly
> cleaning up old webserver logs).  So while the space itself may not be an
> issue, it certainly generates tedious work for volunteers.
> 
> My motivation in dropping the bz2 archives is two-fold:
> 
> 1.  Reduce disk space consumed per release, mostly to ease the burden on the
> sysadmin group.
> 
> 2.  Reduce the number of files posted for the documentation per release, so
> that choices for end-users are easier to pick through.
> 
> 
> 
> 
>   -Fred
> 
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/psoberoi%40gmail.com
>
From cce at clarkevans.com  Tue Sep 21 17:58:00 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Tue Sep 21 17:58:05 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <200409171309.48011.fdrake@acm.org>
References: <200409171309.48011.fdrake@acm.org>
Message-ID: <20040921155759.GB91940@prometheusresearch.com>

Fred,

>From what I understand, the algorithmic behavior of bz2 and gz are
completely different -- while gzip is incremental, bz2 requires
memory proportional to the size of the source information.
Furthermore, most browsers now support gzip compression for their
web pages, it will quite some time before bz2 support is ubiquitous.
Unless these two issues are different than I understand them, I'd
prefer if gzip remain in the standard Python distribution.

Best,

Clark
From cce at clarkevans.com  Tue Sep 21 18:00:54 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Tue Sep 21 18:00:58 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <20040921155759.GB91940@prometheusresearch.com>
References: <200409171309.48011.fdrake@acm.org>
	<20040921155759.GB91940@prometheusresearch.com>
Message-ID: <20040921160053.GC91940@prometheusresearch.com>

*blush*  I read the post wrong, please disregard my comment.
From fredrik at pythonware.com  Tue Sep 21 18:02:57 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Sep 21 18:01:12 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for
	futurereleases.
References: <200409171309.48011.fdrake@acm.org><91489DBB-08FC-11D9-A518-000D932927FE@opnet.com><414E3B09.9070407@interlink.com.au>
	<200409211036.15861.fdrake@acm.org>
Message-ID: <cipj7o$kq3$1@sea.gmane.org>

Fred Drake wrote:

> While in theory disk space isn't supposed to be an issue, it seems to be
> something our sysadmin group is dealing with on a regular basis (mostly
> cleaning up old webserver logs).  So while the space itself may not be an
> issue, it certainly generates tedious work for volunteers.

I thought everyone knew that logs always fill up until the disk is almost
full, no matter how much disk you have.

</F> 


From fdrake at acm.org  Tue Sep 21 18:24:37 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Sep 21 18:24:50 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <e443ad0e0409210831a725a6f@mail.gmail.com>
References: <200409171309.48011.fdrake@acm.org>
	<200409211036.15861.fdrake@acm.org>
	<e443ad0e0409210831a725a6f@mail.gmail.com>
Message-ID: <200409211224.37497.fdrake@acm.org>

On Tuesday 21 September 2004 11:31 am, Paramjit Oberoi wrote:
 > I'm not sure how easy or difficult it would be---but it would be very
 > convenient for me if the documentation was also downloadable in
 > windows help (CHM) format.  Currently CHM files are only available in
 > windows installers, but I use them on Linux (easier to search, etc).

You do?  What software supports them?  It would be cool to have a decent 
single-file documentation browser on Linux.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From gvanrossum at gmail.com  Tue Sep 21 18:11:55 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Sep 21 18:25:32 2004
Subject: [Python-Dev] built on beer?
In-Reply-To: <16719.30101.8590.767412@evoke.lucasdigital.com>
References: <16719.19687.44576.866934@evoke.lucasdigital.com>
	<16719.30101.8590.767412@evoke.lucasdigital.com>
Message-ID: <ca471dc204092109117036e004@mail.gmail.com>

I don't know where that quote comes from, but it's true! During the
early days, when hacking on Python, I often lived on stroopwafels and
beer. (If you've never visited the Netherlands,  you *must* Google for
stroopwafels. :-)

On Mon, 20 Sep 2004 17:28:05 -0700, Tommy Burnette <tommy@ilm.com> wrote:
> 
> apologies for replying to my own posting- the "fact" I wished to
> advertise was the beer one, not the nokia one!
> 
> Tommy Burnette writes:
> | hey team,
> 
> 
> |
> | in a completely un-python-related thread about mobile phones last
> | week, a friend (who I did not know knew anything about python), when
> | asked what made a certain nokia phone stand out above one from another
> | company, replied:
> |
> |
> |       "... nokia runs python, the language built on beer."
> |
> |
> | does the PSF have any t-shirts that advertise this fact? :)
> |
> |
> | _______________________________________________
> | Python-Dev mailing list
> | Python-Dev@python.org
> | http://mail.python.org/mailman/listinfo/python-dev
> | Unsubscribe: http://mail.python.org/mailman/options/python-dev/tommy%40ilm.com
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From fredrik at pythonware.com  Tue Sep 21 18:36:47 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Sep 21 18:40:43 2004
Subject: [Python-Dev] Re: Planning to drop gzip compression for
	futurereleases.
References: <200409171309.48011.fdrake@acm.org><200409211036.15861.fdrake@acm.org><e443ad0e0409210831a725a6f@mail.gmail.com>
	<200409211224.37497.fdrake@acm.org>
Message-ID: <cipl77$r9r$1@sea.gmane.org>

Fred wrote:

> You do?  What software supports them?  It would be cool to have a decent
> single-file documentation browser on Linux.

http://xchm.sourceforge.net/

</F> 


From barry at python.org  Tue Sep 21 18:47:42 2004
From: barry at python.org (Barry Warsaw)
Date: Tue Sep 21 18:47:48 2004
Subject: [Python-Dev] built on beer?
In-Reply-To: <ca471dc204092109117036e004@mail.gmail.com>
References: <16719.19687.44576.866934@evoke.lucasdigital.com>
	<16719.30101.8590.767412@evoke.lucasdigital.com>
	<ca471dc204092109117036e004@mail.gmail.com>
Message-ID: <1095785262.8357.62.camel@geddy.wooz.org>

On Tue, 2004-09-21 at 12:11, Guido van Rossum wrote:
> I don't know where that quote comes from, but it's true! During the
> early days, when hacking on Python, I often lived on stroopwafels and
> beer. (If you've never visited the Netherlands,  you *must* Google for
> stroopwafels. :-)

http://www.amazingstroopwafels.nl/

I /thought/ that guy in the back right looked familiar!

-B

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040921/30ad7de8/attachment.pgp
From psoberoi at gmail.com  Tue Sep 21 19:14:59 2004
From: psoberoi at gmail.com (Paramjit Oberoi)
Date: Tue Sep 21 19:15:01 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <200409211224.37497.fdrake@acm.org>
References: <200409171309.48011.fdrake@acm.org>
	<200409211036.15861.fdrake@acm.org>
	<e443ad0e0409210831a725a6f@mail.gmail.com>
	<200409211224.37497.fdrake@acm.org>
Message-ID: <e443ad0e04092110141c7ba51c@mail.gmail.com>

> You do?  What software supports them?  It would be cool to have a decent
> single-file documentation browser on Linux.

/F pointed out xchm - http://xchm.sourceforge.net/ - that's what I use.

There is also GnoCHM - http://gnochm.sourceforge.net/ - which wasn't
completely stable when I last tried it.  But it's written in
Python/PyGTK.

Both these readers use CHMLIB -
http://66.93.236.84/~jedwin/projects/chmlib/ - to read CHM files. 
Python bindings for this library are available from the GnoCHM
project:

http://gnochm.sourceforge.net/pychm.html

-param
From martin at v.loewis.de  Tue Sep 21 20:20:15 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Sep 21 20:20:40 2004
Subject: [Python-Dev] Planning to drop gzip compression for
	future	releases.
In-Reply-To: <e443ad0e0409210831a725a6f@mail.gmail.com>
References: <200409171309.48011.fdrake@acm.org>	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>	<414E3B09.9070407@interlink.com.au>	<200409211036.15861.fdrake@acm.org>
	<e443ad0e0409210831a725a6f@mail.gmail.com>
Message-ID: <415070DF.40201@v.loewis.de>

Paramjit Oberoi wrote:
> I'm not sure how easy or difficult it would be---but it would be very
> convenient for me if the documentation was also downloadable in
> windows help (CHM) format.  Currently CHM files are only available in
> windows installers, but I use them on Linux (easier to search, etc).

I could do this along with the Windows installer releases. However,
I don't think I can do this whenever Fred makes a snapshot release;
I also doubt that Fred can easily do this on his own, since the
documentation is build on Unix, and the CHM file on Windows.

Regards,
Martin
From aahz at pythoncraft.com  Tue Sep 21 20:25:28 2004
From: aahz at pythoncraft.com (Aahz)
Date: Tue Sep 21 20:25:32 2004
Subject: [Python-Dev] built on beer?
In-Reply-To: <ca471dc204092109117036e004@mail.gmail.com>
References: <16719.19687.44576.866934@evoke.lucasdigital.com>
	<16719.30101.8590.767412@evoke.lucasdigital.com>
	<ca471dc204092109117036e004@mail.gmail.com>
Message-ID: <20040921182528.GA24930@panix.com>

On Tue, Sep 21, 2004, Guido van Rossum wrote:
>
> I don't know where that quote comes from, but it's true! During the
> early days, when hacking on Python, I often lived on stroopwafels and
> beer. (If you've never visited the Netherlands,  you *must* Google for
> stroopwafels. :-)

Apparently they get imported to the US now....
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From psoberoi at gmail.com  Tue Sep 21 21:02:47 2004
From: psoberoi at gmail.com (Paramjit Oberoi)
Date: Tue Sep 21 21:02:49 2004
Subject: [Python-Dev] Planning to drop gzip compression for future
	releases.
In-Reply-To: <415070DF.40201@v.loewis.de>
References: <200409171309.48011.fdrake@acm.org>
	<91489DBB-08FC-11D9-A518-000D932927FE@opnet.com>
	<414E3B09.9070407@interlink.com.au>
	<200409211036.15861.fdrake@acm.org>
	<e443ad0e0409210831a725a6f@mail.gmail.com>
	<415070DF.40201@v.loewis.de>
Message-ID: <e443ad0e0409211202ff77ac@mail.gmail.com>

> I could do this along with the Windows installer releases. However,
> I don't think I can do this whenever Fred makes a snapshot release;

That would be perfectly adequate.  The documentation doesn't change
that much between snapshots, and anyway, I usually don't use snapshot
releases...  As far as I am concerned, just having the CHM files
corresponding to the offical releases would be fine.

Thanks,
-param
From dw-python.org at botanicus.net  Wed Sep 22 00:16:08 2004
From: dw-python.org at botanicus.net (David Wilson)
Date: Wed Sep 22 00:16:17 2004
Subject: [Python-Dev] [Patch 1032206] Add API to logging package to allow
	intercooperation.
Message-ID: <20040921221608.GA71441@thailand.botanicus.net>

Hi there,

There are two alternative patches provided to add a single extra API
item for this package, which would allow developers the ability to
extend the logging package to a certain extent without clobbering each
other's work.

At present, it isn't possible for a package to customise the
logging.Logger class, without running the risk of having it's changes
clobbered by an application using the package, or another package.

This small change allows each customiser to inherit changes from the
last customiser. Any chance of getting one of these solutions in for
2.4?

the "loggerClass" option provides more respectable declaration syntax,
but the "getLoggerclass" option provides symmetry.

http://sourceforge.net/tracker/index.php?func=detail&aid=1032206&group_id=5470&atid=305470

Thanks,


David.

-- 
The next great adventure of mankind is not for people who ask,
"What exactly is the point?" They will never get it.
    -- http://news.bbc.co.uk/1/hi/sci/tech/3302375.stm
From kbk at shore.net  Wed Sep 22 05:22:11 2004
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Sep 22 05:22:17 2004
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200409220322.i8M3MBF7005521@h006008a7bda6.ne.client2.attbi.com>

Patch / Bug Summary
___________________

Patches :  235 open ( -6) /  2633 closed (+11) /  2868 total ( +5)
Bugs    :  767 open ( +3) /  4463 closed (+10) /  5230 total (+13)
RFE     :  151 open ( +1) /   131 closed ( +0) /   282 total ( +1)

New / Reopened Patches
______________________

   (2004-09-18)
CLOSED http://python.org/sf/1030422  opened by  Jeff Connelly aka shellreef

Patch for bug #780725   (2004-09-20)
       http://python.org/sf/1031213  opened by  atsuo ishimoto

Clean up discussion of new C thread idiom  (2004-09-20)
       http://python.org/sf/1031233  opened by  Greg Chapman

atexit decorator  (2004-09-21)
       http://python.org/sf/1031687  opened by  Raymond Hettinger

Add API to logging package to allow intercooperation.  (2004-09-21)
       http://python.org/sf/1032206  opened by  Dave Wilson

Patches Closed
______________

Decimal performance enhancements  (2004-09-01)
       http://python.org/sf/1020845  closed by  rhettinger

topdir calculated incorrectly in bdist_rpm  (2004-09-03)
       http://python.org/sf/1022003  closed by  jafo

add support for the AutoReq flag in bdist_rpm  (2004-09-03)
       http://python.org/sf/1022011  closed by  jafo

Adding IPv6 host handling to httplib  (2004-09-15)
       http://python.org/sf/1028502  closed by  loewis

Add status code constants to httplib  (2004-09-10)
       http://python.org/sf/1025790  closed by  loewis

tarfile.py longnames are truncated in getnames()  (2004-09-16)
       http://python.org/sf/1029061  closed by  loewis

Patch for bug 933795. term.h and curses on Solaris  (2004-08-19)
       http://python.org/sf/1012280  closed by  loewis

fix bug 807871 : tkMessageBox.askyesno wrong result  (2004-08-29)
       http://python.org/sf/1018509  closed by  loewis

Error when int sent to PyLong_AsUnsignedLong  (2004-09-08)
       http://python.org/sf/1024670  closed by  loewis

WinSock 2 support on Win32 w/ MSVC++ 6 (fix #860134)  (2004-03-03)
       http://python.org/sf/908631  closed by  loewis

   (2004-09-18)
       http://python.org/sf/1030422  closed by  jeffconnelly

New / Reopened Bugs
___________________

email.Utils not mentioned  (2004-09-17)
       http://python.org/sf/1030118  opened by  Jeff Blaine

rfc822 __iter__ problem  (2004-09-17)
       http://python.org/sf/1030125  opened by  Mike Foord

socket is not garbage-collected under specific circumstances  (2004-09-18)
CLOSED http://python.org/sf/1030249  opened by  Matthias Klose

distutils' dry-run wants to create some real build dirs  (2004-09-18)
       http://python.org/sf/1030250  opened by  Matthias Klose

os.system exhausts file descriptors  (2004-09-18)
CLOSED http://python.org/sf/1030388  opened by  Eray Ozkural

os.path.join() does not raise TypeError  (2004-09-18)
       http://python.org/sf/1030499  opened by  Pierre Fortin

PyMapping_Check crashes when argument is NULL  (2004-09-19)
CLOSED http://python.org/sf/1030557  opened by  Michiel de Hoon

PyOS_InputHook broken  (2004-09-19)
       http://python.org/sf/1030629  opened by  Michiel de Hoon

Email message croaks the new email pkg parser  (2004-09-19)
       http://python.org/sf/1030941  opened by  Skip Montanaro

tarfile: dirsize is not zero  (2004-09-20)
CLOSED http://python.org/sf/1031148  opened by  Bertram Scharpf

decimal module inconsistent with integers and floats  (2004-09-20)
CLOSED http://python.org/sf/1031480  opened by  Anthony Tuininga

Fold tuples of constants into a single constant  (2004-09-20)
       http://python.org/sf/1031667  opened by  Raymond Hettinger

Conflicting descriptions of application order of decorators  (2004-09-21)
       http://python.org/sf/1031897  opened by  Hamish Lawson

Bugs Closed
___________

help() does not check for chm file  (2004-09-09)
       http://python.org/sf/1025392  closed by  loewis

socket is not garbage-collected under specific circumstances  (2004-09-18)
       http://python.org/sf/1030249  closed by  loewis

configure not able to find ncurses/curses in Solaris   (2004-04-12)
       http://python.org/sf/933795  closed by  loewis

tkMessageBox.askyesno wrong result  (2003-09-17)
       http://python.org/sf/807871  closed by  loewis

Trivial fix for obscure bug in os.urandom()  (2004-09-03)
       http://python.org/sf/1021596  closed by  loewis

os.system exhausts file descriptors  (2004-09-18)
       http://python.org/sf/1030388  closed by  loewis

PyMapping_Check crashes when argument is NULL  (2004-09-18)
       http://python.org/sf/1030557  closed by  rhettinger

tarfile: dirsize is not zero  (2004-09-20)
       http://python.org/sf/1031148  closed by  loewis

decimal module inconsistent with integers and floats  (2004-09-20)
       http://python.org/sf/1031480  closed by  rhettinger

get_installer_filename   (2004-09-15)
       http://python.org/sf/1028334  closed by  theller

New / Reopened RFE
__________________

Update unicodedata to version 4.0.1  (2004-09-20)
       http://python.org/sf/1031288  opened by  Oliver Horn

From python at rcn.com  Wed Sep 22 22:32:00 2004
From: python at rcn.com (Raymond Hettinger)
Date: Wed Sep 22 22:33:08 2004
Subject: FW: [Python-Dev] Noam's open regex requests
Message-ID: <000901c4a0e3$36c1afe0$e841fea9@oemcomputer>

Haven't heard a peep on this one.
Is anyone going to be miffed if I accept Noam's requests?


Raymond Hettinger

-----Original Message-----
From: python-dev-bounces+python=rcn.com@python.org
[mailto:python-dev-bounces+python=rcn.com@python.org] On Behalf Of
Raymond Hettinger
Sent: Saturday, September 18, 2004 6:34 PM
To: python-dev@python.org
Cc: 'Noam Raphael'
Subject: [Python-Dev] Noam's open regex requests

[Noam Raphael]
> I've suggested three things that I think should be done in that
> case, and nobody objected.
>
> 1. Add a prominent note in the module contents page or in the module's
> main page, stating that some functionality can only be acheived by
using
> compiled REs.

I would make that read "The methods of compiled regular expressions
allow more options than their simplified function counterparts.  Most
non-trivial applications always use the compiled form."


> 2. Document the optional parameters which let you specify the start
and
> end pos in the findall and finditer methods of a compiled RE object.

This seems reasonable to me.  The API is already exposed and is useful.
Why not document it.  AFAICT, there are no plans to take away the
functionality.


> 3. Add the optional parameter "flags" to the findall and finditer
> functions. Then, the four functions match, search, findall and
finditer
> would have the same interface: function(pattern, string[, flags]).

This also seems reasonable to me.  It is marginally useful and it may
reduce the learning curve ever so slightly.  There is nothing special
about findall() and finditer() that makes them different from match()
and search() with respect to flags.


Raymond Hettinger

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/python%40rcn.com

From skip at pobox.com  Wed Sep 22 23:43:42 2004
From: skip at pobox.com (Skip Montanaro)
Date: Wed Sep 22 23:43:47 2004
Subject: FW: [Python-Dev] Noam's open regex requests
In-Reply-To: <000901c4a0e3$36c1afe0$e841fea9@oemcomputer>
References: <000901c4a0e3$36c1afe0$e841fea9@oemcomputer>
Message-ID: <16721.61966.97526.546831@montanaro.dyndns.org>


    Raymond> Haven't heard a peep on this one.  Is anyone going to be miffed
    Raymond> if I accept Noam's requests?

I thought most of the opinion (certainly from Fredrik and Guido) ran counter
to the request.

Skip
From gvanrossum at gmail.com  Wed Sep 22 23:50:15 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Sep 22 23:50:26 2004
Subject: FW: [Python-Dev] Noam's open regex requests
In-Reply-To: <16721.61966.97526.546831@montanaro.dyndns.org>
References: <000901c4a0e3$36c1afe0$e841fea9@oemcomputer>
	<16721.61966.97526.546831@montanaro.dyndns.org>
Message-ID: <ca471dc204092214506f4198c7@mail.gmail.com>

We're not against #1 and #2, which are just fixing the docs!

I don't know what /F thinks of #3, which is a small subset of the
original proposal (to add options that are already present for other
APIs), but I'm +0.5 on it. FWIW.


On Wed, 22 Sep 2004 16:43:42 -0500, Skip Montanaro <skip@pobox.com> wrote:
> 
>     Raymond> Haven't heard a peep on this one.  Is anyone going to be miffed
>     Raymond> if I accept Noam's requests?
> 
> I thought most of the opinion (certainly from Fredrik and Guido) ran counter
> to the request.
> 
> Skip
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python at rcn.com  Thu Sep 23 02:15:30 2004
From: python at rcn.com (Raymond Hettinger)
Date: Thu Sep 23 02:16:38 2004
Subject: FW: [Python-Dev] Noam's open regex requests
In-Reply-To: <16721.61966.97526.546831@montanaro.dyndns.org>
Message-ID: <002c01c4a102$6ff12e20$e841fea9@oemcomputer>

>     Raymond> Haven't heard a peep on this one.  Is anyone going to be
> miffed
>     Raymond> if I accept Noam's requests?

[Skip] 
> I thought most of the opinion (certainly from Fredrik and Guido) ran
> counter
> to the request.

IIRC, this is the part of the request that wasn't shot down.
Originally, the OP wanted the function API to fully duplicate the method
API.  There were several reasons for not doing that:  API stability;
where to put the flags argument relative to the start/stop arguments;
the functions were supposed to be kept simple; and there were
unresolvable argument order conflicts.

So, the remaining part of the request is more humble: document that the
functions are not supposed to be full featured, fully document the
existing API, and to give findall() and finditer() the same interface as
the other functions.

I sent Fred a note on the third part and will stick with whatever he
says if there is a reply.


Raymond

From tim.peters at gmail.com  Thu Sep 23 10:26:58 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep 23 10:27:13 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
Message-ID: <1f7befae040923012645bc07f8@mail.gmail.com>

>>> x = [1]
>>> x.extend(-y for y in x)
From arigo at tunes.org  Thu Sep 23 11:45:02 2004
From: arigo at tunes.org (Armin Rigo)
Date: Thu Sep 23 11:50:19 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
	floatobject.c, 2.132, 2.133
In-Reply-To: <E1CAOcm-0007yb-Ir@sc8-pr-cvs1.sourceforge.net>
References: <E1CAOcm-0007yb-Ir@sc8-pr-cvs1.sourceforge.net>
Message-ID: <20040923094502.GA10207@vicky.ecs.soton.ac.uk>

Hello Tim,

Your float.richcompare patch, trying to map the C semantics at the Python
level, introduces artificial results when comparing NaN's with longs:

>>> float('nan') > 0
False
>>> float('nan') > 0L
True

I am not aware of all the problems and various platforms, but clearly in the
patch 'vsign' by itself doesn't make much sense if 'v' is a NaN.

Wouldn't all compilers and platforms compare NaNs "strangely", for some
detectable definition of "stange"?  Something along the lines of:

#define Py_IS_NAN(v)  (!Py_IS_INFINITY(v)  &&          \
                       ( ((v) < 0.0 && (v) > 0.0) ||   \
                         !((v) < 1.0 || (v) > -1.0) )


Armin
From imbaczek at gmail.com  Thu Sep 23 14:46:35 2004
From: imbaczek at gmail.com (=?UTF-8?Q?Marek_=22Baczek=22_Baczy=C5=84ski?=)
Date: Thu Sep 23 14:46:38 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae040923012645bc07f8@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <5f3d2c310409230546693ced87@mail.gmail.com>

On Thu, 23 Sep 2004 04:26:58 -0400, Tim Peters <tim.peters@gmail.com> wrote:
> >>> x = [1]
> >>> x.extend(-y for y in x)

Doesn't it leak memory when Ctrl+C'd (on Windows at least?)

-- 
{ Marek Baczy?ski :: UIN 57114871 :: GG 161671 :: JID imbaczek@jabber.gda.pl  }
{ http://www.vlo.ids.gda.pl/ | imbaczek at poczta fm | http://www.promode.org }
.. .. .. .. ... ... ...... evolve or face extinction ...... ... ... .. .. .. ..
From jhylton at gmail.com  Thu Sep 23 15:01:17 2004
From: jhylton at gmail.com (Jeremy Hylton)
Date: Thu Sep 23 15:01:30 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae040923012645bc07f8@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <e8bf7a5304092306013bc6801b@mail.gmail.com>

On Thu, 23 Sep 2004 04:26:58 -0400, Tim Peters <tim.peters@gmail.com> wrote:
> >>> x = [1]
> >>> x.extend(-y for y in x)

It is perhaps surprising that something lazy can work so hard.

Jeremy
From python at rcn.com  Thu Sep 23 17:33:00 2004
From: python at rcn.com (Raymond Hettinger)
Date: Thu Sep 23 17:34:14 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <001901c4a182$9c44ee00$e841fea9@oemcomputer>

> >>> x = [1]
> >>> x.extend(-y for y in x)

In comparison, the classic form doesn't seem as magical:

    x = [1]
    for y in x:
        x.append(-y)


Raymond

From FBatista at uniFON.com.ar  Thu Sep 23 17:58:39 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Thu Sep 23 18:03:21 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
Message-ID: <A128D751272CD411BC9200508BC2194D053C79C4@escpl.tcp.com.ar>

[Raymond Hettinger]

#- In comparison, the classic form doesn't seem as magical:
#- 
#-     x = [1]
#-     for y in x:
#-         x.append(-y)
#- 

The eternal inherent risk of modify the iterable being iterated. Who didn't
ever fall in this?

.	Facundo
From goodger at python.org  Thu Sep 23 18:32:38 2004
From: goodger at python.org (David Goodger)
Date: Thu Sep 23 18:32:58 2004
Subject: [Python-Dev] Re: A cute new way to get an infinite loop
In-Reply-To: <1f7befae040923012645bc07f8@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <4152FAA6.70201@python.org>

[Tim Peters]
> >>> x = [1]
> >>> x.extend(-y for y in x)

Not quite infinite, since eventually it will raise a MemoryError.
So "while 1:" still rules that roost.  ;-)

-- 
David Goodger <http://python.net/~goodger>

From tim.peters at gmail.com  Thu Sep 23 19:39:56 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep 23 19:40:01 2004
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
	floatobject.c, 2.132, 2.133
In-Reply-To: <20040923094502.GA10207@vicky.ecs.soton.ac.uk>
References: <E1CAOcm-0007yb-Ir@sc8-pr-cvs1.sourceforge.net>
	<20040923094502.GA10207@vicky.ecs.soton.ac.uk>
Message-ID: <1f7befae040923103916aba29b@mail.gmail.com>

[Armin Rigo]
> Your float.richcompare patch, trying to map the C semantics at the Python
> level, introduces artificial results when comparing NaN's with longs:

Not really.  All Python behavior in the presence of NaNs was
accidental before.  That it remains accidental was noted in the
checkin comment, and in an XXX block in the new code.  The specific
form of accidents may or may not have changed, depending on platform.

> >>> float('nan') > 0

And it remains an accident that float('nan') didn't raise ValueError
on whatever box you're using (it does, e.g., on mine).

> False
> >>> float('nan') > 0L
> True
>
> I am not aware of all the problems and various platforms, but clearly in the
> patch 'vsign' by itself doesn't make much sense if 'v' is a NaN.

Right, it makes no sense.

> Wouldn't all compilers and platforms compare NaNs "strangely", for some
> detectable definition of "stange"?

Yes.  Some may even raise SIGFPE if you try; that was also true before
the patch.

>  Something along the lines of:
>
> #define Py_IS_NAN(v)  (!Py_IS_INFINITY(v)  &&          \
>                       ( ((v) < 0.0 && (v) > 0.0) ||   \
>                         !((v) < 1.0 || (v) > -1.0) )

As the new code says,

		/* XXX If we had a reliable way to check whether i is a
		 * XXX NaN, it would belong in this branch too.
		 */

The best candidate for 2.4 may be:

    #define Py_IS_NAN(v) ((v) != (v))

That works under MS VC 7.1, but didn't work under VC 6.0 (which is why
the "for 2.4" qualifier -- Python on Windows is switching to 7.1 for
2.4).  If someone can confirm that it works under recent gcc too,
let's do that.

Nothing exists that will work on all platforms, but all platforms
claiming to support 754 have *some* way to spell "true iff a NaN, and
don't raise SIGFPE just because I'm asking".  C99 spells that
isnan(x), from math.h.  MS C doesn't have that, but does have
_isnan(x), from float.h.  That's the maddening part -- it's easy to
spell on any specific platform, but nothing about the spelling
(neither name nor header file) is the same across platforms.
From tim.peters at gmail.com  Thu Sep 23 20:11:34 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep 23 20:11:42 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <5f3d2c310409230546693ced87@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<5f3d2c310409230546693ced87@mail.gmail.com>
Message-ID: <1f7befae0409231111171029d2@mail.gmail.com>

[Marek Baczek Baczy?ski]
> Doesn't it leak memory when Ctrl+C'd (on Windows at least?)

Not really.  "Leak" is reserved for cases where memory is unaccounted
for.  In this case, the memory is consumed by the ever-growing list:

>>> x = [1]
>>> x.extend(-y for y in x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 1, in <generator expression>
KeyboardInterrupt
>>> len(x)
67090195
>>> x[:10]
[1, -1, 1, -1, 1, -1, 1, -1, 1, -1]
>>>

At that point, doing

>>> del x[:]

reclaimed a few hundred megabytes.
From cben at users.sf.net  Thu Sep 23 19:24:35 2004
From: cben at users.sf.net (Beni Cherniavsky)
Date: Thu Sep 23 21:10:30 2004
Subject: [Python-Dev] Re: A cute new way to get an infinite loop
In-Reply-To: <1f7befae040923012645bc07f8@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <civ0t1$s77$1@sea.gmane.org>

Tim Peters wrote:
>>>>x = [1]
>>>>x.extend(-y for y in x)
> 
A simpler way:

 >>> x = [1, -1]
 >>> x.extend(iter(x))

Curiously, this didn't "work" before 2.4 either:

 >>> x = [1]
 >>> x.extend(iter(x))
 >>> x
[1, 1]

The iterator did see the new elements after the extend call but not
during it:

 >>> x = [1]
 >>> i = iter(x)
 >>> x.extend(x)
 >>> list(i)
[1, 1]
 >>> x = [1]
 >>> i = iter(x)
 >>> x.extend([list(i)])
 >>> x
[1, [1]]

The reason is that in 2.3 `listextend()` passed the right argument
through `PySequence_Fast` which copied it before beggining to extend
the list.

It's much better now.  I mean it!  Bugs should be predictable.
Infinite loop should never terminate silently.  Unless explicitly
terminated.

From tjreedy at udel.edu  Fri Sep 24 02:31:04 2004
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Sep 24 02:31:16 2004
Subject: [Python-Dev] Re: A cute new way to get an infinite loop
References: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <civpsc$plr$1@sea.gmane.org>


"Tim Peters" <tim.peters@gmail.com> wrote in message 
news:1f7befae040923012645bc07f8@mail.gmail.com...
>>>> x = [1]
>>>> x.extend(-y for y in x)

Very similar to this old way (2.2 and I presume before):
>>> l=[1]
>>> for i in l: l.append(i)
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyboardInterrupt
>>> len(l)
1623613

but admittedly a bit more baroque ;-)

So, are things like this a programming bug, interpreter bug, or language 
definition bug?  or just a 'gotcha'?

Terry J. Reedy


From tdelaney at avaya.com  Fri Sep 24 03:17:06 2004
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Fri Sep 24 03:17:12 2004
Subject: [Python-Dev] Re: A cute new way to get an infinite loop
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE4A901D@au3010avexu1.global.avaya.com>

Terry Reedy wrote:

> So, are things like this a programming bug, interpreter bug, or
> language definition bug?  or just a 'gotcha'?

Gotcha. In pretty much every language, you have to be careful about
modifying what you're iterating over. I don't see that Python should be
any different ;)

However, Tim's example is a bit less obvious that you are modifying the
thing you're iterating over ... Hence the "cuteness" IMO.

Tim Delaney
From tim.peters at gmail.com  Fri Sep 24 06:14:56 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Sep 24 06:14:58 2004
Subject: [Python-Dev] Re: A cute new way to get an infinite loop
In-Reply-To: <civpsc$plr$1@sea.gmane.org>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<civpsc$plr$1@sea.gmane.org>
Message-ID: <1f7befae040923211420e5905c@mail.gmail.com>

[Terry Reedy]
> Very similar to this old way (2.2 and I presume before):

Been there forever, yes.

> >>> l=[1]
> >>> for i in l: l.append(i)
> ...
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
> KeyboardInterrupt
> >>> len(l)
> 1623613
>
> but admittedly a bit more baroque ;-)
>
> So, are things like this a programming bug, interpreter bug, or language
> definition bug?  or just a 'gotcha'?

They're features, provoked into revealing their dark sides by pilot
error.  It's not an accident that I posted my note right after
checking in a new test, in test_long.py, containing:

       cases.extend([-x for x in cases])

I will not admit that it didn't always contain the square brackets. 
And if I won't admit that, I *sure* won't admit that I initially
feared hairy new code for mixed float-vs-long comparison contained an
infinite loop <wink>.

never-getting-an-infinite-loop-is-a-symptom-of-not-trying-hard-enough-ly
y'rs  - tim
From imbaczek at gmail.com  Fri Sep 24 11:41:50 2004
From: imbaczek at gmail.com (=?UTF-8?Q?Marek_=22Baczek=22_Baczy=C5=84ski?=)
Date: Fri Sep 24 11:41:53 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae0409231111171029d2@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<5f3d2c310409230546693ced87@mail.gmail.com>
	<1f7befae0409231111171029d2@mail.gmail.com>
Message-ID: <5f3d2c3104092402412b891a11@mail.gmail.com>

On Thu, 23 Sep 2004 14:11:34 -0400, Tim Peters <tim.peters@gmail.com> wrote:
> [Marek Baczek Baczy?ski]
> > Doesn't it leak memory when Ctrl+C'd (on Windows at least?)
> 
> Not really.  "Leak" is reserved for cases where memory is unaccounted
> for.  In this case, the memory is consumed by the ever-growing list:
[...]

I realized that the moment after I pressed 'Send'; felt so embarrassed
that I hoped no one would see that post :)

Next time I'll think. Twice.

-- 
{ Marek Baczy?ski :: UIN 57114871 :: GG 161671 :: JID imbaczek@jabber.gda.pl  }
{ http://www.vlo.ids.gda.pl/ | imbaczek at poczta fm | http://www.promode.org }
.. .. .. .. ... ... ...... evolve or face extinction ...... ... ... .. .. .. ..
From lists at hlabs.spb.ru  Fri Sep 24 16:10:21 2004
From: lists at hlabs.spb.ru (Dmitry Vasiliev)
Date: Fri Sep 24 12:02:19 2004
Subject: [Python-Dev] Methods identity...?
Message-ID: <41542ACD.5080307@hlabs.spb.ru>

Is this intended? Seems like a bug...

(Python 2.1.3, 2.2.2, 2.3.4, 2.4a3, both old- and new- style classes.)

 >>> class Test(object):
...     def test(self): pass
...
 >>> Test.test is Test.test
False
 >>> t = Test()
 >>> t.test is t.test
False

-- 
Dmitry Vasiliev (dima at hlabs.spb.ru)
     http://hlabs.spb.ru
From aahz at pythoncraft.com  Fri Sep 24 16:11:05 2004
From: aahz at pythoncraft.com (Aahz)
Date: Fri Sep 24 16:11:07 2004
Subject: [Python-Dev] Methods identity...?
In-Reply-To: <41542ACD.5080307@hlabs.spb.ru>
References: <41542ACD.5080307@hlabs.spb.ru>
Message-ID: <20040924141105.GA3062@panix.com>

On Fri, Sep 24, 2004, Dmitry Vasiliev wrote:
>
> Is this intended? Seems like a bug...
> 
> (Python 2.1.3, 2.2.2, 2.3.4, 2.4a3, both old- and new- style classes.)
> 
> >>> class Test(object):
> ...     def test(self): pass
> ...
> >>> Test.test is Test.test
> False
> >>> t = Test()
> >>> t.test is t.test
> False

Not a bug.  For more discussion, please post to comp.lang.python
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From gerrit at nl.linux.org  Fri Sep 24 17:00:56 2004
From: gerrit at nl.linux.org (Gerrit)
Date: Fri Sep 24 17:02:01 2004
Subject: [Python-Dev] Methods identity...?
In-Reply-To: <20040924141105.GA3062@panix.com>
References: <41542ACD.5080307@hlabs.spb.ru> <20040924141105.GA3062@panix.com>
Message-ID: <20040924150056.GA4343@nl.linux.org>

Aahz wrote:
> On Fri, Sep 24, 2004, Dmitry Vasiliev wrote:
> >
> > Is this intended? Seems like a bug...
> > 
> > (Python 2.1.3, 2.2.2, 2.3.4, 2.4a3, both old- and new- style classes.)
> > 
> > >>> class Test(object):
> > ...     def test(self): pass
> > ...
> > >>> Test.test is Test.test
> > False
> > >>> t = Test()
> > >>> t.test is t.test
> > False
> 
> Not a bug.  For more discussion, please post to comp.lang.python

Or search the archives, I recall having brought this up on c.l.py once.

Gerrit.

-- 
Weather in Twenthe, Netherlands 24/09 16:25:
	13.0?C light rain showers; Cumulonimbus clouds observed mostly cloudy wind 5.8 m/s WNW (57 m above NAP)
-- 
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
    -Dwight David Eisenhower, January 17, 1961
From ndbecker2 at verizon.net  Fri Sep 24 21:59:44 2004
From: ndbecker2 at verizon.net (Neal D. Becker)
Date: Fri Sep 24 21:59:49 2004
Subject: [Python-Dev] python.sty conflict with \newcommand\url
Message-ID: <cj1ubg$53e$1@sea.gmane.org>

I hope this is the correct place to post this question.

I'm trying to use python.sty to write some doc for my modules.  If I try to
use \hyperref package, I get this:
(/usr/share/texmf/tex/latex/html/url.sty

! LaTeX Error: Command \url already defined.
               Or name \end... illegal, see p.192 of the manual.


This is what python.sty says:

% Use this def/redef approach for \url{} since hyperref defined this
already,
% but only if we actually used hyperref:
\ifpdf
  \newcommand{\url}[1]{{%

The comment suggest a workaround for hyperref, but it doesn't look like the
code actually matches the comment.  Any ideas?


From python at dynkin.com  Sat Sep 25 05:33:39 2004
From: python at dynkin.com (George Yoshida)
Date: Sat Sep 25 05:32:46 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae040923012645bc07f8@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
Message-ID: <4154E713.3090206@dynkin.com>

Tim Peters wrote:
 >>>>x = [1]
 >>>>x.extend(-y for y in x)

It does not always go into an infinite loop. I was bitten by this:

   >>> x = []
   >>> x.extend(-y for y in x)
   Segmentation fault


George
From bob at redivi.com  Sat Sep 25 05:36:10 2004
From: bob at redivi.com (Bob Ippolito)
Date: Sat Sep 25 05:36:52 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <4154E713.3090206@dynkin.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<4154E713.3090206@dynkin.com>
Message-ID: <0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>

On Sep 24, 2004, at 11:33 PM, George Yoshida wrote:

> Tim Peters wrote:
> >>>>x = [1]
> >>>>x.extend(-y for y in x)
>
> It does not always go into an infinite loop. I was bitten by this:
>
>   >>> x = []
>   >>> x.extend(-y for y in x)
>   Segmentation fault

No algorithm that requires infinite memory will run for an infinite 
amount of time on a finite computer.  Of course it should raise an 
exception instead of segfaulting though.. could it be blowing the 
stack?

-bob
From tim.peters at gmail.com  Sat Sep 25 06:41:37 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Sep 25 06:41:40 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<4154E713.3090206@dynkin.com>
	<0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
Message-ID: <1f7befae0409242141ebdcf83@mail.gmail.com>

[George Yoshida]
>> It does not always go into an infinite loop. I was bitten by this:
>>
>>  >>> x = []
>>  >>> x.extend(-y for y in x)
>>  Segmentation fault

[Bob Ippolito]
> No algorithm that requires infinite memory will run for an infinite
> amount of time on a finite computer.  Of course it should raise an
> exception instead of segfaulting though.. could it be blowing the
> stack?

No, its stack use is bounded (and small) no matter how long it runs. 
On Windows it eventually raises MemoryError.  My guess is that George
is using Linux.  "It's a feature" that the Linux malloc() can lie (==
malloc(n) can return a non-NULL value p even if you're going to get a
segfault if you try to write to p+i for some i in range(n)).  Linus
likens this to airlines over-selling seats, based on the likelihood
that someone will miss their flight.  Argue with him <wink>.  When
malloc() claims to return memory that can't actually be used, there's
not much Python can do about that (other than blow up when trying to
use it).
From python at dynkin.com  Sat Sep 25 07:09:56 2004
From: python at dynkin.com (George Yoshida)
Date: Sat Sep 25 07:09:04 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae0409242141ebdcf83@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<4154E713.3090206@dynkin.com>
	<0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
	<1f7befae0409242141ebdcf83@mail.gmail.com>
Message-ID: <4154FDA4.7090401@dynkin.com>

Tim Peters wrote:

> On Windows it eventually raises MemoryError.  My guess is that George
> is using Linux.  

That's right!

$ uname -a
Linux linux 2.6.5-7.108-smp #1 SMP Wed Aug 25 13:34:40 UTC 2004 
i686 i686 i386
GNU/Linux


George
From ronaldoussoren at mac.com  Sat Sep 25 13:56:52 2004
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sat Sep 25 13:57:58 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae0409242141ebdcf83@mail.gmail.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<4154E713.3090206@dynkin.com>
	<0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
	<1f7befae0409242141ebdcf83@mail.gmail.com>
Message-ID: <FD8F5350-0EE9-11D9-8EF1-000A95C77748@mac.com>


On 25-sep-04, at 6:41, Tim Peters wrote:

> [George Yoshida]
>>> It does not always go into an infinite loop. I was bitten by this:
>>>
>>>>>> x = []
>>>>>> x.extend(-y for y in x)
>>>  Segmentation fault
>
> [Bob Ippolito]
>> No algorithm that requires infinite memory will run for an infinite
>> amount of time on a finite computer.  Of course it should raise an
>> exception instead of segfaulting though.. could it be blowing the
>> stack?
>
> No, its stack use is bounded (and small) no matter how long it runs.

I get a bus error on OSX (although with a slightly out of date 
python2.4 from CVS).

Why should this loop at all? x is the empty list, and the generator 
comprehension should therefore end up with an empty sequence. It's not 
like your initial example where the list was non-empty to at the start.

It crashes because of an Py_INCREF(item) at line 2727 in listobject.c 
where item is NULL:

2722            assert(PyList_Check(seq));
2723
2724            if (it->it_index < PyList_GET_SIZE(seq)) {
2725                    item = PyList_GET_ITEM(seq, it->it_index);
2726                    ++it->it_index;
2727                    Py_INCREF(item);
2728                    return item;
2729            }
2730
2731            Py_DECREF(seq);

BWT. seq is null as well.

From arigo at tunes.org  Sat Sep 25 16:07:23 2004
From: arigo at tunes.org (Armin Rigo)
Date: Sat Sep 25 16:12:38 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<4154E713.3090206@dynkin.com>
	<0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
Message-ID: <20040925140723.GA13511@vicky.ecs.soton.ac.uk>

Hi Bob,

On Fri, Sep 24, 2004 at 11:36:10PM -0400, Bob Ippolito wrote:
> >  >>> x = []
> >  >>> x.extend(-y for y in x)
> >  Segmentation fault
> 
> No algorithm that requires infinite memory will run for an infinite 
> amount of time on a finite computer.

The segfault is immediate.  And the example is different, as Ronald pointed
out: the list 'x' is empty!

Uh oh.  We have a real bug in listextend(): the list being extended is in a
semi-invalid state when it's calling tp_iternext() on the 2nd iterable.  This
might call back Python code, which can inspect the list.  The above example
does just that.  Crash.

"Semi-invalid" means that all invariants are respected but the final items in
the list are NULL.  Reading them crashes.  And I'm not even talking about the
nasty things you can do if you modify the list while it's being extended :-)

The safest solution would be to use a regular app1() to add each item as the
iterable produce them instead of optimizing this case.  I'm not sure we need
the high-flying optimization of listextend() in this case (this is the case
where the iterable we extend the list with is neither a list nor a tuple).  I
believe that the speed of app1() would be acceptable, given the fixed bug and
the overall decrease of code complexity (though that should be measured).


Armin
From tim.peters at gmail.com  Sat Sep 25 18:23:57 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Sep 25 18:24:00 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <20040925140723.GA13511@vicky.ecs.soton.ac.uk>
References: <1f7befae040923012645bc07f8@mail.gmail.com>
	<4154E713.3090206@dynkin.com>
	<0AC4DE7F-0EA4-11D9-A344-000A95686CD8@redivi.com>
	<20040925140723.GA13511@vicky.ecs.soton.ac.uk>
Message-ID: <1f7befae040925092332989585@mail.gmail.com>

[Armin Rigo, on
 >>> x = []
 >>> x.extend(-y for y in x)
 Segmentation fault
]

> The segfault is immediate.  And the example is different, as Ronald pointed
> out: the list 'x' is empty!

Good eye!  I overlooked that too.

> Uh oh.  We have a real bug in listextend(): the list being extended is in a
> semi-invalid state when it's calling tp_iternext() on the 2nd iterable.  This
> might call back Python code, which can inspect the list.  The above example
> does just that.  Crash.
>
> "Semi-invalid" means that all invariants are respected but the final items in
> the list are NULL.  Reading them crashes.  And I'm not even talking about the
> nasty things you can do if you modify the list while it's being extended :-)

Yup.  The code doesn't check for C int overflow of m+n either.

> The safest solution would be to use a regular app1() to add each item as the
> iterable produce them instead of optimizing this case.  I'm not sure we need
> the high-flying optimization of listextend() in this case (this is the case
> where the iterable we extend the list with is neither a list nor a tuple).  I
> believe that the speed of app1() would be acceptable, given the fixed bug and
> the overall decrease of code complexity (though that should be measured).

I think it's easy to fix.  "The usual rule" applies:  you can't assume
anything about a mutable object after potentially calling back into
Python.  So trying to save info in "i", "m", or "n" across loop
iterations can't work, and the list can never be left in an insane
state ("semi" or not) at any time user code may get invoked.  But
since we have both "num allocated" and "num used" members in the list
struct now, it's easy to use those instead of trying to carry info in
locals.

Patch attached.  Anyone object?  Of course in the example at the start
of this msg, it leaves x empty.
-------------- next part --------------
Index: Objects/listobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/listobject.c,v
retrieving revision 2.223
diff -c -u -r2.223 listobject.c
--- Objects/listobject.c	12 Sep 2004 19:53:07 -0000	2.223
+++ Objects/listobject.c	25 Sep 2004 16:14:33 -0000
@@ -769,12 +769,20 @@
 	}
 	m = self->ob_size;
 	mn = m + n;
-	if (list_resize(self, mn) == -1)
-		goto error;
-	memset(&(self->ob_item[m]), 0, sizeof(*self->ob_item) * n);
+	if (mn >= m) {
+		/* Make room. */
+		if (list_resize(self, mn) == -1)
+			goto error;
+		/* Make the list sane again. */
+		self->ob_size = m;
+	}
+	/* Else m + n overflowed; on the chance that n lied, and there really
+	 * is enough room, ignore it.  If n was telling the truth, we'll
+	 * eventually run out of memory during the loop.
+	 */
 
 	/* Run iterator to exhaustion. */
-	for (i = m; ; i++) {
+	for (;;) {
 		PyObject *item = iternext(it);
 		if (item == NULL) {
 			if (PyErr_Occurred()) {
@@ -785,8 +793,11 @@
 			}
 			break;
 		}
-		if (i < mn)
-			PyList_SET_ITEM(self, i, item); /* steals ref */
+		if (self->ob_size < self->allocated) {
+			/* steals ref */
+			PyList_SET_ITEM(self, self->ob_size, item);
+			++self->ob_size;
+		}
 		else {
 			int status = app1(self, item);
 			Py_DECREF(item);  /* append creates a new ref */
@@ -796,10 +807,9 @@
 	}
 
 	/* Cut back result list if initial guess was too large. */
-	if (i < mn && self != NULL) {
-		if (list_ass_slice(self, i, mn, (PyObject *)NULL) != 0)
-			goto error;
-	}
+	if (self->ob_size < self->allocated)
+		list_resize(self, self->ob_size);  /* shrinking can't fail */
+
 	Py_DECREF(it);
 	Py_RETURN_NONE;
 
From python at rcn.com  Sat Sep 25 20:21:39 2004
From: python at rcn.com (Raymond Hettinger)
Date: Sat Sep 25 20:23:20 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae040925092332989585@mail.gmail.com>
Message-ID: <000301c4a32c$809bd000$e841fea9@oemcomputer>

> Patch attached.  Anyone object?  Of course in the example at the start
> of this msg, it leaves x empty.

If you can hold off one day, I'll have time to review it in detail
tomorrow morning.

And, I'll check to see if other parts of the code base are similarly
afflicted.


Raymond

From python at rcn.com  Sat Sep 25 21:17:15 2004
From: python at rcn.com (Raymond Hettinger)
Date: Sat Sep 25 21:18:57 2004
Subject: [Python-Dev] More data points
In-Reply-To: <20040925140723.GA13511@vicky.ecs.soton.ac.uk>
Message-ID: <000a01c4a334$44b29e40$e841fea9@oemcomputer>

[Bob Ippolito]
> > >  >>> x = []
> > >  >>> x.extend(-y for y in x)
> > >  Segmentation fault

I get a MemoryError.

To help with get a comprehensive view when I look at this more closely
tomorrow, can you try out variations on the theme with other mutables:

  myset.update 
  deque.extend
  dict.update
  dict.fromkeys
  array.extend


Raymond

From ncoghlan at iinet.net.au  Sun Sep 26 01:47:26 2004
From: ncoghlan at iinet.net.au (ncoghlan@iinet.net.au)
Date: Sun Sep 26 01:47:32 2004
Subject: [Python-Dev] More data points
In-Reply-To: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
Message-ID: <1096156046.4156038e396cd@mail.iinet.net.au>

Quoting Raymond Hettinger <python@rcn.com>:

> [Bob Ippolito]
> > > >  >>> x = []
> > > >  >>> x.extend(-y for y in x)
> > > >  Segmentation fault
> 
> I get a MemoryError.
> 
> To help with get a comprehensive view when I look at this more closely
> tomorrow, can you try out variations on the theme with other mutables:
> 
>   myset.update 
>   deque.extend
>   dict.update
>   dict.fromkeys
>   array.extend

Short answer: all of these work OK for me (i.e. do nothing). Only list.extend
suffers from the segmentation fault.

Session transcripts (with bonus X's to trick mailreaders):

[...@localhost src]$ ./python
Python 2.4a3 (#16, Sep 21 2004, 17:33:57)
[GCC 3.4.1 20040702 (Red Hat Linux 3.4.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
X>> x = []
X>> x.extend(-y for y in x)
Segmentation fault
[...@localhost src]$ ./python
Python 2.4a3 (#16, Sep 21 2004, 17:33:57)
[GCC 3.4.1 20040702 (Red Hat Linux 3.4.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
X>> x = set()
X>> x.update(-y for y in x)
X>> x
set([])
X>> from collections import deque
X>> x = deque()
X>> x.extend(-y for y in x)
X>> x
deque([])
X>> x = {}
X>> x.update(-y for y in x)
X>> x
{}
X>> x.fromkeys(-y for y in x)
{}
X>> from array import array
X>> x = array('B')
X>> x.extend(-y for y in x)
X>> x
array('B')


From ncoghlan at iinet.net.au  Sun Sep 26 07:28:28 2004
From: ncoghlan at iinet.net.au (ncoghlan@iinet.net.au)
Date: Sun Sep 26 07:28:34 2004
Subject: [Python-Dev] More data points
In-Reply-To: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
Message-ID: <1096176508.4156537c95cca@mail.iinet.net.au>

Quoting Raymond Hettinger <python@rcn.com>:
> To help with get a comprehensive view when I look at this more closely
> tomorrow, can you try out variations on the theme with other mutables:
> 
>   myset.update 
>   deque.extend
>   dict.update
>   dict.fromkeys
>   array.extend

Returning to Tim's original infinite loop, the behaviour is interestingly variable.

List and array go into the infinite loop. Deque and dictionary both detect that
the loop variable has been mutated and throw a specific exception. Set throws
the same exception as dictionary does (presumably, the main container inside
'set' is a dictionary)

Details of behaviour:

Python 2.4a3 (#16, Sep 21 2004, 17:33:57)
[GCC 3.4.1 20040702 (Red Hat Linux 3.4.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
X>> x = [1]
X>> x.extend(-y for y in x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 1, in <generator expression>
KeyboardInterrupt
X>> len(x)
73727215
X>> x = set([1])
X>> x
set([1])
X>> x.update(-y for y in x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 1, in <generator expression>
RuntimeError: dictionary changed size during iteration
X>> x
set([1, -1])
X>> from collections import deque
X>> x = deque([1])
X>> x.extend(-y for y in x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 1, in <generator expression>
RuntimeError: deque changed size during iteration
X>> x
deque([1, -1])
X>> from array import array
X>> x = array('b', '1')
X>> x.extend(-y for y in x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 1, in <generator expression>
KeyboardInterrupt
X>> len(x)
6327343
X>> x = dict.fromkeys([1])
X>> x
{1: None}
X>> x.update((-y, None) for y in x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 1, in <generator expression>
RuntimeError: dictionary changed size during iteration
X>> x
{1: None, -1: None}
X>> x.fromkeys(-y for y in x)
{-1: None}


From tim.peters at gmail.com  Sun Sep 26 07:50:50 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Sep 26 07:50:54 2004
Subject: [Python-Dev] More data points
In-Reply-To: <1096176508.4156537c95cca@mail.iinet.net.au>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
Message-ID: <1f7befae040925225047b6d3f3@mail.gmail.com>

[ncoghlan@iinet.net.au]
> Returning to Tim's original infinite loop, the behaviour is interestingly variable.
>
> List and array go into the infinite loop.

What happens when you mutate a list while iterating over it is
defined, and an infinite loop is expected for that.  Ditto for array.

> Deque and dictionary both detect that the loop variable has been mutated and
> throw a specific exception.

That's because they never suffered from list's ill-advised
documentation effectively blessing mutation while iterating <0.5
wink>.

> Set throws the same exception as dictionary does (presumably, the main
> container inside 'set' is a dictionary)
>
> Details of behaviour:

The last one is extremely surprising:

> Python 2.4a3 (#16, Sep 21 2004, 17:33:57)
> [GCC 3.4.1 20040702 (Red Hat Linux 3.4.1-2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.

...

> >>> x
> {1: None, -1: None}
> >>> x.fromkeys(-y for y in x)
> {-1: None}

Are you sure get that?  I get this:

>>> x
{1: None, -1: None}
>>> x.fromkeys(-y for y in x)
{1: None, -1: None}

"x.fromkeys()" doesn't have anything to do with x.  Any dict works same there:

>>> {}.fromkeys(-y for y in x)
{1: None, -1: None}
>>> {'a': 'b', 'c': 'd', 'e': 'f'}.fromkeys(-y for y in x)
{1: None, -1: None}
>>>
From ncoghlan at iinet.net.au  Sun Sep 26 08:22:02 2004
From: ncoghlan at iinet.net.au (ncoghlan@iinet.net.au)
Date: Sun Sep 26 08:22:08 2004
Subject: [Python-Dev] More data points
In-Reply-To: <1f7befae040925225047b6d3f3@mail.gmail.com>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
Message-ID: <1096179722.4156600a2c219@mail.iinet.net.au>

Quoting Tim Peters <tim.peters@gmail.com>:

> That's because they never suffered from list's ill-advised
> documentation effectively blessing mutation while iterating <0.5
> wink>.

Ah. Interesting to know. So catching this is recommended when it's feasible?

> > Set throws the same exception as dictionary does (presumably, the main
> > container inside 'set' is a dictionary)
> >
> > Details of behaviour:
> 
> The last one is extremely surprising:

And it never actually happened, either. It's a transcription error on my part. I
made a mistake when testing the dict.update version (I wrote "-y for y in x",
instead of "(-y, None) for y in x"). When deleting that from the transcript, I
also accidentally deleted the x.fromkeys() example. When I added that example
back in, I put it in the wrong spot (after the x.update example, instead of
before it).

So, no, dict.update isn't randomly eating dictionary entries. Sorry 'bout the
false alarm. . .

Cheers,
Nick.
From raynorj at mn.rr.com  Sun Sep 26 08:44:22 2004
From: raynorj at mn.rr.com (J Raynor)
Date: Sun Sep 26 08:28:18 2004
Subject: [Python-Dev] using openssh's pty code
Message-ID: <41566546.7020601@mn.rr.com>


Since openssh must handle pty allocation, its support for pty operations 
across various platforms is more robust than python's.  I'd like to use 
openssh's code to improve on python's pty handling.

I know the licenses for openssh and python are different.  Can anyone 
tell me if it's legal to mix openssh code into python?  Assuming it is, 
are the python maintainers willing to accept a python patch that 
contains some openssh code?


From ncoghlan at iinet.net.au  Sun Sep 26 08:28:51 2004
From: ncoghlan at iinet.net.au (ncoghlan@iinet.net.au)
Date: Sun Sep 26 08:28:57 2004
Subject: [Python-Dev] More data points
In-Reply-To: <1096179722.4156600a2c219@mail.iinet.net.au>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
Message-ID: <1096180131.415661a362685@mail.iinet.net.au>

Quoting "ncoghlan@iinet.net.au" <ncoghlan@iinet.net.au>:
> So, no, dict.update isn't randomly eating dictionary entries.

And neither is dict.fromkeys, for that matter (which was what my copy-and-paste
error actually showed).

Cheers,
Nick.
With this sort of error rate, it's a good thing I'm not coding right now. . .


From python at rcn.com  Sun Sep 26 12:17:34 2004
From: python at rcn.com (Raymond Hettinger)
Date: Sun Sep 26 12:18:45 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <1f7befae040925092332989585@mail.gmail.com>
Message-ID: <000501c4a3b2$0ab29b40$e841fea9@oemcomputer>

> I think it's easy to fix.  "The usual rule" applies:  you can't assume
> anything about a mutable object after potentially calling back into
> Python.  So trying to save info in "i", "m", or "n" across loop
> iterations can't work, and the list can never be left in an insane
> state ("semi" or not) at any time user code may get invoked.  But
> since we have both "num allocated" and "num used" members in the list
> struct now, it's easy to use those instead of trying to carry info in
> locals.

FWIW, I've searched the codebase and found no other variants on this
problem.  None of the other update/extend methods try to remember self
data between iterations.  Other calls to list_resize immediately fill-in
the NULLS before calling arbitrary Python code.  And, other places that
use the over-allocation trick, map() for example, are working with a
brand new list or tuple that has not been exposed to the rest of the
application.

One situation did look suspect.  _PySequence_IterSearch() remembers an
index/count across calls to PyIter_Next() -- it looks like the worst
that could happen is the index or count would be wrong, but no crashers.


> Patch attached.  Anyone object?  Of course in the example at the start
> of this msg, it leaves x empty.

Looks good.  Reads well. Solves the problem.  The timings are still
fast.  The test suite runs w/o exception.  Please apply.


Raymond

From jepler at unpythonic.net  Sun Sep 26 15:32:57 2004
From: jepler at unpythonic.net (Jeff Epler)
Date: Sun Sep 26 15:33:02 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <41566546.7020601@mn.rr.com>
References: <41566546.7020601@mn.rr.com>
Message-ID: <20040926133257.GA2645@unpythonic.net>

A year or so ago, it was suggested that we take some code from glib for
string-to-float conversion(?).  As far as I remember, after the license
issues were resolved, the remaining issue was that the contributor was
not himself familiar with the code.  I don't know what eventually
happened.  You might look for this thread in python-dev archives.  I
think this is an entry point into that thread:
    http://mail.python.org/pipermail/python-dev/2003-August/037744.html

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20040926/2e993d24/attachment.pgp
From martin at v.loewis.de  Sun Sep 26 17:17:38 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep 26 17:17:37 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <41566546.7020601@mn.rr.com>
References: <41566546.7020601@mn.rr.com>
Message-ID: <4156DD92.2040300@v.loewis.de>

J Raynor wrote:
> 
> Since openssh must handle pty allocation, its support for pty operations 
> across various platforms is more robust than python's.  I'd like to use 
> openssh's code to improve on python's pty handling.
> 
> I know the licenses for openssh and python are different.  Can anyone 
> tell me if it's legal to mix openssh code into python?  Assuming it is, 
> are the python maintainers willing to accept a python patch that 
> contains some openssh code?

Could you change Python's pty module to more closely follow the
procedures in OpenSSH, in particular those parts where OpenSSH
is more robust?

Regards,
Martin
From fredrik at pythonware.com  Sun Sep 26 14:06:30 2004
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Sep 26 18:47:57 2004
Subject: [Python-Dev] Re: python/dist/src/Lib httplib.py,1.88,1.89
Message-ID: <200409261204.i8QC4Qk29680@pythonware.com>


> +++ httplib.py	14 Sep 2004 17:55:21 -0000	1.89
> @@ -525,7 +525,8 @@
>      def _set_hostport(self, host, port):
>          if port is None:
>              i = host.rfind(':')
> -            if i >= 0:
> +            j = host.rfind(']')         # ipv6 addresses have [...]
> +            if i > j:

one-line alternative:

		i = host.find(":", host.rfind("]"))

</F>

From raynorj at mn.rr.com  Sun Sep 26 20:46:41 2004
From: raynorj at mn.rr.com (J Raynor)
Date: Sun Sep 26 20:30:28 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <4156DD92.2040300@v.loewis.de>
References: <41566546.7020601@mn.rr.com> <4156DD92.2040300@v.loewis.de>
Message-ID: <41570E91.2070503@mn.rr.com>


I think I could improve the pty module by having it follow openssh's 
procedures, but I would wind up rewriting several configure checks in 
python, and I imagine some of them can only reliably be checked by 
compiling a small C program, like configure does.

I think the better solution would be to modify the C code in 
posixmodule.c, or to provide an alternate module (written in C).  For 
the alternate module idea, the pty module could import it and check to 
see if it provides openpty() (for example), just as the pty module 
currently tries to use os.openpty() before it tries its own 
implementation of openpty().


Martin v. L?wis wrote:
> J Raynor wrote:
> 
>>
>> Since openssh must handle pty allocation, its support for pty 
>> operations across various platforms is more robust than python's.  I'd 
>> like to use openssh's code to improve on python's pty handling.
>>
>> I know the licenses for openssh and python are different.  Can anyone 
>> tell me if it's legal to mix openssh code into python?  Assuming it 
>> is, are the python maintainers willing to accept a python patch that 
>> contains some openssh code?
> 
> 
> Could you change Python's pty module to more closely follow the
> procedures in OpenSSH, in particular those parts where OpenSSH
> is more robust?
> 
> Regards,
> Martin
> 
From raynorj at mn.rr.com  Sun Sep 26 20:48:01 2004
From: raynorj at mn.rr.com (J Raynor)
Date: Sun Sep 26 20:31:45 2004
Subject: [Fwd: Re: [Python-Dev] using openssh's pty code]
Message-ID: <41570EE1.1010404@mn.rr.com>

I forgot to CC the list with my response.
-------------- next part --------------
An embedded message was scrubbed...
From: J Raynor <raynorj@mn.rr.com>
Subject: Re: [Python-Dev] using openssh's pty code
Date: Sun, 26 Sep 2004 13:28:38 -0500
Size: 1596
Url: http://mail.python.org/pipermail/python-dev/attachments/20040926/34315f35/Python-Devusingopensshsptycode.mht
From martin at v.loewis.de  Sun Sep 26 20:49:55 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Sep 26 20:49:53 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <41570E91.2070503@mn.rr.com>
References: <41566546.7020601@mn.rr.com> <4156DD92.2040300@v.loewis.de>
	<41570E91.2070503@mn.rr.com>
Message-ID: <41570F53.1010203@v.loewis.de>

J Raynor wrote:
> I think the better solution would be to modify the C code in 
> posixmodule.c, or to provide an alternate module (written in C).  For 
> the alternate module idea, the pty module could import it and check to 
> see if it provides openpty() (for example), just as the pty module 
> currently tries to use os.openpty() before it tries its own 
> implementation of openpty().

Either would be fine. For the separate-module approach, I strongly
advise that you publish this separately first, and collect user
feedback. If a sufficient number of users would like to see it included
in Python, and if you volunteer to maintain the module within Python
for an extended period of time, we can include it.

Regards,
Martin

From gvanrossum at gmail.com  Sun Sep 26 23:01:47 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Sep 26 23:01:50 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <41570E91.2070503@mn.rr.com>
References: <41566546.7020601@mn.rr.com> <4156DD92.2040300@v.loewis.de>
	<41570E91.2070503@mn.rr.com>
Message-ID: <ca471dc2040926140140b98de@mail.gmail.com>

On Sun, 26 Sep 2004 13:46:41 -0500, J Raynor <raynorj@mn.rr.com> wrote:
> 
> I think I could improve the pty module by having it follow openssh's
> procedures, but I would wind up rewriting several configure checks in
> python, and I imagine some of them can only reliably be checked by
> compiling a small C program, like configure does.
> 
> I think the better solution would be to modify the C code in
> posixmodule.c, or to provide an alternate module (written in C).  For
> the alternate module idea, the pty module could import it and check to
> see if it provides openpty() (for example), just as the pty module
> currently tries to use os.openpty() before it tries its own
> implementation of openpty().

Agreed that this would be best served by writing C code. I hope that
it can be done without  violating someone else's license *and* without
weighing down future Python distributions with someone else's license.
No matter how sensible the other license is, adding licenses to the
stack of licenses is not a good idea at this point.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tim.peters at gmail.com  Mon Sep 27 00:06:00 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Sep 27 00:06:02 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
In-Reply-To: <000501c4a3b2$0ab29b40$e841fea9@oemcomputer>
References: <1f7befae040925092332989585@mail.gmail.com>
	<000501c4a3b2$0ab29b40$e841fea9@oemcomputer>
Message-ID: <1f7befae040926150618b76b21@mail.gmail.com>

[Raymond Hettinger]
> ...
> One situation did look suspect.  _PySequence_IterSearch() remembers an
> index/count across calls to PyIter_Next() -- it looks like the worst
> that could happen is the index or count would be wrong, but no crashers.

If the operation is PY_ITERSEARCH_INDEX, n is the 0-based count of the
number of times the iterator got poked before the object was found. 
That's always correct, by definition (given that there's no guarantee
the iterator can be rewound and restarted, or even that it would yield
the same objects if it could be restarted, what else could "the index
of the first occurrence" mean?).

If the operation is PY_ITERSEARCH_COUNT, then n is the number of times
poking the iterator returned the object in question.  That's also
correct by defintion of what PySequence_Count() means, although
there's again no guarantee that the user passes a sensible iterable
object (== one that would produce the same objects if crawled over a
second time).

So those are fine.

Thanks for checking the others, and for checking in a test and the fix!
From tim.peters at gmail.com  Mon Sep 27 00:33:43 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Sep 27 00:33:47 2004
Subject: [Python-Dev] More data points
In-Reply-To: <1096179722.4156600a2c219@mail.iinet.net.au>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
Message-ID: <1f7befae040926153369d0d50e@mail.gmail.com>

[Tim]
>> That's because they never suffered from list's ill-advised
>> documentation effectively blessing mutation while iterating <0.5 wink>.

[Nick]
> Ah. Interesting to know. So catching this is recommended when it's feasible?

According to me, but perhaps not according to all.  You can work very
hard to provide predictable semantics for mutation while iterating, by
defining cursor objects that somehow retain sensible guarantees even
if the object they point into mutates.  In effect, "the current index"
is a cursor in this respect when iterating over a list, and the
semantics are that "the current index", on each iteration, goes up by
one, and is an offset from the start of whatever state the list
happens to have at that time.  So, e.g., this behavior is guaranteed:

>>> x = range(10)
>>> for elt in x:
...     x.remove(elt)
>>> x
[1, 3, 5, 7, 9]
>>>

"Guaranteed" doesn't necessarily mean unsurprising, or even useful,
though.  I do have uses for this behavior, but I'd be happy to give
them up.

The "natural" behavior of dicts when mutating while iterating is
effectively unexplainable -- it "does whatever it does", based on
internal details of the hashed distribution of keys into buckets, and
even on the history of insertions (which affects hash collision
resolution).  I'm glad Python gripes about that now (it didn't
always).

It would also be possible, but difficult, to implement "sane"
iteration+mutation semantics for dicts.  A dict cursor object would
need to be aware of which objects had and hadn't already been passed
out by the iteration, and would even need to be robust against the
dict reorganizing itself completely when it changes size.

It's a lot easier all around to say "if you have to, iterate over a
snapshot of the keys".  In some cases, we're reduced to saying that
with no way to catch violations.  ZODB's BTrees are a good example
here.  People routinely get in trouble by mutating them while
iterating over them, but the implementation is such that it would be
very difficult to detect such a thing.
From ilya at bluefir.net  Mon Sep 27 03:46:00 2004
From: ilya at bluefir.net (Ilya Sandler)
Date: Mon Sep 27 03:47:47 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <1f7befae040926153369d0d50e@mail.gmail.com>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
	<1f7befae040926153369d0d50e@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0409261732490.13308@bagira>


A problem:

a number of standard python modules come with a command line interfaces,
e.g.  pydoc.py, pdb.py , unittest.py, timeit.py, uu.py
But it appears that there is no convenient out-of-the-box way to invoke
these tools from command line...

Basically one either has to write wrappers or to
invoke them like this: python /usr/lib/python2.3/pdb.py

Neither approach is convenient...

Am I missing something obvious? If not, then would the following make
sense?

When a script specified from command line is not found and the script name
does not end with py, treat the script as a module name and execute
that module as __main__

So
python pdb
would be equivalent to
python /usr/lib/python2.3/pdb.py

A possible variation of the same idea would be to have an explicit
command line option -m (or -M). More typing, but less magic...

Ilya

PS. An obvious alternative would be to install wrapper
scripts/symlinks next to python, but I don't understand python packaging
well enough to make a judgement here. One obvious problem with wrapper
scripts would be a difficulty of versioning, I wouldn't want to have
 pydoc2.2 pydoc2.3.1 pydoc2.3, etc in my /usr/bin


From skip at pobox.com  Mon Sep 27 03:59:58 2004
From: skip at pobox.com (Skip Montanaro)
Date: Mon Sep 27 04:00:09 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <Pine.LNX.4.58.0409261732490.13308@bagira>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
	<1f7befae040926153369d0d50e@mail.gmail.com>
	<Pine.LNX.4.58.0409261732490.13308@bagira>
Message-ID: <16727.29726.557153.219020@montanaro.dyndns.org>


    Ilya> a number of standard python modules come with a command line
    Ilya> interfaces, e.g.  pydoc.py, pdb.py , unittest.py, timeit.py, uu.py
    Ilya> But it appears that there is no convenient out-of-the-box way to
    Ilya> invoke these tools from command line...

    Ilya> Basically one either has to write wrappers or to
    Ilya> invoke them like this: python /usr/lib/python2.3/pdb.py

    Ilya> Neither approach is convenient...

    Ilya> Am I missing something obvious?

Search for "Scripts to install" in the setup.py file that comes with the
Python distribution.  If there are other scripts you'd like to see
installed, just submit a patch for setup.py.

Skip

From martin at v.loewis.de  Mon Sep 27 06:48:50 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Sep 27 06:48:48 2004
Subject: [Fwd: Re: [Python-Dev] using openssh's pty code]
In-Reply-To: <41570EE1.1010404@mn.rr.com>
References: <41570EE1.1010404@mn.rr.com>
Message-ID: <41579BB2.4020709@v.loewis.de>

> Well, I'm not sure how that applies.  I didn't see any mention of 
> licenses in the thread you pointed me to, but even if that thread (or 
> some other one) showed that it was ok to use glib code in python, that 
> doesn't mean I can put openssh code into python because the glib and 
> openssh licenses are different.

My personal view is that we can only accept code contributions from
the original author, in general. There have been exceptions in the
past, when we have shipped a wrapper for a library along with the
source code of the library, however, there should be a very good reason
for that, e.g. that the functionality in the library is unique.

In the specific case, I do believe it would be better to write
a pty module from scratch instead of adjusting openssh code.

Regards,
Martin
From raynorj at mn.rr.com  Mon Sep 27 08:57:24 2004
From: raynorj at mn.rr.com (J Raynor)
Date: Mon Sep 27 08:41:08 2004
Subject: [Fwd: Re: [Python-Dev] using openssh's pty code]
In-Reply-To: <41579BB2.4020709@v.loewis.de>
References: <41570EE1.1010404@mn.rr.com> <41579BB2.4020709@v.loewis.de>
Message-ID: <4157B9D4.2060401@mn.rr.com>


The code that I would borrow from openssh basically states that you can 
use it if you include in your derived work the copyright notice and 
disclaimer found in the file you want to borrow from.  This sounds like 
it would pose no problems for incorporating into python, but I'm no 
expert on this, so I thought I'd ask.

Looking at some of the python source, I can see that there are several 
files that contain just such notices.  For example, from the Modules 
directory:

addrinfo.h
md5.h
regexpr.h
timing.h
_bsddb.c
getaddrinfo.c
_localemodule.c
parsermodule.c
syslogmodule.c


Perhaps my original question led you to believe that the openssh license 
was unusual, or had problematic clauses in it.  Given the somewhat 
clarified description above of what's required to borrow openssh code, 
do you still have reservations about receiving patches containing it?


Martin v. L?wis wrote:
>> Well, I'm not sure how that applies.  I didn't see any mention of 
>> licenses in the thread you pointed me to, but even if that thread (or 
>> some other one) showed that it was ok to use glib code in python, that 
>> doesn't mean I can put openssh code into python because the glib and 
>> openssh licenses are different.
> 
> 
> My personal view is that we can only accept code contributions from
> the original author, in general. There have been exceptions in the
> past, when we have shipped a wrapper for a library along with the
> source code of the library, however, there should be a very good reason
> for that, e.g. that the functionality in the library is unique.
> 
> In the specific case, I do believe it would be better to write
> a pty module from scratch instead of adjusting openssh code.
> 
> Regards,
> Martin
> 
From jim at zope.com  Mon Sep 27 10:59:44 2004
From: jim at zope.com (Jim Fulton)
Date: Mon Sep 27 10:59:49 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <Pine.LNX.4.58.0409261732490.13308@bagira>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>	<1096176508.4156537c95cca@mail.iinet.net.au>	<1f7befae040925225047b6d3f3@mail.gmail.com>	<1096179722.4156600a2c219@mail.iinet.net.au>	<1f7befae040926153369d0d50e@mail.gmail.com>
	<Pine.LNX.4.58.0409261732490.13308@bagira>
Message-ID: <4157D680.4040809@zope.com>

Ilya Sandler wrote:
> A problem:
> 
> a number of standard python modules come with a command line interfaces,
> e.g.  pydoc.py, pdb.py , unittest.py, timeit.py, uu.py
> But it appears that there is no convenient out-of-the-box way to invoke
> these tools from command line...
> 
> Basically one either has to write wrappers or to
> invoke them like this: python /usr/lib/python2.3/pdb.py
> 
> Neither approach is convenient...
> 
> Am I missing something obvious? If not, then would the following make
> sense?
> 
> When a script specified from command line is not found and the script name
> does not end with py, treat the script as a module name and execute
> that module as __main__
> 
> So
> python pdb
> would be equivalent to
> python /usr/lib/python2.3/pdb.py
> 
> A possible variation of the same idea would be to have an explicit
> command line option -m (or -M). More typing, but less magic...

+1 on the -m command-line variation, with the following change:

   I'd like Python to import the module and then run it's main function.

I've been meaning to suggest smething like this myself.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From jim at zope.com  Mon Sep 27 11:01:42 2004
From: jim at zope.com (Jim Fulton)
Date: Mon Sep 27 11:01:52 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <16727.29726.557153.219020@montanaro.dyndns.org>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>	<1096176508.4156537c95cca@mail.iinet.net.au>	<1f7befae040925225047b6d3f3@mail.gmail.com>	<1096179722.4156600a2c219@mail.iinet.net.au>	<1f7befae040926153369d0d50e@mail.gmail.com>	<Pine.LNX.4.58.0409261732490.13308@bagira>
	<16727.29726.557153.219020@montanaro.dyndns.org>
Message-ID: <4157D6F6.1000307@zope.com>

Skip Montanaro wrote:
...

> Search for "Scripts to install" in the setup.py file that comes with the
> Python distribution.  If there are other scripts you'd like to see
> installed, just submit a patch for setup.py.

But then the same file gets installed twice.  I'd really like something like
what Ilya suggested for the common case of files that are usually used as
modules but that also have a command-line interface.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From anthony at interlink.com.au  Mon Sep 27 11:22:31 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Sep 27 11:23:26 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <4157D680.4040809@zope.com>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>	<1096176508.4156537c95cca@mail.iinet.net.au>	<1f7befae040925225047b6d3f3@mail.gmail.com>	<1096179722.4156600a2c219@mail.iinet.net.au>	<1f7befae040926153369d0d50e@mail.gmail.com>	<Pine.LNX.4.58.04092
Message-ID: <4157DBD7.1090205@interlink.com.au>


> +1 on the -m command-line variation, with the following change:
> 
>   I'd like Python to import the module and then run it's main function.
> 
> I've been meaning to suggest smething like this myself.

I'd prefer it import the module, with __name__ == "__main__",
because it's compatible with what we do now for a module that's
also a script. But I like the idea, nonetheless.

Question: should python -m foo.bar.baz work? I'd say "yes".

Anthony

From jim at zope.com  Mon Sep 27 11:32:24 2004
From: jim at zope.com (Jim Fulton)
Date: Mon Sep 27 11:32:27 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <4157DBD7.1090205@interlink.com.au>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>	<1096176508.4156537c95cca@mail.iinet.net.au>	<1f7befae040925225047b6d3f3@mail.gmail.com>	<1096179722.4156600a2c219@mail.iinet.net.au>	<1f7befae040926153369d0d50e@mail.gmail.com>	<Pine.LNX.4.58.04092
	<4157DBD7.1090205@interlink.com.au>
Message-ID: <4157DE28.5020209@zope.com>

Anthony Baxter wrote:
> 
>> +1 on the -m command-line variation, with the following change:
>>
>>   I'd like Python to import the module and then run it's main function.
>>
>> I've been meaning to suggest smething like this myself.
> 
> 
> I'd prefer it import the module, with __name__ == "__main__",
> because it's compatible with what we do now for a module that's
> also a script. But I like the idea, nonetheless.
> 
> Question: should python -m foo.bar.baz work? I'd say "yes".

Me too.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From lists at hlabs.spb.ru  Mon Sep 27 16:12:32 2004
From: lists at hlabs.spb.ru (Dmitry Vasiliev)
Date: Mon Sep 27 12:04:24 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <Pine.LNX.4.58.0409261732490.13308@bagira>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>	<1096176508.4156537c95cca@mail.iinet.net.au>	<1f7befae040925225047b6d3f3@mail.gmail.com>	<1096179722.4156600a2c219@mail.iinet.net.au>	<1f7befae040926153369d0d50e@mail.gmail.com>
	<Pine.LNX.4.58.0409261732490.13308@bagira>
Message-ID: <41581FD0.7090501@hlabs.spb.ru>

Ilya Sandler wrote:
> A problem:
> 
> a number of standard python modules come with a command line interfaces,
> e.g.  pydoc.py, pdb.py , unittest.py, timeit.py, uu.py
> But it appears that there is no convenient out-of-the-box way to invoke
> these tools from command line...
> 
> Basically one either has to write wrappers or to
> invoke them like this: python /usr/lib/python2.3/pdb.py
> 
> Neither approach is convenient...
> 
> Am I missing something obvious? If not, then would the following make
> sense?
> 
> When a script specified from command line is not found and the script name
> does not end with py, treat the script as a module name and execute
> that module as __main__
> 
> So
> python pdb
> would be equivalent to
> python /usr/lib/python2.3/pdb.py
> 
> A possible variation of the same idea would be to have an explicit
> command line option -m (or -M). More typing, but less magic...

There is already has been some discussion about importing from command line:
http://mail.python.org/pipermail/python-dev/2003-December/041240.html

I suggested the following:

1. python -p package

     Equivalent to:

     import package

2. python -p package.zip

     Equivalent to:

     import sys
     sys.path.insert(0, "package.zip")
     import package

-- 
Dmitry Vasiliev (dima at hlabs.spb.ru)
     http://hlabs.spb.ru
From alex.nanou at gmail.com  Mon Sep 27 15:19:13 2004
From: alex.nanou at gmail.com (Alex A. Naanou)
Date: Mon Sep 27 15:19:16 2004
Subject: Fwd: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <36f8892204092706186f8a277f@mail.gmail.com>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
	<1f7befae040926153369d0d50e@mail.gmail.com>
	<Pine.LNX.4.58.0409261732490.13308@bagira>
	<41581FD0.7090501@hlabs.spb.ru>
	<36f8892204092706186f8a277f@mail.gmail.com>
Message-ID: <36f88922040927061974e56a66@mail.gmail.com>

---------- Forwarded message ----------
From: Alex A. Naanou <alex.nanou@gmail.com>
Date: Mon, 27 Sep 2004 17:18:20 +0400
Subject: Re: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
To: dima@hlabs.spb.ru

On Mon, 27 Sep 2004 14:12:32 +0000, Dmitry Vasiliev <lists@hlabs.spb.ru> wrote:
> Ilya Sandler wrote:
> > A problem:
> >
> > a number of standard python modules come with a command line interfaces,
> > e.g.  pydoc.py, pdb.py , unittest.py, timeit.py, uu.py
> > But it appears that there is no convenient out-of-the-box way to invoke
> > these tools from command line...
> >
> > Basically one either has to write wrappers or to
> > invoke them like this: python /usr/lib/python2.3/pdb.py
> >
> > Neither approach is convenient...
> >
> > Am I missing something obvious? If not, then would the following make
> > sense?
> >
> > When a script specified from command line is not found and the script name
> > does not end with py, treat the script as a module name and execute
> > that module as __main__
> >
> > So
> > python pdb
> > would be equivalent to
> > python /usr/lib/python2.3/pdb.py
> >
> > A possible variation of the same idea would be to have an explicit
> > command line option -m (or -M). More typing, but less magic...
>
> There is already has been some discussion about importing from command line:
> http://mail.python.org/pipermail/python-dev/2003-December/041240.html
>
> I suggested the following:
>
> 1. python -p package
>
>     Equivalent to:
>
>     import package
>
> 2. python -p package.zip
>
>     Equivalent to:
>
>     import sys
>     sys.path.insert(0, "package.zip")
>     import package
>

this might not be good (IMHO), as:
1) this makes an implicit import (from the point of view of the
code... (imports from outside the code that the code uses))...
2) does does not solve the problem at hand, as when a module is
imported its __name__ is no longer "__main__" thus its commandline
handler will not start...

--
Alex.


-- 
Alex.
From alex.nanou at gmail.com  Mon Sep 27 15:20:00 2004
From: alex.nanou at gmail.com (Alex A. Naanou)
Date: Mon Sep 27 15:20:03 2004
Subject: Fwd: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <36f88922040927061163019d6@mail.gmail.com>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
	<1f7befae040926153369d0d50e@mail.gmail.com>
	<4157DBD7.1090205@interlink.com.au> <4157DE28.5020209@zope.com>
	<36f88922040927061163019d6@mail.gmail.com>
Message-ID: <36f8892204092706205d0a13a3@mail.gmail.com>

oops... forgot to CC the messages to python-dev ^_^


---------- Forwarded message ----------
From: Alex A. Naanou <alex.nanou@gmail.com>
Date: Mon, 27 Sep 2004 17:11:45 +0400
Subject: Re: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
To: jim@zope.com

On Mon, 27 Sep 2004 05:32:24 -0400, Jim Fulton <jim@zope.com> wrote:
> Anthony Baxter wrote:
> >
> >> +1 on the -m command-line variation, with the following change:
> >>
> >>   I'd like Python to import the module and then run it's main function.
> >>
> >> I've been meaning to suggest smething like this myself.
> >
> >
> > I'd prefer it import the module, with __name__ == "__main__",
> > because it's compatible with what we do now for a module that's
> > also a script. But I like the idea, nonetheless.
> >
> > Question: should python -m foo.bar.baz work? I'd say "yes".
>
> Me too.

Count me in too!

..though I must say I am against the variant with the main function,
as there is an accepted and widely used mechanism in python already
(__name__ == '__main__'), so why add another one or make the existing
mechanism more complex...

--
Alex.


-- 
Alex.
From FBatista at uniFON.com.ar  Mon Sep 27 16:13:31 2004
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Sep 27 16:18:22 2004
Subject: [Python-Dev] A cute new way to get an infinite loop
Message-ID: <A128D751272CD411BC9200508BC2194D053C79DB@escpl.tcp.com.ar>

[Raymond Hettinger]

#- Looks good.  Reads well. Solves the problem.  The timings are still
#- fast.  The test suite runs w/o exception.

These should be remembered like "The 5 conditions for a good patch" (or
something).

.	Facundo
From niemeyer at conectiva.com  Mon Sep 27 16:13:19 2004
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Mon Sep 27 16:24:20 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <4156DD92.2040300@v.loewis.de>
References: <41566546.7020601@mn.rr.com> <4156DD92.2040300@v.loewis.de>
Message-ID: <20040927141319.GA3105@burma.localdomain>

Hello Martin,

> >Since openssh must handle pty allocation, its support for pty
> >operations across various platforms is more robust than python's.
> >I'd like to use openssh's code to improve on python's pty handling.
> >
> >I know the licenses for openssh and python are different.  Can anyone
> >tell me if it's legal to mix openssh code into python?  Assuming it
> >is, are the python maintainers willing to accept a python patch that
> >contains some openssh code?
> 
> Could you change Python's pty module to more closely follow the
> procedures in OpenSSH, in particular those parts where OpenSSH
> is more robust?

If he's going to copy/base his work on openssh, the openssh license
must surely be respected.

FWIW, that's the issue I was talking about when we discussed the
contributor agreement in the PSF list, regarding inclusion of code
with foreign licenses. In this occasion, you said a contributor
must not include code not authored by him, and cannot sign an
agreement on such code.

-- 
Gustavo Niemeyer
http://niemeyer.net
From ncoghlan at iinet.net.au  Mon Sep 27 16:32:36 2004
From: ncoghlan at iinet.net.au (ncoghlan@iinet.net.au)
Date: Mon Sep 27 16:32:47 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <4157DBD7.1090205@interlink.com.au>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
	<1f7befae040926153369d0d50e@mail.gmail.com>
	<Pine.LNX.4.58.04092 <4157DBD7.1090205@interlink.com.au>
Message-ID: <1096295556.4158248495acf@mail.iinet.net.au>

Quoting Anthony Baxter <anthony@interlink.com.au>:

> 
> > +1 on the -m command-line variation, with the following change:
> > 
> >   I'd like Python to import the module and then run it's main function.
> > 
> > I've been meaning to suggest smething like this myself.
> 
> I'd prefer it import the module, with __name__ == "__main__",
> because it's compatible with what we do now for a module that's
> also a script. But I like the idea, nonetheless.
> 
> Question: should python -m foo.bar.baz work? I'd say "yes".

I was curious how hard this would be to implement. Minus Andrew's addition, the
answer is "Not very". So those who are interested in the idea might want to take
a look at SF Patch # 1035498.

The patch tries to make "./python -m pdb" mean the same thing as "./python
Lib/pdb.py" on a development build. (I use that example, because I have only a
very vague idea of where the pdb script ends up for an installed version of
Python - which is why I think this option would be very useful!)

Cheers,
Nick.

From wiedeman at gmx.net  Mon Sep 27 09:39:30 2004
From: wiedeman at gmx.net (Christoph Wiedemann)
Date: Mon Sep 27 16:46:37 2004
Subject: [Python-Dev] Py_NewInterpreter and PyGILState API
Message-ID: <29383.1096270770@www48.gmx.net>

Hello,

first of all my apologies for sending this message to python-dev, but i
tried comp.lang.pthon and python-help and didn't get helpful answers.

My problem is, that Py_NewInterpreter and the in 2.3 introduced PyGILState
API doesn't play nice with each other, especially in multithreaded embedding
applications. I think, this is because PyGILState functions assume, there is
exactly one PyThreadState instance per thread, and this is violated when
using Py_NewInterpreter, which creates a new PyThreadState instance.

I've tried to use 

state = PyGILState_Ensure();
PyThreadState_Get()->interp = interpreterIWantToUse;
/* code using Python API */
PyGILState_Release(state);

which seems to work, if called from one thread only, but this fails, if used
by multiple threads with a "Fatal Python error: PyThreadState_Delete:
invalid tstate."

Now, most of you would say: "Don't use PyGILState API, use the 2.2 way of
dealing with thread states". Unfortunately, i want to use PyQt, which uses
the PyGILState API, and i found, that it's not easy (or even impossible?) to
mix PyGILState calls with PyEval_SaveThread / PyEval_RestoreThread.

I'm lost with this and would appreciate any help. I'm using Python 2.3 on
linux x86.

Christoph

From barry at python.org  Mon Sep 27 16:51:28 2004
From: barry at python.org (Barry Warsaw)
Date: Mon Sep 27 16:51:35 2004
Subject: [Python-Dev] a simpler way to invoke pydoc, pdb, unittest, etc
In-Reply-To: <Pine.LNX.4.58.0409261732490.13308@bagira>
References: <000a01c4a334$44b29e40$e841fea9@oemcomputer>
	<1096176508.4156537c95cca@mail.iinet.net.au>
	<1f7befae040925225047b6d3f3@mail.gmail.com>
	<1096179722.4156600a2c219@mail.iinet.net.au>
	<1f7befae040926153369d0d50e@mail.gmail.com>
	<Pine.LNX.4.58.0409261732490.13308@bagira>
Message-ID: <1096296687.23222.142.camel@geddy.wooz.org>

On Sun, 2004-09-26 at 21:46, Ilya Sandler wrote:

> When a script specified from command line is not found and the script name
> does not end with py, treat the script as a module name and execute
> that module as __main__
> 
> So
> python pdb
> would be equivalent to
> python /usr/lib/python2.3/pdb.py
> 
> A possible variation of the same idea would be to have an explicit
> command line option -m (or -M). More typing, but less magic...

With the command line switch, +1.

One problem with the "just install it" approach is that you often get
Python from downstream packagers that make their own decisions about
which additional scripts to include.  There's also namespace collision
issues in bin directories to deal with.  So Ilya's suggestion avoids
both of those problems.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040927/3925af92/attachment.pgp
From raymond.hettinger at verizon.net  Mon Sep 27 18:35:07 2004
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon Sep 27 18:36:20 2004
Subject: [Python-Dev] Socket/Asyncore bug needs attention
Message-ID: <002001c4a4af$f3246fe0$e841fea9@oemcomputer>

Anyone who has worked on sockets or asyncore should take a look at SF
bug #1010098:  CPU usage shoots up with asyncore.  Since Py2.3, the
behavior changed for the worse.  The bug report has been around for
about five weeks and doesn't look like it is actively being solved.  If
you worked on those modules, please review your check-ins to see if they
were the cause:

www.python.org/sf/1010098


Thx,


Raymond

From tim.peters at gmail.com  Mon Sep 27 18:45:35 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Sep 27 18:45:41 2004
Subject: [Python-Dev] Socket/Asyncore bug needs attention
In-Reply-To: <002001c4a4af$f3246fe0$e841fea9@oemcomputer>
References: <002001c4a4af$f3246fe0$e841fea9@oemcomputer>
Message-ID: <1f7befae040927094522854326@mail.gmail.com>

[Raymond Hettinger]
> Anyone who has worked on sockets or asyncore should take a look at SF
> bug #1010098:  CPU usage shoots up with asyncore.  Since Py2.3, the
> behavior changed for the worse.  The bug report has been around for
> about five weeks and doesn't look like it is actively being solved.  If
> you worked on those modules, please review your check-ins to see if they
> were the cause:
>
> www.python.org/sf/1010098

FYI, as I noted in a comment there, I stared at that one long enough
to determine that asyncore almost certainly wasn't to blame (despite
the OP's natural belief that it must be).  Instead something "above"
asyncore changed so that a socket *always* shows up as "ready to
write" in 2.4, but hardly ever in 2.3.  CPU usage is nailed to 100% as
a consequence in 2.4, while in 2.3 asyncore's select() times out
instead, consuming almost no CPU.
From arigo at tunes.org  Mon Sep 27 19:53:19 2004
From: arigo at tunes.org (Armin Rigo)
Date: Mon Sep 27 19:58:24 2004
Subject: [Python-Dev] Socket/Asyncore bug needs attention
In-Reply-To: <002001c4a4af$f3246fe0$e841fea9@oemcomputer>
References: <002001c4a4af$f3246fe0$e841fea9@oemcomputer>
Message-ID: <20040927175319.GA32385@vicky.ecs.soton.ac.uk>

Hello Raymond,

On Mon, Sep 27, 2004 at 12:35:07PM -0400, Raymond Hettinger wrote:
> Anyone who has worked on sockets or asyncore should take a look at SF
> bug #1010098:  CPU usage shoots up with asyncore.  (...)   If
> you worked on those modules, please review your check-ins to see if they
> were the cause:

Funnily enough, the check-in to blame is from you :-)  You replaced
asynchat.py's usage of fifo lists with collection.deque()s, but you overlooked
the test for emptiness, which was 'self.list == []'.  This is fine if
'self.list' is really a list, but not if it is a deque :-)

Fixed, checked in.


Armin
From python at rcn.com  Mon Sep 27 21:17:14 2004
From: python at rcn.com (Raymond Hettinger)
Date: Mon Sep 27 21:19:02 2004
Subject: [Python-Dev] Socket/Asyncore bug needs attention
In-Reply-To: <20040927175319.GA32385@vicky.ecs.soton.ac.uk>
Message-ID: <000e01c4a4c6$ad3c3320$e841fea9@oemcomputer>

> Funnily enough, the check-in to blame is from you 

Oh, for shame!


> Fixed, checked in.

Thanks a million.  


Raymond

From arigo at tunes.org  Mon Sep 27 22:05:33 2004
From: arigo at tunes.org (Armin Rigo)
Date: Mon Sep 27 22:10:39 2004
Subject: [Python-Dev] open('/dev/null').read() -> MemoryError
Message-ID: <20040927200533.GA29621@vicky.ecs.soton.ac.uk>

Hi,

On my system, which is admittedly an old Linux box (2.2 kernel), one test
fails:

>>> file('/dev/null').read()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
MemoryError

This is because:

>>> os.stat('/dev/null').st_size
4540321280L

This looks very broken indeed.  I have no idea where this number comes from.  
I'd also complain if I was asked to allocate a buffer large enough to hold
that many bytes.  If we cared, we could "enhance" the file.read() method to
account for the possibility that maybe stat() lied; maybe it is desirable,
instead of allocating huge amounts of memory, to revert to something like the
following above some large threshold:

result = []
while 1:
  buf = f.read(16384)
  if not buf:
    return ''.join(result)
  result.append(buf)

Of course for genuinely large reads it's a disaster to have to allocate twice
as much memory.  Anyway I'm not sure we care about going around broken
behaviour.  I'm just wondering if os.stat() could lie in other situations too.


Armin
From bob at redivi.com  Mon Sep 27 22:21:03 2004
From: bob at redivi.com (Bob Ippolito)
Date: Mon Sep 27 22:21:10 2004
Subject: [Python-Dev] open('/dev/null').read() -> MemoryError
In-Reply-To: <20040927200533.GA29621@vicky.ecs.soton.ac.uk>
References: <20040927200533.GA29621@vicky.ecs.soton.ac.uk>
Message-ID: <C1385FF5-10C2-11D9-80DA-000A95686CD8@redivi.com>


On Sep 27, 2004, at 4:05 PM, Armin Rigo wrote:

> On my system, which is admittedly an old Linux box (2.2 kernel), one 
> test
> fails:
>
>>>> file('/dev/null').read()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> MemoryError
>
> This is because:
>
>>>> os.stat('/dev/null').st_size
> 4540321280L
>
> This looks very broken indeed.  I have no idea where this number comes 
> from.
> I'd also complain if I was asked to allocate a buffer large enough to 
> hold
> that many bytes.  If we cared, we could "enhance" the file.read() 
> method to
> account for the possibility that maybe stat() lied; maybe it is 
> desirable,
> instead of allocating huge amounts of memory, to revert to something 
> like the
> following above some large threshold:
>
> result = []
> while 1:
>   buf = f.read(16384)
>   if not buf:
>     return ''.join(result)
>   result.append(buf)
>
> Of course for genuinely large reads it's a disaster to have to 
> allocate twice
> as much memory.  Anyway I'm not sure we care about going around broken
> behaviour.  I'm just wondering if os.stat() could lie in other 
> situations too.

file(path).read() is never really a good idea in the general case - 
especially for a device node.  It might never terminate and it will get 
a MemoryError for genuinely large files anyway, especially on 32-bit 
architectures.  People should be reading files in chunks or using mmap. 
  Is there really anything the runtime can or should do about this?

In other words, it sounds like the test should be fixed, not the 
implementation.

-bob
From lunz at falooley.org  Mon Sep 27 23:33:45 2004
From: lunz at falooley.org (Jason Lunz)
Date: Tue Sep 28 00:00:55 2004
Subject: [Python-Dev] upcoming stable release?
Message-ID: <cja0vp$g5$1@sea.gmane.org>

I'm not a regular around here, so forgive me if this is obvious:

Can anyone give me an idea of when to expect the next 2.3 point release?
2.3.4 dates from May, and I have a vague idea that I need some
more-recent fix from the release23-maint branch.

[background: I have a python-gtk app that may be affected by this bug:
http://bugzilla.gnome.org/show_bug.cgi?id=149845. The changelog for the
debian unstable python2.3 package says:

python2.3 (2.3.4-5) unstable; urgency=medium

  * Updated to CVS release23-maint 20040705.
    - Remove threading patch, integrated upstream.

I have a vague idea that this may address the same issue, and if so, the
2.3.5 release (being based on release23-maint, I assume) will be safe to
use on all platforms, not just Debian sid.

If I were on debian unstable everything would be fine. Unfortunately, I
wish to also support windows, and on that platform I prefer to use
official python.org releases, which makes me wonder whether 2.3.5 is
imminent.]

thanks, Jason

From anthony at interlink.com.au  Tue Sep 28 02:40:09 2004
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue Sep 28 02:40:39 2004
Subject: [Python-Dev] upcoming stable release?
In-Reply-To: <cja0vp$g5$1@sea.gmane.org>
References: <cja0vp$g5$1@sea.gmane.org>
Message-ID: <4158B2E9.9040204@interlink.com.au>

Jason Lunz wrote:
> I'm not a regular around here, so forgive me if this is obvious:
> 
> Can anyone give me an idea of when to expect the next 2.3 point release?
> 2.3.4 dates from May, and I have a vague idea that I need some
> more-recent fix from the release23-maint branch.

After 2.4 final.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From greg at electricrain.com  Tue Sep 28 03:33:20 2004
From: greg at electricrain.com (Gregory P. Smith)
Date: Tue Sep 28 03:33:32 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <41566546.7020601@mn.rr.com>
References: <41566546.7020601@mn.rr.com>
Message-ID: <20040928013320.GC1530@zot.electricrain.com>

On Sun, Sep 26, 2004 at 01:44:22AM -0500, J Raynor wrote:
> 
> Since openssh must handle pty allocation, its support for pty operations 
> across various platforms is more robust than python's.  I'd like to use 
> openssh's code to improve on python's pty handling.
> 
> I know the licenses for openssh and python are different.  Can anyone 
> tell me if it's legal to mix openssh code into python?  Assuming it is, 
> are the python maintainers willing to accept a python patch that 
> contains some openssh code?

look at the openssh license.  yes you can use it.  its BSD or better.

http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/LICENCE?rev=HEAD

From gvanrossum at gmail.com  Tue Sep 28 04:51:06 2004
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Sep 28 04:51:09 2004
Subject: [Python-Dev] Fwd: Python binding to Rendezvous
In-Reply-To: <B61692B2-10E2-11D9-B15D-000A95E596DC@mac.com>
References: <B61692B2-10E2-11D9-B15D-000A95E596DC@mac.com>
Message-ID: <ca471dc2040927195168783a6@mail.gmail.com>

I've been asked to help getting Rendevous (AKA zeroconf I believe)
bindings for Python implemented. Anyone interested in helping out?
(The person to contact if you're interested is Daniel Steinberg.
Please cc me on the initial emails so I know contact is being made.

--Guido

---------- Forwarded message ----------
From: Joey Trevino
Date: Mon, 27 Sep 2004 17:09:48 -0700
Subject: Python binding to Rendezvous
To: Guido van Rossum
Cc: Daniel H Steinberg

Hey Guido,

    Here's the description of the task from my friend Daniel Steinberg
(CC'd here) at O'Reilly.  I understand that you don't have time to do
the work yourself, but I'm hopeful that you know of someone who would.

>>>> Apple has just provided a daemon for Rendezvous for Mac, for
>>>> Windows, for Linux, and for UNIX. Rich Kilmer spent an afternoon
>>>> with Stuart Cheshire and wrote Ruby bindings for this daemon and so
>>>> Ruby developers can easily Rendezvous enable their application.
>>>> There are also Parrot and Perl bindings on the way. The question
>>>> was whether someone had an afternoon free to spend with Stuart to
>>>> write the Python bindings. Stuart is not a Python expert but he is
>>>> good at teaming up with someone who is to help them see what hooks
>>>> are needed.

Thanks,
Joey

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From martin at v.loewis.de  Tue Sep 28 07:08:21 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Sep 28 07:08:18 2004
Subject: [Fwd: Re: [Python-Dev] using openssh's pty code]
In-Reply-To: <4157B9D4.2060401@mn.rr.com>
References: <41570EE1.1010404@mn.rr.com> <41579BB2.4020709@v.loewis.de>
	<4157B9D4.2060401@mn.rr.com>
Message-ID: <4158F1C5.8030300@v.loewis.de>

J Raynor wrote:
> The code that I would borrow from openssh basically states that you can 
> use it if you include in your derived work the copyright notice and 
> disclaimer found in the file you want to borrow from.  This sounds like 
> it would pose no problems for incorporating into python, but I'm no 
> expert on this, so I thought I'd ask.

It still very much depends on the *precise* wording. For example,
if we assemble binary releases, what are our obligations wrt. copyright
notice? If Python users embed Python into their applications, what will
be their obligations?

> Looking at some of the python source, I can see that there are several 
> files that contain just such notices.  For example, from the Modules 
> directory:

Yes. Each of these cases is somewhat worrysome, and we are working on
eliminating them whereever possible. Some of them are harder to resolve
than others. I'm certain that users of Python break some of these
licenses, by not incorporating the proper clauses into the proper
locations. Some users are worried about doing that and have asked to
simplify their lifes.

> Perhaps my original question led you to believe that the openssh license 
> was unusual, or had problematic clauses in it.  Given the somewhat 
> clarified description above of what's required to borrow openssh code, 
> do you still have reservations about receiving patches containing it?

I understood from the beginning that the openssh license is not unusual,
and that is what worries me. It doesn't worry me so much as to
completely object inclusion of code, but only if a suitable replacement
is too hard to write.

In any case, if you distribute the module separately first, none of
this needs to concern you.

Regards,
Martin
From martin at v.loewis.de  Tue Sep 28 07:13:17 2004
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Sep 28 07:13:14 2004
Subject: [Python-Dev] using openssh's pty code
In-Reply-To: <20040927141319.GA3105@burma.localdomain>
References: <41566546.7020601@mn.rr.com> <4156DD92.2040300@v.loewis.de>
	<20040927141319.GA3105@burma.localdomain>
Message-ID: <4158F2ED.5080502@v.loewis.de>

Gustavo Niemeyer wrote:
> If he's going to copy/base his work on openssh, the openssh license
> must surely be respected.

I wasn't suggesting literal copying, but rewriting the C code in
Python.

> FWIW, that's the issue I was talking about when we discussed the
> contributor agreement in the PSF list, regarding inclusion of code
> with foreign licenses. In this occasion, you said a contributor
> must not include code not authored by him, and cannot sign an
> agreement on such code.

Yes, and I still stand to this. A contributor can, of course, suggest
that code with a different license is included, explaining what the
consequences of doing so would be, and why we are then permitted to
still distribute the derived work in the way we want.

As this typically involves putting some sort of notice in some place,
I'm concerned that the list of notices grows longer and longer over
time, and becomes unmanagable for us. So a solution of the original
author contributing the code with permission to distribute it under
our own licenses is much preferable.

Regards,
Martin
From arigo at tunes.org  Tue Sep 28 11:37:58 2004
From: arigo at tunes.org (Armin Rigo)
Date: Tue Sep 28 11:43:03 2004
Subject: [Python-Dev] open('/dev/null').read() -> MemoryError
In-Reply-To: <C1385FF5-10C2-11D9-80DA-000A95686CD8@redivi.com>
References: <20040927200533.GA29621@vicky.ecs.soton.ac.uk>
	<C1385FF5-10C2-11D9-80DA-000A95686CD8@redivi.com>
Message-ID: <20040928093758.GA21112@vicky.ecs.soton.ac.uk>

Hi Bob,

On Mon, Sep 27, 2004 at 04:21:03PM -0400, Bob Ippolito wrote:
> file(path).read() is never really a good idea in the general case - 
> especially for a device node.

> In other words, it sounds like the test should be fixed, not the 
> implementation.

Sounds good.  Does anyone object to the following patch?

Index: test_os.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_os.py,v
retrieving revision 1.27
diff -c -r1.27 test_os.py
*** test_os.py  29 Aug 2004 18:47:31 -0000      1.27
--- test_os.py  28 Sep 2004 09:42:26 -0000
***************
*** 340,346 ****
          f.write('hello')
          f.close()
          f = file(os.devnull, 'r')
!         self.assertEqual(f.read(), '')
          f.close()
  
  class URandomTests (unittest.TestCase):
--- 340,351 ----
          f.write('hello')
          f.close()
          f = file(os.devnull, 'r')
!         self.assertEqual(f.read(1), '')
!         self.assertEqual(f.read(10), '')
!         self.assertEqual(f.read(100), '')
!         self.assertEqual(f.read(1000), '')
!         self.assertEqual(f.read(10000), '')
!         self.assertEqual(f.read(100000), '')
          f.close()
  
  class URandomTests (unittest.TestCase):


-+- Armin
From amk at amk.ca  Tue Sep 28 17:37:25 2004
From: amk at amk.ca (A.M. Kuchling)
Date: Tue Sep 28 17:38:45 2004
Subject: [Python-Dev] Fwd: Python binding to Rendezvous
In-Reply-To: <ca471dc2040927195168783a6@mail.gmail.com>
References: <B61692B2-10E2-11D9-B15D-000A95E596DC@mac.com>
	<ca471dc2040927195168783a6@mail.gmail.com>
Message-ID: <20040928153725.GB27126@rogue.amk.ca>

On Mon, Sep 27, 2004 at 07:51:06PM -0700, Guido van Rossum wrote:
> I've been asked to help getting Rendevous (AKA zeroconf I believe)
> bindings for Python implemented. Anyone interested in helping out?

I've volunteered for this.

> >>>> There are also Parrot and Perl bindings on the way. The question

They're working on Ruby and Parrot bindings before Python ones?  What
colour is the sky in their world?

--amk

From judson at mcs.anl.gov  Tue Sep 28 17:52:53 2004
From: judson at mcs.anl.gov (Ivan R. Judson)
Date: Tue Sep 28 17:54:29 2004
Subject: [Python-Dev] Fwd: Python binding to Rendezvous
In-Reply-To: <ca471dc2040927195168783a6@mail.gmail.com>
Message-ID: <200409281553.i8SFr0r92026@mcs.anl.gov>


We have interest in this here on a project and have been investigating using
SWIG to generate multiple language bindings of the Apple Rendezvous code.
We'd be interested in either helping, or just doing this as we need it for
various things. Suggestions for alternative approaches are welcome.

--Ivan 

> -----Original Message-----
> From: python-dev-bounces+judson=mcs.anl.gov@python.org 
> [mailto:python-dev-bounces+judson=mcs.anl.gov@python.org] On 
> Behalf Of Guido van Rossum
> Sent: Monday, September 27, 2004 9:51 PM
> To: Python-Dev; Daniel H Steinberg
> Subject: [Python-Dev] Fwd: Python binding to Rendezvous
> 
> I've been asked to help getting Rendevous (AKA zeroconf I 
> believe) bindings for Python implemented. Anyone interested 
> in helping out?
> (The person to contact if you're interested is Daniel Steinberg.
> Please cc me on the initial emails so I know contact is being made.
> 
> --Guido
> 
> ---------- Forwarded message ----------
> From: Joey Trevino
> Date: Mon, 27 Sep 2004 17:09:48 -0700
> Subject: Python binding to Rendezvous
> To: Guido van Rossum
> Cc: Daniel H Steinberg
> 
> Hey Guido,
> 
>     Here's the description of the task from my friend Daniel 
> Steinberg (CC'd here) at O'Reilly.  I understand that you 
> don't have time to do the work yourself, but I'm hopeful that 
> you know of someone who would.
> 
> >>>> Apple has just provided a daemon for Rendezvous for Mac, for 
> >>>> Windows, for Linux, and for UNIX. Rich Kilmer spent an afternoon 
> >>>> with Stuart Cheshire and wrote Ruby bindings for this 
> daemon and so 
> >>>> Ruby developers can easily Rendezvous enable their application.
> >>>> There are also Parrot and Perl bindings on the way. The question 
> >>>> was whether someone had an afternoon free to spend with 
> Stuart to 
> >>>> write the Python bindings. Stuart is not a Python expert 
> but he is 
> >>>> good at teaming up with someone who is to help them see 
> what hooks 
> >>>> are needed.
> 
> Thanks,
> Joey
> 
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/) 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/judson%40mcs.anl.gov
> 
> 

From kbk at shore.net  Wed Sep 29 07:17:48 2004
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed Sep 29 07:17:53 2004
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200409290517.i8T5Hm0L017921@h006008a7bda6.ne.client2.attbi.com>

Patch / Bug Summary
___________________

Patches :  235 open ( +0) /  2637 closed ( +4) /  2872 total ( +4)
Bugs    :  768 open ( +1) /  4480 closed (+17) /  5248 total (+18)
RFE     :  152 open ( +1) /   131 closed ( +0) /   283 total ( +1)

New / Reopened Patches
______________________

unittest.py patch: add skipped test functionality  (2004-09-24)
       http://python.org/sf/1034053  opened by  Remy Blank

Remove CoreServices / CoreFoundation dependencies in core  (2004-09-26)
       http://python.org/sf/1035255  opened by  Bob Ippolito

-m option to run a module as a script  (2004-09-28)
       http://python.org/sf/1035498  opened by  Nick Coghlan

Add New RPM-friendly record option to setup.py  (2004-09-28)
       http://python.org/sf/1035576  opened by  Jeff Pitman

Patches Closed
______________

Add API to logging package to allow intercooperation.  (2004-09-21)
       http://python.org/sf/1032206  closed by  vsajip

SystemError generated by struct.pack('P', 'notanumber')  (2004-08-18)
       http://python.org/sf/1011240  closed by  arigo

(bug 952953) execve empty 2nd arg fix  (2004-08-14)
       http://python.org/sf/1009075  closed by  arigo

atexit decorator  (2004-09-21)
       http://python.org/sf/1031687  closed by  rhettinger

New / Reopened Bugs
___________________

idle -n crashes  (2004-09-22)
CLOSED http://python.org/sf/1032395  opened by  Matthias Klose

Odd behavior with unicode.translate on OSX.  (2004-09-22)
       http://python.org/sf/1032615  opened by  Jeremy Fincher

ftplib has incomplete transfer when sending files in Windows  (2004-09-22)
       http://python.org/sf/1032875  opened by  Ed Sanville

Confusing description of strict option for email.Parser  (2004-09-23)
       http://python.org/sf/1032960  opened by  Andrew Bennetts

Misleading error message in random.choice  (2004-09-22)
CLOSED http://python.org/sf/1033038  opened by  Nefarious CodeMonkey, Jr.

build doesn't pick up bsddb w/Mandrake 9.2  (2004-09-23)
       http://python.org/sf/1033390  opened by  Alex Martelli

buffer() object broken.  (2004-09-23)
CLOSED http://python.org/sf/1033720  opened by  James Y Knight

Can't inherit slots from new-style classes implemented in C  (2004-09-24)
       http://python.org/sf/1034178  opened by  Phil Thompson

More buffer object brokenness  (2004-09-24)
CLOSED http://python.org/sf/1034242  opened by  James Y Knight

Why does Python link to Foundation?  (2004-09-24)
       http://python.org/sf/1034277  opened by  Bob Ippolito

Configure uses GNU ld flags with non-GNU compilers/linkers  (2004-09-25)
CLOSED http://python.org/sf/1034496  opened by  Drew Schatt

hex() and oct() documentation is incorrect  (2004-09-27)
       http://python.org/sf/1035279  opened by  Nick Coghlan

distutils.util.get_platform() should include sys.version[:2]  (2004-09-27)
       http://python.org/sf/1035703  opened by  Bob Ippolito

Tix.Grid widgets not implemented yet  (2004-09-28)
       http://python.org/sf/1036406  opened by  Christos Georgiou

unicode strings cannot be dictionary keys  (2004-09-28)
       http://python.org/sf/1036490  opened by  Morten Kjeldgaard

Email module's feed parser  (2004-09-28)
CLOSED http://python.org/sf/1036506  opened by  Matthew Cowles

file.next() info hidden  (2004-09-28)
       http://python.org/sf/1036626  opened by  Nick Jacobson

printf() in dynload_shlib.c should be PySys_WriteStderr  (2004-09-28)
       http://python.org/sf/1036752  opened by  Jp Calderone

Bugs Closed
___________

rfc822 __iter__ problem  (2004-09-17)
       http://python.org/sf/1030125  closed by  rhettinger

Fold tuples of constants into a single constant  (2004-09-20)
       http://python.org/sf/1031667  closed by  rhettinger

Misleading error message in random.choice  (2004-09-22)
       http://python.org/sf/1033038  closed by  rhettinger

PEP 302 loader not carried through by reload function  (2004-09-16)
       http://python.org/sf/1029475  closed by  pje

Float/long comparison anomaly  (2002-02-06)
       http://python.org/sf/513866  closed by  tim_one

buffer() object broken.  (2004-09-23)
       http://python.org/sf/1033720  closed by  nascheme

ConfigParser's get method gives utf-8 for a utf-16 config...  (2004-01-10)
       http://python.org/sf/874354  closed by  goodger

More buffer object brokenness  (2004-09-24)
       http://python.org/sf/1034242  closed by  nascheme

embedding in multi-threaded &amp; multi sub-interpreter environ  (2004-03-22)
       http://python.org/sf/921077  closed by  bcannon

Configure uses GNU ld flags with non-GNU compilers/linkers  (2004-09-25)
       http://python.org/sf/1034496  closed by  loewis

2.4 asyncore breaks Zope  (2004-08-18)
       http://python.org/sf/1011606  closed by  tim_one

CPU usage shoots up with asyncore  (2004-08-16)
       http://python.org/sf/1010098  closed by  arigo

execve rejects empty argument list  (2004-05-13)
       http://python.org/sf/952953  closed by  arigo

email.Message.Message.__getitem__ doc string wrong  (2004-06-25)
       http://python.org/sf/979924  closed by  bwarsaw

Email module's feed parser  (2004-09-28)
       http://python.org/sf/1036506  closed by  bwarsaw

idle -n crashes  (2004-09-22)
       http://python.org/sf/1032395  closed by  kbk

IDLE hangs when inactive more than 2 hours  (2004-08-02)
       http://python.org/sf/1001869  closed by  kbk

From nbastin at opnet.com  Wed Sep 29 16:52:32 2004
From: nbastin at opnet.com (Nick Bastin)
Date: Wed Sep 29 16:52:51 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
Message-ID: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>

Is there any way to (reliably) find the module that defined the class 
represented by a given PyTypeObject in C?

--
Nick

From mwh at python.net  Wed Sep 29 18:56:40 2004
From: mwh at python.net (Michael Hudson)
Date: Wed Sep 29 18:56:42 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
In-Reply-To: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com> (Nick Bastin's
	message of "Wed, 29 Sep 2004 10:52:32 -0400")
References: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>
Message-ID: <2m4qlht3nb.fsf@starship.python.net>

Nick Bastin <nbastin@opnet.com> writes:

> Is there any way to (reliably) find the module that defined the class
> represented by a given PyTypeObject in C?

Not especially appropriate for python-dev...

I think the answer depends on what you mean by "reliably".  __module__
is a good first bet, but can be defeated with sufficient malice (or
mere inattention, in the case of types defined by C).

Cheers,
mwh

-- 
  ... the U.S. Department of Transportation today disclosed that its
  agents have recently cleared airport security checkpoints with an 
  M1 tank, a beluga whale, and a fully active South American volcano.
             -- http://www.satirewire.com/news/march02/screeners.shtml
From nbastin at opnet.com  Wed Sep 29 19:24:23 2004
From: nbastin at opnet.com (Nick Bastin)
Date: Wed Sep 29 19:24:38 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
In-Reply-To: <2m4qlht3nb.fsf@starship.python.net>
References: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>
	<2m4qlht3nb.fsf@starship.python.net>
Message-ID: <680B417C-123C-11D9-8F21-000D932927FE@opnet.com>


On Sep 29, 2004, at 12:56 PM, Michael Hudson wrote:

> Nick Bastin <nbastin@opnet.com> writes:
>
>> Is there any way to (reliably) find the module that defined the class
>> represented by a given PyTypeObject in C?
>
> Not especially appropriate for python-dev...
>
> I think the answer depends on what you mean by "reliably".  __module__
> is a good first bet, but can be defeated with sufficient malice (or
> mere inattention, in the case of types defined by C).

Ok, maybe more appropriately, what do people think of adding a 
PyType_GetModule (PyTypeObject *) which basically functions like 
type_module(PyTypeObject *, void *) (in Objects/typeobject.c) to the 
public C API, rather than having to dig around in the object 
themselves?

--
Nick

From arigo at tunes.org  Wed Sep 29 22:17:02 2004
From: arigo at tunes.org (Armin Rigo)
Date: Wed Sep 29 22:22:10 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
In-Reply-To: <680B417C-123C-11D9-8F21-000D932927FE@opnet.com>
References: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>
	<2m4qlht3nb.fsf@starship.python.net>
	<680B417C-123C-11D9-8F21-000D932927FE@opnet.com>
Message-ID: <20040929201702.GA19671@vicky.ecs.soton.ac.uk>

Hello Nick,

On Wed, Sep 29, 2004 at 01:24:23PM -0400, Nick Bastin wrote:
> Ok, maybe more appropriately, what do people think of adding a 
> PyType_GetModule (PyTypeObject *) which basically functions like 
> type_module(PyTypeObject *, void *) (in Objects/typeobject.c) to the 
> public C API, rather than having to dig around in the object 
> themselves?

It looks overkill, when you can do instead:

  PyObject* module_name = PyObject_GetAttrString(type, "__module__");


Armin
From nbastin at opnet.com  Wed Sep 29 22:29:39 2004
From: nbastin at opnet.com (Nick Bastin)
Date: Wed Sep 29 22:30:15 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
In-Reply-To: <20040929201702.GA19671@vicky.ecs.soton.ac.uk>
References: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>
	<2m4qlht3nb.fsf@starship.python.net>
	<680B417C-123C-11D9-8F21-000D932927FE@opnet.com>
	<20040929201702.GA19671@vicky.ecs.soton.ac.uk>
Message-ID: <49A2279C-1256-11D9-8F21-000D932927FE@opnet.com>


On Sep 29, 2004, at 4:17 PM, Armin Rigo wrote:

> Hello Nick,
>
> On Wed, Sep 29, 2004 at 01:24:23PM -0400, Nick Bastin wrote:
>> Ok, maybe more appropriately, what do people think of adding a
>> PyType_GetModule (PyTypeObject *) which basically functions like
>> type_module(PyTypeObject *, void *) (in Objects/typeobject.c) to the
>> public C API, rather than having to dig around in the object
>> themselves?
>
> It looks overkill, when you can do instead:
>
>   PyObject* module_name = PyObject_GetAttrString(type, "__module__");

That only works most of the time, I think.  To be honest, I didn't try 
that, but it doesn't seem that type_module would jump through the hoops 
it does if that worked all of the time, unless parsing tp_name is 
legacy code.

--
Nick

From tim.peters at gmail.com  Wed Sep 29 22:30:10 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Sep 29 22:30:33 2004
Subject: [Python-Dev] Odd compile errors for bad genexps
Message-ID: <1f7befae04092913305132017c@mail.gmail.com>

>>> (i for i in x) = 2
SystemError: assign to generator expression not possible

1. Why is that a SystemError instead of a SyntaxError?  SystemError
   doesn't make sense.
2. Why didn't it echo the offending line?

>>> (i for i in x) += 2
SyntaxError: augmented assign to tuple literal not possible

3. That's not a tuple literal.
4. See #2 <wink>.
From python at rcn.com  Wed Sep 29 23:30:52 2004
From: python at rcn.com (Raymond Hettinger)
Date: Wed Sep 29 23:32:59 2004
Subject: [Python-Dev] Odd compile errors for bad genexps
In-Reply-To: <1f7befae04092913305132017c@mail.gmail.com>
Message-ID: <000601c4a66b$994d2da0$e841fea9@oemcomputer>

[Tim]
> >>> (i for i in x) = 2
> SystemError: assign to generator expression not possible
> 
> 1. Why is that a SystemError instead of a SyntaxError?  SystemError
>    doesn't make sense.

It, of course, should be a SyntaxError.
The fix is easy.  Put in PyExc_SyntaxError on line 3206 in compile.c


> 2. Why didn't it echo the offending line?

I don't follow this part.  The output is no different from:

    >>> str(x) = 2
    SyntaxError: can't assign to function call


> >>> (i for i in x) += 2
> SyntaxError: augmented assign to tuple literal not possible
> 
> 3. That's not a tuple literal.

The code for that one was modeled after broken code for list comps:

    >>> [i for i in x] += 2
    SyntaxError: augmented assign to list literal not possible

That's not a list literal either.

For both genexps and listcomps, the test for augmented assignment should
likely be moved before the same test for tuple literals and list
literals (they only check for LPAR or LSQB to trigger their message).

Is there a compiler weenie in the house who knows how to reliably fix
this one?  Though I can see the problem clearly enough, I'm just enough
out of my element that I don't want to touch it.


Raymond

From python at rcn.com  Thu Sep 30 02:40:42 2004
From: python at rcn.com (Raymond Hettinger)
Date: Thu Sep 30 02:43:00 2004
Subject: [Python-Dev] Odd compile errors for bad genexps
In-Reply-To: <000601c4a66b$994d2da0$e841fea9@oemcomputer>
Message-ID: <000001c4a686$1e4fef00$e841fea9@oemcomputer>

> [Tim]
> > >>> (i for i in x) = 2
> > SystemError: assign to generator expression not possible
> >
> > 1. Why is that a SystemError instead of a SyntaxError?  SystemError
> >    doesn't make sense.
 . . .
> > >>> (i for i in x) += 2
> > SyntaxError: augmented assign to tuple literal not possible
> >
> > 3. That's not a tuple literal

Okay, those two are fixed.


> > 2. Why didn't it echo the offending line?

The code for com_error() screens out the line numbering when in the
interactive mode.  You get the full echo when running a script.

What is interesting is that some syntax errors ("2 & * 3" for example)
by-pass com_error() and echo the line with a caret pointing at the
offending token.

These are both probably as they should be.


Raymond 

From tim.peters at gmail.com  Thu Sep 30 04:58:38 2004
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Sep 30 04:58:43 2004
Subject: [Python-Dev] Odd compile errors for bad genexps
In-Reply-To: <000001c4a686$1e4fef00$e841fea9@oemcomputer>
References: <000601c4a66b$994d2da0$e841fea9@oemcomputer>
	<000001c4a686$1e4fef00$e841fea9@oemcomputer>
Message-ID: <1f7befae04092919587fcb9eb4@mail.gmail.com>

[Raymond Hettinger]
> Okay, those two are fixed.

Thank you!

>> 2. Why didn't it echo the offending line?

> The code for com_error() screens out the line numbering when in the
> interactive mode.  You get the full echo when running a script.
> 
> What is interesting is that some syntax errors ("2 & * 3" for example)
> by-pass com_error() and echo the line with a caret pointing at the
> offending token.

Or plain "+" or plain "if" or "2 &" or "*3" or "from math import sin
as" etc etc.  That's why I asked.  I almost always see an echo echo
echo.  But those are actually syntax errors, in the sense that can't
be derived from the formal grammar.

> These are both probably as they should be.

Not if it makes life harder for doctest <wink>.
From ncoghlan at email.com  Thu Sep 30 12:11:06 2004
From: ncoghlan at email.com (Nick Coghlan)
Date: Thu Sep 30 12:11:14 2004
Subject: [Python-Dev] Running a module as a script
Message-ID: <1096539066.415bdbbaaed43@mail.iinet.net.au>

Patch # 1035498 attempts to implement the semantics suggested by Ilya and
Anthony and co.

"python -m module"

Means: 
- find the source file for the relevant module (using the standard locations for
module import)

- run the located script as __main__ (note that containing packages are NOT
imported first - it's as if the relevant module was executed directly from the
command line)

- as with '-c' anything before the option is an argument to the interpreter,
anything after is an argument to the script

The allowed modules are those whose associated source file meet the normal rules
for a command line script. I believe that means .py and .pyc files only (e.g.
"python -m profile" works, but "python -m hotshot" does not).

Special import hooks (such as zipimport) almost certainly won't work (since I
don't believe they work with the current command line script mechanism).

Cheers,
Nick.

-- 
Nick Coghlan
Brisbane, Australia
From ncoghlan at email.com  Thu Sep 30 12:21:47 2004
From: ncoghlan at email.com (Nick Coghlan)
Date: Thu Sep 30 12:21:53 2004
Subject: [Python-Dev] Proposing a sys.special_exceptions tuple
Message-ID: <1096539707.415bde3ba1425@mail.iinet.net.au>

I spent some time the other day looking at the use of bare except statements in
the standard library.

Many of them seemed to fall into the category of 'need to catch anything user
code is likely to throw, but shouldn't be masking SystemExit, StopIteration,
KeyboardInterrupt, MemoryError, etc'.

Changing them to "except Exception:" doesn't help, since all of the above still
fit into that category (Tim posted a message recently about rearranging the
Exception heirarchy to fix this. Backwards compatibility woes pretty much killed
the discussion though).

However, another possibility occurred to me:

try:
  # Do stuff
except sys.special_exceptions:
  raise
except:
  # Deal with all the mundane stuff

With an appropriately defined tuple, that makes it easy for people to "do the
right thing" with regards to critical exceptions. Such a tuple could also be
useful for invoking isinstance() and issubclass().

Who knows? If something like this caught on, it might some day be possible to
kill a Python script with a single press of Ctrl-C };>

Cheers,
Nick.

-- 
Nick Coghlan
Brisbane, Australia
From mwh at python.net  Thu Sep 30 13:46:59 2004
From: mwh at python.net (Michael Hudson)
Date: Thu Sep 30 13:47:00 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
In-Reply-To: <49A2279C-1256-11D9-8F21-000D932927FE@opnet.com> (Nick Bastin's
	message of "Wed, 29 Sep 2004 16:29:39 -0400")
References: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>
	<2m4qlht3nb.fsf@starship.python.net>
	<680B417C-123C-11D9-8F21-000D932927FE@opnet.com>
	<20040929201702.GA19671@vicky.ecs.soton.ac.uk>
	<49A2279C-1256-11D9-8F21-000D932927FE@opnet.com>
Message-ID: <2mfz50rnbg.fsf@starship.python.net>

Nick Bastin <nbastin@opnet.com> writes:

> On Sep 29, 2004, at 4:17 PM, Armin Rigo wrote:
>
>> Hello Nick,
>>
>> On Wed, Sep 29, 2004 at 01:24:23PM -0400, Nick Bastin wrote:
>>> Ok, maybe more appropriately, what do people think of adding a
>>> PyType_GetModule (PyTypeObject *) which basically functions like
>>> type_module(PyTypeObject *, void *) (in Objects/typeobject.c) to the
>>> public C API, rather than having to dig around in the object
>>> themselves?
>>
>> It looks overkill, when you can do instead:
>>
>>   PyObject* module_name = PyObject_GetAttrString(type, "__module__");
>
> That only works most of the time, I think.  To be honest, I didn't try
> that, but it doesn't seem that type_module would jump through the
> hoops it does if that worked all of the time, unless parsing tp_name
> is legacy code.

Huh?  The code above *winds up* calling type_module!

Cheers,
mwh

-- 
  <dash> i am trying to get Asterisk to work
  <dash> it is stabbing me in the face
  <dreid> yes ... i seem to recall that feature in the documentation
                                                -- from Twisted.Quotes
From nbastin at opnet.com  Thu Sep 30 15:43:01 2004
From: nbastin at opnet.com (Nick Bastin)
Date: Thu Sep 30 15:43:21 2004
Subject: [Python-Dev] Finding the module from PyTypeObject?
In-Reply-To: <2mfz50rnbg.fsf@starship.python.net>
References: <3182857D-1227-11D9-8F21-000D932927FE@opnet.com>
	<2m4qlht3nb.fsf@starship.python.net>
	<680B417C-123C-11D9-8F21-000D932927FE@opnet.com>
	<20040929201702.GA19671@vicky.ecs.soton.ac.uk>
	<49A2279C-1256-11D9-8F21-000D932927FE@opnet.com>
	<2mfz50rnbg.fsf@starship.python.net>
Message-ID: <A5E44C8A-12E6-11D9-8F21-000D932927FE@opnet.com>


On Sep 30, 2004, at 7:46 AM, Michael Hudson wrote:

> Nick Bastin <nbastin@opnet.com> writes:
>
>> On Sep 29, 2004, at 4:17 PM, Armin Rigo wrote:
>>
>>> Hello Nick,
>>>
>>> On Wed, Sep 29, 2004 at 01:24:23PM -0400, Nick Bastin wrote:
>>>> Ok, maybe more appropriately, what do people think of adding a
>>>> PyType_GetModule (PyTypeObject *) which basically functions like
>>>> type_module(PyTypeObject *, void *) (in Objects/typeobject.c) to the
>>>> public C API, rather than having to dig around in the object
>>>> themselves?
>>>
>>> It looks overkill, when you can do instead:
>>>
>>>   PyObject* module_name = PyObject_GetAttrString(type, "__module__");
>>
>> That only works most of the time, I think.  To be honest, I didn't try
>> that, but it doesn't seem that type_module would jump through the
>> hoops it does if that worked all of the time, unless parsing tp_name
>> is legacy code.
>
> Huh?  The code above *winds up* calling type_module!

Doh, nevermind...I missed the getter def.

--
Nick  (::slinks off back under his rock now::)

From aahz at pythoncraft.com  Thu Sep 30 15:57:18 2004
From: aahz at pythoncraft.com (Aahz)
Date: Thu Sep 30 15:57:21 2004
Subject: [Python-Dev] Running a module as a script
In-Reply-To: <1096539066.415bdbbaaed43@mail.iinet.net.au>
References: <1096539066.415bdbbaaed43@mail.iinet.net.au>
Message-ID: <20040930135718.GA208@panix.com>

On Thu, Sep 30, 2004, Nick Coghlan wrote:
>
> The allowed modules are those whose associated source file meet the
> normal rules for a command line script. I believe that means .py
> and .pyc files only (e.g. "python -m profile" works, but "python -m
> hotshot" does not).

Not positive, but if you're allowing .pyc, you should probably allow
.pyo if optimize mode is on.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"A foolish consistency is the hobgoblin of little minds, adored by little
statesmen and philosophers and divines."  --Ralph Waldo Emerson
From pje at telecommunity.com  Thu Sep 30 16:19:22 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Sep 30 16:19:22 2004
Subject: [Python-Dev] Proposing a sys.special_exceptions tuple
In-Reply-To: <1096539707.415bde3ba1425@mail.iinet.net.au>
Message-ID: <5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>

At 08:21 PM 9/30/04 +1000, Nick Coghlan wrote:
>However, another possibility occurred to me:
>
>try:
>   # Do stuff
>except sys.special_exceptions:
>   raise
>except:
>   # Deal with all the mundane stuff
>
>With an appropriately defined tuple, that makes it easy for people to "do the
>right thing" with regards to critical exceptions. Such a tuple could also be
>useful for invoking isinstance() and issubclass().

+1.  This would be a big help for developers, if only in that it will tell 
us what exceptions we ought to do this with.

IMO, this is probably important enough to make it a builtin; maybe call it 
CriticalExceptions or some such.

Also, maybe in 2.5 we could begin warning about bare excepts that aren't 
preceded by non-bare exceptions.

From barry at python.org  Thu Sep 30 16:27:10 2004
From: barry at python.org (Barry Warsaw)
Date: Thu Sep 30 16:27:17 2004
Subject: [Python-Dev] Proposing a sys.special_exceptions tuple
In-Reply-To: <5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>
References: <5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>
Message-ID: <1096554430.20270.23.camel@geddy.wooz.org>

On Thu, 2004-09-30 at 10:19, Phillip J. Eby wrote:
> At 08:21 PM 9/30/04 +1000, Nick Coghlan wrote:
> >However, another possibility occurred to me:
> >
> >try:
> >   # Do stuff
> >except sys.special_exceptions:
> >   raise
> >except:
> >   # Deal with all the mundane stuff

+0, except that I'd rather see it put in the exceptions module and given
a name in builtins.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20040930/18ac1740/attachment.pgp
From theller at python.net  Thu Sep 30 16:31:22 2004
From: theller at python.net (Thomas Heller)
Date: Thu Sep 30 16:31:21 2004
Subject: [Python-Dev] Running a module as a script
In-Reply-To: <20040930135718.GA208@panix.com> (aahz@pythoncraft.com's
	message of "Thu, 30 Sep 2004 09:57:18 -0400")
References: <1096539066.415bdbbaaed43@mail.iinet.net.au>
	<20040930135718.GA208@panix.com>
Message-ID: <wtybzv45.fsf@python.net>

Aahz <aahz@pythoncraft.com> writes:

> On Thu, Sep 30, 2004, Nick Coghlan wrote:
>>
>> The allowed modules are those whose associated source file meet the
>> normal rules for a command line script. I believe that means .py
>> and .pyc files only (e.g. "python -m profile" works, but "python -m
>> hotshot" does not).
>
> Not positive, but if you're allowing .pyc, you should probably allow
> .pyo if optimize mode is on.

Plus .pyw, on Windows.

Thomas

From pje at telecommunity.com  Thu Sep 30 17:16:12 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Sep 30 17:16:11 2004
Subject: [Python-Dev] Running a module as a script
In-Reply-To: <wtybzv45.fsf@python.net>
References: <20040930135718.GA208@panix.com>
	<1096539066.415bdbbaaed43@mail.iinet.net.au>
	<20040930135718.GA208@panix.com>
Message-ID: <5.1.1.6.0.20040930111348.038beb60@mail.telecommunity.com>

At 04:31 PM 9/30/04 +0200, Thomas Heller wrote:
>Aahz <aahz@pythoncraft.com> writes:
>
> > On Thu, Sep 30, 2004, Nick Coghlan wrote:
> >>
> >> The allowed modules are those whose associated source file meet the
> >> normal rules for a command line script. I believe that means .py
> >> and .pyc files only (e.g. "python -m profile" works, but "python -m
> >> hotshot" does not).
> >
> > Not positive, but if you're allowing .pyc, you should probably allow
> > .pyo if optimize mode is on.
>
>Plus .pyw, on Windows.

Using the C equivalent of 'imp.find_module()' should cover all these cases, 
and any new forms of PY_SOURCE or PY_COMPILED that come up in future.

From cce at clarkevans.com  Thu Sep 30 17:47:22 2004
From: cce at clarkevans.com (Clark C. Evans)
Date: Thu Sep 30 17:47:25 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040909215548.GB61544@prometheusresearch.com>
References: <20040908014845.GA52384@prometheusresearch.com>
	<0F1308BF-01C2-11D9-AC9C-000A95A50FB2@fuhm.net>
	<413F564D.2070708@bluewin.ch>
	<20040908192056.GB62848@prometheusresearch.com>
	<20040909101444.GA2877@vicky.ecs.soton.ac.uk>
	<20040909215548.GB61544@prometheusresearch.com>
Message-ID: <20040930154722.GA79121@prometheusresearch.com>

To distill this request to a sentence:

   I would like syntax-level support in Python for a Continuation
   Passing Style (CPS) of code execution.

It is important to note that Ruby, Parrot (next-generation Perl),
and SML-NJ all support this async programming style.  In Python
land, the Twisted framework uses this style via its Deferred
mechanism. This isn't a off-the-wall request.  I currently think
that a generator syntax would be the best, and this proposal is for
further work via defining a SuspendIterator semantics.  However, I'm
not tied to this implementation.  A pre-parser which made Deferred
object handling nicer could also work, or any other option that
provides an intuitive syntax for CPS in Python.

The hoops that Twisted has to jump-through to wrap Exceptions for
use in a Deferred processing chain, and also the (completely
necessary but yet) convoluted ways of combining Deferreds is, IMHO,
a direct result of lack of support for CPS in Python.  These items
have a huge impact application program readability and maintenance.
Clean syntax-level support for CPS in Python would be a boon for
application developers.

Best,

Clark
From lalo at laranja.org  Thu Sep 30 17:52:48 2004
From: lalo at laranja.org (Lalo Martins)
Date: Thu Sep 30 17:56:37 2004
Subject: [Python-Dev] Proposing a sys.special_exceptions tuple
In-Reply-To: <5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>
References: <1096539707.415bde3ba1425@mail.iinet.net.au>
	<5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>
Message-ID: <20040930155247.GU13993@laranja.org>

On Thu, Sep 30, 2004 at 10:19:22AM -0400, Phillip J. Eby wrote:
> 
> Also, maybe in 2.5 we could begin warning about bare excepts that aren't 
> preceded by non-bare exceptions.

try:
  foo()
except:
  print_or_log_exception_in_a_way_that_is_meaningful()
  raise

doesn't seem to be incorrect to me.  For example, if the program
is a daemon, I want the exception logged somewhere so that I can
see it later, because I won't be watching stderr.

[]s,
                                               |alo
                                               +----
--
            Those who trade freedom for security
               lose both and deserve neither.
--
http://www.laranja.org/                mailto:lalo@laranja.org
 pgp key: http://garfield.laranja.org/~lalo/gpgkey-signed.asc

GNU: never give up freedom                 http://www.gnu.org/
From jcarlson at uci.edu  Thu Sep 30 18:32:37 2004
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu Sep 30 18:39:56 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040930154722.GA79121@prometheusresearch.com>
References: <20040909215548.GB61544@prometheusresearch.com>
	<20040930154722.GA79121@prometheusresearch.com>
Message-ID: <20040930090510.FE87.JCARLSON@uci.edu>


> It is important to note that Ruby, Parrot (next-generation Perl),
> and SML-NJ all support this async programming style.  In Python

For those of us who aren't current on the latest happenings of Ruby,
Parrot and SML/NJ, it may be convenient for us to hear precisely how
"async programming style" is done in those languages, so we have a
reference, and so that we can agree (or disagree) with you about whether
they are equivalent to your PEP.

It would also be nice if you were to do a bit of research on the
internals of those languages, to discover how it is actually implemented. 
This would allow Python interpreter hackers to say, "Yes, that kind of
thing is possible," "Maybe with a bit of work," "It is not possible with
the current interpreter," or even "It wouldn't be usable on Jython."


With that said, I believe there is a general consensus that this kind of
thing would be useful.  For me, if I had greenlets everywhere I'd be
happy (though I understand that this may not be technically possible on
Jython).


 - Josiah

From lt at toetsch.at  Thu Sep 30 21:30:17 2004
From: lt at toetsch.at (Leopold Toetsch)
Date: Thu Sep 30 21:29:52 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via SuspendIteration
In-Reply-To: <20040930090510.FE87.JCARLSON@uci.edu>
References: <20040909215548.GB61544@prometheusresearch.com>	<20040930154722.GA79121@prometheusresearch.com>
	<20040930090510.FE87.JCARLSON@uci.edu>
Message-ID: <415C5EC9.8070308@toetsch.at>

Josiah Carlson wrote:
>>It is important to note that Ruby, Parrot (next-generation Perl),
>>and SML-NJ all support this async programming style.  In Python
> 
> 
> For those of us who aren't current on the latest happenings of Ruby,
> Parrot and SML/NJ, it may be convenient for us to hear precisely how
> "async programming style" is done in those languages,

Some clarifications WRT Parrot. Parrot isn't a language, Parrot isn't 
"next-generation Perl". Parrot is a virtual machine that will run Perl6. 
And Parrot is running currently languages like Python, tcl, m4, forth, 
and others more or less completely[1].

Parrot's function calling scheme is CPS. A Python generator function 
gets automatically translated to a coroutine. Returning from a plain 
function is done by invoking a continuation. And you can of course (in 
Parrot assembly) create a continuation store it away and invoke it at 
any time later, which will continue program execution at that point, 
where it should continue.

Please note that that has nothing to do with "aync programming". Its 
just like a GOTO, but w/o limitation where you'll branch to - or almost 
no limitations: you can't cross C-stack boundaries on in other words you 
can't branch to other incarnations of the run-loop. (Exceptions are a 
bit more flexible though, but they still can only jump "up" the C-stack)

Using CPS for function calls implies therefore a non-trivial rewrite of 
CPython, which OTOH and AFAIK is already available as Stackless Python.

Making continuations usable at the language level is a different thing, 
though.

leo

[1] http://www.parrotcode.org - in CVS languages/python. The test b2.py
from the Pie-thon benchmark has two generators (izip, Pi.__iter__), 
which are Parrot coroutines, that's working fine.

From pje at telecommunity.com  Thu Sep 30 21:37:17 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Sep 30 21:37:18 2004
Subject: [Python-Dev] Proposing a sys.special_exceptions tuple
In-Reply-To: <20040930155247.GU13993@laranja.org>
References: <5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>
	<1096539707.415bde3ba1425@mail.iinet.net.au>
	<5.1.1.6.0.20040930101453.0244e8f0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20040930153454.02bc04c0@mail.telecommunity.com>

At 12:52 PM 9/30/04 -0300, Lalo Martins wrote:
>On Thu, Sep 30, 2004 at 10:19:22AM -0400, Phillip J. Eby wrote:
> >
> > Also, maybe in 2.5 we could begin warning about bare excepts that aren't
> > preceded by non-bare exceptions.
>
>try:
>   foo()
>except:
>   print_or_log_exception_in_a_way_that_is_meaningful()
>   raise
>
>doesn't seem to be incorrect to me.  For example, if the program
>is a daemon, I want the exception logged somewhere so that I can
>see it later, because I won't be watching stderr.

1. If the exception raised is a MemoryError, your daemon is in trouble.

2. I said *warn*, and it'd be easy to suppress the warning using 'except 
Exception:', if that's what you really mean

3. But I suppose this could be considered a job for pychecker.

From bac at OCF.Berkeley.EDU  Thu Sep 30 22:02:34 2004
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Sep 30 22:02:45 2004
Subject: [Python-Dev] [OT] stats on type-inferencing atomic types in local
 variables in the stdlib
Message-ID: <415C665A.4060706@ocf.berkeley.edu>

My thesis (which, for those who don't know, was to come up with a way to do 
type inferencing in the compiler without requiring any semantic changes; 
basically type inferencing atomic types assigned to local variables) is now far 
enough long that I have the algorithm done and I can generate statistics on 
what opcodes are called with the most common types that I can specifically 
infer (can also do static type checking on occasion; only triggers were actual 
unit tests making sure TypeError was raised for certain things like ``~4.2`` 
and such).  Thought some of you might get a kick out of this since the numbers 
are rather blatent for certain opcodes and methods.

To read the stats, the number to the left is the number of times the opcode was 
compiled (not executed!) with the specific type(s) known for the opcode (if it 
took two args, then both types are listed; order was considered irrelevant). 
Now they are listed as integers, so here is the conversion::

   Basestring 4
   IntegralType 8
   FloatType 16
   ImagType 32
   DictType 64
   ListType 128
   TupleType 256

For the things named "meth_<something>" that is the method being called 
immediately on the type.

Now realize these numbers are only for opcodes where I could definitely infer 
the type; ones where it could be more than one type, regardless if those 
possibilities were very specific, I just ignored it and did not include in the 
stats.

I also tweaked some opcodes knowing how they are more often used.  So, for 
instance, BINARY_MODULO checks specifically for the case of when the left side 
is a basestring and then just doesn't worry about the other args.  Other ones I 
just didn't bother with all the args since it was not interesting to me in 
terms of deciding what type-specific opcodes I want to come up with.

Anyway, here are the numbers on Lib sans Lib/test (129,814 lines according to 
SLOCCount) for the ones above 100::

  (101, ('BINARY_MULTIPLY', (8, 4))),
  (106, ('BINARY_SUBSCR', 128)),
  (118, ('GET_ITER', 128)),
  (124, ('BINARY_MODULO', None)),
  (195, ('meth_join', 4)),
  (204, ('BINARY_ADD', (8, 8))),
  (331, ('BINARY_ADD', (4, 4))),
  (513, ('BINARY_LSHIFT', (8, 8))),
  (840, ('meth_append', 128)),
  (1270, ('PRINT_ITEM', 4)),
  (1916, ('BINARY_MODULO', 4)),
  (12302, ('STORE_SUBSCR', 64))]

We sure like our dictionaries (for those that don't know, dictionaries are 
created by making an empty dict and then basically doing an indivual assignment 
for each value).  We also seem to love to use string interpolation, and 
printing stuff.  Using list.append is also popular.  Now the BINARY_LSHIFT is 
rather interesting, and that ties into the whole issue of how much I can 
actually infer; since binary work tends to be with all constants I can infer it 
really easily and so its frequency is rather high.  Its actual frequency of 
use, though, compared to other things probably is not high, though.  Plus I 
doubt Plone, for instance, uses ``<<`` very often so I suspect the opcode will 
get weeded out when I incorporate stats from the other apps I am taking stats from.

As for the stuff I cut out, the surprising thing from those numbers was how few 
mathematical expressions could be inferred.  I checked my numbers with grep and 
there really is only 3 times where a float constant is divided by a float 
constant (and they are all in colorsys).  I was not expecting that at all. 
Guess global variables or object attributes tend to have them or I just can't 
infer the values.  Either way I just wasn't expecting that.

Anyway, as I said I just thought some people might find this interesting. 
Don't read into this too much since I am just using these numbers as guidelines 
for type-specific opcodes to write for use as a quantifiable measurement of the 
usefulness of type inferencing like this.

-Brett

P.S.: anyone who is *really* interested I can send you the full stats for the 
apps I have run my modified version of compile.c against.
From pje at telecommunity.com  Thu Sep 30 22:40:18 2004
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Sep 30 22:40:19 2004
Subject: [Python-Dev] PEP 334 - Simple Coroutines via
  SuspendIteration
In-Reply-To: <415C5EC9.8070308@toetsch.at>
References: <20040930090510.FE87.JCARLSON@uci.edu>
	<20040909215548.GB61544@prometheusresearch.com>
	<20040930154722.GA79121@prometheusresearch.com>
	<20040930090510.FE87.JCARLSON@uci.edu>
Message-ID: <5.1.1.6.0.20040930153803.02bc0140@mail.telecommunity.com>

At 09:30 PM 9/30/04 +0200, Leopold Toetsch wrote:
>Please note that that has nothing to do with "aync programming". Its just 
>like a GOTO, but w/o limitation where you'll branch to - or almost no 
>limitations: you can't cross C-stack boundaries on in other words you 
>can't branch to other incarnations of the run-loop. (Exceptions are a bit 
>more flexible though, but they still can only jump "up" the C-stack)
>
>Using CPS for function calls implies therefore a non-trivial rewrite of 
>CPython, which OTOH and AFAIK is already available as Stackless Python.

Clark is talking about a limited subset of CPS, where continuations are 
only single-use.  That is, a very limited form of continuations roughly 
equivalent in power to either Greenlets or a stack of generator-iterators.


>Making continuations usable at the language level is a different thing, 
>though.

Indeed, and luckily it isn't needed for PEP 334.  PEP 334 just needs the 
interpreter to be able to resume evaluation of a generator frame at any 
CALL opcode or "for" looping that invokes a generator-iterator's next() 
method, if SuspendIteration was raised.  I don't know if a corresponding 
operation for Jython is possible.

(In the case of CPython, this could be implemented via a type slot to check 
whether a callable object is "resumable", so that you actually *could* 
decorate suitable functions as being resumable, not just generator-iterator 
next() methods.)

Personally, I'm +0 (at most) on the PEP at the moment, as it doesn't IMO 
add much over using a generator stack, such as what I use in 
'peak.events'.  I'd be much more interested in a way to pass values and 
exceptions *into* generators, which would be more in line with what I'd 
consider "simple coroutines".

A mechanism to pass values or exceptions into generators would be let me 
replace the hackish
bits of 'peak.events' with clean language features, but I'm not sure PEP 
334 would give me enough to be worth reorganizing my code, as it's 
presently defined.

Also, I find the current PEP a confusing mishmash of references to various 
technologies (that are all harder to implement than what's actually 
desires) and unmotivating implementations of things I'd can't see wanting 
to do.  It would be helpful for it to focus on motivating usage examples 
(such as suspending a report while waiting for a database) *early* in the 
PEP, rather than burying them at the end.  And, most of the sample Python 
code looks to me like examples of how an implementation might work, but 
they don't illustrate the intended semantics well, nor do they really help 
with designing an implementation.  Finally, the PEP shouldn't call these 
co-routines, as co-routines are able to "return" values to other 
co-routines.  The title should be something more like "Resuming Generators 
after SuspendIteration", which much more accurately describes the scope of 
the desired result.