From tim at pollenation.net Wed Feb 1 00:09:46 2006 From: tim at pollenation.net (Tim Parkin) Date: Tue, 31 Jan 2006 23:09:46 +0000 Subject: [Python-Dev] YAML (was Re: Extension to ConfigParser) In-Reply-To: References: <43D930E0.6070805@voidspace.org.uk> <43DDE661.4020808@voidspace.org.uk> <43DE51F7.4010007@colorstudy.com> <17374.33384.775422.175195@montanaro.dyndns.org> <43DE9066.2040100@voidspace.org.uk> Message-ID: <43DFEE3A.8030809@pollenation.net> Georg Brandl wrote: > Guido van Rossum wrote: >>Ah. This definitely isn't what ConfigParser was meant to do. I'd think >>for this you should use some kind of XML pickle though. That's >>horrible if end users must edit it, but great for saving >>near-arbitrary persistent data in a readable and occasionally editable >>(for the developer) form. > > > While we're at it, is the Python library going to incorporate some YAML > parser in the future? YAML seems like a perfectly matching data format > for Python. Unfortunately, YAML still doesn't have a fully featured pure python parser (pyyaml works on simple yaml documents). The specification also doesn't have a blueprint implementation (there was talk about one at some point) and the fact that the specification has a context sensitive grammar and quite a large lookahead means that writing parsers with standard components is a little tricky (I know I tried for some time). The defacto standard implementation is 'syck' which is a c library that is used in the ruby distribution and works very well. Up until recently the only python wrapper that didn't segfault for syck was our own pyrex wrapper. Forunately, Kirill Simonov has written an excellent wrapper (which handles load and dump) which is available at http://xitology.org/pysyck/. Although we make extensive use of yaml and it is definitely the best human editable data format I've used - and our non techy clients agree that it's pretty simple to use - it is a lot more complicated than ini files. Our opinion is that it undoubtedly has it's bad points but that it makes complex configuration files easy to write, read and edit. If you want a human readable serialisation format, it's way, way better than xml. If you want to create config files that have some nesting and typing, have a look and see what you think. Tim Parkin p.s. JSON is 'nearly' a subset of YAML (the nearly point is being considered by various parties). From bob at redivi.com Wed Feb 1 00:21:15 2006 From: bob at redivi.com (Bob Ippolito) Date: Tue, 31 Jan 2006 15:21:15 -0800 Subject: [Python-Dev] YAML (was Re: Extension to ConfigParser) In-Reply-To: <43DFEE3A.8030809@pollenation.net> References: <43D930E0.6070805@voidspace.org.uk> <43DDE661.4020808@voidspace.org.uk> <43DE51F7.4010007@colorstudy.com> <17374.33384.775422.175195@montanaro.dyndns.org> <43DE9066.2040100@voidspace.org.uk> <43DFEE3A.8030809@pollenation.net> Message-ID: On Jan 31, 2006, at 3:09 PM, Tim Parkin wrote: > Georg Brandl wrote: >> Guido van Rossum wrote: >>> Ah. This definitely isn't what ConfigParser was meant to do. I'd >>> think >>> for this you should use some kind of XML pickle though. That's >>> horrible if end users must edit it, but great for saving >>> near-arbitrary persistent data in a readable and occasionally >>> editable >>> (for the developer) form. >> >> >> While we're at it, is the Python library going to incorporate some >> YAML >> parser in the future? YAML seems like a perfectly matching data >> format >> for Python. > > Unfortunately, YAML still doesn't have a fully featured pure python > parser (pyyaml works on simple yaml documents). That's the killer for me. I wanted to try it out once, but since there wasn't a good implementation I tossed it. > p.s. JSON is 'nearly' a subset of YAML (the nearly point is being > considered by various parties). There's a subset of JSON that is valid YAML. The output of simplejson is intentionally valid JSON and YAML, for example. Basically, the JSON serializer just needs to put whitespace in the right places. JSON isn't a great human editable format... Better than XML I guess, but it's not terribly natural. However, it is simple to implement, and the tools to deal with it are very widely available. -bob From bokr at oz.net Wed Feb 1 01:05:11 2006 From: bokr at oz.net (Bengt Richter) Date: Wed, 01 Feb 2006 00:05:11 GMT Subject: [Python-Dev] Octal literals References: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> Message-ID: <43dff5e0.759259757@news.gmane.org> On Tue, 31 Jan 2006 17:17:22 -0500, "Andrew Koenig" wrote: >> Apart from making 0640 a syntax error (which I think is wrong too), >> could this be solved by *requiring* the argument to be a string? (Or >> some other data type, but that's probably overkill.) > >That solves the problem only in that particular context. > >I would think that if it is deemed undesirable for a leading 0 to imply >octal, then it would be best to decide on a different syntax for octal >literals and use that syntax consistently everywhere. > >I am personally partial to allowing an optional radix (in decimal) followed >by the letter r at the beginning of a literal, so 19, 8r23, and 16r13 would >all represent the same value. In that case, could I also make a pitch for the letter c which would similarly follow a radix (in decimal) but would introduce the rest of the number as a radix-complement signed number, e.g., -2, 16cfe, 8c76, 2c110, 10c98 would all have the same value, and the sign-digit could be arbitrarily repeated to the left without changing the value, e.g., -2, 16cfffe, 8c776, 2c1110, 10c99998 would all have the same value. Likewise the positive values, where the "sign-digit" would be 0 instead of radix-1 (in the particular digit set for the radix). E.g., 2, 16c02, 16c0002, 8c02, 8c0002, 2c010, 2c0010, 10c02, 10c00002, etc. Of course you can put a unary minus in front of any of those, so -16f7 == 1609, and -2c0110 == -6 == 2c1010 etc. This permits negative literal constants to be expressed "showing the bits" as they are in two's complement or with the bits grouped to show as hex or octal digits etc. And 16cf80000000 would become a 32-bit int, not a long as would -0x80000000 (being a unary minus on a positive value that is promoted to long). Regards, Bengt Richter From facundobatista at gmail.com Wed Feb 1 02:01:31 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 31 Jan 2006 22:01:31 -0300 Subject: [Python-Dev] Extension to ConfigParser In-Reply-To: References: <43D930E0.6070805@voidspace.org.uk> <43DDE661.4020808@voidspace.org.uk> <43DE51F7.4010007@colorstudy.com> <17374.33384.775422.175195@montanaro.dyndns.org> <43DE9066.2040100@voidspace.org.uk> Message-ID: 2006/1/30, Fredrik Lundh : > fwiw, I've *never* used INI files to store program state, and I've > never used the save support in ConfigParser. As a SiGeFi developing decision, we obligated us to keep the program state between executions (hey, if I set the window this big, I want the window this big next time!). It was natural to us to save it in the user home directory, in a ".sigefi" file. And we thought it was unpolite, at less, to put a pickled dictionary in users home directory. That's how we finished keeping program state in a .INI, :s. Regards, . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From facundobatista at gmail.com Wed Feb 1 02:11:24 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 31 Jan 2006 22:11:24 -0300 Subject: [Python-Dev] Octal literals In-Reply-To: <43dff5e0.759259757@news.gmane.org> References: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> <43dff5e0.759259757@news.gmane.org> Message-ID: 2006/1/31, Bengt Richter : > In that case, could I also make a pitch for the letter c which would similarly > follow a radix (in decimal) but would introduce the rest of the number as > a radix-complement signed number, e.g., -2, 16cfe, 8c76, 2c110, 10c98 would > all have the same value, and the sign-digit could be arbitrarily repeated to > the left without changing the value, e.g., -2, 16cfffe, 8c776, 2c1110, 10c99998 > would all have the same value. Likewise the positive values, where the "sign-digit" > would be 0 instead of radix-1 (in the particular digit set for the radix). E.g., > 2, 16c02, 16c0002, 8c02, 8c0002, 2c010, 2c0010, 10c02, 10c00002, etc. Of course > you can put a unary minus in front of any of those, so -16f7 == 1609, and > -2c0110 == -6 == 2c1010 etc. This is getting too complicated. I dont' want to read code and pause myself 5 minutes while doing math to understand a number. I think that the whole point of modifying something is to simplify it. I'm +0 on removing 0-leading literals. But only if we create "d", "h" and "o" suffixes to represent decimal, hex and octal literals (2.35d, 3Fh, 660o). And +0 on keeping the "0x" preffix for hexa (c'mon, it seems so natural....). Regards, . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From tim.peters at gmail.com Wed Feb 1 02:16:21 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 31 Jan 2006 20:16:21 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060131105920.GQ18916@xs4all.nl> References: <20060131105920.GQ18916@xs4all.nl> Message-ID: <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> [Thomas Wouters] > I noticed a few compiler warnings, when I compile Python on my amd64 with > gcc 4.0.3: > > Objects/longobject.c: In function 'PyLong_AsDouble': > Objects/longobject.c:655: warning: 'e' may be used uninitialized in this function Well, that's pretty bizarre. There's _obviously_ no way to get to a reference to `e` without going through x = _PyLong_AsScaledDouble(vv, &e); first. That isn't a useful warning. > Objects/longobject.c: In function 'long_true_divide': > Objects/longobject.c:2263: warning: 'aexp' may be used uninitialized in this function > Objects/longobject.c:2263: warning: 'bexp' may be used uninitialized in this function Same thing, really, complaining about vrbls whose values are always set by _PyLong_AsScaledDouble(). > Modules/linuxaudiodev.c: In function 'lad_obuffree': > Modules/linuxaudiodev.c:392: warning: 'ssize' may be used uninitialized in this function > Modules/linuxaudiodev.c: In function 'lad_bufsize': > Modules/linuxaudiodev.c:348: warning: 'ssize' may be used uninitialized in this function > Modules/linuxaudiodev.c: In function 'lad_obufcount': > Modules/linuxaudiodev.c:369: warning: 'ssize' may be used uninitialized in this function Those are Linux bugs ;-) > ... > Should these warnings be fixed? I don't know. Is this version of gcc broken in some way relative to other gcc versions, or newer, or ... ? We certainly don't want to see warnings under gcc, since it's heavily used, but I'm not clear on why other versions of gcc aren't producing these warnings (or are they, and people have been ignoring that?). > I know Tim has always argued to fix them, in the past (and I agree,) and it > doesn't look like doing so, by initializing the variables, wouldn't be too big a > performance hit. We shouldn't see any warnings under a healthy gcc. > I also noticed test_logging is spuriously failing, and not just on my > machine (according to buildbot logs.) Is anyone (Vinay?) looking at that > yet? FWIW, I've never seen this fail on Windows. The difference is probably that sockets on Windows work . From guido at python.org Wed Feb 1 02:59:54 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 31 Jan 2006 17:59:54 -0800 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: On 1/31/06, Tim Peters wrote: > [Thomas Wouters] > > Objects/longobject.c:655: warning: 'e' may be used uninitialized in this function > > Well, that's pretty bizarre. There's _obviously_ no way to get to a > reference to `e` without going through > > x = _PyLong_AsScaledDouble(vv, &e); > > first. That isn't a useful warning. But how can the compiler know that it is an output-only argument? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Wed Feb 1 03:19:41 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 31 Jan 2006 21:19:41 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <1f7befae0601311819h73cb8188j563b5b3cad88313a@mail.gmail.com> [Tim] >> Well, that's pretty bizarre. There's _obviously_ no way to get to a >> reference to `e` without going through >> >> x = _PyLong_AsScaledDouble(vv, &e); >> >> first. That isn't a useful warning. [Guido] > But how can the compiler know that it is an output-only argument? In the absence of interprocedural analysis, it cannot -- and neither can it know that it's not an output argument. It can't know anything non-trivial, and because it can't, a reasonable compiler would avoid raising a red flag at "warning" level. "info", maybe, if it has such a concept. It's as silly to me as seeing, e.g., """ double recip(double z) { return 1.0 / z; } "warning: possible division by 0 or signaling NaN" """ Perhaps, but not useful because there's no reason to presume it's a _likely_ error. From evdo.hsdpa at gmail.com Wed Feb 1 03:42:27 2006 From: evdo.hsdpa at gmail.com (Robert Kim Wireless Internet Advisor) Date: Tue, 31 Jan 2006 18:42:27 -0800 Subject: [Python-Dev] Compiler warnings In-Reply-To: References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <1ec620e90601311842l228fe84ayd5cd06d231fd9ff1@mail.gmail.com> u guys are way over my head :) bob -- Robert Kim 2611s Highway 101 suite 102 San diego CA 92007 206 984 0880 http://evdo-coverage.com/cellular-repeater.html On 1/31/06, Guido van Rossum wrote: > On 1/31/06, Tim Peters wrote: > > [Thomas Wouters] > > > Objects/longobject.c:655: warning: 'e' may be used uninitialized in this > function > > > > Well, that's pretty bizarre. There's _obviously_ no way to get to a > > reference to `e` without going through > > > > x = _PyLong_AsScaledDouble(vv, &e); > > > > first. That isn't a useful warning. > > But how can the compiler know that it is an output-only argument? > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/evdo.hsdpa%40gmail.com > -- Robert Q Kim, Wireless Internet Advisor http://evdo-coverage.com/cellular-repeater.html http://hsdpa-coverage.com 2611 S. Pacific Coast Highway 101 Suite 102 Cardiff by the Sea, CA 92007 206 984 0880 From foom at fuhm.net Wed Feb 1 04:27:01 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 31 Jan 2006 22:27:01 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <724240BE-79DB-4DE5-A26D-F6F2E894FDB1@fuhm.net> On Jan 31, 2006, at 8:16 PM, Tim Peters wrote: > [Thomas Wouters] >> I noticed a few compiler warnings, when I compile Python on my >> amd64 with >> gcc 4.0.3: >> >> Objects/longobject.c: In function 'PyLong_AsDouble': >> Objects/longobject.c:655: warning: 'e' may be used uninitialized >> in this function > > Well, that's pretty bizarre. There's _obviously_ no way to get to a > reference to `e` without going through > > x = _PyLong_AsScaledDouble(vv, &e); > > first. That isn't a useful warning. Look closer, and it's not quite so obvious. Here's the beginning of PyLong_AsDouble: > double > PyLong_AsDouble(PyObject *vv) > { > int e; > double x; > > if (vv == NULL || !PyLong_Check(vv)) { > PyErr_BadInternalCall(); > return -1; > } > x = _PyLong_AsScaledDouble(vv, &e); > if (x == -1.0 && PyErr_Occurred()) > return -1.0; > if (e > INT_MAX / SHIFT) > goto overflow; Here's the beginning of _PyLong_AsScaledDouble: > _PyLong_AsScaledDouble(PyObject *vv, int *exponent) > { > #define NBITS_WANTED 57 > PyLongObject *v; > double x; > const double multiplier = (double)(1L << SHIFT); > int i, sign; > int nbitsneeded; > > if (vv == NULL || !PyLong_Check(vv)) { > PyErr_BadInternalCall(); > return -1; > } Now here's the thing: _PyLong_AsScaledDouble *doesn't* set exponent before returning -1 there, which is where the warning comes from. Now, you might protest, it's impossible to go down that code path, because of two reasons: 1) PyLong_AsDouble has an identical "(vv == NULL || !PyLong_Check (vv))" check, so that codepath in _PyLong_AsScaledDouble cannot possibly be gone down. However, PyLong_Check is a macro which expands to a function call to an external function, "PyType_IsSubtype((vv)- >ob_type, (&PyLong_Type)))", so GCC has no idea it cannot return an error the second time. This is the kind of thing C++'s const 2) There's a guard "(x == -1.0 && PyErr_Occurred())" before "e" is used in PyLong_AsDouble, which checks the conditions that _PyLong_AsScaledDouble set. Thus, e cannot possibly be used, even if the previous codepath *was* possible to go down. However, again, PyErr_BadInternalCall() is an external function, so the compiler has no way of knowing that PyErr_BadInternalCall() causes PyErr_Occurred () to return true. So in conclusion, from all the information the compiler has available to it, it is giving a correct diagnostic. James From jeremy at alum.mit.edu Wed Feb 1 05:28:22 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 31 Jan 2006 23:28:22 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1ec620e90601311842l228fe84ayd5cd06d231fd9ff1@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <1ec620e90601311842l228fe84ayd5cd06d231fd9ff1@mail.gmail.com> Message-ID: On 1/31/06, Robert Kim Wireless Internet Advisor wrote: > u guys are way over my head :) > bob > > > -- > Robert Kim > 2611s Highway 101 > suite 102 > San diego CA 92007 > 206 984 0880 > Stop spamming our list. Jeremy From ianb at colorstudy.com Wed Feb 1 05:32:20 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 31 Jan 2006 22:32:20 -0600 Subject: [Python-Dev] Extension to ConfigParser In-Reply-To: <43DE7F03.1080202@voidspace.org.uk> References: <43D930E0.6070805@voidspace.org.uk> <43DDE661.4020808@voidspace.org.uk> <43DE51F7.4010007@colorstudy.com> <43DE75B4.7000502@colorstudy.com> <43DE7F03.1080202@voidspace.org.uk> Message-ID: <43E039D4.8050600@colorstudy.com> Sorry, I didn't follow up here like I should have, and I haven't followed the rest of this conversation, so apologies if I am being redundant... Fuzzyman wrote: >>While ConfigParser is okay for simple configuration, it is (IMHO) not a >>very good basis for anyone who wants to build better systems, like >>config files that can be changed programmatically, or error messages >>that point to file and line numbers. Those aren't necessarily features >>we need to expose in the standard library, but it'd be nice if you could >>implement that kind of feature without having to ignore the standard >>library entirely. >> >> > > Can you elaborate on what kinds of programattic changes you envisage ? > I'm just wondering if there are classes of usage not covered by > ConfigObj. Of course you can pretty much do anything to a ConfigObj > instance programattically, but even so... ConfigObj does fine, my criticism was simply of ConfigParser in this case. Just yesterday I was doing (with ConfigParser): conf.save('app:main', '## Uncomment this next line to enable authentication:\n#filter-with', 'openid') This is clearly lame ;) >>That said, I'm not particularly enthused about a highly featureful >>config file *format* in the standard library, even if I would like a >>much more robust implementation. >> >> > > I don't see how you can easily separate the format from the parser - > unless you just leave raw values. (As I said in the other email, I don't > think I fully understand you.) > > If accessing raw values suits your purposes, why not subclass > ConfigParser and do magic in the get* methods ? I guess I haven't really looked closely at the implementation of ConfigParser, so I don't know how serious the subclassing would have to be. But, for example, if you wanted to do nested sections this is not infeasible with the current syntax, you just have to overload the meaning of the section names. E.g., [foo.bar] (a section named "foo.bar") could mean that this is a subsection of "foo". Or, if the parser allows you to see the order of sections, you could use [[bar]] (a section named "[bar]") to imply a subsection, not unlike what you have already, except without the indentation. I think there's lots of other kinds of things you can do with the INI syntax as-is, but providing a different interface to it. If you allow an easy-to-reuse parser, you can even check that syntax at read time. (Or if you keep enough information, check the syntax later and still be able to signal errors with filenames and line numbers) An example of a parser that doesn't imply much of anything about the object being produced is one that I wrote here: http://svn.colorstudy.com/INITools/trunk/initools/iniparser.py On top of that I was able to build some other fancy things without much problem (which ended up being too fancy, but that's a different issue ;) >> From my light reading on ConfigObj, it looks like it satisfies my >>personal goals (though I haven't used it), but maybe has too many >>features, like nested sections. And it seems like maybe the API can be >> > > I personally think nested sections are very useful and would be sad to > not see them included. Grouping additional configuration options as a > sub-section can be *very* handy. Using .'s in names can also do grouping, or section naming conventions. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From jcarlson at uci.edu Wed Feb 1 05:36:34 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 31 Jan 2006 20:36:34 -0800 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1ec620e90601311842l228fe84ayd5cd06d231fd9ff1@mail.gmail.com> References: <1ec620e90601311842l228fe84ayd5cd06d231fd9ff1@mail.gmail.com> Message-ID: <20060131203324.109D.JCARLSON@uci.edu> Robert Kim Wireless Internet Advisor wrote: > u guys are way over my head :) > bob You seem to be new to the python-dev mailing list. As a heads-up, python-dev is for the development _of_ python. If you are using Python, and want help or want to help others using Python, you should instead join python-list, or the equivalent comp.lang.python newsgroup. Posting as a new user what you just did "u guys are way over my head :)", as well as your earlier post of "anybody here?", is a good and fast way of being placed in everyone's kill file. - Josiah From martin at v.loewis.de Wed Feb 1 08:15:41 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Feb 2006 08:15:41 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <43E0601D.7090505@v.loewis.de> Tim Peters wrote: >>I noticed a few compiler warnings, when I compile Python on my amd64 with >>gcc 4.0.3: >> >>Objects/longobject.c: In function 'PyLong_AsDouble': >>Objects/longobject.c:655: warning: 'e' may be used uninitialized in this function > > > Well, that's pretty bizarre. There's _obviously_ no way to get to a > reference to `e` without going through > > x = _PyLong_AsScaledDouble(vv, &e); > > first. That isn't a useful warning. It inlines the function to make this determination. Now, it's not true that e can be uninitialized then, but there the gcc logic fails: If you take the if (vv == NULL || !PyLong_Check(vv)) { PyErr_BadInternalCall(); return -1; } case in _PyLong_AsScaledDouble, *exponent won't be initialized. Then, in PyLong_AsDouble, with x = _PyLong_AsScaledDouble(vv, &e); if (x == -1.0 && PyErr_Occurred()) return -1.0; it looks like the return would not be taken if PyErr_Occurred returns false. Of course, it won't, but that is difficult to analyse. > I don't know. Is this version of gcc broken in some way relative to > other gcc versions, or newer, or ... ? We certainly don't want to see > warnings under gcc, since it's heavily used, but I'm not clear on why > other versions of gcc aren't producing these warnings (or are they, > and people have been ignoring that?). gcc 4 does inlining in far more cases now. Regards, Martin From martin at v.loewis.de Wed Feb 1 08:20:21 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Feb 2006 08:20:21 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <43E06135.6010608@v.loewis.de> Guido van Rossum wrote: >>Well, that's pretty bizarre. There's _obviously_ no way to get to a >>reference to `e` without going through >> >> x = _PyLong_AsScaledDouble(vv, &e); >> >>first. That isn't a useful warning. > > > But how can the compiler know that it is an output-only argument? If a variable's address is passed to a function, gcc normally assumes that the function will modify the variable, so you normally don't see "might be used uninitialized" warnings. However, gcc now also inlines the functions called if possible, to find out how the pointer is used inside the function. Changing the order of the functions in the file won't help anymore, either. If you want to suppress inlining, you must put __attribute__((noinline)) before the function. Regards, Martin From thomas at xs4all.net Wed Feb 1 11:14:05 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 1 Feb 2006 11:14:05 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <20060201101405.GT18916@xs4all.nl> On Tue, Jan 31, 2006 at 08:16:21PM -0500, Tim Peters wrote: > Is this version of gcc broken in some way relative to other gcc versions, > or newer, or ... ? We certainly don't want to see warnings under gcc, > since it's heavily used, but I'm not clear on why other versions of gcc > aren't producing these warnings (or are they, and people have been > ignoring that?). Well, I said 4.0.3, and that was wrong. It's actually a pre-release of 4.0.3 (in Debian's 'unstable' distribution.) However, 4.0.2 (the actual release) behaves the same way. The normal make process shows quite a lot of output on systems that use gcc, so I wouldn't be surprised if people did ignore it, for the most part. My main problem with fixing the warnings is that I don't see the difference between, for example, the 'ssize' variable and the 'nchannels' variable in linuxaudio's lad_obuffree/lad_bufsize/lad_obufcount. 'ssize' gets a warning, 'nchannels' doesn't, yet how they are treated is not particularly different. The ssize output parameter gets set inside a switch, is directly followed by a break, and the switch is directly followed by a set of the nchannels output parameter. The only way through the switch is through the set of ssize. I understand the compiler doesn't "see" it this way, but who knows for how long :) I guess we ignore this until we're closer to a 2.5alpha1 ;P -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From sjoerd at acm.org Wed Feb 1 11:34:00 2006 From: sjoerd at acm.org (Sjoerd Mullender) Date: Wed, 01 Feb 2006 11:34:00 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060201101405.GT18916@xs4all.nl> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> Message-ID: <43E08E98.8070403@acm.org> Thomas Wouters wrote: > On Tue, Jan 31, 2006 at 08:16:21PM -0500, Tim Peters wrote: > > >>Is this version of gcc broken in some way relative to other gcc versions, >>or newer, or ... ? We certainly don't want to see warnings under gcc, >>since it's heavily used, but I'm not clear on why other versions of gcc >>aren't producing these warnings (or are they, and people have been >>ignoring that?). > > > Well, I said 4.0.3, and that was wrong. It's actually a pre-release of 4.0.3 > (in Debian's 'unstable' distribution.) However, 4.0.2 (the actual release) > behaves the same way. The normal make process shows quite a lot of output on > systems that use gcc, so I wouldn't be surprised if people did ignore it, > for the most part. > > My main problem with fixing the warnings is that I don't see the difference > between, for example, the 'ssize' variable and the 'nchannels' variable in > linuxaudio's lad_obuffree/lad_bufsize/lad_obufcount. 'ssize' gets a warning, > 'nchannels' doesn't, yet how they are treated is not particularly different. > The ssize output parameter gets set inside a switch, is directly followed by > a break, and the switch is directly followed by a set of the nchannels > output parameter. The only way through the switch is through the set of > ssize. I understand the compiler doesn't "see" it this way, but who knows > for how long :) > > I guess we ignore this until we're closer to a 2.5alpha1 ;P > I don't quite understand what's the big deal. The compiler issues a warning. We know better (and I agree, we *do* know better in most of these cases), but it's easy to add a "= 0" to the declaration of the variable to shut up the compiler, hopefully with a comment saying as much. That's what I've been doing in my code that generated these warnings. It's clearly a "bug" in the compiler that it isn't smart enough to figure out that variable do actually only get used after they've been set. Hence, this is Somebody Else's Problem. -- Sjoerd Mullender -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 369 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20060201/0de698a5/attachment.pgp From gjc at inescporto.pt Wed Feb 1 13:33:36 2006 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 01 Feb 2006 12:33:36 +0000 Subject: [Python-Dev] Octal literals In-Reply-To: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> References: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> Message-ID: <1138797216.6791.38.camel@localhost.localdomain> On Tue, 2006-01-31 at 17:17 -0500, Andrew Koenig wrote: > > Apart from making 0640 a syntax error (which I think is wrong too), > > could this be solved by *requiring* the argument to be a string? (Or > > some other data type, but that's probably overkill.) > > That solves the problem only in that particular context. > > I would think that if it is deemed undesirable for a leading 0 to imply > octal, then it would be best to decide on a different syntax for octal > literals and use that syntax consistently everywhere. +1, and then issue a warning every time the parser sees leading 0 octal constant instead of the new syntax, although the old syntax would continue to work for compatibility reasons. > > I am personally partial to allowing an optional radix (in decimal) followed > by the letter r at the beginning of a literal, so 19, 8r23, and 16r13 would > all represent the same value. For me, adding the radix to the right instead of left looks nicer: 23r8, 13r16, etc., since a radix is almost like a unit, and units are always to the right. Plus, we already use suffix characters to the right, like 10L. And I seem to recall an old assembler (a z80 assembler, IIRC :P) that used a syntax like 10h and 11b for hex an bin radix. Hmm.. I'm beginning to think 13r16 or 16r13 look too cryptic to the casual observer; perhaps a suffix letter is more readable, since we don't need arbitrary radix support anyway. /me thinks of some examples: 644o # I _think_ the small 'o' cannot be easily confused with 0 or O, but.. 10h # hex.. hm.. but we already have 0x10 101b # binary Another possility is to extend the 0x syntax to non-hex, 0xff # hex 0o644 # octal 0b1101 # binary I'm unsure which one I like better. Regards, -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic From scott+python-dev at scottdial.com Wed Feb 1 14:07:00 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Wed, 01 Feb 2006 08:07:00 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060201101405.GT18916@xs4all.nl> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> Message-ID: <43E0B274.7070204@scottdial.com> Thomas Wouters wrote: > My main problem with fixing the warnings is that I don't see the difference > between, for example, the 'ssize' variable and the 'nchannels' variable As was pointed out elsewhere, any variable that is passed by-reference to another function is ignored for the purposes of these warnings. The fact that the ioctl call with nchannels happens well after potential problem spots doesn't matter. It appears that GCC has eliminated it from the decision process for the purposes of these warnings already. The problem roots from the ambiguity of the returns. At compile-time, there is no way for GCC that the return value will be negative in the error case, and thus the return may cause us to go down an execution path that ssize (and nchannels) need to be initialized. This check seems to be very shallow, even if you provide a guarantee that the return value will be well-behaved, GCC has already given up on figuring this out. The rule of thumb here seems to be "if you make a call to a function which provides the condition for the uninitialized variable is used, then the condition is decided to be ambiguous." So, either the GCC people have not noticed this problem, or (more likely) have decided that this is acceptable, but clearly it will cause spurious warnings. Hey, after all, they are just warnings. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From mwh at python.net Wed Feb 1 14:51:03 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 01 Feb 2006 13:51:03 +0000 Subject: [Python-Dev] Compiler warnings In-Reply-To: <43E0B274.7070204@scottdial.com> (Scott Dial's message of "Wed, 01 Feb 2006 08:07:00 -0500") References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <43E0B274.7070204@scottdial.com> Message-ID: <2mpsm7farc.fsf@starship.python.net> Scott Dial writes: > So, either the GCC people have not noticed this problem, or (more > likely) have decided that this is acceptable, but clearly it will cause > spurious warnings. Hey, after all, they are just warnings. Well, indeed, but "no warnings" is a useful policy -- it makes new warnings much easier to spot :) The warnings under discussion seem rather excessive to me. Cheers, mwh -- Ignoring the rules in the FAQ: 1" slice in spleen and prevention of immediate medical care. -- Mark C. Langston, asr From thomas at xs4all.net Wed Feb 1 15:33:22 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 1 Feb 2006 15:33:22 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <2mpsm7farc.fsf@starship.python.net> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <43E0B274.7070204@scottdial.com> <2mpsm7farc.fsf@starship.python.net> Message-ID: <20060201143322.GU18916@xs4all.nl> On Wed, Feb 01, 2006 at 01:51:03PM +0000, Michael Hudson wrote: > Scott Dial writes: > > > So, either the GCC people have not noticed this problem, or (more > > likely) have decided that this is acceptable, but clearly it will cause > > spurious warnings. Hey, after all, they are just warnings. > Well, indeed, but "no warnings" is a useful policy -- it makes new > warnings much easier to spot :) > The warnings under discussion seem rather excessive to me. Yes, and more than that; fixing them 'properly' requires more than just initializing them. There is no sane default for some of those warnings, so a proper fix would have to check for a sane value after the function returns. That is, if we take the warning seriously. If we don't take it seriously, initializing the variable may surpress a warning in the future: one of the called functions could change, opening a code path that in fact doesn't initialize the output variable. But initializing to a sentinel value, checking the value before use and handling that case sanely isn't always easy, or efficient. Hence my suggestion to let this wait a bit (since they are, at this time, spurious errors). Fixing the warnings *now* won't fix any bugs, may mask future bugs, and may not be necessary if gcc 4.0.4 grows a better way to surpress these warnings. Or gcc 4.0 may grow more such warnings, in which case we may want to change the 'no warnings' policy or the flags to gcc, or add a 'known spurious warnings' checking thing to the buildbot. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.peters at gmail.com Wed Feb 1 16:15:15 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 1 Feb 2006 10:15:15 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <43E0601D.7090505@v.loewis.de> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <43E0601D.7090505@v.loewis.de> Message-ID: <1f7befae0602010715v454025bas9509c1d19bb16836@mail.gmail.com> [Martin v. L?wis] > It inlines the function to make this determination. Very cool! Is this a new(ish) behavior? > Now, it's not true that e can be uninitialized then, but there > the gcc logic fails: That's fine -- there are any number of ways a compiler can reach a wrong conclusion by making conservative assumptions, and so long as it's actually staring at code I don't mind that at all. What I would mind is griping about some_func(&a) possibly not setting `a` in the _absence_ of staring at `some_func`'s internals. > If you take the > > if (vv == NULL || !PyLong_Check(vv)) { > PyErr_BadInternalCall(); > return -1; > } > > case in _PyLong_AsScaledDouble, *exponent won't be initialized. Certainly, and I don't expect a compiler to realize that this branch is impossible when _PyLong_AsScaledDouble is invoked from the call sites where gcc is complaining. > Then, in PyLong_AsDouble, with > > x = _PyLong_AsScaledDouble(vv, &e); > if (x == -1.0 && PyErr_Occurred()) > return -1.0; > > it looks like the return would not be taken if PyErr_Occurred returns > false. Of course, it won't, but that is difficult to analyse. PyLong_AsDouble already did: if (vv == NULL || !PyLong_Check(vv)) { PyErr_BadInternalCall(); return -1; } before calling _PyLong_AsScaledDouble(), and the latter's `x` is the former's `vv`. That is, the check you showed above from _PyLong_AsScaledDouble() is exactly the same as the check PyLong_AsDouble already made. To exploit that, gcc would have to realize PyLong_Check() is a "pure enough" function, and I don't expect gcc to be able to figure that out. >> I don't know. Is this version of gcc broken in some way relative to >> other gcc versions, or newer, or ... ? We certainly don't want to see >> warnings under gcc, since it's heavily used, but I'm not clear on why >> other versions of gcc aren't producing these warnings (or are they, >> and people have been ignoring that?). > gcc 4 does inlining in far more cases now. OK then. Thomas, for these _PyLong_AsScaledDouble()-caller cases, I suggest doing whatever obvious thing manages to silence the warning. For example, in PyLong_AsDouble: int e = -1; /* silence gcc warning */ and then add: assert(e >= 0); after the call. From scott+python-dev at scottdial.com Wed Feb 1 16:17:26 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Wed, 01 Feb 2006 10:17:26 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060201143322.GU18916@xs4all.nl> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <43E0B274.7070204@scottdial.com> <2mpsm7farc.fsf@starship.python.net> <20060201143322.GU18916@xs4all.nl> Message-ID: <43E0D106.2010508@scottdial.com> Thomas Wouters wrote: > On Wed, Feb 01, 2006 at 01:51:03PM +0000, Michael Hudson wrote: >> Scott Dial writes: >> >>> So, either the GCC people have not noticed this problem, or (more >>> likely) have decided that this is acceptable, but clearly it will cause >>> spurious warnings. Hey, after all, they are just warnings. > >> Well, indeed, but "no warnings" is a useful policy -- it makes new >> warnings much easier to spot :) > >> The warnings under discussion seem rather excessive to me. > > Yes, and more than that; fixing them 'properly' requires more than just > initializing them. There is no sane default for some of those warnings, so a > proper fix would have to check for a sane value after the function returns. > That is, if we take the warning seriously. If we don't take it seriously, > initializing the variable may surpress a warning in the future: one of the > called functions could change, opening a code path that in fact doesn't > initialize the output variable. But initializing to a sentinel value, > checking the value before use and handling that case sanely isn't always > easy, or efficient. Hence my suggestion to let this wait a bit (since they > are, at this time, spurious errors). Fixing the warnings *now* won't fix any > bugs, may mask future bugs, and may not be necessary if gcc 4.0.4 grows a > better way to surpress these warnings. Or gcc 4.0 may grow more such > warnings, in which case we may want to change the 'no warnings' policy or > the flags to gcc, or add a 'known spurious warnings' checking thing to the > buildbot. > Although it is no consolation, there are two types of unused variable warnings: the known-error ("is used uninitialized in this function") and a probable-error ("may be used uninitialized in this function"). It may be reasonable to ignore the probable-error case. I think someone even mentioned that they really should be an "info" and not a "warning". The points in the code clearly should have attention brought to them because there is a real possibility of error, but as you say, there is no way to rid yourself of this type of warning. Also, note that the phrasing "is"/"may be" is a change from 3.x to 4.x. The old warning was "might" always, and as I understand gcc the "might" of 3.x maps directly to the "is" of 4.x -- leaving "may be" an entirely new thing to 4.x. From gcc/tree-ssa.c: The second pass follows PHI nodes to find uses that are potentially uninitialized. In this case we can't necessarily prove that the use is really uninitialized. This pass is run after most optimizations, so that we thread as many jumps and possible, and delete as much dead code as possible, in order to reduce false positives. We also look again for plain uninitialized variables, since optimization may have changed conditionally uninitialized to unconditionally uninitialized. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From rhamph at gmail.com Wed Feb 1 16:32:49 2006 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 1 Feb 2006 08:32:49 -0700 Subject: [Python-Dev] Octal literals In-Reply-To: <1138797216.6791.38.camel@localhost.localdomain> References: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> <1138797216.6791.38.camel@localhost.localdomain> Message-ID: On 2/1/06, Gustavo J. A. M. Carneiro wrote: > On Tue, 2006-01-31 at 17:17 -0500, Andrew Koenig wrote: > > I am personally partial to allowing an optional radix (in decimal) followed > > by the letter r at the beginning of a literal, so 19, 8r23, and 16r13 would > > all represent the same value. > > For me, adding the radix to the right instead of left looks nicer: > 23r8, 13r16, etc., since a radix is almost like a unit, and units are > always to the right. Plus, we already use suffix characters to the > right, like 10L. And I seem to recall an old assembler (a z80 > assembler, IIRC :P) that used a syntax like 10h and 11b for hex an bin > radix. ffr16 #16rff or 255 Iamadeadparrotr36 # 36rIamadeadparrot or 3120788520272999375597 Suffix syntax for bases higher than 10 is ambiguous with variable names. Prefix syntax is not. -- Adam Olsen, aka Rhamphoryncus From bokr at oz.net Wed Feb 1 16:35:55 2006 From: bokr at oz.net (Bengt Richter) Date: Wed, 01 Feb 2006 15:35:55 GMT Subject: [Python-Dev] Octal literals References: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> <1138797216.6791.38.camel@localhost.localdomain> Message-ID: <43e0bd69.810340828@news.gmane.org> On Wed, 01 Feb 2006 12:33:36 +0000, "Gustavo J. A. M. Carneiro" wrote: [...] > Hmm.. I'm beginning to think 13r16 or 16r13 look too cryptic to the >casual observer; perhaps a suffix letter is more readable, since we >don't need arbitrary radix support anyway. > >/me thinks of some examples: > > 644o # I _think_ the small 'o' cannot be easily confused with 0 or O, >but.. > 10h # hex.. hm.. but we already have 0x10 > 101b # binary > > Another possility is to extend the 0x syntax to non-hex, > > 0xff # hex > 0o644 # octal > 0b1101 # binary > > I'm unsure which one I like better. > Sorry if I seem to be picking nits, but IMO there's more than a nit here: The trouble with all of these is that they are all literals for integers, but integers are signed, and there is no way to represent the sign bit (wherever it is for a particular platform) along with the others, without triggering a promotion to positive long. So you get stuff like >>> def i32(i): return int(-(i&0x80000000))+int(i&0x7fffffff) ... >>> MYCONST = i32(0x87654321) >>> MYCONST -2023406815 >>> type(MYCONST) >>> hex(MYCONST) '-0x789abcdf' Oops ;-/ >>> hex(MYCONST&0xffffffff) '0x87654321L' instead of MYCONST = 16cf87654321 Hm... maybe an explicit ordinary sign _after_ the prefix would be more mnemonic instead of indicating it with the radix-complement (f or 0 for hex). E.g., MYCONST = 16r-87654321 # all bits above the 8 are ones and MYCONST = 16r+87654321 # explicitly positive, all bits above 8 (none for 32 bits) are zeroes MYCONST = 16r87654321 # implicitly positive, ditto or the above in binary MYCONST = 2r-10000111011001010100001100100001 # leading bits are ones (here all are specified for 32-bit int, but # effect would be noticeable for smaller numbers or wider ints) MYCONST = 2r+10000111011001010100001100100001 # leading bits are zeroes (ditto) MYCONST = 2r10000111011001010100001100100001 # ditto This could also be done as alternative 0x syntax, e.g. using 0h, 0o, and 0b, but I sure don't like that '0o' ;-) BTW, for non-power-of-two radices(?), it should be remembered that the '-' is mnemonic for the symbol for (radix-1), and '+' or no sign is mnemonic for a prefixed 0 (which is 0 in any allowable radix) in order to have this notation have general radix expressivity for free ;-) Regards, Bengt Richter From dw at botanicus.net Wed Feb 1 16:33:09 2006 From: dw at botanicus.net (David Wilson) Date: Wed, 1 Feb 2006 15:33:09 +0000 Subject: [Python-Dev] webmaster@python.org failing sender verification. Message-ID: <20060201153309.GA22646@thailand.botanicus.net> Hi there, Recently, updates from MoinMoin have started getting quarantined due to sender verification failing. On investigating the problem, it seems that an assumption about the webmaster mailbox is incorrect: 220 bag.python.org ESMTP Postfix (Debian/GNU) MAIL FROM: <> 503 Error: send HELO/EHLO first HELO argon.maildefence.co.uk 250 bag.python.org MAIL FROM: <> 250 Ok RCPT TO: webmaster at python.org 553 invalid bounce (address does not send mail) The MoinMoin instance on Python.org is sending mail as "webmaster at python.org". Can somebody take a look? Or at least tell me who to contact. Thanks, David. PS: Please CC me in replies as I am not currently subscribed. -- It's never too late to have a happy childhood. From tim.peters at gmail.com Wed Feb 1 17:29:16 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 1 Feb 2006 11:29:16 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060201101405.GT18916@xs4all.nl> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> Message-ID: <1f7befae0602010829k7ec7519di3f39082445cfcfe1@mail.gmail.com> [Thomas Wouters] > Well, I said 4.0.3, and that was wrong. It's actually a pre-release of 4.0.3 > (in Debian's 'unstable' distribution.) However, 4.0.2 (the actual release) > behaves the same way. The normal make process shows quite a lot of output on > systems that use gcc, so I wouldn't be surprised if people did ignore it, > for the most part. Does it really? It's completely warning-free on Windows, and that's the intent, and it takes ongoing work to keep it that way. Over at, e.g., http://www.python.org/dev/buildbot/g5%20osx.3%20trunk/builds/46/step-compile/0 I only see one gcc warning, coming from Python/Python-ast.c. I suppose that isn't a complete build, though. From foom at fuhm.net Wed Feb 1 18:40:42 2006 From: foom at fuhm.net (James Y Knight) Date: Wed, 1 Feb 2006 12:40:42 -0500 Subject: [Python-Dev] Octal literals In-Reply-To: <1138797216.6791.38.camel@localhost.localdomain> References: <011c01c626b4$2d6a0750$6402a8c0@arkdesktop> <1138797216.6791.38.camel@localhost.localdomain> Message-ID: <1F407826-5F10-4905-9A82-637E4825D191@fuhm.net> On Feb 1, 2006, at 7:33 AM, Gustavo J. A. M. Carneiro wrote: > Another possility is to extend the 0x syntax to non-hex, > > 0xff # hex > 0o644 # octal > 0b1101 # binary +1 James From jcarlson at uci.edu Wed Feb 1 18:47:34 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 01 Feb 2006 09:47:34 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <43e0bd69.810340828@news.gmane.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> Message-ID: <20060201085152.10AC.JCARLSON@uci.edu> bokr at oz.net (Bengt Richter) wrote: > On Wed, 01 Feb 2006 12:33:36 +0000, "Gustavo J. A. M. Carneiro" wrote: > [...] > > Hmm.. I'm beginning to think 13r16 or 16r13 look too cryptic to the > >casual observer; perhaps a suffix letter is more readable, since we > >don't need arbitrary radix support anyway. [snip discussion over radix and compliments] I hope I'm not the only one who thinks that "simple is better than complex", at least when it comes to numeric constants. Certainly it would be _convenient_ to express constants in a radix other than decimal, hexidecimal, or octal, but to me, it all looks like noise. Peronally, I was on board for the removal of octal literals, if only because I find _seeing_ a leading zero without something else (like the 'x' for hexidecimal) to be difficult, and because I've found little use for them in my work (decimals and hex are usually all I need). Should it change for me? Of course not, but I think that adding different ways to spell integer values will tend to confuse new and seasoned python users. Some will like the flexibility that adding new options offers, but I believe such a change will be a net loss for the understandability of those pieces of code which use it. - Josiah From bjourne at gmail.com Wed Feb 1 19:14:25 2006 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Wed, 1 Feb 2006 18:14:25 +0000 Subject: [Python-Dev] The path module PEP In-Reply-To: <43D9FA39.6060407@ofai.at> References: <740c3aec0601241222s25123daerd85ac2e6a9d0b920@mail.gmail.com> <43D6BA85.8070007@colorstudy.com> <740c3aec0601251237j422c274dx32667261a83df9b1@mail.gmail.com> <79990c6b0601260515w22ae9e2dy22d265acc4bce7c3@mail.gmail.com> <43D8E0E9.9080202@ofai.at> <79990c6b0601270216x2c43449cj534427f7a8f5234c@mail.gmail.com> <43D9FA39.6060407@ofai.at> Message-ID: <740c3aec0602011014v5e284b41t50570c7129221749@mail.gmail.com> I've submitted an updated version of the PEP. The only major change is that instead of the method atime and property getatime() there is now only one method named atime(). Also some information about the string inheritance problem in Open Issues. I still have no idea what to do about it though. -- mvh Bj?rn From bokr at oz.net Wed Feb 1 19:17:30 2006 From: bokr at oz.net (Bengt Richter) Date: Wed, 01 Feb 2006 18:17:30 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> Message-ID: <43e0f7d9.825300609@news.gmane.org> On Wed, 01 Feb 2006 09:47:34 -0800, Josiah Carlson wrote: > >bokr at oz.net (Bengt Richter) wrote: >> On Wed, 01 Feb 2006 12:33:36 +0000, "Gustavo J. A. M. Carneiro" wrote: >> [...] >> > Hmm.. I'm beginning to think 13r16 or 16r13 look too cryptic to the >> >casual observer; perhaps a suffix letter is more readable, since we >> >don't need arbitrary radix support anyway. > >[snip discussion over radix and compliments] > >I hope I'm not the only one who thinks that "simple is better than >complex", at least when it comes to numeric constants. Certainly it >would be _convenient_ to express constants in a radix other than decimal, >hexidecimal, or octal, but to me, it all looks like noise. You don't have to use any other radix, any more than you have to use all forms of float literals if you are happy with xx.yy. The others just become available through a consistent methodology. > >Peronally, I was on board for the removal of octal literals, if only >because I find _seeing_ a leading zero without something else (like the >'x' for hexidecimal) to be difficult, and because I've found little use >for them in my work (decimals and hex are usually all I need). I agree that 8r641 is more easily disambiguated than 0641 ;-) But how do you represent a negative int in hex? Or have you never encountered the need? The failure of current formats with respect to negative values whose values you want to specify in a bit-specifying format was my main point. Regards, Bengt Richter From barry at python.org Wed Feb 1 19:35:14 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 01 Feb 2006 13:35:14 -0500 Subject: [Python-Dev] Octal literals In-Reply-To: <20060201085152.10AC.JCARLSON@uci.edu> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> Message-ID: <1138818914.12020.17.camel@geddy.wooz.org> On Wed, 2006-02-01 at 09:47 -0800, Josiah Carlson wrote: > I hope I'm not the only one who thinks that "simple is better than > complex", at least when it comes to numeric constants. Certainly it > would be _convenient_ to express constants in a radix other than decimal, > hexidecimal, or octal, but to me, it all looks like noise. As a Unix weenie and occasional bit twiddler, I've had needs for octal, hex, and binary literals. +1 for coming up with a common syntax for these. -1 on removing any way to write octal literals. The proposal for something like 0xff, 0o664, and 0b1001001 seems like the right direction, although 'o' for octal literal looks kind of funky. Maybe 'c' for oCtal? (remember it's 'x' for heXadecimal). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060201/8d8dff84/attachment.pgp From gvwilson at cs.utoronto.ca Wed Feb 1 19:55:42 2006 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Wed, 1 Feb 2006 13:55:42 -0500 (EST) Subject: [Python-Dev] syntactic support for sets Message-ID: Hi, I have a student who may be interested in adding syntactic support for sets to Python, so that: x = {1, 2, 3, 4, 5} and: y = {z for z in x if (z % 2)} would be legal. There are of course issues (what's the syntax for a frozen set? for the empty set?), but before he even starts, I'd like to know if this would ever be considered for inclusion into the language. Thanks, Greg p.s. please Cc: me as well as the list, since I'm no longer subscribed. From martin at v.loewis.de Wed Feb 1 20:16:04 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Feb 2006 20:16:04 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <43E08E98.8070403@acm.org> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <43E08E98.8070403@acm.org> Message-ID: <43E108F4.7070507@v.loewis.de> Sjoerd Mullender wrote: > I don't quite understand what's the big deal. Traditionally, people see two problems with these initializations: - the extra initialization may cause a performance loss. - the initialization might hide real bugs later on. For example, if an additional control flow branch is added which fails to initialize the variable, you don't get the warning anymore, not even from compilers which previously did a correct analysis. Whether this is a big deal, I don't know. Regards, Martin From jcarlson at uci.edu Wed Feb 1 20:07:17 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 01 Feb 2006 11:07:17 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <43e0f7d9.825300609@news.gmane.org> References: <20060201085152.10AC.JCARLSON@uci.edu> <43e0f7d9.825300609@news.gmane.org> Message-ID: <20060201103624.10B2.JCARLSON@uci.edu> bokr at oz.net (Bengt Richter) wrote: > On Wed, 01 Feb 2006 09:47:34 -0800, Josiah Carlson wrote: > >bokr at oz.net (Bengt Richter) wrote: > >> On Wed, 01 Feb 2006 12:33:36 +0000, "Gustavo J. A. M. Carneiro" wrote: > >> [...] > >> > Hmm.. I'm beginning to think 13r16 or 16r13 look too cryptic to the > >> >casual observer; perhaps a suffix letter is more readable, since we > >> >don't need arbitrary radix support anyway. > > > >[snip discussion over radix and compliments] > > > >I hope I'm not the only one who thinks that "simple is better than > >complex", at least when it comes to numeric constants. Certainly it > >would be _convenient_ to express constants in a radix other than decimal, > >hexidecimal, or octal, but to me, it all looks like noise. > > You don't have to use any other radix, any more than you have to use all forms > of float literals if you are happy with xx.yy. The others just become available > through a consistent methodology. > > >Peronally, I was on board for the removal of octal literals, if only > >because I find _seeing_ a leading zero without something else (like the > >'x' for hexidecimal) to be difficult, and because I've found little use > >for them in my work (decimals and hex are usually all I need). > > I agree that 8r641 is more easily disambiguated than 0641 ;-) > > But how do you represent a negative int in hex? Or have you never encountered the need? > The failure of current formats with respect to negative values whose values you > want to specify in a bit-specifying format was my main point. In my experience, I've rarely had the opportunity (or misfortune?) to deal with negative constants, whose exact bit representation I needed to get "just right". For my uses, I find that specifying "-0x..." or "-..." to be sufficient. Certainly it may or may not be the case in what you are doing (hence your exposition on signs, radixes, etc.). Would the i32() function you previously defined, as well as a utility h32() function which does the reverse be a reasonable start? Are there any radixes beyond binary, octal, decimal, and hexidecimal that people want to use? Does it make sense to create YYrXXXXX syntax for integer literals for basically 4 representations, all of which can be handled by int('XXXXXX', YY) (ignoring the runtime overhead)? Does the suffix idea for different types (long, decimal, ...) necessarily suggest that suffixes for radixes for one type (int/long) is a good idea (1011b, 2000o, ...) are a good idea? I'll expand what I said before; there are many things that would make integer literals more convenient for heavy (or experienced) users of non-decimal or non-decimal-non-positive literals, but it wouldn't necessarily increase the understandability of code which uses them. - Josiah From paul-python at svensson.org Wed Feb 1 19:54:49 2006 From: paul-python at svensson.org (Paul Svensson) Date: Wed, 1 Feb 2006 13:54:49 -0500 (EST) Subject: [Python-Dev] Octal literals In-Reply-To: <1138818914.12020.17.camel@geddy.wooz.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> Message-ID: <20060201133954.V49643@familjen.svensson.org> On Wed, 1 Feb 2006, Barry Warsaw wrote: > The proposal for something like 0xff, 0o664, and 0b1001001 seems like > the right direction, although 'o' for octal literal looks kind of funky. > Maybe 'c' for oCtal? (remember it's 'x' for heXadecimal). Shouldn't it be 0t644 then, and 0n1001001 for binary ? That would sidestep the issue of 'b' and 'c' being valid hexadecimal digits as well. Regarding negative numbers, I think they're a red herring. If there is any need for a new literal format, it would be to express ~0x0f, not -0x10. 1xf0 has been proposed before, but I think YAGNI. /Paul From martin at v.loewis.de Wed Feb 1 20:21:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 01 Feb 2006 20:21:58 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0602010715v454025bas9509c1d19bb16836@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <43E0601D.7090505@v.loewis.de> <1f7befae0602010715v454025bas9509c1d19bb16836@mail.gmail.com> Message-ID: <43E10A56.6070602@v.loewis.de> Tim Peters wrote: >>It inlines the function to make this determination. > > > Very cool! Is this a new(ish) behavior? In 3.4: http://gcc.gnu.org/gcc-3.4/changes.html # A new unit-at-a-time compilation scheme for C, Objective-C, C++ and # Java which is enabled via -funit-at-a-time (and implied by -O2). In # this scheme a whole file is parsed first and optimized later. The # following basic inter-procedural optimizations are implemented: # # - ... The actual "might be uninitialized" warning comes from the SSA branch, which was merged in 4.0, as somebody else pointed out. Regards, Martin From rasky at develer.com Wed Feb 1 20:40:58 2006 From: rasky at develer.com (Giovanni Bajo) Date: Wed, 1 Feb 2006 20:40:58 +0100 Subject: [Python-Dev] Compiler warnings References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> Message-ID: <07d601c62767$6cde4aa0$bf03030a@trilan> Tim Peters wrote: > [Thomas Wouters] >> I noticed a few compiler warnings, when I compile Python on my amd64 with >> gcc 4.0.3: >> >> Objects/longobject.c: In function 'PyLong_AsDouble': >> Objects/longobject.c:655: warning: 'e' may be used uninitialized in this >> function > > Well, that's pretty bizarre. There's _obviously_ no way to get to a > reference to `e` without going through > > x = _PyLong_AsScaledDouble(vv, &e); > > first. That isn't a useful warning. This has been discussed many times on the GCC mailing list. Ultimately, detecting whether a variable is using initialized or not (given full interprocedural and whole-program compilation) is a problem that can be reduced to the halting problem. The only thing that GCC should (and will) do is finding a way to be consistent across different releases and optimization levels, and to produce an useful number of warnings, while not issuing too many false positives. -- Giovanni Bajo From barry at python.org Wed Feb 1 20:51:41 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 01 Feb 2006 14:51:41 -0500 Subject: [Python-Dev] Octal literals In-Reply-To: <20060201103624.10B2.JCARLSON@uci.edu> References: <20060201085152.10AC.JCARLSON@uci.edu> <43e0f7d9.825300609@news.gmane.org> <20060201103624.10B2.JCARLSON@uci.edu> Message-ID: <1138823501.12021.51.camel@geddy.wooz.org> On Wed, 2006-02-01 at 11:07 -0800, Josiah Carlson wrote: > In my experience, I've rarely had the opportunity (or misfortune?) to > deal with negative constants, whose exact bit representation I needed to > get "just right". For my uses, I find that specifying "-0x..." or "-..." > to be sufficient. I can't remember a time when signed hex, oct, or binary representation wasn't a major inconvenience, let alone something desirable. Don't get me started about hex(id(object()))! I typically use hex for addresses and bit fields, binary for bit flags and other bit twiddling, and oct for OS/file system interfaces. In none of those cases do you actually need or want signed values. IME. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060201/e9b3cc4a/attachment.pgp From brett at python.org Wed Feb 1 20:59:10 2006 From: brett at python.org (Brett Cannon) Date: Wed, 1 Feb 2006 11:59:10 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: Message-ID: On 2/1/06, Greg Wilson wrote: > Hi, > > I have a student who may be interested in adding syntactic support for > sets to Python, so that: > > x = {1, 2, 3, 4, 5} > > and: > > y = {z for z in x if (z % 2)} > > would be legal. There are of course issues (what's the syntax for a > frozen set? for the empty set?), but before he even starts, I'd like to > know if this would ever be considered for inclusion into the language. I am -0 on set syntax support. If the set() constructor was expanded to take an arbitrary number of arguments (and thus be more inline with the dict constructor) then the syntax need really starts to go away since the above could be done as ``set(1, 2, 3, 4, 5)``. As for the set copmrehension/expression/thing, I don't think that is needed at all when ``set(z for z in x if z % 2)`` will get the job done just as well without adding more syntactic sugar to the language for something that is so easy to do already. > p.s. please Cc: me as well as the list, since I'm no longer subscribed. -Brett From raymond.hettinger at verizon.net Wed Feb 1 20:50:28 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 01 Feb 2006 14:50:28 -0500 Subject: [Python-Dev] syntactic support for sets References: Message-ID: <000d01c62768$c11c0980$b83efea9@RaymondLaptop1> [Greg Wilson] > I have a student who may be interested in adding syntactic support for > sets to Python, so that: > > x = {1, 2, 3, 4, 5} > > and: > > y = {z for z in x if (z % 2)} > > would be legal. There are of course issues (what's the syntax for a > frozen set? for the empty set?), but before he even starts, I'd like to > know if this would ever be considered for inclusion into the language. Generator expressions make syntactic support irrelevant: x = set(xrange(1,6)) y = set(z for z in x if (z % 2)) y = frozenset(z for z in x if (z % 2)) Accordingly,Guido rejected the braced notation for set comprehensions. See: http://www.python.org/peps/pep-0218.html Raymond From pje at telecommunity.com Wed Feb 1 21:03:22 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 01 Feb 2006 15:03:22 -0500 Subject: [Python-Dev] syntactic support for sets In-Reply-To: Message-ID: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> At 01:55 PM 2/1/2006 -0500, Greg Wilson wrote: >I have a student who may be interested in adding syntactic support for >sets to Python, so that: > > x = {1, 2, 3, 4, 5} > >and: > > y = {z for z in x if (z % 2)} > >would be legal. There are of course issues (what's the syntax for a >frozen set? for the empty set?), Ones that work now: frozenset(z for z in x if (z%2)) set() The only case that looks slightly less than optimal is: set((1, 2, 3, 4, 5)) But I'm not sure that it warrants a special syntax just to get rid of the extra (). From raymond.hettinger at verizon.net Wed Feb 1 21:16:58 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 01 Feb 2006 15:16:58 -0500 Subject: [Python-Dev] syntactic support for sets References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> Message-ID: <000701c6276c$7504d500$b83efea9@RaymondLaptop1> [Phillip J. Eby] > The only case that looks slightly less than optimal is: > > set((1, 2, 3, 4, 5)) > > But I'm not sure that it warrants a special syntax just to get rid of the > extra (). The PEP records that Tim argued for leaving the extra parentheses. What would you do with {'title'} -- create a four element set consisting of letters or a single element set consisting of a string? Raymond From thomas at xs4all.net Wed Feb 1 22:15:11 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 1 Feb 2006 22:15:11 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0602010829k7ec7519di3f39082445cfcfe1@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <1f7befae0602010829k7ec7519di3f39082445cfcfe1@mail.gmail.com> Message-ID: <20060201211511.GV18916@xs4all.nl> On Wed, Feb 01, 2006 at 11:29:16AM -0500, Tim Peters wrote: > [Thomas Wouters] > > Well, I said 4.0.3, and that was wrong. It's actually a pre-release of 4.0.3 > > (in Debian's 'unstable' distribution.) However, 4.0.2 (the actual release) > > behaves the same way. The normal make process shows quite a lot of output on > > systems that use gcc, so I wouldn't be surprised if people did ignore it, > > for the most part. > Does it really? It's completely warning-free on Windows, and that's > the intent, and it takes ongoing work to keep it that way. Over at, > e.g., No, it's mostly warning-free, it just outputs a lot of text. By default, the warnings don't stand out much. And if you have a decent computer, it scrolls by pretty fast, too. ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Wed Feb 1 22:34:22 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 1 Feb 2006 22:34:22 +0100 Subject: [Python-Dev] Compiler warnings In-Reply-To: <1f7befae0602010715v454025bas9509c1d19bb16836@mail.gmail.com> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <43E0601D.7090505@v.loewis.de> <1f7befae0602010715v454025bas9509c1d19bb16836@mail.gmail.com> Message-ID: <20060201213422.GW18916@xs4all.nl> On Wed, Feb 01, 2006 at 10:15:15AM -0500, Tim Peters wrote: > Thomas, for these _PyLong_AsScaledDouble()-caller cases, I suggest doing > whatever obvious thing manages to silence the warning. For example, in > PyLong_AsDouble: > > int e = -1; /* silence gcc warning */ > > and then add: > > assert(e >= 0); > > after the call. Done, although it was nowhere near obvious to me that -1 would be a sane sentinel value ;) Not that I don't believe you, but it took some actual reading of _PyLong_AsScaledDouble to confirm it. Reading--imagine-that-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From steven.bethard at gmail.com Wed Feb 1 22:58:41 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed, 1 Feb 2006 14:58:41 -0700 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <000701c6276c$7504d500$b83efea9@RaymondLaptop1> References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> <000701c6276c$7504d500$b83efea9@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > [Phillip J. Eby] > > The only case that looks slightly less than optimal is: > > > > set((1, 2, 3, 4, 5)) > > > > But I'm not sure that it warrants a special syntax just to get rid of the > > extra (). > > The PEP records that Tim argued for leaving the extra parentheses. > What would you do with {'title'} -- create a four element set consisting > of letters or a single element set consisting of a string? I think the answer to this one is clearly that it is a single element set consisting of a string, just as ['title'] is a single element list consisting of a string. I believe the confusion arises if Brett's proposal for ``set(1, 2, 3, 4, 5)`` is considered. Currently, set('title') is a five element set consisting of letters. But set('title', 'author') would be a two element set consisting of two strings? The problem is in calling the set constructor, not in writing a set literal. That said, I don't think there's really that much of a need for set literals. I use sets almost exclusively to remove duplicates, so I almost always start with empty sets and add things to them. And I'm certainly never going to write ``set([1, 1, 2])`` when I could just write ``set([1, 2])`. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From tim.peters at gmail.com Wed Feb 1 23:42:36 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 1 Feb 2006 17:42:36 -0500 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060201213422.GW18916@xs4all.nl> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <43E0601D.7090505@v.loewis.de> <1f7befae0602010715v454025bas9509c1d19bb16836@mail.gmail.com> <20060201213422.GW18916@xs4all.nl> Message-ID: <1f7befae0602011442k4e46950t77a3d6702f02836c@mail.gmail.com> [Thomas] > Done, Thanks! > although it was nowhere near obvious to me that -1 would be a sane > sentinel value ;) Not that I don't believe you, but it took some actual > reading of _PyLong_AsScaledDouble to confirm it. Nope, the thing to do was to read the docs for _PyLong_AsScaledDouble, which explicitly promise e >= 0. That's what I did :-) "The docs" are in longobject.h. You can tell which functions I wrote, BTW, because they're the ones with comments in the header file documenting what they do. It's an ongoing mystery to me why nobody else found that to be a practice worth emulating ;-)/:-( From raymond.hettinger at verizon.net Wed Feb 1 23:49:21 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 01 Feb 2006 17:49:21 -0500 Subject: [Python-Dev] syntactic support for sets References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> <000701c6276c$7504d500$b83efea9@RaymondLaptop1> Message-ID: <001b01c62781$be6b8620$b83efea9@RaymondLaptop1> [Greg Wilson] > This is a moderately-fertile source of bugs for newcomers: judging from > the number of students who come into my office with code that they think > ought to work, but doesn't, most people believe that: > > set(1, 2, 3) Like many things in Python where people pre-emptively believe one thing or another, the interpreter's corrective feedback is immediate: >>> set(1, 2, 3) Traceback (most recent call last): set(1, 2, 3) TypeError: set expected at most 1 arguments, got 3 There is futher feedback in the repr string which serves as a reminder of how to construct a literal: >>> set(xrange(3)) set([0, 1, 2]) Once the students have progressed beyond academic finger drills and have started writing real code, have you observed a shift in emphasis away from hard-coded literals and towards something like s=set(data) where the data is either read-in from outside the script or generated by another part of the program? For academic purposes, I think the genexp form also has value in that it is broadly applicable to more than just sets (i.e. dict comprehensions) and that it doesn't have to grapple with arbitrary choices about whether {1,2,3} would be a set or frozenset. Raymond From mwh at python.net Wed Feb 1 23:59:21 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 01 Feb 2006 22:59:21 +0000 Subject: [Python-Dev] Compiler warnings In-Reply-To: <20060201211511.GV18916@xs4all.nl> (Thomas Wouters's message of "Wed, 1 Feb 2006 22:15:11 +0100") References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <1f7befae0602010829k7ec7519di3f39082445cfcfe1@mail.gmail.com> <20060201211511.GV18916@xs4all.nl> Message-ID: <2md5i6fzxy.fsf@starship.python.net> Thomas Wouters writes: > On Wed, Feb 01, 2006 at 11:29:16AM -0500, Tim Peters wrote: >> [Thomas Wouters] >> > Well, I said 4.0.3, and that was wrong. It's actually a pre-release of 4.0.3 >> > (in Debian's 'unstable' distribution.) However, 4.0.2 (the actual release) >> > behaves the same way. The normal make process shows quite a lot of output on >> > systems that use gcc, so I wouldn't be surprised if people did ignore it, >> > for the most part. > >> Does it really? It's completely warning-free on Windows, and that's >> the intent, and it takes ongoing work to keep it that way. Over at, >> e.g., > > No, it's mostly warning-free, it just outputs a lot of text. By default, > the warnings don't stand out much. And if you have a decent computer, it > scrolls by pretty fast, too. ;) "make -s" is a wonderful thing :) Cheers, mwh -- In case you're not a computer person, I should probably point out that "Real Soon Now" is a technical term meaning "sometime before the heat-death of the universe, maybe". -- Scott Fahlman From evdo.hsdpa at gmail.com Thu Feb 2 00:03:57 2006 From: evdo.hsdpa at gmail.com (Robert Kim Wireless Internet Advisor) Date: Wed, 1 Feb 2006 15:03:57 -0800 Subject: [Python-Dev] Compiler warnings In-Reply-To: <2md5i6fzxy.fsf@starship.python.net> References: <20060131105920.GQ18916@xs4all.nl> <1f7befae0601311716y1d906ca9qbc0daa5f01514d9@mail.gmail.com> <20060201101405.GT18916@xs4all.nl> <1f7befae0602010829k7ec7519di3f39082445cfcfe1@mail.gmail.com> <20060201211511.GV18916@xs4all.nl> <2md5i6fzxy.fsf@starship.python.net> Message-ID: <1ec620e90602011503h68d339c5s766e9942cc7807e5@mail.gmail.com> Thomas,,,, thanks.. useful string ... bob On 2/1/06, Michael Hudson wrote: > Thomas Wouters writes: > > > On Wed, Feb 01, 2006 at 11:29:16AM -0500, Tim Peters wrote: > >> [Thomas Wouters] > >> > Well, I said 4.0.3, and that was wrong. It's actually a pre-release of 4.0.3 > >> > (in Debian's 'unstable' distribution.) However, 4.0.2 (the actual release) > >> > behaves the same way. The normal make process shows quite a lot of output on > >> > systems that use gcc, so I wouldn't be surprised if people did ignore it, > >> > for the most part. > > > >> Does it really? It's completely warning-free on Windows, and that's > >> the intent, and it takes ongoing work to keep it that way. Over at, > >> e.g., > > > > No, it's mostly warning-free, it just outputs a lot of text. By default, > > the warnings don't stand out much. And if you have a decent computer, it > > scrolls by pretty fast, too. ;) > > "make -s" is a wonderful thing :) > > Cheers, > mwh > > -- > In case you're not a computer person, I should probably point out > that "Real Soon Now" is a technical term meaning "sometime before > the heat-death of the universe, maybe". > -- Scott Fahlman > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/evdo.hsdpa%40gmail.com > -- Robert Q Kim, Wireless Internet Advisor http://evdo-coverage.com/cellular-repeater.html http://hsdpa-coverage.com http://evdo-coverage.com/pocket-pc-pda-ppc.html 2611 S. Pacific Coast Highway 101 Suite 102 Cardiff by the Sea, CA 92007 206 984 0880 From dw at botanicus.net Thu Feb 2 01:36:24 2006 From: dw at botanicus.net (David Wilson) Date: Thu, 2 Feb 2006 00:36:24 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> Message-ID: <20060202003624.GA83347@thailand.botanicus.net> On Wed, Feb 01, 2006 at 03:03:22PM -0500, Phillip J. Eby wrote: > The only case that looks slightly less than optimal is: > > set((1, 2, 3, 4, 5)) > > But I'm not sure that it warrants a special syntax just to get rid of the > extra (). In any case I don't think it's possible to differentiate between the current calling convention and the 'parenless' one reliably, eg.: S = set([]) There is no way to tell if that is a set containing an empty list created using the parenless syntax, or an empty set, as is created with the current calling convention. -- DISOBEY, v.t. To celebrate with an appropriate ceremony the maturity of a command. From eric.nieuwland at xs4all.nl Thu Feb 2 14:45:52 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Thu, 2 Feb 2006 14:45:52 +0100 Subject: [Python-Dev] The path module PEP In-Reply-To: <740c3aec0602011014v5e284b41t50570c7129221749@mail.gmail.com> References: <740c3aec0601241222s25123daerd85ac2e6a9d0b920@mail.gmail.com> <43D6BA85.8070007@colorstudy.com> <740c3aec0601251237j422c274dx32667261a83df9b1@mail.gmail.com> <79990c6b0601260515w22ae9e2dy22d265acc4bce7c3@mail.gmail.com> <43D8E0E9.9080202@ofai.at> <79990c6b0601270216x2c43449cj534427f7a8f5234c@mail.gmail.com> <43D9FA39.6060407@ofai.at> <740c3aec0602011014v5e284b41t50570c7129221749@mail.gmail.com> Message-ID: On 1 feb 2006, at 19:14, BJ?rn Lindqvist wrote: > I've submitted an updated version of the PEP. The only major change is > that instead of the method atime and property getatime() there is now > only one method named atime(). Also some information about the string > inheritance problem in Open Issues. I still have no idea what to do > about it though. The current PEP still contains some redundancy between properties and methods under Specifications: basename() <-> name basename(), stripext() <-> namebase splitpath() <-> parent, name (documented) I would like to suggest to use only properties and use splitall() to obtain a tuple with the complete breakdown of the path. And may be splitall() could then be renamed to split(). The directory methods mkdir()/makedirs() and rmdir()/removedirs() could be unified. To me it seems they only exist because of Un*x details. my $0.005 --eric From hyeshik at gmail.com Thu Feb 2 18:44:13 2006 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Fri, 3 Feb 2006 02:44:13 +0900 Subject: [Python-Dev] ctypes patch (was: (libffi) Re: Copyright issue) Message-ID: <4f0b69dc0602020944l3bcfe1d2v1bc149ac3f202e91@mail.gmail.com> On 1/30/06, "Martin v. L?wis" wrote: > Hye-Shik Chang wrote: > > I did some work to make ctypes+libffi compacter and liberal. > > http://openlook.org/svnpublic/ctypes-compactffi/ (svn) > > > > I removed sources/gcc and put sources/libffi copied from gcc 4.0.2. > > And removed all automake-related build processes and integrated > > them into setup.py. There's still aclocal.m4 in sources/libffi. But > > it is just identical to libffi's acinclude.m4 which looks liberal. > > Well done! Would you like to derive a Python patch from that? > Don't worry about MSVC, yet, I will do that once the sources > are in the subversion. > Here goes patches for the integration: [1] http://people.freebsd.org/~perky/ctypesinteg-f1.diff.bz2 [2] http://people.freebsd.org/~perky/ctypesinteg-f2.diff.bz2 I implemented it in two flavors. [1] runs libffi's configure along with Python's and setup.py just builds it. And [2] has no change to Python's configure and setup.py runs libffi configure and builds it. And both patches don't have things for documentations yet. > (Of course, for due process, it would be better if this code gets > integrated into the official ctypes first, and then we incorporate > some named/versioned snapshot into /external, and svn cp it into > python/trunk from there). Thomas and I collaborated on integration into the ctypes repository and testing on various platforms yesterday. My patches for Python are derived from ctypes CVS with a change of only one line. Hye-Shik From scs5mjf at comp.leeds.ac.uk Wed Feb 1 20:09:52 2006 From: scs5mjf at comp.leeds.ac.uk (M J Fleming) Date: Wed, 1 Feb 2006 19:09:52 +0000 Subject: [Python-Dev] Octal literals Message-ID: <20060201190950.GA22164@cslin-gps.csunix.comp.leeds.ac.uk> On Wed, Feb 01, 2006 at 01:35:14PM -0500, Barry Warsaw wrote: > The proposal for something like 0xff, 0o664, and 0b1001001 seems like > the right direction, although 'o' for octal literal looks kind of funky. > Maybe 'c' for oCtal? (remember it's 'x' for heXadecimal). > > -Barry > +1 I definately agree with the 0c664 octal literal. Seems rather more intuitive. Matt From gvwilson at cs.utoronto.ca Wed Feb 1 22:44:32 2006 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Wed, 1 Feb 2006 16:44:32 -0500 (EST) Subject: [Python-Dev] syntactic support for sets In-Reply-To: <000d01c62768$c11c0980$b83efea9@RaymondLaptop1> References: <000d01c62768$c11c0980$b83efea9@RaymondLaptop1> Message-ID: > Generator expressions make syntactic support irrelevant: Not when you're teaching the language to undergraduates: I haven't actually done the study yet (though I may this summer), but I'm willing to bet that allowing "math" notation for sets will more than double their use. (Imagine having to write "list(1, 2, 3, 4, 5)"...) > Accordingly,Guido rejected the braced notation for set comprehensions. > See: http://www.python.org/peps/pep-0218.html "...however, the issue could be revisited for Python 3000 (see PEP 3000)." So I'm only 1994 years early ;-) Thanks, Greg From gvwilson at cs.utoronto.ca Wed Feb 1 22:48:23 2006 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Wed, 1 Feb 2006 16:48:23 -0500 (EST) Subject: [Python-Dev] syntactic support for sets In-Reply-To: <000701c6276c$7504d500$b83efea9@RaymondLaptop1> References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> <000701c6276c$7504d500$b83efea9@RaymondLaptop1> Message-ID: > The PEP records that Tim argued for leaving the extra parentheses. What > would you do with {'title'} -- create a four element set consisting of > letters or a single element set consisting of a string? This is a moderately-fertile source of bugs for newcomers: judging from the number of students who come into my office with code that they think ought to work, but doesn't, most people believe that: set(1, 2, 3) is "right". I believe curly-brace notation would eliminate this problem. Thanks, Greg From gvwilson at cs.utoronto.ca Thu Feb 2 02:55:38 2006 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Wed, 1 Feb 2006 20:55:38 -0500 (EST) Subject: [Python-Dev] syntactic support for sets In-Reply-To: <001b01c62781$be6b8620$b83efea9@RaymondLaptop1> References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> <000701c6276c$7504d500$b83efea9@RaymondLaptop1> <001b01c62781$be6b8620$b83efea9@RaymondLaptop1> Message-ID: > Like many things in Python where people pre-emptively believe one thing > or another, the interpreter's corrective feedback is immediate: Yup, that's the theory; it's a shame practice is different. > Once the students have progressed beyond academic finger drills and have > started writing real code, have you observed a shift in emphasis away > from hard-coded literals and towards something like s=set(data) where > the data is either read-in from outside the script or generated by > another part of the program? The problem is that once people classify something as "hard" or "fragile", they (consciously or unconsciously) avoid it thereafter, which of course means that it doesn't get any easier or more robust, since they're not practicing it. This has been observed in many arenas, not just programming. I agree it's not a compelling reason to add set notation to the language, but I'd rather eliminate the sand traps than reuqire people to learn to recognize and avoid them. Thanks, Greg From theller at python.net Thu Feb 2 19:55:50 2006 From: theller at python.net (Thomas Heller) Date: Thu, 02 Feb 2006 19:55:50 +0100 Subject: [Python-Dev] ctypes patch References: <4f0b69dc0602020944l3bcfe1d2v1bc149ac3f202e91@mail.gmail.com> Message-ID: <7j8degjt.fsf@python.net> Hye-Shik Chang writes: > On 1/30/06, "Martin v. L?wis" wrote: >> Hye-Shik Chang wrote: >> > I did some work to make ctypes+libffi compacter and liberal. >> > http://openlook.org/svnpublic/ctypes-compactffi/ (svn) >> > >> > I removed sources/gcc and put sources/libffi copied from gcc 4.0.2. >> > And removed all automake-related build processes and integrated >> > them into setup.py. There's still aclocal.m4 in sources/libffi. But >> > it is just identical to libffi's acinclude.m4 which looks liberal. >> >> Well done! Would you like to derive a Python patch from that? >> Don't worry about MSVC, yet, I will do that once the sources >> are in the subversion. >> > > Here goes patches for the integration: > > [1] http://people.freebsd.org/~perky/ctypesinteg-f1.diff.bz2 > [2] http://people.freebsd.org/~perky/ctypesinteg-f2.diff.bz2 > > I implemented it in two flavors. [1] runs libffi's configure along with > Python's and setup.py just builds it. And [2] has no change to > Python's configure and setup.py runs libffi configure and builds it. > And both patches don't have things for documentations yet. My plan is to make separate ctypes releases for 2.3 and 2.4, even after it is integrated into Python 2.5, so it seems [2] would be better - it must be possible to build ctypes without Python. As I said before, docs need still to be written. I think content is more important than markup, so I'm writing in rest, it can be converted to latex later. I expect that writing the docs will show quite some edges that need to be cleaned up - that should certainly be done before the first 2.5 release. Also I want to make a few releases before declaring the 1.0 version. This does not mean that I'm against integrating it right now. >> (Of course, for due process, it would be better if this code gets >> integrated into the official ctypes first, and then we incorporate >> some named/versioned snapshot into /external, and svn cp it into >> python/trunk from there). > > Thomas and I collaborated on integration into the ctypes repository > and testing on various platforms yesterday. My patches for Python > are derived from ctypes CVS with a change of only one line. > Hye-Shik has done a great job! Many thanks to him for that. Thomas From ark at acm.org Thu Feb 2 20:09:38 2006 From: ark at acm.org (Andrew Koenig) Date: Thu, 2 Feb 2006 14:09:38 -0500 Subject: [Python-Dev] Octal literals In-Reply-To: <20060201190950.GA22164@cslin-gps.csunix.comp.leeds.ac.uk> Message-ID: <000001c6282c$3a0efa50$6402a8c0@arkdesktop> > I definately agree with the 0c664 octal literal. Seems rather more > intuitive. I still prefer 8r664. From bokr at oz.net Thu Feb 2 20:11:13 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 02 Feb 2006 19:11:13 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> Message-ID: <43e21352.897869648@news.gmane.org> On Wed, 1 Feb 2006 13:54:49 -0500 (EST), Paul Svensson wrote: >On Wed, 1 Feb 2006, Barry Warsaw wrote: > >> The proposal for something like 0xff, 0o664, and 0b1001001 seems like >> the right direction, although 'o' for octal literal looks kind of funky. >> Maybe 'c' for oCtal? (remember it's 'x' for heXadecimal). > >Shouldn't it be 0t644 then, and 0n1001001 for binary ? >That would sidestep the issue of 'b' and 'c' being valid >hexadecimal digits as well. > >Regarding negative numbers, I think they're a red herring. >If there is any need for a new literal format, >it would be to express ~0x0f, not -0x10. >1xf0 has been proposed before, but I think YAGNI. > YMMV re YAGNI, but you have an excellent point re negative numbers vs ~. If you look at examples, the representation digits _are_ actually "~" ;-) I.e., I first proposed 'c' in place of 'r' for 16cf0, where "c" stands for radix _complement_, and 0 and 1 are complements wrt 2, as are hex 0 and f wrt radix 16. So the actual notation has digits that are radix-complement, and are evaluated as such to get the integer value. So ~0x0f is represented r16-f0, which does produce a negative number (but whose integer value BTW is -0x10, not 0x0f. I.e., -16r-f0 == 16r+10, and the sign after the 'r' is a complement-notation indicator, not an algebraic sign. (Perhaps or '^' would be a better indicator, as -16r^f0 == 0x10) Thank you for making the point that the negative value per se is a red herring. Still, that is where the problem shows up: e.g. when we want to define a hex bit mask as an int and the sign bit happens to be set. IMO it's a wart that if you want to define bit masks as integer data, you have to invoke computation for the sign bit, e.g., BIT_0 = 0x1 BIT_1 = 0x02 ... BIT_30 = 0x40000000 BIT_31 = int(-0x80000000) instead of defining true literals all the way, e.g., BIT_0 = 16r1 BIT_1 = 16r2 # or 16r00000002 obviously ... BIT_30 = 16r+40000000 BIT_31 = 16r-80000000) and if you wanted to define the bit-wise complement masks as literals, you could, though radix-2 is certainly easier to see (introducing '_' as transparent elision) CBIT_0 = 16r-f # or 16r-fffffffe or 2r-0 or 2r-11111111_11111111_11111111_11111110 CBIT_1 = 16r-d # or 16r-fffffffd or 2r-01 or 2r-11111111_11111111_11111111_11111101 ... CBIT_30 = 16r-b0000000 or 2r-10111111_11111111_11111111_11111111 CBIT_31 = 16r+7fffffff or 2r+01111111_11111111_11111111_11111111 With constant-folding optimization and some kind of inference-guiding for expressions like -sys.maxint-1, perhaps computation vs true literals will become moot. And practically it already is, since a one-time computation is normally insignificant in time or space. But aren't we also targeting platforms also where space is at a premium, and being able to define constants as literal data without resorting to workaround pre-processing would be nice? BTW, base-complement decoding works by generalized analogy to twos complement decoding, by assuming that the most significant digit is a signed coefficient value for base**digitpos in radix-complement form, where the upper half of the range of digits represents negative values as digit-radix, and the rest positive as digit. The rest of the digits are all positive coefficients for base powers. E.g., to decode our simple example[1] represented as a literal in base-complement form (very little tested): >>> def bclitval(s, digits='0123456789abcdefghijklmnopqrstuvwxyz'): ... """ ... decode base complement literal of form r ... where ... is in range(2,37) or more if digits supplied ... is a mnemonic + for digits[0] and - for digits[-1] or absent ... are decoded as base-complement notation after if ... present is changed to appropriate digit. ... The first digit is taken as a signed coefficient with value ... digit- (negative) if the digit*2>=B and digit (positive) otherwise. ... """ ... B, s = s.split('r', 1) ... B = int(B) ... if s[0] =='+': s = digits[0]+s[1:] ... elif s[0] =='-': s = digits[B-1]+s[1:] ... ds = digits.index(s[0]) ... if ds*2 >= B: acc = ds-B ... else: acc = ds ... for c in s[1:]: acc = acc*B + digits.index(c) ... return acc ... >>> bclitval('16r80000004') -2147483644 >>> bclitval('2r10000000000000000000000000000100') -2147483644 BTW, because of the decoding method, extended "sign" bits don't force promotion to a long value: >>> bclitval('16rffffffff80000004') -2147483644 [1] To reduce all this eye-glazing discussion to a simple example, how do people now use hex notation to define an integer bit-mask constant with bits 31 and 2 set? (assume 32-bit int for target platform, counting bit 0 as LSB and bit 31 as sign). Regards, Bengt Richter From foom at fuhm.net Thu Feb 2 21:26:24 2006 From: foom at fuhm.net (James Y Knight) Date: Thu, 2 Feb 2006 15:26:24 -0500 Subject: [Python-Dev] Octal literals In-Reply-To: <43e21352.897869648@news.gmane.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> Message-ID: On Feb 2, 2006, at 7:11 PM, Bengt Richter wrote: > [1] To reduce all this eye-glazing discussion to a simple example, > how do people now > use hex notation to define an integer bit-mask constant with bits > 31 and 2 set? That's easy: 0x80000004 That was broken in python < 2.4, though, so there you need to do: MASK = 2**32 - 1 0x80000004 & MASK > (assume 32-bit int for target platform, counting bit 0 as LSB and > bit 31 as sign). The 31st bit _isn't_ the sign bit in python and the bit-ness of the target platform doesn't matter. Python's integers are arbitrarily long. I'm not sure why you're trying to pretend as if python was C. James From jjl at pobox.com Thu Feb 2 21:30:00 2006 From: jjl at pobox.com (John J Lee) Date: Thu, 2 Feb 2006 20:30:00 +0000 (GMT Standard Time) Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: <5.1.1.6.0.20060201145956.040ff9c8@mail.telecommunity.com> <000701c6276c$7504d500$b83efea9@RaymondLaptop1> <001b01c62781$be6b8620$b83efea9@RaymondLaptop1> Message-ID: On Wed, 1 Feb 2006, Greg Wilson wrote: >> Like many things in Python where people pre-emptively believe one thing >> or another, the interpreter's corrective feedback is immediate: > > Yup, that's the theory; it's a shame practice is different. So what mistake(s) *do* your students make? As people have pointed out, the mistake you complain about *does* usually result in an immediate traceback: >>> set(1, 2, 3) Traceback (most recent call last): File "", line 1, in ? TypeError: set expected at most 1 arguments, got 3 >>> set(1) Traceback (most recent call last): File "", line 1, in ? TypeError: iteration over non-sequence >>> Perhaps this? >>> set("argh") set(['a', 'h', 'r', 'g']) >>> [...] > the language, but I'd rather eliminate the sand traps than reuqire people > to learn to recognize and avoid them. I'm sure nobody would disagree with you, but of course the devil is in the detail. John From jjl at pobox.com Thu Feb 2 21:32:34 2006 From: jjl at pobox.com (John J Lee) Date: Thu, 2 Feb 2006 20:32:34 +0000 (GMT Standard Time) Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: <000d01c62768$c11c0980$b83efea9@RaymondLaptop1> Message-ID: On Wed, 1 Feb 2006, Greg Wilson wrote: [...] > (Imagine having to write "list(1, 2, 3, 4, 5)"...) [...] I believe that was actually proposed on this list for Python 3. John From mrovner at propel.com Thu Feb 2 22:28:40 2006 From: mrovner at propel.com (Mike Rovner) Date: Thu, 02 Feb 2006 13:28:40 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <000001c6282c$3a0efa50$6402a8c0@arkdesktop> References: <20060201190950.GA22164@cslin-gps.csunix.comp.leeds.ac.uk> <000001c6282c$3a0efa50$6402a8c0@arkdesktop> Message-ID: Andrew Koenig wrote: >>I definately agree with the 0c664 octal literal. Seems rather more >>intuitive. > > > I still prefer 8r664. 664[8] looks better and allows any radix From aleaxit at gmail.com Thu Feb 2 23:26:30 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 2 Feb 2006 14:26:30 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: <000d01c62768$c11c0980$b83efea9@RaymondLaptop1> Message-ID: On 2/1/06, Greg Wilson wrote: > > Generator expressions make syntactic support irrelevant: > > Not when you're teaching the language to undergraduates: I haven't > actually done the study yet (though I may this summer), but I'm willing to > bet that allowing "math" notation for sets will more than double their > use. (Imagine having to write "list(1, 2, 3, 4, 5)"...) Actually, as far as I'm concerned, I'd just love to remove the [ ... ] notation for building lists if good ways could be found to distinguish "a list with this one item" from "a list with the same items as this iterable". list(1, 2, 3) is perfectly easy to explain, more readable, and just as likely to be used, if not more, than cryptic shorthand [1,2,3]. "If you want APL, you know where to find it" (==on IBM's online store, called APL2!-). > > Accordingly,Guido rejected the braced notation for set comprehensions. > > See: http://www.python.org/peps/pep-0218.html > > "...however, the issue could be revisited for Python 3000 (see PEP 3000)." > So I'm only 1994 years early ;-) Don't be such a pessimist, it's ONLY 994 years to go! Alex From bokr at oz.net Thu Feb 2 23:36:03 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 02 Feb 2006 22:36:03 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> Message-ID: <43e27155.921936314@news.gmane.org> On Thu, 2 Feb 2006 15:26:24 -0500, James Y Knight wrote: >On Feb 2, 2006, at 7:11 PM, Bengt Richter wrote: >> [1] To reduce all this eye-glazing discussion to a simple example, >> how do people now >> use hex notation to define an integer bit-mask constant with bits ^^^^^^^ >> 31 and 2 set? | > | >That's easy: | >0x80000004 | >>> 0x80000004 | 2147483652L | ^------------------------' That didn't meet specs ;-) > >That was broken in python < 2.4, though, so there you need to do: I agree it was broken, but >MASK = 2**32 - 1 >0x80000004 & MASK does not solve the problem of doing correctly what it was doing (creating a mask in a signed type int variable, which happened to have the sign bit set). So long as there is a fixed-width int different from long, the problem will reappear. >> (assume 32-bit int for target platform, counting bit 0 as LSB and ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> bit 31 as sign). ^^^^^^^^^^^^^^ > >The 31st bit _isn't_ the sign bit in python and the bit-ness of the >target platform doesn't matter. Python's integers are arbitrarily >long. I'm not sure why you're trying to pretend as if python was C. Evidently I haven't made myself clear to you, and your mind reading wrt what I am trying to pretend is definitely flawed (and further speculations along that line are likely to be OT ;-) So long as we have a distinction between int and long, IWT int will be fixed width for any given implementation, and for interfacing with foreign functions it will continue to be useful at times to limit the type of arguments being passed. To do this arms-length C argument type control, it may be important to have constants of int type, knowing what that means on a given platform, and therefore _nice_ to be able to define them directly, understanding full well all the issues, and that there are workarounds ;-) Whatever the fixed width of int, ISTM we'll have predictable type promotion effects such as >>> width=32 >>> -1*2**(width-2)*2 -2147483648 vs >>> -1*2**(width-1) -2147483648L and >>> hex(-sys.maxint-1) '-0x80000000' >>> (-int(hex(-sys.maxint-1)[1:],16)) == (-sys.maxint-1) True >>> (-int(hex(-sys.maxint-1)[1:],16)) , (-sys.maxint-1) (-2147483648L, -2147483648) >>> type(-int(hex(-sys.maxint-1)[1:],16)) == type(-sys.maxint-1) False >>> type(-int(hex(-sys.maxint-1)[1:],16)) , type(-sys.maxint-1) (, ) [1] Even though BTW you could well define a sign bit position abstractly for any integer value. E.g., the LSB of the arbitrarily repeated sign bits to the left of any integer in a twos complement representation (which can be well defined abstractly too). Code left as exercise ;-) Bottom line: You haven't shown me an existing way to do "16r80000004" and produce the int ;-) Regards, Bengt Richter From aleaxit at gmail.com Thu Feb 2 23:43:51 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 2 Feb 2006 14:43:51 -0800 Subject: [Python-Dev] any support for a methodcaller HOF? Message-ID: I was recently reviewing a lot of the Python 2.4 code I have written, and I've noticed one thing: thanks to the attrgetter and itemgetter functions in module operator, I've been using (or been tempted to use) far fewer lambdas, particularly but not exclusively in key= arguments to sort and sorted. Most of those "lambda temptations" will be removed by PEP 309 (functional.partial), and most remaining ones are of the form: lambda x: x.amethod(zip, zop) So I was thinking -- wouldn't it be nice to have (possibly in module functional, like partial; possibly in module operator, like itemgetter and attrgetter -- I'm partial to functional;-) a methodcaller entry akin to (...possibly with a better name...): def methodcaller(methodname, *a, **k): def caller(self): getattr(self, methodname)(*a, **k) caller.__name__ = methodname return caller ...? This would allow removal of even more lambdas. I'll be glad to write a PEP, but I first want to check whether the Python-Dev crowd would just blast it out of the waters, in which case I may save writing it... Alex From martin at v.loewis.de Thu Feb 2 23:46:00 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Feb 2006 23:46:00 +0100 Subject: [Python-Dev] Octal literals In-Reply-To: <43e27155.921936314@news.gmane.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> Message-ID: <43E28BA8.70402@v.loewis.de> Bengt Richter wrote: >>>[1] To reduce all this eye-glazing discussion to a simple example, >>>how do people now >>>use hex notation to define an integer bit-mask constant with bits > > ^^^^^^^ > >>>31 and 2 set? | >> >> | >>That's easy: | >>0x80000004 | > > >>> 0x80000004 | > 2147483652L | > ^------------------------' > > That didn't meet specs ;-) It sure does: 2147483652L is an integer (a long one); it isn't an int. Regards, Martin From tdelaney at avaya.com Fri Feb 3 00:14:00 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Fri, 3 Feb 2006 10:14:00 +1100 Subject: [Python-Dev] Octal literals Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB8F2@au3010avexu1.global.avaya.com> M J Fleming wrote: > +1 > > I definately agree with the 0c664 octal literal. Seems rather more > intuitive. And importantly, sounds like "Oc" 664 ;) Tim Delaney From tdelaney at avaya.com Fri Feb 3 00:16:17 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Fri, 3 Feb 2006 10:16:17 +1100 Subject: [Python-Dev] Octal literals Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> Andrew Koenig wrote: >> I definately agree with the 0c664 octal literal. Seems rather more >> intuitive. > > I still prefer 8r664. The more I look at this, the worse it gets. Something beginning with zero (like 0xFF, 0c664) immediately stands out as "unusual". Something beginning with any other digit doesn't. This just looks like noise to me. I found the suffix version even worse, but they're blown out of the water anyway by the fact that FFr16 is a valid identifier. Tim Delaney From bokr at oz.net Fri Feb 3 01:08:18 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 00:08:18 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> <43E28BA8.70402@v.loewis.de> Message-ID: <43e29d06.933121688@news.gmane.org> On Thu, 02 Feb 2006 23:46:00 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Bengt Richter wrote: >>>>[1] To reduce all this eye-glazing discussion to a simple example, >>>>how do people now >>>>use hex notation to define an integer bit-mask constant with bits >> >> ^^^^^^^ >> >>>>31 and 2 set? | >>> >>> | >>>That's easy: | >>>0x80000004 | >> >> >>> 0x80000004 | >> 2147483652L | >> ^------------------------' >> >> That didn't meet specs ;-) > >It sure does: 2147483652L is an integer (a long one); it isn't an >int. Aw, shux, dang. I didn't say what I meant ;-/ Apologies to James & all 'round. s/integer/int/ in the above. Regards, Bengt Richter From bokr at oz.net Fri Feb 3 01:27:19 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 00:27:19 GMT Subject: [Python-Dev] Octal literals References: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> Message-ID: <43e29f78.933747858@news.gmane.org> On Fri, 3 Feb 2006 10:16:17 +1100, "Delaney, Timothy (Tim)" wrote: >Andrew Koenig wrote: > >>> I definately agree with the 0c664 octal literal. Seems rather more >>> intuitive. >> >> I still prefer 8r664. > >The more I look at this, the worse it gets. Something beginning with >zero (like 0xFF, 0c664) immediately stands out as "unusual". Something >beginning with any other digit doesn't. This just looks like noise to >me. > >I found the suffix version even worse, but they're blown out of the >water anyway by the fact that FFr16 is a valid identifier. > Are you sure you aren't just used to the x in 0xff? I.e., if the leading 0 were just an alias for 16, we could use 8x664 instead of 8r664. BTW Ada uses radix prefix, but with # separating the prefix, so we can't use that. How about apostrophe as separator? 8'664 # or the suffix version could work also, although you'd have to back out of some names: 664'8 bee'16 Regards, Bengt Richter From foom at fuhm.net Fri Feb 3 02:39:01 2006 From: foom at fuhm.net (James Y Knight) Date: Thu, 2 Feb 2006 20:39:01 -0500 Subject: [Python-Dev] Octal literals In-Reply-To: <43e27155.921936314@news.gmane.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> Message-ID: <7D3E3773-CDB4-4A81-ADFB-CBCC339F3DEE@fuhm.net> On Feb 2, 2006, at 10:36 PM, Bengt Richter wrote: > So long as we have a distinction between int and long, IWT int will > be fixed width > for any given implementation, and for interfacing with foreign > functions it will > continue to be useful at times to limit the type of arguments being > passed. We _don't_ have a distinction in any meaningful way, anymore. ints and longs are almost always treated exactly the same, other than the "L" suffix. I expect that suffix will soon go away as well. If there is code that _doesn't_ treat them the same, there is the bug. We don't need strange new syntax to work around buggy code. Note that 10**14/10**13 is also a long, yet any interface that did not accept that as an argument but did accept "10" is simply buggy. Same goes for code that says it takes a 32-bit bitfield argument but won't accept 0x80000000. James From bokr at oz.net Fri Feb 3 09:05:25 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 08:05:25 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> <7D3E3773-CDB4-4A81-ADFB-CBCC339F3DEE@fuhm.net> Message-ID: <43e2ffef.958442787@news.gmane.org> On Thu, 2 Feb 2006 20:39:01 -0500, James Y Knight wrote: >On Feb 2, 2006, at 10:36 PM, Bengt Richter wrote: >> So long as we have a distinction between int and long, IWT int will >> be fixed width >> for any given implementation, and for interfacing with foreign >> functions it will >> continue to be useful at times to limit the type of arguments being >> passed. > >We _don't_ have a distinction in any meaningful way, anymore. ints Which will disappear, "int" or "long"? Or both in favor of "integer"? What un-"meaningful" distinction(s) are you hedging your statement about? ;-) >and longs are almost always treated exactly the same, other than the >"L" suffix. I expect that suffix will soon go away as well. If there >is code that _doesn't_ treat them the same, there is the bug. We If you are looking at them in C code receiving them as args in a call, "treat them the same" would have to mean provide code to coerce long->int or reject it with an exception, IWT. This could be a performance issue that one might like to control by calling strictly with int args, or even an implementation restriction due to lack of space on some microprocessor for unnecessary general coercion code. >don't need strange new syntax to work around buggy code. It's not a matter of "buggy" if you are trying to optimize. (I am aware of premature optimization issues, and IMO "strange" is in the eye of the beholder. What syntax would you suggest? I am not married to any particular syntax, just looking for expressive control over what my programs will do ;-) > >Note that 10**14/10**13 is also a long, yet any interface that did >not accept that as an argument but did accept "10" is simply buggy. def foo(i): assert isinstance(i, int); ... # when this becomes illegal, yes. >Same goes for code that says it takes a 32-bit bitfield argument but >won't accept 0x80000000. If the bitfield is signed, it can't, unless you are glossing over an assumed coercion rule. >>> int(0x80000000) 2147483648L >>> int(-0x80000000) -2147483648 BTW, I am usually on the pure-abstraction-view side of discussions ;-) Noticing-kindling-is-wet-and-about-out-of-matches-ly, Regards, Bengt Richter From stefan.rank at ofai.at Fri Feb 3 09:38:06 2006 From: stefan.rank at ofai.at (Stefan Rank) Date: Fri, 03 Feb 2006 09:38:06 +0100 Subject: [Python-Dev] Octal literals In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> Message-ID: <43E3166E.3050303@ofai.at> on 03.02.2006 00:16 Delaney, Timothy (Tim) said the following: > Andrew Koenig wrote: >>> I definately agree with the 0c664 octal literal. Seems rather more >>> intuitive. >> I still prefer 8r664. > The more I look at this, the worse it gets. Something beginning with > zero (like 0xFF, 0c664) immediately stands out as "unusual". Something > beginning with any other digit doesn't. Let me throw something into the arena :-) I know there should only be one way to do it, but what about requiring a leading 0 for any 'special' number format, and then allow:: 0x1AFFE and:: 016r1AFFE 02r010001000101001 08r1234567 and maybe have 0b be a synonym of 02r, and some other nice character (o/c) for octals. For backwards compatibility you could even allow classic octal literals, though I think it would be better to have a Syntax Error for any literal starting with 0 but missing a radix code. cheers From mwh at python.net Fri Feb 3 10:36:30 2006 From: mwh at python.net (Michael Hudson) Date: Fri, 03 Feb 2006 09:36:30 +0000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: (Alex Martelli's message of "Thu, 2 Feb 2006 14:43:51 -0800") References: Message-ID: <2m64nwg4wx.fsf@starship.python.net> Alex Martelli writes: > I was recently reviewing a lot of the Python 2.4 code I have written, > and I've noticed one thing: thanks to the attrgetter and itemgetter > functions in module operator, I've been using (or been tempted to use) > far fewer lambdas, particularly but not exclusively in key= arguments > to sort and sorted. Interesting. Something I'd noticed was that *until* the key= argument to sort appeared, I was hardly using any lambdas at all (most of the places I had used them were rendered obsolete by list comprehensions). > Most of those "lambda temptations" will be > removed by PEP 309 (functional.partial), and most remaining ones are > of the form: > lambda x: x.amethod(zip, zop) > > So I was thinking -- wouldn't it be nice to have (possibly in module > functional, like partial; possibly in module operator, like itemgetter > and attrgetter -- I'm partial to functional;-) a methodcaller entry > akin to (...possibly with a better name...): > > def methodcaller(methodname, *a, **k): > def caller(self): > getattr(self, methodname)(*a, **k) > caller.__name__ = methodname > return caller > > ...? This would allow removal of even more lambdas. > > I'll be glad to write a PEP, but I first want to check whether the > Python-Dev crowd would just blast it out of the waters, in which case > I may save writing it... Hmm. >>> funcTakingCallback(lamda x:x.method(zip, zop)) >>> funcTakingCallback(methodcaller("method", zip, zop)) I'm not sure which of these is clearer really. Are lambdas so bad? (FWIW, I haven't internalized itemgetter/attrgetter yet and still tend to use lambdas instead those too). A class I wrote (and lost) ages ago was a "placeholder" class, so if 'X' was an instance of this class, "X + 1" was roughly equivalent to "lambda x:x+1" and "X.method(zip, zop)" was roughly equivalent to your "methodcaller("method", zip, zop)". I threw it away when listcomps got implemented. Not sure why I mention it now, something about your post made me think of it... Cheers, mwh -- If you give someone Fortran, he has Fortran. If you give someone Lisp, he has any language he pleases. -- Guy L. Steele Jr, quoted by David Rush in comp.lang.scheme.scsh From ncoghlan at gmail.com Fri Feb 3 11:07:12 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 03 Feb 2006 20:07:12 +1000 Subject: [Python-Dev] Octal literals In-Reply-To: <43e29f78.933747858@news.gmane.org> References: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> <43e29f78.933747858@news.gmane.org> Message-ID: <43E32B50.1010302@gmail.com> Bengt Richter wrote: > On Fri, 3 Feb 2006 10:16:17 +1100, "Delaney, Timothy (Tim)" wrote: > >> Andrew Koenig wrote: >> >>>> I definately agree with the 0c664 octal literal. Seems rather more >>>> intuitive. >>> I still prefer 8r664. >> The more I look at this, the worse it gets. Something beginning with >> zero (like 0xFF, 0c664) immediately stands out as "unusual". Something >> beginning with any other digit doesn't. This just looks like noise to >> me. >> >> I found the suffix version even worse, but they're blown out of the >> water anyway by the fact that FFr16 is a valid identifier. >> > Are you sure you aren't just used to the x in 0xff? I.e., if the leading > 0 were just an alias for 16, we could use 8x664 instead of 8r664. No, I'm with Tim - it's definitely the distinctive shape of the '0' that helps the non-standard base stand out. '0c' creates a similar shape, also helping it to stand out. More on distinctive shapes below, though. That said, I'm still trying to figure out exactly what problem is being solved here. Thinking out loud. . . The full syntax for writing integers in any base is: int("LITERAL", RADIX) int("LITERAL", base=RADIX) 5 prefix chars, 3 or 8 in the middle (counting the space, and depending on whether the keyword is used or not), one on the end, and one or two to specify the radix. That's quite verbose, so its unsurprising that many would like something nicer in the toolkit when they need to write multiple numeric literals in a base other than ten. This can typically happen when writing Unix system admin scripts, bitbashing to control a piece of hardware or some other low-level task. The genuine use cases we have for integer literals are: - decimal (normal numbers) - hex (compact bitmasks) - octal (unix file permissions) - binary (explicit bitmasks for those that don't speak fluent hex) Currently, there is no syntax for binary literals, and the syntax for octal literals is both magical (where else in integer mathematics does a leading zero matter?) and somewhat error prone (int and eval will give different answers for a numeric literal with a leading zero - int ignores the leading zero, eval treats it as signifying that the value is in octal. The charming result is that the following statement fails: assert int('0123') == 0123). Looking at existing precedent in the language, a prefix is currently used when the parsing of the subsequent literal may be affected (that is, the elements that make up the literal may be interpreted differently depending on the prefix). This is the case for hex and octal literals, and also for raw and unicode strings. Suffixes are currently used when the literal as a whole is affected, but the meaning of the individual elements remains the same. This is the case for both long integer and imaginary number literals. A suffix also makes sense for decimal float literals, as the individual elements would still be interpreted as base 10 digits. So, since we want to affect the parsing process, this means we want a prefix. The convention of using '0x' to denote hex extends far beyond Python, and doesn't seem to provoke much in the way of objection. This suggests options like '0o' or '0c' for octal literals. Given that '0x' matches the '%x' in string formatting, the least magical option would be '0o' (to match the existing '%o' output format). While '0c' is cute and quite suggestive, it creates a significant potential for confusion , as it most emphatically does *not* align with the meaning of the '%c' format specifier. I'd be +0 on changing the octal literal prefix from '0' to '0o', and also +0 on adding an '0b' prefix and '%b' format specifier for binary numbers. Whether anyone will actually care enough to implement a patch to change the syntax for any of these is an entirely different question ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From Ben.Young at risk.sungard.com Fri Feb 3 11:15:54 2006 From: Ben.Young at risk.sungard.com (Ben.Young at risk.sungard.com) Date: Fri, 3 Feb 2006 10:15:54 +0000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <2m64nwg4wx.fsf@starship.python.net> Message-ID: Michael Hudson wrote on 03/02/2006 09:36:30: > > Hmm. > > >>> funcTakingCallback(lamda x:x.method(zip, zop)) > >>> funcTakingCallback(methodcaller("method", zip, zop)) > > I'm not sure which of these is clearer really. Are lambdas so bad? > (FWIW, I haven't internalized itemgetter/attrgetter yet and still tend > to use lambdas instead those too). > > A class I wrote (and lost) ages ago was a "placeholder" class, so if > 'X' was an instance of this class, "X + 1" was roughly equivalent to > "lambda x:x+1" and "X.method(zip, zop)" was roughly equivalent to your > "methodcaller("method", zip, zop)". I threw it away when listcomps > got implemented. Not sure why I mention it now, something about your > post made me think of it... > The C++ library Boost makes use of this method, but has a number of "placeholder" variables _1, _2, _3 ... _9 which can be combined to form expressions. e.g _1 + _2 is the same as lambda x,y: x+y so maybe there could be a lambda module that exposes placeholders like this. Pythons ones will be better that the C++ ones because we would be able to delay function calls as above with a much nicer syntax than the C++ versions. E.g _1.method(_2+_3) ! Cheers, Ben > Cheers, > mwh > > > -- > If you give someone Fortran, he has Fortran. > If you give someone Lisp, he has any language he pleases. > -- Guy L. Steele Jr, quoted by David Rush in comp.lang.scheme.scsh > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python- > dev/python%40theyoungfamily.co.uk > From bob at redivi.com Fri Feb 3 11:40:54 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 3 Feb 2006 02:40:54 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <43E32B50.1010302@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> <43e29f78.933747858@news.gmane.org> <43E32B50.1010302@gmail.com> Message-ID: <4D291098-3729-458A-8C4C-F97DD3ED5410@redivi.com> On Feb 3, 2006, at 2:07 AM, Nick Coghlan wrote: > Bengt Richter wrote: >> On Fri, 3 Feb 2006 10:16:17 +1100, "Delaney, Timothy (Tim)" >> wrote: >> >>> Andrew Koenig wrote: >>> >>>>> I definately agree with the 0c664 octal literal. Seems rather more >>>>> intuitive. >>>> I still prefer 8r664. >>> The more I look at this, the worse it gets. Something beginning with >>> zero (like 0xFF, 0c664) immediately stands out as "unusual". >>> Something >>> beginning with any other digit doesn't. This just looks like >>> noise to >>> me. >>> >>> I found the suffix version even worse, but they're blown out of the >>> water anyway by the fact that FFr16 is a valid identifier. >>> >> Are you sure you aren't just used to the x in 0xff? I.e., if the >> leading >> 0 were just an alias for 16, we could use 8x664 instead of 8r664. > > Currently, there is no syntax for binary literals, and the syntax > for octal > literals is both magical (where else in integer mathematics does a > leading > zero matter?) and somewhat error prone (int and eval will give > different > answers for a numeric literal with a leading zero - int ignores the > leading > zero, eval treats it as signifying that the value is in octal. The > charming > result is that the following statement fails: assert int('0123') == > 0123). That's just a misunderstanding on your part. The default radix is 10, not DWIM. 0 signifies that behavior:: assert int('0123', 0) == 0123 assert int('0x123', 0) == 0x123 -bob From ncoghlan at gmail.com Fri Feb 3 11:44:47 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 03 Feb 2006 20:44:47 +1000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <2m64nwg4wx.fsf@starship.python.net> References: <2m64nwg4wx.fsf@starship.python.net> Message-ID: <43E3341F.4050001@gmail.com> Michael Hudson wrote: > Alex Martelli writes: >> I'll be glad to write a PEP, but I first want to check whether the >> Python-Dev crowd would just blast it out of the waters, in which case >> I may save writing it... > > Hmm. > >>>> funcTakingCallback(lamda x:x.method(zip, zop)) >>>> funcTakingCallback(methodcaller("method", zip, zop)) > > I'm not sure which of these is clearer really. Are lambdas so bad? > (FWIW, I haven't internalized itemgetter/attrgetter yet and still tend > to use lambdas instead those too). I've been convinced for a while that the proliferation of features like operator.itemgetter and attrgetter (and some uses of functional.partial) demonstrate that the ability to defer a single expression (as lambda currently allows) is a very useful feature to have in the language. Unfortunately that utility gets overshadowed by the ugliness of the syntax, the mathematical baggage associated with the current keyword, and the fact that lambda gets pitched as an "anonymous function limited to a single expression", rather than as "the ability to defer an expression for later evaluation" (the former sounds like a limitation that should be fixed, the latter sounds like the deliberate design choice that it is). At the moment it looks like the baby is going to get thrown out with the bathwater in Py3k, but I'd love to be able to simply write the following instead of some byzantine mixture of function calls to get the same effect: funcTakingCallback(x.method(zip, zop) def (x)) Consider these comparisons: itemgetter(1) <=> (x[1] def (x)) attrgetter('foo') <=> (x.foo def (x)) partial(y, arg) <=> (y(arg) def) So rather than yet another workaround for lambda being ugly, I'd rather see a PEP that proposed "Let's make the syntax for deferring an expression not be ugly anymore, now that we have generator expressions and conditionals as an example of how to do it right". Guido was rather unenthused the last time this topic came up, though, so maybe it isn't worth the effort. . . (although he did eventually change his mind on PEP 308, so I haven't entirely given up hope yet). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From abo at minkirri.apana.org.au Fri Feb 3 12:12:05 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Fri, 03 Feb 2006 11:12:05 +0000 Subject: [Python-Dev] Octal literals In-Reply-To: <20060201190950.GA22164@cslin-gps.csunix.comp.leeds.ac.uk> References: <20060201190950.GA22164@cslin-gps.csunix.comp.leeds.ac.uk> Message-ID: <1138965125.7232.12.camel@warna.dub.corp.google.com> On Wed, 2006-02-01 at 19:09 +0000, M J Fleming wrote: > On Wed, Feb 01, 2006 at 01:35:14PM -0500, Barry Warsaw wrote: > > The proposal for something like 0xff, 0o664, and 0b1001001 seems like > > the right direction, although 'o' for octal literal looks kind of funky. > > Maybe 'c' for oCtal? (remember it's 'x' for heXadecimal). > > > > -Barry > > > > +1 +1 too. It seems like a "least changes" way to fix the IMHO strange 0123 != 123 behaviour. Any sort of arbitrary base syntax is overkill; decimal, hexadecimal, octal, and binary cover 99.9% of cases. The 0.1% of other cases are very special, and can use int("LITERAL",base=RADIX). For me, binary is far more useful than octal, so I'd be happy to let octal languish as legacy support, but I definitely want "0b10110101". -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From ncoghlan at gmail.com Fri Feb 3 12:15:22 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 03 Feb 2006 21:15:22 +1000 Subject: [Python-Dev] Octal literals In-Reply-To: <4D291098-3729-458A-8C4C-F97DD3ED5410@redivi.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB8F3@au3010avexu1.global.avaya.com> <43e29f78.933747858@news.gmane.org> <43E32B50.1010302@gmail.com> <4D291098-3729-458A-8C4C-F97DD3ED5410@redivi.com> Message-ID: <43E33B4A.3050205@gmail.com> Bob Ippolito wrote: > > On Feb 3, 2006, at 2:07 AM, Nick Coghlan wrote: >> Currently, there is no syntax for binary literals, and the syntax for >> octal >> literals is both magical (where else in integer mathematics does a >> leading >> zero matter?) and somewhat error prone (int and eval will give different >> answers for a numeric literal with a leading zero - int ignores the >> leading >> zero, eval treats it as signifying that the value is in octal. The >> charming >> result is that the following statement fails: assert int('0123') == >> 0123). > > That's just a misunderstanding on your part. The default radix is 10, > not DWIM. 0 signifies that behavior:: > > assert int('0123', 0) == 0123 > assert int('0x123', 0) == 0x123 How does that make the situation any better? The fact remains that a leading zero on an integer string may be significant, depending on the exact method used to convert the string to a number. The fact that int() can be made to behave like eval() doesn't change the fact that the default behaviours are different, and in a fashion that allows errors to pass silently. You've highlighted a nice way to turn this into a real bug, though - use the DWIM feature of int() to accept numbers in either decimal or hex, and wait until someone relying on the mathematics they learned in high school enters a decimal number with a leading zero (leading zeros don't matter, right?). I think it's a bad thing that Python defaults to handling numbers differently from high school mathematics. One of the virtues of '0x' and '0o' is that the resulting strings aren't actually legal numbers, leading to people wondering what the prefixes mean. The danger of the leading 0 denoting octal is that programmers without a background in C (or one of its successors that use the same convention) may *think* they know what it means, only to discover they're wrong the hard way (when their program doesn't work right). Do I think this *really* matters? Nope - I think most bugs due to this will be pretty shallow. That's why I was only +0 on doing anything about it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From abo at minkirri.apana.org.au Fri Feb 3 13:04:52 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Fri, 03 Feb 2006 12:04:52 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: Message-ID: <1138968292.7232.48.camel@warna.dub.corp.google.com> On Wed, 2006-02-01 at 13:55 -0500, Greg Wilson wrote: > Hi, > > I have a student who may be interested in adding syntactic support for > sets to Python, so that: > > x = {1, 2, 3, 4, 5} > > and: > > y = {z for z in x if (z % 2)} Personally I'd like this. currently the "set(...)" syntax makes sets feel tacked on compared to tuples, lists, dicts, and strings which have nice built in syntax support. Many people don't realise they are there because of this. Before set() the standard way to do them was to use dicts with None Values... to me the "{1,2,3}" syntax would have been a logical extension of the "a set is a dict with no values, only keys" mindset. I don't know why it wasn't done this way in the first place, though I missed the arguments where it was rejected. As for frozenset vs set, I would be inclined to make them normal mutable sets. This is in line with the "dict without values" idea. Frozensets are to sets what tuples are to lists. It would be nice if there was another type of bracket that could be used for frozenset... something like ':1,2,3:'... yuk... I dunno. Alternatively you could to the same thing we do with strings; add a prefix char for different variants; {1,2,3} is a set, f{1,2,3} is a frozen set... For Python 3000 you could extend this approach to lists and dicts; [1,2,3] is a list, f[1,2,3] is a "frozen list" or tuple, {1:'a',2:'b'} is a dict, f{1:'a',2:'b'} is a "frozen dict" which can be used as a key in other dicts... etc. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From fredrik at pythonware.com Fri Feb 3 13:10:33 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 3 Feb 2006 13:10:33 +0100 Subject: [Python-Dev] syntactic support for sets References: <1138968292.7232.48.camel@warna.dub.corp.google.com> Message-ID: Donovan Baarda wrote: > For Python 3000 you could extend this approach to lists and dicts; > [1,2,3] is a list, f[1,2,3] is a "frozen list" or tuple, {1:'a',2:'b'} > is a dict, f{1:'a',2:'b'} is a "frozen dict" which can be used as a key > in other dicts... etc. Traceback (most recent call last): File "pythondev.py", line 219, in monitor HyperGeneralizationViolationError: please let your brain cool down before proceeding From aleaxit at gmail.com Fri Feb 3 15:35:02 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 3 Feb 2006 06:35:02 -0800 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <2m64nwg4wx.fsf@starship.python.net> References: <2m64nwg4wx.fsf@starship.python.net> Message-ID: <228F8C5C-81D1-47C3-B2AE-42585FFE1B21@gmail.com> On Feb 3, 2006, at 1:36 AM, Michael Hudson wrote: > Alex Martelli writes: > >> I was recently reviewing a lot of the Python 2.4 code I have written, >> and I've noticed one thing: thanks to the attrgetter and itemgetter >> functions in module operator, I've been using (or been tempted to >> use) >> far fewer lambdas, particularly but not exclusively in key= arguments >> to sort and sorted. > > Interesting. Something I'd noticed was that *until* the key= argument > to sort appeared, I was hardly using any lambdas at all (most of the > places I had used them were rendered obsolete by list comprehensions). Mine too, but many new places appeared, especially in itertools. > A class I wrote (and lost) ages ago was a "placeholder" class, so if > 'X' was an instance of this class, "X + 1" was roughly equivalent to > "lambda x:x+1" and "X.method(zip, zop)" was roughly equivalent to your > "methodcaller("method", zip, zop)". I threw it away when listcomps > got implemented. Not sure why I mention it now, something about your > post made me think of it... Such a placeholder would certainly offer better syntax and more power than methodcaller (and itemgetter and attrgetter, too). A lovely idea! Alex From rasky at develer.com Fri Feb 3 15:47:05 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 3 Feb 2006 15:47:05 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> Message-ID: <02c401c628d0$b3236df0$bf03030a@trilan> Nick Coghlan wrote: > Consider these comparisons: > > itemgetter(1) <=> (x[1] def (x)) > attrgetter('foo') <=> (x.foo def (x)) > partial(y, arg) <=> (y(arg) def) > > So rather than yet another workaround for lambda being ugly, I'd rather see > a PEP that proposed "Let's make the syntax for deferring an expression not > be ugly anymore, now that we have generator expressions and conditionals as > an example of how to do it right". +1000. Instead of keep on adding arcane functions which return objects which (when called) do things not obvious if not by knowing the function beforehand, a generic syntax should be added for deferred execution. I too use itemgetter and friends but the "correct" way of doing a defferred "x[1]" *should* let you write "x[1]" in the code. This is my main opposition to partial/itemgetter/attrgetter/methodcaller: they allow deferred execution using a syntax which is not equivalent to that of immediate execution. Unless we propose to deprecate "x[1]" in favor of "itemgetter(1)(x)"... -- Giovanni Bajo From aleaxit at gmail.com Fri Feb 3 16:00:26 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 3 Feb 2006 07:00:26 -0800 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <02c401c628d0$b3236df0$bf03030a@trilan> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <02c401c628d0$b3236df0$bf03030a@trilan> Message-ID: <3ED185A8-2597-4268-9C31-111177F1BCBA@gmail.com> On Feb 3, 2006, at 6:47 AM, Giovanni Bajo wrote: ... > use itemgetter and friends but the "correct" way of doing a > defferred "x[1]" > *should* let you write "x[1]" in the code. This is my main > opposition to > partial/itemgetter/attrgetter/methodcaller: they allow deferred > execution > using a syntax which is not equivalent to that of immediate execution. I understand your worry re the syntax issue. So what about Michael Hudson's "placeholder class" idea, where X[1] returns the callable that will do x[1] when called, etc? Looks elegant to me... Alex From bokr at oz.net Fri Feb 3 16:16:23 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 15:16:23 GMT Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> Message-ID: <43e36411.984076957@news.gmane.org> On Fri, 03 Feb 2006 20:44:47 +1000, Nick Coghlan wrote: >Michael Hudson wrote: >> Alex Martelli writes: >>> I'll be glad to write a PEP, but I first want to check whether the >>> Python-Dev crowd would just blast it out of the waters, in which case >>> I may save writing it... >> >> Hmm. >> >>>>> funcTakingCallback(lamda x:x.method(zip, zop)) >>>>> funcTakingCallback(methodcaller("method", zip, zop)) >> >> I'm not sure which of these is clearer really. Are lambdas so bad? >> (FWIW, I haven't internalized itemgetter/attrgetter yet and still tend >> to use lambdas instead those too). If you are familiar with lambda, it's clearer, because the expression evaluation is just deferred, and you can see that zip and zop are going to be accessed at lambda call time. methodcaller could potentially hide an alternate binding of zip and zop like lambda x, zip=zip, zop=zop:x.method(zip,zop) So you have to know what methodcaller does rather than just reading the expression. And if you want to customize with def-time (lambda eval time) bindings as well as call arg bindings, you can't easily AFAICS. BTW, re def-time bindings, the default arg abuse is a hack, so I would like to see a syntax that would permit default-arg-like def-time function-local bindings without affecting the call signature. E.g., if def foo(*args, **keywords, ***bindings): ... would use bindings as a dict at def-time to create local namespace bindings like **keywords, but not affecting the call signature. This would allow a nicer version of above-mentioned lambda x, zip=zip, zop=zop:x.method(zip,zop) as lambda x, ***dict(zip=zip, zop=zop):x.method(zip,zop) or lambda x, ***{'zip':zip, 'zop':zop}:x.method(zip,zop) This could also be used to do currying without the typical cost of wrapped nested calling. > >I've been convinced for a while that the proliferation of features like >operator.itemgetter and attrgetter (and some uses of functional.partial) >demonstrate that the ability to defer a single expression (as lambda currently >allows) is a very useful feature to have in the language. Unfortunately that note that "deferring" is a particular case of controlling evaluation time. The other direction in time goes towards reader macros & such. >utility gets overshadowed by the ugliness of the syntax, the mathematical >baggage associated with the current keyword, and the fact that lambda gets >pitched as an "anonymous function limited to a single expression", rather than >as "the ability to defer an expression for later evaluation" (the former >sounds like a limitation that should be fixed, the latter sounds like the >deliberate design choice that it is). > >At the moment it looks like the baby is going to get thrown out with the >bathwater in Py3k, but I'd love to be able to simply write the following >instead of some byzantine mixture of function calls to get the same effect: > > funcTakingCallback(x.method(zip, zop) def (x)) > >Consider these comparisons: > This looks a lot like the "anonymous def" expression in a postfix form ;-) > itemgetter(1) <=> (x[1] def (x)) <=> def(x):x[1] > attrgetter('foo') <=> (x.foo def (x)) <=> def(x):x.foo > partial(y, arg) <=> (y(arg) def) <=> def(***{'arg':arg}):y() # ?? (not sure about semantics of partial) > >So rather than yet another workaround for lambda being ugly, I'd rather see a >PEP that proposed "Let's make the syntax for deferring an expression not be >ugly anymore, now that we have generator expressions and conditionals as an >example of how to do it right". I guess you can guess my vote is for anonymous def ;-) > >Guido was rather unenthused the last time this topic came up, though, so maybe >it isn't worth the effort. . . (although he did eventually change his mind on >PEP 308, so I haven't entirely given up hope yet). Likewise ;-) Regards, Bengt Richter From abo at minkirri.apana.org.au Fri Feb 3 17:09:34 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Fri, 03 Feb 2006 16:09:34 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1138968292.7232.48.camel@warna.dub.corp.google.com> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> Message-ID: <1138982974.7232.100.camel@warna.dub.corp.google.com> On Fri, 2006-02-03 at 12:04 +0000, Donovan Baarda wrote: > On Wed, 2006-02-01 at 13:55 -0500, Greg Wilson wrote: [...] > Personally I'd like this. currently the "set(...)" syntax makes sets > feel tacked on compared to tuples, lists, dicts, and strings which have > nice built in syntax support. Many people don't realise they are there > because of this. [...] > Frozensets are to sets what tuples are to lists. It would be nice if > there was another type of bracket that could be used for frozenset... > something like ':1,2,3:'... yuk... I dunno. One possible bracket option for frozenset would be "<1,2,3>" which I initially rejected because of the possible syntactic clash with the < and > operators... however, there may be a way this could work... dunno. The other thing that keeps nagging me is set, frozenset, tuple, and list all overlap in functionality to fairly significant degrees. Sometimes it feels like just implementation or application differences... could a list that is never modified be optimised under the hood as a tuple? Could the immutability constraint of tuples be just acquired by a list when it is used as a key? Could a set simply be a list with unique values? etc. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From exarkun at divmod.com Fri Feb 3 17:31:44 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Fri, 3 Feb 2006 11:31:44 -0500 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <3ED185A8-2597-4268-9C31-111177F1BCBA@gmail.com> Message-ID: <20060203163144.2697.2017601097.divmod.quotient.5992@ohm> On Fri, 3 Feb 2006 07:00:26 -0800, Alex Martelli wrote: > >On Feb 3, 2006, at 6:47 AM, Giovanni Bajo wrote: > ... >> use itemgetter and friends but the "correct" way of doing a >> defferred "x[1]" >> *should* let you write "x[1]" in the code. This is my main >> opposition to >> partial/itemgetter/attrgetter/methodcaller: they allow deferred >> execution >> using a syntax which is not equivalent to that of immediate execution. > >I understand your worry re the syntax issue. So what about Michael >Hudson's "placeholder class" idea, where X[1] returns the callable >that will do x[1] when called, etc? Looks elegant to me... > FWIW, Jean-Paul From rasky at develer.com Fri Feb 3 17:32:30 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 3 Feb 2006 17:32:30 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <02c401c628d0$b3236df0$bf03030a@trilan> <3ED185A8-2597-4268-9C31-111177F1BCBA@gmail.com> Message-ID: <046a01c628df$6d4da160$bf03030a@trilan> Alex Martelli wrote: >> use itemgetter and friends but the "correct" way of doing a >> defferred "x[1]" >> *should* let you write "x[1]" in the code. This is my main >> opposition to >> partial/itemgetter/attrgetter/methodcaller: they allow deferred >> execution >> using a syntax which is not equivalent to that of immediate execution. > > I understand your worry re the syntax issue. So what about Michael > Hudson's "placeholder class" idea, where X[1] returns the callable > that will do x[1] when called, etc? Looks elegant to me... Depends on how the final API looks like. "deffered(x)[1]" isn't that bad, but "def x: x[1]" still looks clearer as the 'def' keyword immediatly makes clear you're DEFining a DEFerred function :) Of course we can paint our bikeshed of whatever color we like, but I'm happy enough if we agree with the general idea of keeping the same syntax in both deferred and immediate execution. There is an also an issue with deferred execution without arguments. By grepping my code it turned out that many lambda instances are in calls to assertRaises() (unittest), where I stricly prefer the syntax: self.assertRaises(TypeError, lambda: int("ABK", 16)) to the allowed: self.assertRaises(TypeError, int, "ABK", 16) With the inline def proposal we'd get something along the lines of: self.assertRaises(TypeError, def(): int("ABK", 16)) self.assertRaises(TypeError, (int("ABK", 16) def)) # it's not lisp, really, I swear while I'm not sure how this would get with the placeholder class. -- Giovanni Bajo From hyeshik at gmail.com Fri Feb 3 17:41:25 2006 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Sat, 4 Feb 2006 01:41:25 +0900 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <20060203163144.2697.2017601097.divmod.quotient.5992@ohm> References: <3ED185A8-2597-4268-9C31-111177F1BCBA@gmail.com> <20060203163144.2697.2017601097.divmod.quotient.5992@ohm> Message-ID: <4f0b69dc0602030841u28d13d94td9141259c2e1dbdf@mail.gmail.com> On 2/4/06, Jean-Paul Calderone wrote: > On Fri, 3 Feb 2006 07:00:26 -0800, Alex Martelli wrote: > > > >I understand your worry re the syntax issue. So what about Michael > >Hudson's "placeholder class" idea, where X[1] returns the callable > >that will do x[1] when called, etc? Looks elegant to me... > > > > FWIW, > > > > Yet another implementation, http://mail.python.org/pipermail/python-announce-list/2004-January/002801.html Hye-Shik From jcarlson at uci.edu Fri Feb 3 18:00:56 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 03 Feb 2006 09:00:56 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1138968292.7232.48.camel@warna.dub.corp.google.com> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> Message-ID: <20060203085105.10DE.JCARLSON@uci.edu> Donovan Baarda wrote: > > On Wed, 2006-02-01 at 13:55 -0500, Greg Wilson wrote: > > Hi, > > > > I have a student who may be interested in adding syntactic support for > > sets to Python, so that: > > > > x = {1, 2, 3, 4, 5} > > > > and: > > > > y = {z for z in x if (z % 2)} > > Personally I'd like this. currently the "set(...)" syntax makes sets > feel tacked on compared to tuples, lists, dicts, and strings which have > nice built in syntax support. Many people don't realise they are there > because of this. Sets are tacked on. That's why you need to use 'import sets' to get to them, in a similar fashion that you need to use 'import array' to get access to C-like arrays. People don't realize that sets are there because they tend to not read the "what's new in Python X.Y", and also fail to read through the "global module index" every once and a while. I personally object to making syntax for sets for the same reasons I object to making arrays, heapqs, Queues, deques, or any of the other data structure-defining modules in the standard library into syntax. - Josiah From abo at minkirri.apana.org.au Fri Feb 3 18:04:24 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Fri, 03 Feb 2006 17:04:24 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <20060203085105.10DE.JCARLSON@uci.edu> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> <20060203085105.10DE.JCARLSON@uci.edu> Message-ID: <1138986264.7232.105.camel@warna.dub.corp.google.com> On Fri, 2006-02-03 at 09:00 -0800, Josiah Carlson wrote: [...] > Sets are tacked on. That's why you need to use 'import sets' to get to > them, in a similar fashion that you need to use 'import array' to get > access to C-like arrays. No you don't; $ python Python 2.4.1 (#2, Mar 30 2005, 21:51:10) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> v=set((1,2,3)) >>> f=frozenset(v) >>> set and frozenset are now builtin. > I personally object to making syntax for sets for the same reasons I > object to making arrays, heapqs, Queues, deques, or any of the other > data structure-defining modules in the standard library into syntax. Nuff was a fairy... though I guess it depends on where you draw the line; should [1,2,3] be list(1,2,3)? -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From mwh at python.net Fri Feb 3 18:13:08 2006 From: mwh at python.net (Michael Hudson) Date: Fri, 03 Feb 2006 17:13:08 +0000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <20060203163144.2697.2017601097.divmod.quotient.5992@ohm> (Jean-Paul Calderone's message of "Fri, 3 Feb 2006 11:31:44 -0500") References: <20060203163144.2697.2017601097.divmod.quotient.5992@ohm> Message-ID: <2m1wykfjrv.fsf@starship.python.net> Jean-Paul Calderone writes: > On Fri, 3 Feb 2006 07:00:26 -0800, Alex Martelli wrote: >> >>On Feb 3, 2006, at 6:47 AM, Giovanni Bajo wrote: >> ... >>> use itemgetter and friends but the "correct" way of doing a >>> defferred "x[1]" >>> *should* let you write "x[1]" in the code. This is my main >>> opposition to >>> partial/itemgetter/attrgetter/methodcaller: they allow deferred >>> execution >>> using a syntax which is not equivalent to that of immediate execution. >> >>I understand your worry re the syntax issue. So what about Michael >>Hudson's "placeholder class" idea, where X[1] returns the callable >>that will do x[1] when called, etc? Looks elegant to me... I'd just like to point out here that I only mentioned this class; I didn't suggest it for anything :) > FWIW, > > > Yow. My implementation was somewhere in between those for length, I think (and pre-dated new style classes, which probably changes things). Cheers, mwh -- I'm sorry, was my bias showing again? :-) -- William Tanksley, 13 May 2000 From g.brandl at gmx.net Fri Feb 3 18:31:00 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 03 Feb 2006 18:31:00 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <228F8C5C-81D1-47C3-B2AE-42585FFE1B21@gmail.com> References: <2m64nwg4wx.fsf@starship.python.net> <228F8C5C-81D1-47C3-B2AE-42585FFE1B21@gmail.com> Message-ID: Alex Martelli wrote: >> A class I wrote (and lost) ages ago was a "placeholder" class, so if >> 'X' was an instance of this class, "X + 1" was roughly equivalent to >> "lambda x:x+1" and "X.method(zip, zop)" was roughly equivalent to your >> "methodcaller("method", zip, zop)". I threw it away when listcomps >> got implemented. Not sure why I mention it now, something about your >> post made me think of it... > > Such a placeholder would certainly offer better syntax and more power > than methodcaller (and itemgetter and attrgetter, too). A lovely idea! Yep. And it would make Python stand out of the crowd another time ;) The question is: is it "serious" and deterministic enough to be builtin? Georg From bokr at oz.net Fri Feb 3 18:51:31 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 17:51:31 GMT Subject: [Python-Dev] any support for a methodcaller HOF? References: Message-ID: <43e39048.995396013@news.gmane.org> On Thu, 2 Feb 2006 14:43:51 -0800, Alex Martelli wrote: >I was recently reviewing a lot of the Python 2.4 code I have written, >and I've noticed one thing: thanks to the attrgetter and itemgetter >functions in module operator, I've been using (or been tempted to use) >far fewer lambdas, particularly but not exclusively in key= arguments >to sort and sorted. Most of those "lambda temptations" will be >removed by PEP 309 (functional.partial), and most remaining ones are >of the form: > lambda x: x.amethod(zip, zop) > >So I was thinking -- wouldn't it be nice to have (possibly in module >functional, like partial; possibly in module operator, like itemgetter >and attrgetter -- I'm partial to functional;-) a methodcaller entry >akin to (...possibly with a better name...): > >def methodcaller(methodname, *a, **k): > def caller(self): > getattr(self, methodname)(*a, **k) > caller.__name__ = methodname > return caller > >...? This would allow removal of even more lambdas. > Yes, but what semantics do you really want? The above, as I'm sure you know, is not a direct replacement for the lambda: >>> import dis >>> foo = lambda x: x.amethod(zip, zop) >>> >>> def methodcaller(methodname, *a, **k): ... def caller(self): ... getattr(self, methodname)(*a, **k) ... caller.__name__ = methodname ... return caller ... >>> bar = methodcaller('amethod', zip, zop) Traceback (most recent call last): File "", line 1, in ? NameError: name 'zop' is not defined >>> zop = 'must exist at methodcaller call time' >>> bar = methodcaller('amethod', zip, zop) >>> dis.dis(foo) 1 0 LOAD_FAST 0 (x) 3 LOAD_ATTR 1 (amethod) 6 LOAD_GLOBAL 2 (zip) 9 LOAD_GLOBAL 3 (zop) 12 CALL_FUNCTION 2 15 RETURN_VALUE >>> dis.dis(bar) 3 0 LOAD_GLOBAL 0 (getattr) 3 LOAD_FAST 0 (self) 6 LOAD_DEREF 2 (methodname) 9 CALL_FUNCTION 2 12 LOAD_DEREF 0 (a) 15 LOAD_DEREF 1 (k) 18 CALL_FUNCTION_VAR_KW 0 21 POP_TOP 22 LOAD_CONST 0 (None) 25 RETURN_VALUE >I'll be glad to write a PEP, but I first want to check whether the >Python-Dev crowd would just blast it out of the waters, in which case >I may save writing it... -0 ;-) Regards, Bengt Richter From tismer at stackless.com Fri Feb 3 19:10:42 2006 From: tismer at stackless.com (Christian Tismer) Date: Fri, 03 Feb 2006 19:10:42 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43e36411.984076957@news.gmane.org> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> Message-ID: <43E39CA2.2030104@stackless.com> Bengt Richter wrote: ... > BTW, re def-time bindings, the default arg abuse is a hack, so I would like to > see a syntax that would permit default-arg-like def-time function-local bindings without > affecting the call signature. E.g., if def foo(*args, **keywords, ***bindings): ... > would use bindings as a dict at def-time to create local namespace bindings like **keywords, > but not affecting the call signature. This would allow a nicer version of above-mentioned > lambda x, zip=zip, zop=zop:x.method(zip,zop) > as > lambda x, ***dict(zip=zip, zop=zop):x.method(zip,zop) > or > lambda x, ***{'zip':zip, 'zop':zop}:x.method(zip,zop) > This could also be used to do currying without the typical cost of wrapped nested calling. Just in case that you might be not aware of it (like I was): lambda does support local scope, like here: >>> def locallambda(x, y): ... func = lambda: x+y ... return func ... >>> f=locallambda(2, 3) >>> f() 5 >>> ciao - chris -- Christian Tismer :^) tismerysoft GmbH : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9A : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 802 86 56 mobile +49 173 24 18 776 fax +49 30 80 90 57 05 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From martin at v.loewis.de Fri Feb 3 19:56:20 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 03 Feb 2006 19:56:20 +0100 Subject: [Python-Dev] Octal literals In-Reply-To: <43e2ffef.958442787@news.gmane.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> <7D3E3773-CDB4-4A81-ADFB-CBCC339F3DEE@fuhm.net> <43e2ffef.958442787@news.gmane.org> Message-ID: <43E3A754.2090207@v.loewis.de> Bengt Richter wrote: > If you are looking at them in C code receiving them as args in a call, > "treat them the same" would have to mean provide code to coerce long->int > or reject it with an exception, IWT. The typical way of processing incoming ints in C is through PyArg_ParseTuple, which already has the code to coerce long->int (which in turn may raise an exception for a range violation). So for typical C code, 0x80000004 is a perfect bit mask in Python 2.4. > It's not a matter of "buggy" if you are trying to optimize. > (I am aware of premature optimization issues, and IMO "strange" > is in the eye of the beholder. What syntax would you suggest? The question is: what is the problem you are trying to solve? If it is "bit masks", then consider the problem solved already. >>Same goes for code that says it takes a 32-bit bitfield argument but >>won't accept 0x80000000. > > If the bitfield is signed, it can't, unless you are glossing over > an assumed coercion rule. Just have a look at the 'k' specifier in PyArg_ParseTuple. Regards, Martin From martin at v.loewis.de Fri Feb 3 20:02:16 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 03 Feb 2006 20:02:16 +0100 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1138968292.7232.48.camel@warna.dub.corp.google.com> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> Message-ID: <43E3A8B8.2040500@v.loewis.de> Donovan Baarda wrote: > Before set() the standard way to do them was to use dicts with None > Values... to me the "{1,2,3}" syntax would have been a logical extension > of the "a set is a dict with no values, only keys" mindset. I don't know > why it wasn't done this way in the first place, though I missed the > arguments where it was rejected. There might be many reasons; one obvious reason is that you can't spell the empty set that way. > Frozensets are to sets what tuples are to lists. It would be nice if > there was another type of bracket that could be used for frozenset... > something like ':1,2,3:'... yuk... I dunno. Readability counts. Regards, Martin From fumanchu at amor.org Fri Feb 3 19:52:01 2006 From: fumanchu at amor.org (Robert Brewer) Date: Fri, 3 Feb 2006 10:52:01 -0800 Subject: [Python-Dev] any support for a methodcaller HOF? Message-ID: <6949EC6CD39F97498A57E0FA55295B21015D82B8@ex9.hostedexchange.local> Giovanni Bajo wrote: > Alex Martelli wrote: > > I understand your worry re the syntax issue. So what about Michael > > Hudson's "placeholder class" idea, where X[1] returns the callable > > that will do x[1] when called, etc? Looks elegant to me... > > Depends on how the final API looks like. "deffered(x)[1]" > isn't that bad, but "def x: x[1]" still looks clearer as > the 'def' keyword immediatly makes clear you're DEFining > a DEFerred function :) Of course we can paint our > bikeshed of whatever color we like, but I'm happy enough if > we agree with the general idea of keeping the same syntax > in both deferred and immediate execution. I don't agree with that "general idea" at all. Sorry. ;) I think the semantic emphasis should not be on "execution", but rather on "expression". The word "execution" to me implies "statements", and although some functions somewhere are called behind the scenes to evaluate any expression, the lambda (and its potential successors) differ from "def" by not allowing statements. They may be used to "defer execution" but to me, their value lies in being static expressions--object instances which are portable and introspectable. This is where LINQ [1] is taking off: expressions are declared with "var" (in C#). I used Expression() in Dejavu [2] for the same reasons (before LINQ came along ;), and am using it to build SQL from Python lambdas. I had to use lambda because that's Python's only builtin support for expressions-as-objects at the moment, but I'd like to see Python grow a syntax like: e = expr(x: x + 1) ...where expr() does early binding like dejavu.logic does. [Looking back over my logic module, I'm noticing it requires boolean return values, but it would not be difficult to extend to return abitrary values--even easier if it were rewritten as builtin functionality. Guess I need to write myself another ticket. ;)] Robert Brewer System Architect Amor Ministries fumanchu at amor.org [1] http://msdn.microsoft.com/netframework/future/linq/ [2] http://projects.amor.org/dejavu/browser/trunk/logic.py From rasky at develer.com Fri Feb 3 20:05:42 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 3 Feb 2006 20:05:42 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? References: <6949EC6CD39F97498A57E0FA55295B21015D82B8@ex9.hostedexchange.local> Message-ID: <009e01c628f4$d4705260$bf03030a@trilan> Robert Brewer wrote: > The word "execution" to me implies "statements", and > although some functions somewhere are called behind the scenes to > evaluate any expression, the lambda (and its potential successors) > differ from "def" by not allowing statements. They may be used to "defer > execution" but to me, their value lies in being static > expressions--object instances which are portable and introspectable. > This is where LINQ [1] is taking off: expressions are declared with > "var" (in C#). I used Expression() in Dejavu [2] for the same reasons > (before LINQ came along ;), and am using it to build SQL from Python > lambdas. I had to use lambda because that's Python's only builtin > support for expressions-as-objects at the moment, but I'd like to see > Python grow a syntax like: > > e = expr(x: x + 1) I see what you mean, but in a way you're still agreeing with me :) Your expression-as-objects proposal is very clever, but to me (and as far as this thread is concerned) it still allows to write a "decorated" piece of code (expression), pass it around, and execute (evaluate) it later. This is what I (and others) mainly use lambda for, and your expr() thing would still serve me well. Instead, itemgetter() and friends are going to a different direction (the expression which is later evaluated is not clearly expressed in familiar Python terms), and that's what I find inconvenient. -- Giovanni Bajo From jcarlson at uci.edu Fri Feb 3 20:56:55 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 03 Feb 2006 11:56:55 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1138986264.7232.105.camel@warna.dub.corp.google.com> References: <20060203085105.10DE.JCARLSON@uci.edu> <1138986264.7232.105.camel@warna.dub.corp.google.com> Message-ID: <20060203113244.10E4.JCARLSON@uci.edu> Donovan Baarda wrote: > > On Fri, 2006-02-03 at 09:00 -0800, Josiah Carlson wrote: > [...] > > Sets are tacked on. That's why you need to use 'import sets' to get to > > them, in a similar fashion that you need to use 'import array' to get > > access to C-like arrays. > > No you don't; > > $ python > Python 2.4.1 (#2, Mar 30 2005, 21:51:10) > [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> v=set((1,2,3)) > >>> f=frozenset(v) > >>> > > set and frozenset are now builtin. Indeed they are. My apologies for being incorrect, I'm still using 2.3 for all of my commercial work. > > I personally object to making syntax for sets for the same reasons I > > object to making arrays, heapqs, Queues, deques, or any of the other > > data structure-defining modules in the standard library into syntax. > > Nuff was a fairy... though I guess it depends on where you draw the > line; should [1,2,3] be list(1,2,3)? Who is "Nuff"? Along the lines of "not every x line function should be a builtin", "not every builtin should have syntax". I think that sets have particular uses, but I don't believe those uses are sufficiently varied enough to warrant the creation of a syntax. I suggest that people take a walk through their code. How often do you use other sequence and/or mapping types? How many lists, tuples and dicts are there? How many sets? Ok, now how many set literals? Syntax for sets is only really useful for the equivalent of a set literal, and with minimal syntax for a set literal being some sort of start and ending character pair, the only thing gained is a 3 key reduction in the amount of typing necessary, and a possible compiler optimization to call the set creation code instead of the local, global, then builtin namespaces. Essentially, I'm saying that "set(...)" isn't significantly worse than "{...}" (or some other pair) for set creation. One can say the same thing about list(), tuple(), and dict(), but I think that their millions of uses far overwhelms the minimal uses (and usage) of set(), and puts them in a completely different class. - Josiah From bokr at oz.net Fri Feb 3 21:58:06 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 20:58:06 GMT Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E39CA2.2030104@stackless.com> Message-ID: <43e3b756.1005393519@news.gmane.org> On Fri, 03 Feb 2006 19:10:42 +0100, Christian Tismer wrote: >Bengt Richter wrote: > >... > >> BTW, re def-time bindings, the default arg abuse is a hack, so I would like to >> see a syntax that would permit default-arg-like def-time function-local bindings without >> affecting the call signature. E.g., if def foo(*args, **keywords, ***bindings): ... >> would use bindings as a dict at def-time to create local namespace bindings like **keywords, >> but not affecting the call signature. This would allow a nicer version of above-mentioned >> lambda x, zip=zip, zop=zop:x.method(zip,zop) >> as >> lambda x, ***dict(zip=zip, zop=zop):x.method(zip,zop) >> or >> lambda x, ***{'zip':zip, 'zop':zop}:x.method(zip,zop) >> This could also be used to do currying without the typical cost of wrapped nested calling. > >Just in case that you might be not aware of it (like I was): >lambda does support local scope, like here: > > >>> def locallambda(x, y): >... func = lambda: x+y >... return func >... > >>> f=locallambda(2, 3) > >>> f() >5 Yes, thanks, I really did know that ;-/ Just got thinking along another line. So lambda x, zip=zip, zop=zop:x.method(zip,zop) and lambda x, ***{'zip':zip, 'zop':zop}:x.method(zip,zop) would better have been (lambda zip,zop:lambda x:x.method(zip,zop))(zip, zop) Regards, Bengt Richter From bokr at oz.net Sat Feb 4 00:08:39 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 03 Feb 2006 23:08:39 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> <7D3E3773-CDB4-4A81-ADFB-CBCC339F3DEE@fuhm.net> <43e2ffef.958442787@news.gmane.org> <43E3A754.2090207@v.loewis.de> Message-ID: <43e3d27c.1012343743@news.gmane.org> On Fri, 03 Feb 2006 19:56:20 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Bengt Richter wrote: >> If you are looking at them in C code receiving them as args in a call, >> "treat them the same" would have to mean provide code to coerce long->int >> or reject it with an exception, IWT. > >The typical way of processing incoming ints in C is through >PyArg_ParseTuple, which already has the code to coerce long->int >(which in turn may raise an exception for a range violation). > >So for typical C code, 0x80000004 is a perfect bit mask in Python 2.4. Ok, I'll take your word that 'k' coercion takes no significant time for longs vs ints. I thought there might be a case in a hot loop where it could make a difference. I confess not having done a C extension since I wrote one to access RDTSC quite some time ago. > >> It's not a matter of "buggy" if you are trying to optimize. >> (I am aware of premature optimization issues, and IMO "strange" >> is in the eye of the beholder. What syntax would you suggest? > >The question is: what is the problem you are trying to solve? >If it is "bit masks", then consider the problem solved already. Well, I was visualizing having a homogeneous bunch of bit mask definitions all as int type if they could fit. I can't express them all in hex as literals without some processing. That got me started ;-) Not that some one-time processing at module import time is a big deal. Just that it struck me as a wart not to be able to do it without processing, even if constant folding is on the way. > >>>Same goes for code that says it takes a 32-bit bitfield argument but >>>won't accept 0x80000000. >> >> If the bitfield is signed, it can't, unless you are glossing over >> an assumed coercion rule. > >Just have a look at the 'k' specifier in PyArg_ParseTuple. Ok, well that's the provision for the coercion then. BTW, is long mandatory for all implementations? Is there a doc that defines minimum features for a conforming Python implementation? E.g., IIRC Scheme has a list naming what's optional and not. Regards, Bengt Richter From brett at python.org Sat Feb 4 00:29:17 2006 From: brett at python.org (Brett Cannon) Date: Fri, 3 Feb 2006 15:29:17 -0800 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <6949EC6CD39F97498A57E0FA55295B21015D82B8@ex9.hostedexchange.local> References: <6949EC6CD39F97498A57E0FA55295B21015D82B8@ex9.hostedexchange.local> Message-ID: On 2/3/06, Robert Brewer wrote: > Giovanni Bajo wrote: > > Alex Martelli wrote: > > > I understand your worry re the syntax issue. So what about Michael > > > Hudson's "placeholder class" idea, where X[1] returns the callable > > > that will do x[1] when called, etc? Looks elegant to me... > > > > Depends on how the final API looks like. "deffered(x)[1]" > > isn't that bad, but "def x: x[1]" still looks clearer as > > the 'def' keyword immediatly makes clear you're DEFining > > a DEFerred function :) Of course we can paint our > > bikeshed of whatever color we like, but I'm happy enough if > > we agree with the general idea of keeping the same syntax > > in both deferred and immediate execution. > > I don't agree with that "general idea" at all. Sorry. ;) I think the > semantic emphasis should not be on "execution", but rather on > "expression". The word "execution" to me implies "statements", and > although some functions somewhere are called behind the scenes to > evaluate any expression, the lambda (and its potential successors) > differ from "def" by not allowing statements. They may be used to "defer > execution" but to me, their value lies in being static > expressions--object instances which are portable and introspectable. > > This is where LINQ [1] is taking off: expressions are declared with > "var" (in C#). I used Expression() in Dejavu [2] for the same reasons > (before LINQ came along ;), and am using it to build SQL from Python > lambdas. I had to use lambda because that's Python's only builtin > support for expressions-as-objects at the moment, but I'd like to see > Python grow a syntax like: > > e = expr(x: x + 1) > > ...where expr() does early binding like dejavu.logic does. [Looking back > over my logic module, I'm noticing it requires boolean return values, > but it would not be difficult to extend to return abitrary values--even > easier if it were rewritten as builtin functionality. Guess I need to > write myself another ticket. ;)] > Well, maybe what we really want is lambda but under a different name. ``expr x: x + 1`` seems fine to me and it doesn't have the issue of portraying Python has having a crippled lambda expression. I do think that a general solution can be found that can allow us to do away with itemgetter, itergetter, and Alex's methodcaller (something like Michael's Placeholder class). The problem is when we want deferred arguments to a function call. We have functional.partial, but it can't do something like ``lambda x: func(1, 2, x, 4, 5)`` unless everything is turned to keyword arguments but that doesn't work when something only take positional arguments. This does not seem to have a good solution outside of lambda in terms of non-function definition. But then again small functions can be defined for those situations. So I think that functional.partial along with some deferred object implementation should deal with most uses of lambda and then allow us to use custom functions to handle all the other cases of when we would want lambda. -Brett From ncoghlan at gmail.com Sat Feb 4 03:18:11 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 04 Feb 2006 12:18:11 +1000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43e36411.984076957@news.gmane.org> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> Message-ID: <43E40EE3.9070900@gmail.com> Bengt Richter wrote: > On Fri, 03 Feb 2006 20:44:47 +1000, Nick Coghlan wrote: >> funcTakingCallback(x.method(zip, zop) def (x)) >> >> Consider these comparisons: >> > This looks a lot like the "anonymous def" expression in a postfix form ;-) If you think about the way a for-loop statement maps to the looping portion of a listcomp or genexp, or the way an if statement maps to a conditional expression, you might notice that this is *not* a coincidence :) def g(_seq): for x in _seq: yield x*x g = g(seq) => g = (x*x for x in seq) l = [] for x in seq: l.append(x*x) => l = [x*x for x in seq] if cond: val = x else: val = y => val = x if cond else y In all three of the recent cases where a particular usage of a statement has been converted to an expression, the variable portion of the innermost part of the the first suite is pulled up and placed to the left of the normal statement keyword. A bracketing syntax is used when the expression creates a new object. All I'm suggesting is that a similarly inspired syntax is worth considering when it comes to deferred expressions: def f(x): return x*x => f = (x*x def (x)) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From eric.nieuwland at xs4all.nl Sat Feb 4 03:48:18 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Sat, 4 Feb 2006 03:48:18 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E40EE3.9070900@gmail.com> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> Message-ID: <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> On 4 feb 2006, at 3:18, Nick Coghlan wrote: > All I'm suggesting is that a similarly inspired syntax is worth > considering when it comes to deferred expressions: > > def f(x): > return x*x > > => f = (x*x def (x)) It's not the same, as x remains free whereas in g = [x*x for x in seq] x is bound. Yours is f = lambda x: x*x and it will die by Guido hand... --eric From ncoghlan at gmail.com Sat Feb 4 04:11:21 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 04 Feb 2006 13:11:21 +1000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> Message-ID: <43E41B59.6080005@gmail.com> Eric Nieuwland wrote: > On 4 feb 2006, at 3:18, Nick Coghlan wrote: >> All I'm suggesting is that a similarly inspired syntax is worth >> considering when it comes to deferred expressions: >> >> def f(x): >> return x*x >> >> => f = (x*x def (x)) > > It's not the same, as x remains free whereas in g = [x*x for x in seq] x > is bound. That's like saying "it's not the same because '(x*x def (x)' creates a function while '(x*x for x in seq)' creates a generator-iterator". Well, naturally - if the expression didn't do something different, what would be the point in having it? The parallel I'm trying to draw is at the syntactic level, not the semantic. I'm quite aware that the semantics will be very different ;) > Yours is > > f = lambda x: x*x > > and it will die by Guido hand... In the short term, probably. I'm hoping that the progressive accumulation of workarounds like itemgetter, attrgetter and partial (and Alex's suggestion of 'methodcaller') and the increasing use of function arguments for things like sorting and the itertools module will eventually convince Guido that deferring expressions is a feature that needs to be *fixed* rather than discarded entirely. But until the BDFL is willing to at least entertain the notion of fixing deferred expressions rather than getting ridding of them, there isn't much point in writing a PEP or a patch to tweak the parser (with the AST in place, this is purely a change to the parser front-end - the AST and code generation back end don't need to be touched). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From gvwilson at cs.utoronto.ca Fri Feb 3 01:26:14 2006 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Thu, 2 Feb 2006 19:26:14 -0500 (EST) Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: <000d01c62768$c11c0980$b83efea9@RaymondLaptop1> Message-ID: > > > Raymond: > > > Accordingly,Guido rejected the braced notation for set comprehensions. > > > See: http://www.python.org/peps/pep-0218.html > > Greg: > > "...however, the issue could be revisited for Python 3000 (see PEP 3000)." > > So I'm only 1994 years early ;-) > Alex: > Don't be such a pessimist, it's ONLY 994 years to go! Greg: I was allowing for likely schedule slippage... ;-) G From ncoghlan at iinet.net.au Sat Feb 4 07:01:43 2006 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat, 04 Feb 2006 16:01:43 +1000 Subject: [Python-Dev] Path PEP and the division operator Message-ID: <43E44347.3060007@iinet.net.au> I was tinkering with something today, and wondered whether it would cause fewer objections if the PEP used the floor division operator (//) to combine path fragments, instead of the true division operator? The parallel to directory separators is still there, but the syntax isn't tied quite so strongly to the Unix path separator. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From eric.nieuwland at xs4all.nl Sat Feb 4 09:05:37 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Sat, 4 Feb 2006 09:05:37 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E41B59.6080005@gmail.com> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> Message-ID: <7a725d1194b99280428568c857a5d577@xs4all.nl> Nick Coghlan wrote: > That's like saying "it's not the same because '(x*x def (x)' creates a > function while '(x*x for x in seq)' creates a generator-iterator". > Well, > naturally - if the expression didn't do something different, what > would be the > point in having it? ;-) Naturally. I just wanted to point out it's a beast of another kind, so like syntax may not be a good idea. > The parallel I'm trying to draw is at the syntactic level, not the > semantic. > I'm quite aware that the semantics will be very different ;) > >> Yours is >> >> f = lambda x: x*x >> >> and it will die by Guido hand... > > In the short term, probably. I'm hoping that the progressive > accumulation of > workarounds like itemgetter, attrgetter and partial (and Alex's > suggestion of > 'methodcaller') and the increasing use of function arguments for > things like > sorting and the itertools module will eventually convince Guido that > deferring > expressions is a feature that needs to be *fixed* rather than > discarded entirely. Then how about nameless function/method definition: def (x): ... usual body ... produces an unnamed method object and def spam(x): .... is just spam = def (x): ... while our beloved eggs(lambda x: x*x) would become eggs(def(x): return x*x) --eric From martin at v.loewis.de Sat Feb 4 11:11:08 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 04 Feb 2006 11:11:08 +0100 Subject: [Python-Dev] Octal literals In-Reply-To: <43e3d27c.1012343743@news.gmane.org> References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> <7D3E3773-CDB4-4A81-ADFB-CBCC339F3DEE@fuhm.net> <43e2ffef.958442787@news.gmane.org> <43E3A754.2090207@v.loewis.de> <43e3d27c.1012343743@news.gmane.org> Message-ID: <43E47DBC.9000703@v.loewis.de> Bengt Richter wrote: >>The typical way of processing incoming ints in C is through >>PyArg_ParseTuple, which already has the code to coerce long->int >>(which in turn may raise an exception for a range violation). >> >>So for typical C code, 0x80000004 is a perfect bit mask in Python 2.4. > > Ok, I'll take your word that 'k' coercion takes no significant time for longs vs ints. I didn't say that 'k' takes no significant time for longs vs ints. In fact, I did not make any performance claims. I don't know what the relative performance is. > Well, I was visualizing having a homogeneous bunch of bit mask > definitions all as int type if they could fit. I can't express > them all in hex as literals without some processing. That got me started ;-) I still can't see *why* you want to do that. Just write them as hex literals the way you expect it to work, and it typically will work just fine. Some of these literals are longs, some are ints, but there is no need to worry about this. It will all work just fine. > BTW, is long mandatory for all implementations? Is there a doc that > defines minimum features for a conforming Python implementation? The Python language reference is typically considered as a specification of what Python is. There is no "minimal Python" specification: you have to do all of it. Regards, Martin From ncoghlan at gmail.com Sat Feb 4 13:41:48 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 04 Feb 2006 22:41:48 +1000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <7a725d1194b99280428568c857a5d577@xs4all.nl> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> Message-ID: <43E4A10C.7020703@gmail.com> Eric Nieuwland wrote: > Then how about nameless function/method definition: > def (x): > ... usual body ... Hell no. If I want to write a real function, I already have perfectly good syntax for that in the form of a def statement. I want to *increase* the conceptual (and pedagogical) difference between deferred expressions and real functions, not reduce it. There's a reason I try to use the term 'deferred expression' for lambda rather than 'anonymous function'. Even if lambdas are *implemented* as normal function objects, they're a conceptually different beast as far as I'm concerned - a function is typically about factoring out a piece of common code to be used in multiple places, while a lambda is about defining *here* and *now* an operation that is to be carried out *elsewhere* and possibly *later* (e.g., sorting and predicate arguments are defined at the call site but executed in the function body, callbacks are defined when registered but executed when the relevant event occurs). > produces an unnamed method object > and > def spam(x): > .... > is just > spam = def (x): > ... Except that it wouldn't be - the name used in a def statement has special status that a normal variable name does not (e.g. the function knows about its real name, but nothing about the aliases given to it by assignment statements). > while our beloved > eggs(lambda x: x*x) > would become > eggs(def(x): return x*x) I personally believe this fascination with "we want to be able to include a suite inside an expression" has been a major contributor to Guido's irritation with the whole concept of anonymous functions. That may just be me projecting my own feelings though - every time I try to start a discussion about getting a clean deferred expression syntax, at least one part of the thread will veer off onto the topic of embedded suites. IMO, if what you want to do is complex enough that you can't write it using a single expression, then giving it a name and a docstring would probably make the code more comprehensible anyway. Generator expressions allow a generator to be embedded only if it is simple enough to be written using a single expression in the body of the loop. Lambda does the same thing for functions, but for some reason people seem to love the flexibility provided by genexps, while many think the exact same restriction in lambda is a problem that needs "fixing". Maybe once PEP 308 has been implemented, some of that griping will go away, as it will then be possible to cleanly embed conditional logic inside an expression (and hence inside a lambda). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Sat Feb 4 13:55:23 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 04 Feb 2006 13:55:23 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E4A10C.7020703@gmail.com> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> Message-ID: <43E4A43B.20903@v.loewis.de> Nick Coghlan wrote: > Hell no. If I want to write a real function, I already have perfectly good > syntax for that in the form of a def statement. I want to *increase* the > conceptual (and pedagogical) difference between deferred expressions and real > functions, not reduce it. There's a reason I try to use the term 'deferred > expression' for lambda rather than 'anonymous function'. Even if lambdas are > *implemented* as normal function objects, they're a conceptually different > beast as far as I'm concerned - a function is typically about factoring out a > piece of common code to be used in multiple places, while a lambda is about > defining *here* and *now* an operation that is to be carried out *elsewhere* > and possibly *later* (e.g., sorting and predicate arguments are defined at the > call site but executed in the function body, callbacks are defined when > registered but executed when the relevant event occurs). Hmm. A function also defines *here* and *now* an operation to be carried out *elsewhere* and *later*. > Generator expressions allow a generator to be embedded only if it is simple > enough to be written using a single expression in the body of the loop. Lambda > does the same thing for functions, but for some reason people seem to love the > flexibility provided by genexps, while many think the exact same restriction > in lambda is a problem that needs "fixing". Maybe once PEP 308 has been > implemented, some of that griping will go away, as it will then be possible to > cleanly embed conditional logic inside an expression (and hence inside a lambda). I believe that usage of a keyword with the name of a Greek letter also contributes to people considering something broken. Regards, Martin From ncoghlan at gmail.com Sat Feb 4 15:01:47 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 05 Feb 2006 00:01:47 +1000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E4A43B.20903@v.loewis.de> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E4A43B.20903@v.loewis.de> Message-ID: <43E4B3CB.7080005@gmail.com> Martin v. L?wis wrote: > Hmm. A function also defines *here* and *now* an operation to be carried > out *elsewhere* and *later*. Agreed, but when I use a lambda, I almost always have a *specific* elsewhere in mind (such as a sorting operation or a callback registration). With named functions, that isn't usually the case - I'll either be returning the function from a factory function or decorator (allowing the caller to do whatever they want with it), or I'll be storing the function in a module or class namespace where any code that needs to use it can retrieve it later. Local utility functions occupy a middle ground - their usage is localised to one function or class definition, but they aren't necessarily defined just for one particular use. Using them more than once is a clear sign that they're worth naming, and the occasional need to name a complex single-use function seems a worthwhile trade-off when compared to trying to permit that complexity to be embedded inside an expression. >> Generator expressions allow a generator to be embedded only if it is simple >> enough to be written using a single expression in the body of the loop. Lambda >> does the same thing for functions, but for some reason people seem to love the >> flexibility provided by genexps, while many think the exact same restriction >> in lambda is a problem that needs "fixing". Maybe once PEP 308 has been >> implemented, some of that griping will go away, as it will then be possible to >> cleanly embed conditional logic inside an expression (and hence inside a lambda). > > I believe that usage of a keyword with the name of a Greek letter also > contributes to people considering something broken. Aye, I agree there are serious problems with the current syntax. All I'm trying to say above is that I don't believe the functionality itself is broken. At last count, Guido's stated preference was to ditch the functionality entirely for Py3k, so unless he says something to indicate he's changed his mind, we'll simply need to continue with proposing functions like methodcaller() as workarounds for its absence... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Sat Feb 4 17:05:59 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 4 Feb 2006 08:05:59 -0800 Subject: [Python-Dev] Path PEP and the division operator In-Reply-To: <43E44347.3060007@iinet.net.au> References: <43E44347.3060007@iinet.net.au> Message-ID: I won't even look at the PEP as long as it uses / or // (or any other operator) for concatenation. On 2/3/06, Nick Coghlan wrote: > I was tinkering with something today, and wondered whether it would cause > fewer objections if the PEP used the floor division operator (//) to combine > path fragments, instead of the true division operator? > > The parallel to directory separators is still there, but the syntax isn't tied > quite so strongly to the Unix path separator. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bjourne at gmail.com Sat Feb 4 17:16:46 2006 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Sat, 4 Feb 2006 17:16:46 +0100 Subject: [Python-Dev] Path PEP and the division operator In-Reply-To: References: <43E44347.3060007@iinet.net.au> Message-ID: <740c3aec0602040816w34981344n271b237d6b6c9fd5@mail.gmail.com> On 2/4/06, Guido van Rossum wrote: > I won't even look at the PEP as long as it uses / or // (or any other > operator) for concatenation. That's good, because it doesn't. :) http://www.python.org/peps/pep-0355.html -- mvh Bj?rn From ncoghlan at gmail.com Sat Feb 4 17:28:43 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 05 Feb 2006 02:28:43 +1000 Subject: [Python-Dev] Path PEP and the division operator In-Reply-To: <740c3aec0602040816w34981344n271b237d6b6c9fd5@mail.gmail.com> References: <43E44347.3060007@iinet.net.au> <740c3aec0602040816w34981344n271b237d6b6c9fd5@mail.gmail.com> Message-ID: <43E4D63B.4000900@gmail.com> BJ?rn Lindqvist wrote: > On 2/4/06, Guido van Rossum wrote: >> I won't even look at the PEP as long as it uses / or // (or any other >> operator) for concatenation. > > That's good, because it doesn't. :) http://www.python.org/peps/pep-0355.html My mistake - that's been significantly updated since I last read it. I should have known better, though, as I think I was one of the people advocating use of the constructor instead of an operator. . . Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From duncan.booth at suttoncourtenay.org.uk Sat Feb 4 18:26:12 2006 From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth) Date: Sat, 4 Feb 2006 11:26:12 -0600 Subject: [Python-Dev] Path PEP and the division operator References: <43E44347.3060007@iinet.net.au> <740c3aec0602040816w34981344n271b237d6b6c9fd5@mail.gmail.com> Message-ID: BJ?rn Lindqvist wrote in news:740c3aec0602040816w34981344n271b237d6b6c9fd5 at mail.gmail.com: > On 2/4/06, Guido van Rossum wrote: >> I won't even look at the PEP as long as it uses / or // (or any other >> operator) for concatenation. > > That's good, because it doesn't. :) > http://www.python.org/peps/pep-0355.html > No, but it does say that / may be reintroduced 'if the BFDL so desires'. I hope that doesn't mean the BDFL may be overruled. :^) I'm not convinced by the rationale given why atime,ctime,mtime and size are methods rather than properties but I do find this PEP much more agreeable than the last time I looked at it. From eric.nieuwland at xs4all.nl Sat Feb 4 20:11:31 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Sat, 4 Feb 2006 20:11:31 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E4A43B.20903@v.loewis.de> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E4A43B.20903@v.loewis.de> Message-ID: Martin v. L?wis wrote: > I believe that usage of a keyword with the name of a Greek letter also > contributes to people considering something broken. QOTW! ;-) --eric From eric.nieuwland at xs4all.nl Sat Feb 4 20:17:15 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Sat, 4 Feb 2006 20:17:15 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E4B3CB.7080005@gmail.com> References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E4A43B.20903@v.loewis.de> <43E4B3CB.7080005@gmail.com> Message-ID: <8307255af7b268cf2018a0cfb1fe1d17@xs4all.nl> Nick Coghlan wrote: >> I believe that usage of a keyword with the name of a Greek letter also >> contributes to people considering something broken. > > Aye, I agree there are serious problems with the current syntax. All > I'm > trying to say above is that I don't believe the functionality itself > is broken. Lambda is not broken, it's restricted to single calculation and therefore of limited use. Although I wasn't too serious (should had added more signs of that), an anonymous 'def' would allow to use the full power of method definition. > At last count, Guido's stated preference was to ditch the functionality > entirely for Py3k, so unless he says something to indicate he's > changed his > mind, we'll simply need to continue with proposing functions like > methodcaller() as workarounds for its absence... Yep, we'll just have to learn to live without it. :-( / ;-) --eric From rasky at develer.com Sat Feb 4 20:35:43 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sat, 4 Feb 2006 20:35:43 +0100 Subject: [Python-Dev] Path PEP: some comments Message-ID: <030001c629c2$30345c90$bf03030a@trilan> Hello, my comments on the Path PEP: - Many methods contain the word 'path' in them. I suppose this is to help transition from the old library to the new library. But in the context of a new Python user, I don't think that Path.abspath() is optimal. Path.abs() looks better. Maybe it's not so fundamental to have exactly the same names of the old library, especially when thinking of future? If I rearrange my code to use Path, I can as well rename methods to something more sound at the same time. - Why having a basename() and a .namebase property? Again for backward compatibility? I guess we can live with the property only. - The operations that return list of files have confusing names. Something more orthogonal could be: list, listdirs, listfiles / walk, walkdirs, walkfiles. Where, I guess, the first triplet does not recurse into subdirs while the second does. glob() could be dropped (as someone else proposed). - ctime() is documented to be unportable: it has different semantics on UNIX and Windows. I believe the class should abstract from these details. One solution is to rip it off and forget about it. Another is to provide two different functions which have a fixed semantic (and possibly available only a subset of the operating systems / file systems). - remove() and unlink() are duplicates, I'd drop one (unlink() has a more arcane name). - mkdir+makedirs and rmdir+removedirs are confusing and could use some example. I believe it's enough to have a single makedir() (which is recursive by default) and a single remove() (again recursive by default, and could work with both files and directories). rmtree() should go for the same reason (duplicated). - Whatever function we comes out with for removing trees, it should have a force=True flag to mimic "rm -rf". That is, it should try to remove read-only files as well. I saw so many times people writing their own rmtree_I_mean_it() wrapper which uses the onerror callback to change the permissions. That's so unpythonic for such a common task. - copy/copy2/copyfile mean the same to me. copy2() is really a bad name though, I'd use copy(stats=True). - My own feeling on the controversial split() vs splitpath() is that split() is always wrong for paths so I don't see nothing fundamentally wrong in overwriting it. I don't expect to find existing code (using strings for path) calling split() on a path. split("/") might be common though, and in fact my proposal is to overwrite the zero-argument split() giving it the meaning of split("/"). - I'm missing read(), write(), readlines() and bytes() from the original Path class. When I have a Path() that points to a file, it's pretty common to read from it. Those functions were handy because they were saving much obvious code: for L in Path("foo.txt").readlines(): print L, ===> f = open(Path("foo.txt"), "rU") try: for L in f: print L finally: f.close() - Since we're at it, we could also move part of "fileinput" into Path. For instance, why not have a replacelines() method: import fileinput for L in fileinput.FileInput("foo.txt", inplace=True, backup=True): print "(modified) " + L, ====> for L in Path("foo.txt").replacelines(backup=True): print "(modified) " + L, Thanks for working on this! -- Giovanni Bajo From pje at telecommunity.com Sat Feb 4 22:08:42 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 04 Feb 2006 16:08:42 -0500 Subject: [Python-Dev] Path PEP: some comments In-Reply-To: <030001c629c2$30345c90$bf03030a@trilan> Message-ID: <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> At 08:35 PM 2/4/2006 +0100, Giovanni Bajo wrote: >- ctime() is documented to be unportable: it has different semantics on UNIX >and Windows. I believe the class should abstract from these details. Note that this is the opposite of normal Python policy: Python does not attempt to create cross-platform abstractions, but instead chooses to expose platform differences. The Path class shouldn't abstract this any more than the original *path modules do. > One >solution is to rip it off and forget about it. Another is to provide two >different functions which have a fixed semantic (and possibly available only >a subset of the operating systems / file systems). Keep in mind that to properly replace os.path, each of the various *path modules will need their own Path variant to support foreign path manipulation. For example, one can use posixpath.join() right now on Windows to manipulate Posix paths, and ntpath.join() to do the reverse on Unix. So there is already going to have to be a Path class for each os anyway - and they will all need to be simultaneously usable. Note that this is a big difference from the Path implementation currently in circulation, which is limited to processing the native OS's paths. The PEP also currently doesn't address this point at all; it should probably mention that each of the posixpath, ntpath, macpath, etc. modules will each need to include a Path implementation. Whether this should be made available as os.Path or os.path.Path is the only open question; the latter of course would be automatic by simply adding a Path implementation to each of the *path modules. From Scott.Daniels at Acm.Org Sat Feb 4 22:42:21 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 04 Feb 2006 13:42:21 -0800 Subject: [Python-Dev] Path PEP -- a couple of typos. Message-ID: Here are a couple of simple-minded fixes for the PEP. Near the bottom of "Replacing older functions with the Path class": > fname = Path("Python2.4.tar.gz") > base, ext = fname.namebase, fname.extx Surely this should be: base, ext = fname.namebase, fname.ext > lib_dir = "/lib" > libs = glob.glob(os.path.join(lib_dir, "*s.o")) > ==> > lib_dir = Path("/lib") > libs = lib_dir.files("*.so") Probably that should be: ... libs = glob.glob(os.path.join(lib_dir, "*.so")) ... --Scott David Daniels Scott.Daniels at Acm.Org From rasky at develer.com Sun Feb 5 00:18:08 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 5 Feb 2006 00:18:08 +0100 Subject: [Python-Dev] Path PEP: some comments References: <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> Message-ID: <028701c629e1$42c11b90$5cbc2997@bagio> Phillip J. Eby wrote: >> - ctime() is documented to be unportable: it has different semantics >> on UNIX and Windows. I believe the class should abstract from these >> details. > > Note that this is the opposite of normal Python policy: Python does > not attempt to create cross-platform abstractions, but instead > chooses to expose platform differences. The Path class > shouldn't abstract this > any more than the original *path modules do. I don't follow. One thing is to provide an interface which totally abstracts from low-level details. Another is to provide a function which holds different results depending on the operating system. I'm fine to have different functions available for different purposes on different platforms, I'm not fine with having a single function which does different things. Do you have any other example? Giovanni Bajo From ncoghlan at gmail.com Sun Feb 5 02:26:19 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 05 Feb 2006 11:26:19 +1000 Subject: [Python-Dev] Path PEP and the division operator In-Reply-To: References: <43E44347.3060007@iinet.net.au> <740c3aec0602040816w34981344n271b237d6b6c9fd5@mail.gmail.com> Message-ID: <43E5543B.1080907@gmail.com> Duncan Booth wrote: > I'm not convinced by the rationale given why atime,ctime,mtime and size are > methods rather than properties but I do find this PEP much more agreeable > than the last time I looked at it. A better rationale for doing it is that all of them may raise IOException. It's rude for properties to do that, so it's better to make them methods instead. That was a general guideline that came up the first time adding Path was proposed - if the functionality involved querying or manipulating the actual filesystem (and therefore potentially raising IOError), then it should be a method. If the operation related solely to the string representation, then it could be a property. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From tjreedy at udel.edu Sun Feb 5 05:38:29 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 4 Feb 2006 23:38:29 -0500 Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net><43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org><43E40EE3.9070900@gmail.com><90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl><43E41B59.6080005@gmail.com><7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> Message-ID: "Nick Coghlan" wrote in message news:43E4A10C.7020703 at gmail.com... > Hell no. If I want to write a real function, I already have perfectly > good > syntax for that in the form of a def statement. I want to *increase* the > conceptual (and pedagogical) difference between deferred expressions and > real > functions, not reduce it. Mathematically, a function is a function. Expressions and statements are two syntaxes for composing functions to create/define new functions. A few languages use just one or the other. Python intentionally uses both. But I think making an even bigger deal of surface syntax is exactly the wrong movement, especially pedagogically. Terry Jan Reedy From tjreedy at udel.edu Sun Feb 5 06:06:28 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 5 Feb 2006 00:06:28 -0500 Subject: [Python-Dev] Path PEP: some comments References: <030001c629c2$30345c90$bf03030a@trilan> <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> Message-ID: "Phillip J. Eby" wrote in message news:5.1.1.6.0.20060204160108.0410c758 at mail.telecommunity.com... > Note that this is the opposite of normal Python policy: Python does not > attempt to create cross-platform abstractions, but instead chooses to > expose platform differences. I had the opposite impression about Python -- that it generally masks such differences. Overall, I see it as a cross-platform abstraction. The requirement that ints be at least 32 bits masked the difference between 16-bit int and 32-bit int platforms, in a way that C did/does not. I am pretty sure that Tim Peters has said that he would welcome better uniformity in binary float computations, but that he won't do the work needed. The decimal package attempts to completely mask the underlying platform. Cross-platform guis, whether written in Python or just accessible from Python, also mask differences. The os module has names like sep and pathsep precisely so people can more easily write platform independent code. And so on. From ncoghlan at gmail.com Sun Feb 5 08:09:14 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 05 Feb 2006 17:09:14 +1000 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: References: <2m64nwg4wx.fsf@starship.python.net><43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org><43E40EE3.9070900@gmail.com><90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl><43E41B59.6080005@gmail.com><7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> Message-ID: <43E5A49A.5050907@gmail.com> Terry Reedy wrote: > "Nick Coghlan" wrote in message > news:43E4A10C.7020703 at gmail.com... >> Hell no. If I want to write a real function, I already have perfectly >> good >> syntax for that in the form of a def statement. I want to *increase* the >> conceptual (and pedagogical) difference between deferred expressions and >> real >> functions, not reduce it. > > Mathematically, a function is a function. Expressions and statements are > two syntaxes for composing functions to create/define new functions. A few > languages use just one or the other. Python intentionally uses both. But > I think making an even bigger deal of surface syntax is exactly the wrong > movement, especially pedagogically. I guess I misstated myself slightly - I've previously advocated re-using the 'def' keyword, so there are obviously parallels I want to emphasize. I guess my point is that expressions are appropriate sometimes, functions are appropriate other times, and it *is* possible to give reasonably simple guidelines as to which one is most appropriate when (one consumer->deferred expression, multiple consumers->named function). I see it as similar to the choice of whether to use a generator function or generator expression in a given situation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From duncan.booth at suttoncourtenay.org.uk Sun Feb 5 11:10:08 2006 From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth) Date: Sun, 5 Feb 2006 04:10:08 -0600 Subject: [Python-Dev] Path PEP and the division operator References: <43E5543B.1080907@gmail.com> Message-ID: Nick Coghlan wrote in news:43E5543B.1080907 at gmail.com: > Duncan Booth wrote: >> I'm not convinced by the rationale given why atime,ctime,mtime and >> size are methods rather than properties but I do find this PEP much >> more agreeable than the last time I looked at it. > > A better rationale for doing it is that all of them may raise > IOException. It's rude for properties to do that, so it's better to > make them methods instead. Yes, that rationale sounds good to me. > > That was a general guideline that came up the first time adding Path > was proposed - if the functionality involved querying or manipulating > the actual filesystem (and therefore potentially raising IOError), > then it should be a method. If the operation related solely to the > string representation, then it could be a property. Perhaps Bjorn could add that to the PEP? From martin at v.loewis.de Sun Feb 5 13:57:41 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 05 Feb 2006 13:57:41 +0100 Subject: [Python-Dev] Path PEP: some comments In-Reply-To: <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> References: <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> Message-ID: <43E5F645.9050105@v.loewis.de> Phillip J. Eby wrote: >>- ctime() is documented to be unportable: it has different semantics on UNIX >>and Windows. I believe the class should abstract from these details. > > > Note that this is the opposite of normal Python policy: Python does not > attempt to create cross-platform abstractions, but instead chooses to > expose platform differences. The Path class shouldn't abstract this any > more than the original *path modules do. I think this is partially due to a misunderstanding, both by Microsoft, and in Python. There is a long-time myth that ctime denotes "creation time", as this is really in-line with mtime and atime. I think the path module should provide these under a different name: creation_time and status_change_time. Either of these might be absent. ctime should be provided to report whatever ctime used to report in the past (i.e. creation_time on Windows, status_change_time on Unix). Regards, Martin From martin at v.loewis.de Sun Feb 5 14:03:28 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 05 Feb 2006 14:03:28 +0100 Subject: [Python-Dev] Path PEP: some comments In-Reply-To: References: <030001c629c2$30345c90$bf03030a@trilan> <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> Message-ID: <43E5F7A0.7010704@v.loewis.de> Terry Reedy wrote: >>Note that this is the opposite of normal Python policy: Python does not >>attempt to create cross-platform abstractions, but instead chooses to >>expose platform differences. > > > I had the opposite impression about Python -- that it generally masks such > differences. I think it is both ways. For counter-examples, consider GUIs: Python does *not* attempt to provide a cross-platform GUI library (Tk tries that, but that is a different story). It also exposes os.lstat on systems that provide it, but doesn't try to emulate it on systems which don't. Likewise, there is a module linuxaudiodev which is only useful on some systems, and winsound, which is only useful on others. So first of all, Python exposes the platform API as-is, and doesn't try to "correct" things that it thinks the system got "wrong", or forgot to implement. On top of that, you have layers which try to mask differences, e.g. the os module or the subprocess module. Regards, Martin From martin at v.loewis.de Sun Feb 5 14:09:22 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 05 Feb 2006 14:09:22 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? In-Reply-To: <43E5A49A.5050907@gmail.com> References: <2m64nwg4wx.fsf@starship.python.net><43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org><43E40EE3.9070900@gmail.com><90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl><43E41B59.6080005@gmail.com><7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E5A49A.5050907@gmail.com> Message-ID: <43E5F902.4070208@v.loewis.de> Nick Coghlan wrote: > I guess my point is that expressions are appropriate sometimes, functions are > appropriate other times, and it *is* possible to give reasonably simple > guidelines as to which one is most appropriate when (one consumer->deferred > expression, multiple consumers->named function). I don't think this guideline is really valuable. If you transfer this to variables, you would get "one reader -> inline expression, multiple readers -> named variable". This is clearly wrong: it is established practice to use local variables even if there is only one access to the variable, if creating the variable improves readability of the code (e.g. if the expression is very complex). For functions, the same should hold: if it improves readability, make it a local function. Regards, Martin From martin at v.loewis.de Sun Feb 5 14:12:10 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 05 Feb 2006 14:12:10 +0100 Subject: [Python-Dev] ctypes patch (was: (libffi) Re: Copyright issue) In-Reply-To: <4f0b69dc0602020944l3bcfe1d2v1bc149ac3f202e91@mail.gmail.com> References: <4f0b69dc0602020944l3bcfe1d2v1bc149ac3f202e91@mail.gmail.com> Message-ID: <43E5F9AA.4080409@v.loewis.de> Hye-Shik Chang wrote: > Thomas and I collaborated on integration into the ctypes repository > and testing on various platforms yesterday. My patches for Python > are derived from ctypes CVS with a change of only one line. Not sure whether you think you need further approval: if you are ready to check this into the Python trunk, just go ahead. As I said, I would prefer if what is checked in is a literal copy of the ctypes CVS (as far as reasonable). Regards, Martin From rasky at develer.com Sun Feb 5 15:17:42 2006 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 5 Feb 2006 15:17:42 +0100 (CET) Subject: [Python-Dev] Path PEP: some comments In-Reply-To: <43E5F645.9050105@v.loewis.de> References: <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> <43E5F645.9050105@v.loewis.de> Message-ID: <1197.62.94.48.167.1139149062.squirrel@www.develer.com> On Sun, February 5, 2006 13:57, "Martin v. L?wis" wrote: > I think the path module should provide these under a different name: > creation_time and status_change_time. Either of these might be absent. +1. This is exactly what I proposed, in fact. > ctime should be provided to report whatever ctime used to report in > the past (i.e. creation_time on Windows, status_change_time on Unix). As I stated in my mail, I don't agree that there needs to be such a strict compatibility between methods in the new Path class and functions in the old os.path (or other) modules. Some consistency will ease the transition of course, but there is absolutely no need to provide a 1:1 mapping. Old code will continue to work, and new code might adapt to a new (possibly) better API. Given the confusion with 'ctime', I don't think that providing it in the new Path class would be a good move. It's better to force people to explicitally name what they're asking for (either creation_time or status_change_time). In other words, if there are mistakes in the old API, this is the time to fix them. Why should we carry them over to a new API? Giovanni Bajo From bokr at oz.net Sun Feb 5 16:57:54 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 05 Feb 2006 15:57:54 GMT Subject: [Python-Dev] Octal literals References: <1138797216.6791.38.camel@localhost.localdomain> <43e0bd69.810340828@news.gmane.org> <20060201085152.10AC.JCARLSON@uci.edu> <1138818914.12020.17.camel@geddy.wooz.org> <20060201133954.V49643@familjen.svensson.org> <43e21352.897869648@news.gmane.org> <43e27155.921936314@news.gmane.org> <7D3E3773-CDB4-4A81-ADFB-CBCC339F3DEE@fuhm.net> <43e2ffef.958442787@news.gmane.org> <43E3A754.2090207@v.loewis.de> <43e3d27c.1012343743@news.gmane.org> <43E47DBC.9000703@v.loewis.de> Message-ID: <43e60b15.1685113@news.gmane.org> On Sat, 04 Feb 2006 11:11:08 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Bengt Richter wrote: >>>The typical way of processing incoming ints in C is through >>>PyArg_ParseTuple, which already has the code to coerce long->int >>>(which in turn may raise an exception for a range violation). >>> >>>So for typical C code, 0x80000004 is a perfect bit mask in Python 2.4. >> >> Ok, I'll take your word that 'k' coercion takes no significant time for longs vs ints. > >I didn't say that 'k' takes no significant time for longs vs ints. In >fact, I did not make any performance claims. I don't know what the >relative performance is. Sorry, I apologize for putting words in your mouth. > >> Well, I was visualizing having a homogeneous bunch of bit mask >> definitions all as int type if they could fit. I can't express >> them all in hex as literals without some processing. That got me started ;-) > >I still can't see *why* you want to do that. Just write them as >hex literals the way you expect it to work, and it typically will >work just fine. Some of these literals are longs, some are ints, >but there is no need to worry about this. It will all work just >fine. Perhaps it's mostly aesthetics. Imagine that I was a tile-setter and my supplier had an order form where I could order square glazed tiles in various colors with dimensions in multiples of 4cm, and I said that I was very happy with the product, except why does the supplier have to send stretchable plastic tiles whenever I order the 32cm size, when I know they can be made like the others? (Granted that the plastic works just fine for most uses ;-). I have to admit the price for supplies is unbeatable, and that the necessary kit for converting 32cm plastic to ceramic was also supplied, but still, if one can order ceramic at all, why not the full range? Especially since if one orders the 32cm size in another dialect one can get it without having to use the conversion kit, e.g., >>> -2147483648 -2147483648 but >>> -0x80000000 -2147483648L >>> int(-0x80000000) -2147483648 ;-) That minus seems to bind differently in different literal dialects, e.g. to make the point clearer, compare with above: >>> -2147483648 -2147483648 >>> -(2147483648) -2147483648L > >> BTW, is long mandatory for all implementations? Is there a doc that >> defines minimum features for a conforming Python implementation? > >The Python language reference is typically considered as a specification >of what Python is. There is no "minimal Python" specification: you have >to do all of it. Good to know, thanks. Sorry to go OT. If someone wants to add something about supersetting and pypy's facilitation of same, I guess that belongs in another thread ;-) Regards, Bengt Richter From martin at v.loewis.de Sun Feb 5 18:04:06 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 05 Feb 2006 18:04:06 +0100 Subject: [Python-Dev] Path PEP: some comments In-Reply-To: <1197.62.94.48.167.1139149062.squirrel@www.develer.com> References: <5.1.1.6.0.20060204160108.0410c758@mail.telecommunity.com> <43E5F645.9050105@v.loewis.de> <1197.62.94.48.167.1139149062.squirrel@www.develer.com> Message-ID: <43E63006.40709@v.loewis.de> Giovanni Bajo wrote: >>ctime should be provided to report whatever ctime used to report in >>the past (i.e. creation_time on Windows, status_change_time on Unix). > > > In other words, if there are mistakes in the old API, this is the time to > fix them. Why should we carry them over to a new API? I'm not talking about all API in general, I'm talking about ctime specifically. People will ask "where is ctime?", and it better be where they think it should be. I don't see a point in confusing users. (Plus, there might be systems that associate yet a different meaning with ctime - it is just our guess that it means "status change time"). Regards, Martin From aleaxit at gmail.com Sun Feb 5 18:31:48 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sun, 5 Feb 2006 09:31:48 -0800 Subject: [Python-Dev] math.areclose ...? Message-ID: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> When teaching some programming to total newbies, a common frustration is how to explain why a==b is False when a and b are floats computed by different routes which ``should'' give the same results (if arithmetic had infinite precision). Decimals can help, but another approach I've found useful is embodied in Numeric.allclose(a,b) -- which returns True if all items of the arrays are ``close'' (equal to within certain absolute and relative tolerances): >>> (1.0/3.0)==(0.1/0.3) False >>> Numeric.allclose(1.0/3.0, 0.1/0.3) 1 But pulling in the whole of Numeric just to have that one handy function is often overkill. So I was wondering if module math (and perhaps by symmetry module cmath, too) shouldn't grow a function 'areclose' (calling it just 'close' seems likely to engender confusion, since 'close' is more often used as a verb than as an adjective; maybe some other name would work better, e.g. 'almost_equal') taking two float arguments and optional tolerances and using roughly the same specs as Numeric, e.g.: def areclose(x,y,rtol=1.e-5,atol=1.e-8): return abs(x-y) References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> Message-ID: <20060205091611.10F7.JCARLSON@uci.edu> bokr at oz.net (Bengt Richter) wrote: > Martin v. Lowis wrote: > >Bengt Richter wrote: > >>>The typical way of processing incoming ints in C is through > >>>PyArg_ParseTuple, which already has the code to coerce long->int > >>>(which in turn may raise an exception for a range violation). > >>> > >>>So for typical C code, 0x80000004 is a perfect bit mask in Python 2.4. > >> > >> Ok, I'll take your word that 'k' coercion takes no significant time for longs vs ints. > > > >I didn't say that 'k' takes no significant time for longs vs ints. In > >fact, I did not make any performance claims. I don't know what the > >relative performance is. > > Sorry, I apologize for putting words in your mouth. In regards to the aesthetics and/or inconsistancies of: >>> -0x80000000 -2147483648L >>> -2147483648 -2147483648 >>> -(2147483648) -2147483648L 1. If your Python code distinguishes between ints and longs, it has a bug. 2. If your C extension to Python isn't using the 'k' format specifier as Martin is telling you to, then your C extension has a bug. 3. If you are concerned about *potential* performance degredation due to a use of 'k' rather than 'i' or 'I', then you've forgotten the fact that Python function calling is orders of magnitude slower than the minimal bit twiddling that PyInt_AsUnsignedLongMask() or PyLong_AsUnsignedLongMask() has to do. Please, just use 'k' and let the list get past this. - Josiah From guido at python.org Sun Feb 5 18:43:28 2006 From: guido at python.org (Guido van Rossum) Date: Sun, 5 Feb 2006 09:43:28 -0800 Subject: [Python-Dev] Let's just *keep* lambda Message-ID: After so many attempts to come up with an alternative for lambda, perhaps we should admit defeat. I've not had the time to follow the most recent rounds, but I propose that we keep lambda, so as to stop wasting everybody's talent and time on an impossible quest. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Sun Feb 5 19:01:24 2006 From: aahz at pythoncraft.com (Aahz) Date: Sun, 5 Feb 2006 10:01:24 -0800 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> Message-ID: <20060205180124.GA20842@panix.com> On Sun, Feb 05, 2006, Alex Martelli wrote: > > But pulling in the whole of Numeric just to have that one handy > function is often overkill. So I was wondering if module math (and > perhaps by symmetry module cmath, too) shouldn't grow a function > 'areclose' (calling it just 'close' seems likely to engender > confusion, since 'close' is more often used as a verb than as an > adjective; maybe some other name would work better, e.g. > 'almost_equal') taking two float arguments and optional tolerances > and using roughly the same specs as Numeric, e.g.: > > def areclose(x,y,rtol=1.e-5,atol=1.e-8): > return abs(x-y) http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From g.brandl at gmx.net Sun Feb 5 19:02:34 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 05 Feb 2006 19:02:34 +0100 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> Message-ID: Alex Martelli wrote: > When teaching some programming to total newbies, a common frustration > is how to explain why a==b is False when a and b are floats computed > by different routes which ``should'' give the same results (if > arithmetic had infinite precision). Decimals can help, but another > approach I've found useful is embodied in Numeric.allclose(a,b) -- > which returns True if all items of the arrays are ``close'' (equal to > within certain absolute and relative tolerances): > > >>> (1.0/3.0)==(0.1/0.3) > False > >>> Numeric.allclose(1.0/3.0, 0.1/0.3) > 1 > > But pulling in the whole of Numeric just to have that one handy > function is often overkill. So I was wondering if module math (and > perhaps by symmetry module cmath, too) shouldn't grow a function > 'areclose' (calling it just 'close' seems likely to engender > confusion, since 'close' is more often used as a verb than as an > adjective; maybe some other name would work better, e.g. > 'almost_equal') taking two float arguments and optional tolerances > and using roughly the same specs as Numeric, e.g.: > > def areclose(x,y,rtol=1.e-5,atol=1.e-8): > return abs(x-y) > What do y'all think...? atol sounds suspicious to me, but otherwise fine. Georg From tjreedy at udel.edu Sun Feb 5 19:11:29 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 5 Feb 2006 13:11:29 -0500 Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net><43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org><43E40EE3.9070900@gmail.com><90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl><43E41B59.6080005@gmail.com><7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E5A49A.5050907@gmail.com> Message-ID: "Nick Coghlan" wrote in message news:43E5A49A.5050907 at gmail.com... > I guess I misstated myself slightly - I've previously advocated re-using > the > 'def' keyword, so there are obviously parallels I want to emphasize. If 3.0 comes with a conversion program, then I would like to see 'lambda' replaced with either 'def' or another abbreviation like 'edef' (expression def) or 'func'. From bokr at oz.net Sun Feb 5 19:45:52 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 05 Feb 2006 18:45:52 GMT Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E4A43B.20903@v.loewis.de> <43E4B3CB.7080005@gmail.com> <8307255af7b268cf2018a0cfb1fe1d17@xs4all.nl> Message-ID: <43e627aa.9002845@news.gmane.org> On Sat, 4 Feb 2006 20:17:15 +0100, Eric Nieuwland wrote: >Nick Coghlan wrote: >>> I believe that usage of a keyword with the name of a Greek letter also >>> contributes to people considering something broken. >> >> Aye, I agree there are serious problems with the current syntax. All >> I'm >> trying to say above is that I don't believe the functionality itself >> is broken. > >Lambda is not broken, it's restricted to single calculation and >therefore of limited use. It's not even that restricted, if you want to be perverse, e.g., >>> (lambda w:eval(compile("""if 1: # indented looks nicer ;-) ... if len(w)<=3: adj ='short' ... elif len(w)<=5: adj ='medium length' ... else: adj = 'long' ... print 'Hi, %s! I would say you have a %s name ;-)'%(w,adj) ... """,'','exec')))('Monty') Hi, Monty! I would say you have a medium length name ;-) lazy copy/pasting and changing the arg: >>> (lambda w:eval(compile("""if 1: # indented looks nicer ;-) ... if len(w)<=3: adj ='short' ... elif len(w)<=5: adj ='medium length' ... else: adj = 'long' ... print 'Hi, %s! I would say you have a %s name ;-)'%(w,adj) ... """,'','exec')))('Ada') Hi, Ada! I would say you have a short name ;-) My point is that ISTM preventing easy inclusion of suites in lambda/anonymous_def is more of a morality/taste/catechistic issue than a technical one. It seems like an attempt to control coding style by disincentivizing the disapproved. That may be ok in the big picture, I'm not sure, but IMO transparency of motivations is best. >Although I wasn't too serious (should had added more signs of that), an >anonymous 'def' would allow to use the full power of method definition. > It's already allowed, just not in a way that generates efficient code (although the above can be improved upon, let's not go there ;-) >> At last count, Guido's stated preference was to ditch the functionality >> entirely for Py3k, so unless he says something to indicate he's >> changed his >> mind, we'll simply need to continue with proposing functions like >> methodcaller() as workarounds for its absence... > >Yep, we'll just have to learn to live without it. :-( / ;-) If it's needed, I believe a way will be found to have it ;-) I do think the current lambda serves a valuable purpose, so I hope some way is found to preserve the functionality, whatever problems anyone may have with its current simple syntax. Psst, Nick, how about (x*y for x,y in ()) ? # "()" as mnemonic for call args ;-) Regards, Bengt Richter From python at rcn.com Sun Feb 5 19:48:51 2006 From: python at rcn.com (Raymond Hettinger) Date: Sun, 5 Feb 2006 13:48:51 -0500 Subject: [Python-Dev] math.areclose ...? References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> <20060205180124.GA20842@panix.com> Message-ID: <001601c62a84$cef91d80$b83efea9@RaymondLaptop1> >>So I was wondering if module math (and >> perhaps by symmetry module cmath, too) shouldn't grow a function >> 'areclose' (calling it just 'close' seems likely to engender >> confusion, since 'close' is more often used as a verb than as an >> adjective; maybe some other name would work better, e.g. >> 'almost_equal') taking two float arguments and optional tolerances >> and using roughly the same specs as Numeric, e.g.: >> >> def areclose(x,y,rtol=1.e-5,atol=1.e-8): >> return abs(x-y) This proposal is one of several that have recently surfaced that aim to help newbies skip learning basic lessons. I think the efforts are noble but misguided. * If someone doesn't get why set(1,2,3) raises an exception, it is a good opportunity to teach a broadly applicable skill: def Set(*args): return set(args) * If someone doesn't get why sum([0.1]*10)!=1.0, then we have a good opportunity to teach the basics of floating point. Otherwise, we're going to get people writing accounting apps using floats instead of ints or Decimals. * If someone doesn't get how to empty a list using a[:]=[], it is a good time to go through the basics of slicing which are a foundation for understanding many parts of the language. A language suitable for beginners should be easy to learn, but it should not leave them permanently crippled. All of the above are sets of training wheels that don't come off. To misquote Einstein: The language should be as simple as possible, but no simpler. Raymond From Scott.Daniels at Acm.Org Sun Feb 5 19:53:30 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 05 Feb 2006 10:53:30 -0800 Subject: [Python-Dev] math.areclose ...? In-Reply-To: References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> Message-ID: Georg Brandl wrote: > Alex Martelli wrote: >>So I was wondering if module math (and perhaps by symmetry module cmath, >> too) shouldn't grow a function 'areclose' ...maybe ... 'almost_equal') >> def areclose(x, y, rtol=1.e-5, atol=1.e-8): >> return abs(x-y) > atol sounds suspicious to me, but otherwise fine. "almost_equal", "closeto", or some variant of "near" (no nasty verb to worry about) would do for me. atol / rtol would be better as either abs_tol / rel_tol or even absolute_tolerance / relative_tolerance. As to the equation itself, wouldn't a symmetric version be somewhat better? def nearby(x, y, rel_tol=1.e-5, abs_tol=1.e-8): return abs(x - y) < abs_tol + rel_tol * (abs(x) + abs(y)) This avoids areclose(0, 1e-8) != areclose(1e-8, 0), for example. --Scott David Daniels Scott.Daniels at Acm.Org From gherron at islandtraining.com Sun Feb 5 19:35:23 2006 From: gherron at islandtraining.com (Gary Herron) Date: Sun, 05 Feb 2006 10:35:23 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <43E6456B.3040707@islandtraining.com> Guido van Rossum wrote: >After so many attempts to come up with an alternative for lambda, >perhaps we should admit defeat. I've not had the time to follow the >most recent rounds, but I propose that we keep lambda, so as to stop >wasting everybody's talent and time on an impossible quest. > >-- >--Guido van Rossum (home page: http://www.python.org/~guido/) >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/gherron%40islandtraining.com > > Hear hear! +1 Gary Herron From tjreedy at udel.edu Sun Feb 5 20:01:58 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 5 Feb 2006 14:01:58 -0500 Subject: [Python-Dev] Let's just *keep* lambda References: Message-ID: "Guido van Rossum" wrote in message news:ca471dc20602050943q5bad4d1ehadd9d3b653d8b4fb at mail.gmail.com... > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. To me, there are two separate issues: the keyword and the syntax. I also have not been impressed by any of the numerous alternative syntaxes proposed over several years and just this morning was thinking something similar to the above. But will you consider changing the keyword from the charged and overladen 'lambda' to something else? (See other post today.) I think this would cut at least half the fuss. I base this on the following observation: generator expressions are to generator statement definitions much like function expressions are to function statement definitions. Both work when the payload yielded or returned is computed in a single expression. But I personally have not seen any complaints about the 'limitations of generator expressions' nor proposals to duplicate the generality of statement definitions by stuffing compound statement bodies within expressions. But if we had called them generator lambdas, I suspect we would have. Terry Jan Reedy From fredrik at pythonware.com Sun Feb 5 20:02:39 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 5 Feb 2006 20:02:39 +0100 Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net><43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org><43E40EE3.9070900@gmail.com><90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl><43E41B59.6080005@gmail.com><7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com><43E5A49A.5050907@gmail.com> Message-ID: Terry Reedy wrote: > If 3.0 comes with a conversion program, then I would like to see 'lambda' > replaced with either 'def' or another abbreviation like 'edef' (expression > def) or 'func'. making the implied return statment visible might also be a good idea, e.g. lambda x, y: return x + y or even def (x, y): return x + y From bob at redivi.com Sun Feb 5 20:16:19 2006 From: bob at redivi.com (Bob Ippolito) Date: Sun, 5 Feb 2006 11:16:19 -0800 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <001601c62a84$cef91d80$b83efea9@RaymondLaptop1> References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> <20060205180124.GA20842@panix.com> <001601c62a84$cef91d80$b83efea9@RaymondLaptop1> Message-ID: <7406CE97-9CE0-4E0B-ADB4-4FD6212CA59A@redivi.com> On Feb 5, 2006, at 10:48 AM, Raymond Hettinger wrote: >>> So I was wondering if module math (and >>> perhaps by symmetry module cmath, too) shouldn't grow a function >>> 'areclose' (calling it just 'close' seems likely to engender >>> confusion, since 'close' is more often used as a verb than as an >>> adjective; maybe some other name would work better, e.g. >>> 'almost_equal') taking two float arguments and optional tolerances >>> and using roughly the same specs as Numeric, e.g.: >>> >>> def areclose(x,y,rtol=1.e-5,atol=1.e-8): >>> return abs(x-y) > IMO, the cure is worse than the disease. It is easier to learn about > the hazards of floating point equality testing than to think > through the > implications of tolerance testing (such as loss of transitivity) and > learning > how to set the right tolerance values for a given application (ones > that > give the right results across the entire domain of expected inputs). > > The areclose() function can be a dangerous crutch that temporarily > glosses over the issue. Without some numerical sophistication, it > would not > be hard create programs that look correct and pass a few test but, > in fact, > contain nasty bugs (non-termination, incorrect acceptance/ > rejection, etc). For those of us that already know what we're doing with floating point, areclose would be very convenient to have. Especially for unit testing. I could definitely throw away a bunch of ugly code that uses less correct arbitrary tolerance guesses if it were around. -bob From python at rcn.com Sun Feb 5 20:31:42 2006 From: python at rcn.com (Raymond Hettinger) Date: Sun, 5 Feb 2006 14:31:42 -0500 Subject: [Python-Dev] math.areclose ...? References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> <20060205180124.GA20842@panix.com> <001601c62a84$cef91d80$b83efea9@RaymondLaptop1> <7406CE97-9CE0-4E0B-ADB4-4FD6212CA59A@redivi.com> Message-ID: <006001c62a8a$cb558230$b83efea9@RaymondLaptop1> [Bob Ipppolito] > For those of us that already know what we're doing with floating > point, areclose would be very convenient to have. Do you agree that the original proposed use (helping newbs ignore floating point realities) is misguided and error-prone? Just curious, for your needs, do you want both absolute and relative checks combined into the same function? > Especially for > unit testing. I could definitely throw away a bunch of ugly code > that uses less correct arbitrary tolerance guesses if it were around. The unittest module already has assertAlmostEqual(). Does that method meet your needs or does it need to be improved in some way? Raymond From bob at redivi.com Sun Feb 5 20:46:25 2006 From: bob at redivi.com (Bob Ippolito) Date: Sun, 5 Feb 2006 11:46:25 -0800 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <006001c62a8a$cb558230$b83efea9@RaymondLaptop1> References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> <20060205180124.GA20842@panix.com> <001601c62a84$cef91d80$b83efea9@RaymondLaptop1> <7406CE97-9CE0-4E0B-ADB4-4FD6212CA59A@redivi.com> <006001c62a8a$cb558230$b83efea9@RaymondLaptop1> Message-ID: On Feb 5, 2006, at 11:31 AM, Raymond Hettinger wrote: > [Bob Ipppolito] >> For those of us that already know what we're doing with floating >> point, areclose would be very convenient to have. > > Do you agree that the original proposed use (helping newbs ignore > floating > point realities) is misguided and error-prone? Maybe it's a bit misguided, but it's less error-prone than more naive comparisons. It could delay the necessity for a newer programmer to lean all about floating point, but maybe most of those users don't really need to learn it. Whether the function is there or not, this is really a documentation issue. If the function is there then maybe it could highly suggest reading some "floating point in Python" guide that would describe the scenario, then lists common pitfalls with patterns that avoid those problems. > Just curious, for your needs, do you want both absolute and > relative checks combined into the same function? Having both makes it less likely that you'll need to tweak the constants, except of course if you're working with very small numbers such that the absolute tolerance is too big. Of course, if you only want one or the other in a given case, you can always pass in 0 manually. For my needs, the proposed function and default tolerances would be better than the sloppy stuff that usually ends up in my tests. >> Especially for unit testing. I could definitely throw away a >> bunch of ugly code that uses less correct arbitrary tolerance >> guesses if it were around. > > The unittest module already has assertAlmostEqual(). Does that > method meet your needs or does it need to be improved in some way? I generally write tests that don't run directly under the unittest framework, such as doctests or assert-based functions for nose or py.test. The unittest module does not expose assertAlmostEqual as a function so it's of little use for me. -bob From bokr at oz.net Sun Feb 5 21:28:30 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 05 Feb 2006 20:28:30 GMT Subject: [Python-Dev] Octal literals References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> Message-ID: <43e657a1.21281501@news.gmane.org> On Sun, 05 Feb 2006 09:38:35 -0800, Josiah Carlson wrote: > >bokr at oz.net (Bengt Richter) wrote: >> Martin v. Lowis wrote: >> >Bengt Richter wrote: >> >>>The typical way of processing incoming ints in C is through >> >>>PyArg_ParseTuple, which already has the code to coerce long->int >> >>>(which in turn may raise an exception for a range violation). >> >>> >> >>>So for typical C code, 0x80000004 is a perfect bit mask in Python 2.4. >> >> >> >> Ok, I'll take your word that 'k' coercion takes no significant time for longs vs ints. >> > >> >I didn't say that 'k' takes no significant time for longs vs ints. In >> >fact, I did not make any performance claims. I don't know what the >> >relative performance is. >> >> Sorry, I apologize for putting words in your mouth. > >In regards to the aesthetics and/or inconsistancies of: > >>> -0x80000000 > -2147483648L > >>> -2147483648 > -2147483648 > >>> -(2147483648) > -2147483648L > >1. If your Python code distinguishes between ints and longs, it has a >bug. Are you just lecturing me personally (in which case off list would be more appropriate), or do you include the authors of the 17 files I count under /Lib that have isinstance(, int) in them? Or would you like to rephrase that with suitable qualifications? ;-) > >2. If your C extension to Python isn't using the 'k' format specifier as >Martin is telling you to, then your C extension has a bug. I respect Martin's expert knowledge and manner of communication. He said, "Just have a look at the 'k' specifier in PyArg_ParseTuple." Regards, Bengt Richter From fdrake at acm.org Sun Feb 5 21:38:06 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 5 Feb 2006 15:38:06 -0500 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <200602051538.06415.fdrake@acm.org> On Sunday 05 February 2006 12:43, Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. +1 -Fred -- Fred L. Drake, Jr. From bokr at oz.net Sun Feb 5 21:42:19 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 05 Feb 2006 20:42:19 GMT Subject: [Python-Dev] math.areclose ...? References: <030FFEE3-8134-4B39-84BE-FDB11EAFC692@gmail.com> <20060205180124.GA20842@panix.com> <001601c62a84$cef91d80$b83efea9@RaymondLaptop1> Message-ID: <43e66278.24056261@news.gmane.org> On Sun, 5 Feb 2006 13:48:51 -0500, "Raymond Hettinger" wrote: [...] > [...] >A language suitable for beginners should be easy to learn, but it should not >leave them permanently crippled. All of the above are sets of training >wheels >that don't come off. To misquote Einstein: The language should be as >simple >as possible, but no simpler. > ++1 QOTW Regards, Bengt Richter From p.f.moore at gmail.com Sun Feb 5 21:43:57 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 5 Feb 2006 20:43:57 +0000 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <79990c6b0602051243y3c1d7197i584070a7ef806de8@mail.gmail.com> On 2/5/06, Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. +1 The recently suggested keyword change, from lambda to expr (as in '''expr x, y: x+y''') looks like an improvement to me, but I suspect opening up the possibility of a keyword change would simply restart all the discussions... (Nevertheless, I'd be +1 on lambda being renamed to expr, if it was an option). Paul. From python at rcn.com Sun Feb 5 21:49:14 2006 From: python at rcn.com (Raymond Hettinger) Date: Sun, 5 Feb 2006 15:49:14 -0500 Subject: [Python-Dev] Let's just *keep* lambda References: Message-ID: <004e01c62a95$a03e9ef0$6701a8c0@RaymondLaptop1> > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. +1 -- trying to cover all the use cases is a fools errand Raymond From allison at shasta.stanford.edu Sun Feb 5 22:02:37 2006 From: allison at shasta.stanford.edu (Dennis Allison) Date: Sun, 5 Feb 2006 13:02:37 -0800 (PST) Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <79990c6b0602051243y3c1d7197i584070a7ef806de8@mail.gmail.com> Message-ID: +1 on retaining lambda -1 on any name change On Sun, 5 Feb 2006, Paul Moore wrote: > On 2/5/06, Guido van Rossum wrote: > > After so many attempts to come up with an alternative for lambda, > > perhaps we should admit defeat. I've not had the time to follow the > > most recent rounds, but I propose that we keep lambda, so as to stop > > wasting everybody's talent and time on an impossible quest. > > +1 > > The recently suggested keyword change, from lambda to expr (as in > '''expr x, y: x+y''') looks like an improvement to me, but I suspect > opening up the possibility of a keyword change would simply restart > all the discussions... (Nevertheless, I'd be +1 on lambda being > renamed to expr, if it was an option). > > Paul. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40shasta.stanford.edu > -- From crutcher at gmail.com Sun Feb 5 22:09:50 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 5 Feb 2006 13:09:50 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: +1 On 2/5/06, Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com > -- Crutcher Dunnavant monket.samedi-studios.com From tim.peters at gmail.com Sun Feb 5 23:16:38 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 5 Feb 2006 17:16:38 -0500 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <1f7befae0602051416m3d568ac7s627e3d10b84fb21a@mail.gmail.com> [Guido] > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. Huh! Was someone bad-mouthing lambda again? We should keep it, but rename it to honor a different Greek letter. xi is a good one, easier to type, and would lay solid groundwork for future flamewars between xi enthusiasts and Roman numeral fans :-) From crutcher at gmail.com Sun Feb 5 23:26:54 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 5 Feb 2006 14:26:54 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <1f7befae0602051416m3d568ac7s627e3d10b84fb21a@mail.gmail.com> References: <1f7befae0602051416m3d568ac7s627e3d10b84fb21a@mail.gmail.com> Message-ID: Which reminds me, we need to support roman numeral constants. A silly implementation follows. class RomanNumeralDict(dict): def __getitem__(self, key): if not self.has_key(key) and self.isRN(key): return self.decodeRN(key) return dict.__getitem__(self, key) def isRN(self, key): for c in key: if c not in 'MmCcXxIiDdVv': return False return True def decodeRN(self, key): val = 0 # ... do stuff ... return val On 2/5/06, Tim Peters wrote: > [Guido] > > After so many attempts to come up with an alternative for lambda, > > perhaps we should admit defeat. I've not had the time to follow the > > most recent rounds, but I propose that we keep lambda, so as to stop > > wasting everybody's talent and time on an impossible quest. > > Huh! Was someone bad-mouthing lambda again? We should keep it, but > rename it to honor a different Greek letter. xi is a good one, easier > to type, and would lay solid groundwork for future flamewars between > xi enthusiasts and Roman numeral fans :-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com > -- Crutcher Dunnavant monket.samedi-studios.com From tim.peters at gmail.com Sun Feb 5 23:35:44 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 5 Feb 2006 17:35:44 -0500 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <1f7befae0602051416m3d568ac7s627e3d10b84fb21a@mail.gmail.com> Message-ID: <1f7befae0602051435l3d20a2b9s798d51a364c212a6@mail.gmail.com> [Crutcher Dunnavant[ > Which reminds me, we need to support roman numeral constants. One of my more-normal relatives reminded me that this is Super Bowl XL Sunday, so your demand is more topical than it would ordinarily be. Alas, there's already a PEP on this, and it was already rejected. See PEP CCCXIII: http://www.python.org/peps/pep-0313.html From crutcher at gmail.com Sun Feb 5 23:53:20 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 5 Feb 2006 14:53:20 -0800 Subject: [Python-Dev] [PATCH] Fix dictionary subclass semantics whenused as global dictionaries In-Reply-To: <43D6AFC0.5080602@v.loewis.de> References: <001201c61756$cc2d8780$5916c797@oemcomputer> <43C6E25F.2040606@v.loewis.de> <43D6AFC0.5080602@v.loewis.de> Message-ID: I've significantly re-worked the patch to permit globals to be arbitrary mappings. The regression tests continue to all pass. http://sourceforge.net/tracker/index.php?func=detail&aid=1402289&group_id=5470&atid=305470 On 1/24/06, "Martin v. L?wis" wrote: > Crutcher Dunnavant wrote: > > Okay, but is there any reason not to include this in 2.5? There > > doesn't seem to be any noticeable performance impact, and it does add > > consistancy (and opens some really, really cool options up). > > I see no reason, except perhaps the lack of volunteers to actually > patch the repository (along with the accompanying work). > > Regards, > Martin > > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From python at rcn.com Mon Feb 6 00:01:16 2006 From: python at rcn.com (Raymond Hettinger) Date: Sun, 5 Feb 2006 18:01:16 -0500 Subject: [Python-Dev] [PATCH] Fix dictionary subclass semantics whenused as global dictionaries References: <001201c61756$cc2d8780$5916c797@oemcomputer> <43C6E25F.2040606@v.loewis.de> <43D6AFC0.5080602@v.loewis.de> Message-ID: <002001c62aa8$12439890$6701a8c0@RaymondLaptop1> You don't have to keep writing notes to python-dev on this patch. It is assigned to me and when I get a chance to go through it in detail, it has a good likelihood of going in (if no issues arise). Raymond ----- Original Message ----- From: "Crutcher Dunnavant" To: "Martin v. L?wis" Cc: ; "Aahz" ; Sent: Sunday, February 05, 2006 5:53 PM Subject: Re: [Python-Dev] [PATCH] Fix dictionary subclass semantics whenused as global dictionaries I've significantly re-worked the patch to permit globals to be arbitrary mappings. The regression tests continue to all pass. http://sourceforge.net/tracker/index.php?func=detail&aid=1402289&group_id=5470&atid=305470 On 1/24/06, "Martin v. L?wis" wrote: > Crutcher Dunnavant wrote: > > Okay, but is there any reason not to include this in 2.5? There > > doesn't seem to be any noticeable performance impact, and it does add > > consistancy (and opens some really, really cool options up). > > I see no reason, except perhaps the lack of volunteers to actually > patch the repository (along with the accompanying work). > > Regards, > Martin > > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From bokr at oz.net Mon Feb 6 00:28:55 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 05 Feb 2006 23:28:55 GMT Subject: [Python-Dev] any support for a methodcaller HOF? References: <2m64nwg4wx.fsf@starship.python.net> <43E3341F.4050001@gmail.com> <43e36411.984076957@news.gmane.org> <43E40EE3.9070900@gmail.com> <90fc517f86c4e1c3dbc5114236c54f54@xs4all.nl> <43E41B59.6080005@gmail.com> <7a725d1194b99280428568c857a5d577@xs4all.nl> <43E4A10C.7020703@gmail.com> <43E4A43B.20903@v.loewis.de> <43E4B3CB.7080005@gmail.com> <8307255af7b268cf2018a0cfb1fe1d17@xs4all.nl> <43e627aa.9002845@news.gmane.org> Message-ID: <43e685c4.33092394@news.gmane.org> On Sun, 05 Feb 2006 18:45:52 GMT, bokr at oz.net (Bengt Richter) wrote: [...] >Psst, Nick, how about > (x*y for x,y in ()) ? # "()" as mnemonic for call args D'oh, sorry, that should have been illegal syntax, e.g., (x*y for x,y in *) ? # "*" as mnemonic for call *args so (x*y for x,y in *)(3,5) # => 15 or (x*y for x,y in *)(*[3,5]) # => 15 etc. Hm, along that line why not (x*y for x,y in **) ? # "**" as mnemonic for call **kwargs so (x*y for x,y in **)(x=3, y=5) # => 15 or maybe even (x*y+z for (x,y),z in *,**)(3, 5, z=200) # => 215 Though I see this is moot, since Guido decided to "keep lambda," (+1 on that, although this is kind of growing on me, no doubt from partial ih-factor ;-) Regards, Bengt Richter From eric.nieuwland at xs4all.nl Mon Feb 6 02:19:52 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Mon, 6 Feb 2006 02:19:52 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <474fbdc277b15f628e215b954dd6543e@xs4all.nl> On 5 feb 2006, at 18:43, Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) +1 And let's add "Wise" t the BDFL's title: WBDFL. ;-) --eric From guido at python.org Mon Feb 6 03:08:58 2006 From: guido at python.org (Guido van Rossum) Date: Sun, 5 Feb 2006 18:08:58 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <43e657a1.21281501@news.gmane.org> References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> Message-ID: On 2/5/06, Bengt Richter wrote: > On Sun, 05 Feb 2006 09:38:35 -0800, Josiah Carlson wrote: > >1. If your Python code distinguishes between ints and longs, it has a > >bug. > Are you just lecturing me personally (in which case off list would be more appropriate), > or do you include the authors of the 17 files I count under /Lib that have > isinstance(, int) in them? Josiah is correct, and those modules all have bugs. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Mon Feb 6 03:47:13 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 05 Feb 2006 18:47:13 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <43e657a1.21281501@news.gmane.org> References: <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> Message-ID: <20060205180405.10FD.JCARLSON@uci.edu> bokr at oz.net (Bengt Richter) wrote: > Are you just lecturing me personally (in which case off list would be more appropriate), > or do you include the authors of the 17 files I count under /Lib that have > isinstance(, int) in them? > Or would you like to rephrase that with suitable qualifications? ;-) I did not mean to sound like I was lecturing you personally. Without taking a peek at the source, I would guess that the various uses of isinstance(, int) are bugs, possibly replacing previous uses of type() is int, shortly after int subclassing was allowed. But that's just a guess. - Josiah From smiles at worksmail.net Mon Feb 6 03:50:43 2006 From: smiles at worksmail.net (Chris or Leslie Smith) Date: Sun, 5 Feb 2006 20:50:43 -0600 Subject: [Python-Dev] any support for a methodcaller HOF? References: Message-ID: <003001c62aca$47d7e750$cf2c4fca@csmith> | making the implied return statment visible might also be a good idea, | e.g. | | lambda x, y: return x + y | | or even | | def (x, y): return x + y | Although I don't understand the implications of making such a change, the 2nd alternative above looks very nice. Whenever I write a lambda I feel like I am doing something non-pythonic. I think the 2nd proposal increases the readability of the lambda, something that is often touted as being part of what makes python beautiful. /c From steven.bethard at gmail.com Mon Feb 6 05:44:15 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Sun, 5 Feb 2006 21:44:15 -0700 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. Personally, I'd rather see a callable-from-expression syntax (the ``lambda`` expression) that looks more like our callable-from-statements syntax (the ``def`` statement), e.g. Nick Coghlan's def-from syntax:: (def f(a) + o(b) - o(c) from (a, b, c)) Something like this is more consistent with how list creation is turned into list comprehensions, how generator functions are turned into generator expressions and how if/else statements are turned into conditional expressions. That said, I firmly believe that syntax decisions *must* be left to the BDFL. The decorator syntax and with-statement syntax debates clearly showed this. So if after looking at all the syntax alternatives_, you feel that the current lambda syntax is the best we can do, I'm willing to accept that decision. .. _alternatives: http://wiki.python.org/moin/AlternateLambdaSyntax STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From bokr at oz.net Mon Feb 6 06:33:57 2006 From: bokr at oz.net (Bengt Richter) Date: Mon, 06 Feb 2006 05:33:57 GMT Subject: [Python-Dev] Octal literals References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> Message-ID: <43e6bfbc.47932262@news.gmane.org> On Sun, 5 Feb 2006 18:08:58 -0800, Guido van Rossum wrote: >On 2/5/06, Bengt Richter wrote: >> On Sun, 05 Feb 2006 09:38:35 -0800, Josiah Carlson wrote: >> >1. If your Python code distinguishes between ints and longs, it has a >> >bug. >> Are you just lecturing me personally (in which case off list would be more appropriate), >> or do you include the authors of the 17 files I count under /Lib that have >> isinstance(, int) in them? > >Josiah is correct, and those modules all have bugs. > It seems I stand incontestably corrected. Sorry, both ways ;-/ Perhaps I missed a py3k assumption in this thread (where I see in the PEP that "Remove distinction between int and long types" is core item number one)? I googled, but could not find that isinstance(,int) was slated for deprecation, so I assumed that Josiah's absolute statement "1. ..." (above) could not be absolutely true, at least in the "has" (present) tense that he used. Is PEP 237 phase C to be implemented sooner than py3k, making isinstance(, int) a transparently distinction-hiding alias for isinstance(, integer), or outright illegal? IOW, will isinstance(, int) be _guaranteed_ to be a bug, thus requiring code change? If so, when? Regards, Bengt Richter From bokr at oz.net Mon Feb 6 06:55:05 2006 From: bokr at oz.net (Bengt Richter) Date: Mon, 06 Feb 2006 05:55:05 GMT Subject: [Python-Dev] Octal literals References: <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> <20060205180405.10FD.JCARLSON@uci.edu> Message-ID: <43e6e09a.56346872@news.gmane.org> On Sun, 05 Feb 2006 18:47:13 -0800, Josiah Carlson wrote: > >bokr at oz.net (Bengt Richter) wrote: >> Are you just lecturing me personally (in which case off list would be more appropriate), >> or do you include the authors of the 17 files I count under /Lib that have >> isinstance(, int) in them? >> Or would you like to rephrase that with suitable qualifications? ;-) > >I did not mean to sound like I was lecturing you personally. > >Without taking a peek at the source, I would guess that the various uses >of isinstance(, int) are bugs, possibly replacing previous >uses of type() is int, shortly after int subclassing was >allowed. But that's just a guess. > Thank you. I didn't look either, but I did notice that most (but not all) of them were under /Lib/test/. Maybe it's excusable for test code ;-) Regards, Bengt Richter From thomas at xs4all.net Mon Feb 6 09:05:01 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 6 Feb 2006 09:05:01 +0100 Subject: [Python-Dev] Octal literals In-Reply-To: <43e6bfbc.47932262@news.gmane.org> References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> <43e6bfbc.47932262@news.gmane.org> Message-ID: <20060206080501.GA10226@xs4all.nl> On Mon, Feb 06, 2006 at 05:33:57AM +0000, Bengt Richter wrote: > Perhaps I missed a py3k assumption in this thread (where I see in the PEP > that "Remove distinction between int and long types" is core item number > one)? http://www.python.org/peps/pep-0237.html -- an ungoing process, not a Py3K-eventual one. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From abo at minkirri.apana.org.au Mon Feb 6 15:09:18 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 06 Feb 2006 14:09:18 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <20060203113244.10E4.JCARLSON@uci.edu> References: <20060203085105.10DE.JCARLSON@uci.edu> <1138986264.7232.105.camel@warna.dub.corp.google.com> <20060203113244.10E4.JCARLSON@uci.edu> Message-ID: <1139234958.24831.42.camel@warna.dub.corp.google.com> On Fri, 2006-02-03 at 11:56 -0800, Josiah Carlson wrote: > Donovan Baarda wrote: [...] > > Nuff was a fairy... though I guess it depends on where you draw the > > line; should [1,2,3] be list(1,2,3)? > > Who is "Nuff"? fairynuff... :-) > Along the lines of "not every x line function should be a builtin", "not > every builtin should have syntax". I think that sets have particular > uses, but I don't believe those uses are sufficiently varied enough to > warrant the creation of a syntax. I suggest that people take a walk > through their code. How often do you use other sequence and/or mapping > types? How many lists, tuples and dicts are there? How many sets? Ok, > now how many set literals? The absence of sets in early Python, the requirement to "import sets" when they first appeared, and the lack of a set syntax now all mean that people tend to avoid using sets and resort to lists, tuples, and "dicts of None" instead, even though they really want a set. Anywhere you see "if value in sequence:", they probably mean sequence is a set, and this code would run much faster if it really was, and might even avoid potential bugs because it would prevent duplicates... -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From abo at minkirri.apana.org.au Mon Feb 6 15:11:20 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 06 Feb 2006 14:11:20 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <43E3A8B8.2040500@v.loewis.de> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> <43E3A8B8.2040500@v.loewis.de> Message-ID: <1139235080.24831.45.camel@warna.dub.corp.google.com> On Fri, 2006-02-03 at 20:02 +0100, "Martin v. L?wis" wrote: > Donovan Baarda wrote: > > Before set() the standard way to do them was to use dicts with None > > Values... to me the "{1,2,3}" syntax would have been a logical extension > > of the "a set is a dict with no values, only keys" mindset. I don't know > > why it wasn't done this way in the first place, though I missed the > > arguments where it was rejected. > > There might be many reasons; one obvious reason is that you can't spell > the empty set that way. Hmm... how about "{,}", which is the same trick tuples use for the empty tuple? -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From bokr at oz.net Mon Feb 6 15:27:12 2006 From: bokr at oz.net (Bengt Richter) Date: Mon, 06 Feb 2006 14:27:12 GMT Subject: [Python-Dev] Octal literals References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> <43e6bfbc.47932262@news.gmane.org> <20060206080501.GA10226@xs4all.nl> Message-ID: <43e75b8e.87822602@news.gmane.org> On Mon, 6 Feb 2006 09:05:01 +0100, Thomas Wouters wrote: >On Mon, Feb 06, 2006 at 05:33:57AM +0000, Bengt Richter wrote: > >> Perhaps I missed a py3k assumption in this thread (where I see in the PEP >> that "Remove distinction between int and long types" is core item number >> one)? > >http://www.python.org/peps/pep-0237.html -- an ungoing process, not a >Py3K-eventual one. > Thanks, I noticed. Hence my question following what you quote: """ Is PEP 237 phase C to be implemented sooner than py3k, making isinstance(, int) a transparently distinction-hiding alias for isinstance(, integer), or outright illegal? IOW, will isinstance(, int) be _guaranteed_ to be a bug, thus requiring code change? If so, when? """ Sorry that my paragraph-packing habit tends to bury things. I'll have to work on that ;-/ Regards, Bengt Richter From ronaldoussoren at mac.com Mon Feb 6 15:36:06 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 06 Feb 2006 15:36:06 +0100 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1139235080.24831.45.camel@warna.dub.corp.google.com> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> <43E3A8B8.2040500@v.loewis.de> <1139235080.24831.45.camel@warna.dub.corp.google.com> Message-ID: <12651352.1139236566517.JavaMail.ronaldoussoren@mac.com> On Monday, February 06, 2006, at 03:12PM, Donovan Baarda wrote: >On Fri, 2006-02-03 at 20:02 +0100, "Martin v. L?wis" wrote: >> Donovan Baarda wrote: >> > Before set() the standard way to do them was to use dicts with None >> > Values... to me the "{1,2,3}" syntax would have been a logical extension >> > of the "a set is a dict with no values, only keys" mindset. I don't know >> > why it wasn't done this way in the first place, though I missed the >> > arguments where it was rejected. >> >> There might be many reasons; one obvious reason is that you can't spell >> the empty set that way. > >Hmm... how about "{,}", which is the same trick tuples use for the empty >tuple? Isn't () the empty tuple? I guess you're confusing this with a single element tuple: (1,) instead of (1) (well actually it is "1,") BTW. I don't like your proposal for spelling the empty set as {,} because that is entirely non-obvious. If {1,2,3} where a valid way to spell a set literal, I'd expect {} for the empty set. Ronald > >-- >Donovan Baarda >http://minkirri.apana.org.au/~abo/ > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com > > From abo at minkirri.apana.org.au Mon Feb 6 15:42:31 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 06 Feb 2006 14:42:31 +0000 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <12651352.1139236566517.JavaMail.ronaldoussoren@mac.com> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> <43E3A8B8.2040500@v.loewis.de> <1139235080.24831.45.camel@warna.dub.corp.google.com> <12651352.1139236566517.JavaMail.ronaldoussoren@mac.com> Message-ID: <1139236951.24831.54.camel@warna.dub.corp.google.com> On Mon, 2006-02-06 at 15:36 +0100, Ronald Oussoren wrote: > On Monday, February 06, 2006, at 03:12PM, Donovan Baarda wrote: > > >On Fri, 2006-02-03 at 20:02 +0100, "Martin v. L?wis" wrote: > >> Donovan Baarda wrote: > >> > Before set() the standard way to do them was to use dicts with None > >> > Values... to me the "{1,2,3}" syntax would have been a logical extension > >> > of the "a set is a dict with no values, only keys" mindset. I don't know > >> > why it wasn't done this way in the first place, though I missed the > >> > arguments where it was rejected. > >> > >> There might be many reasons; one obvious reason is that you can't spell > >> the empty set that way. > > > >Hmm... how about "{,}", which is the same trick tuples use for the empty > >tuple? > > Isn't () the empty tuple? I guess you're confusing this with a single element tuple: (1,) instead of (1) (well actually it is "1,") Yeah, sorry.. nasty brainfart... > BTW. I don't like your proposal for spelling the empty set as {,} because that is entirely non-obvious. If {1,2,3} where a valid way to spell a set literal, I'd expect {} for the empty set. yeah... the problem is differentiating the empty set from an empty dict. The only alternative that occured to me was the not-so-nice and not-backwards-compatible "{:}" for an empty dict and "{}" for an empty set. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From guido at python.org Mon Feb 6 18:34:42 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 6 Feb 2006 09:34:42 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1139236951.24831.54.camel@warna.dub.corp.google.com> References: <1138968292.7232.48.camel@warna.dub.corp.google.com> <43E3A8B8.2040500@v.loewis.de> <1139235080.24831.45.camel@warna.dub.corp.google.com> <12651352.1139236566517.JavaMail.ronaldoussoren@mac.com> <1139236951.24831.54.camel@warna.dub.corp.google.com> Message-ID: On 2/6/06, Donovan Baarda wrote: > yeah... the problem is differentiating the empty set from an empty dict. > The only alternative that occured to me was the not-so-nice and > not-backwards-compatible "{:}" for an empty dict and "{}" for an empty > set. How about spelling the empty set as ``set()''? Wouldn't that solve the ambiguity and the backwards compatibility nicely? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 6 18:44:32 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 6 Feb 2006 09:44:32 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: <43e75b8e.87822602@news.gmane.org> References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> <43e6bfbc.47932262@news.gmane.org> <20060206080501.GA10226@xs4all.nl> <43e75b8e.87822602@news.gmane.org> Message-ID: On 2/6/06, Bengt Richter wrote: > Is PEP 237 phase C to be implemented sooner than py3k, > making isinstance(, int) a transparently distinction-hiding alias for > isinstance(, integer), or outright illegal? IOW, will isinstance(, int) > be _guaranteed_ to be a bug, thus requiring code change? If so, when? Probably not before Python 3.0. Until then, int and long will be distinct types for backwards compatibilty reasons. But we want as much code as possible to treat longs the same as ints, hence the party line that (barring attenuating circumstances :-) isinstance(x, int) is a bug if the code doesn't also have a similar case for long. If you find standard library code (in Python *or* C!) that treats int preferentially, please submit a patch or bug. What we should do in 3.0 is not entirely clear to me. It would be nice if there was only a single type (named 'int', of course) with two run-time representations, one similar to the current int and one similar to the current long. But that's not so easy, and somewhat contrary to the philosophy that differences in (C-level) representation are best distinguisghed by looking at the type of an object. The next most likely solution is to make long a subclass of int, or perhaps to make int an abstract base class with two subclasses, short and long. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Mon Feb 6 19:39:38 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 06 Feb 2006 10:39:38 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: <1139234958.24831.42.camel@warna.dub.corp.google.com> References: <20060203113244.10E4.JCARLSON@uci.edu> <1139234958.24831.42.camel@warna.dub.corp.google.com> Message-ID: <20060206091019.1100.JCARLSON@uci.edu> Donovan Baarda wrote: > > On Fri, 2006-02-03 at 11:56 -0800, Josiah Carlson wrote: > > Along the lines of "not every x line function should be a builtin", "not > > every builtin should have syntax". I think that sets have particular > > uses, but I don't believe those uses are sufficiently varied enough to > > warrant the creation of a syntax. I suggest that people take a walk > > through their code. How often do you use other sequence and/or mapping > > types? How many lists, tuples and dicts are there? How many sets? Ok, > > now how many set literals? > > The absence of sets in early Python, the requirement to "import sets" > when they first appeared, and the lack of a set syntax now all mean that > people tend to avoid using sets and resort to lists, tuples, and "dicts > of None" instead, even though they really want a set. Anywhere you see > "if value in sequence:", they probably mean sequence is a set, and this > code would run much faster if it really was, and might even avoid > potential bugs because it would prevent duplicates... Maybe they mean set, maybe they don't. 'if obj in seq' is used for various reasons. A quick check of the Python standard library shows that some of the uses of 'if obj in tuple_literal' could certainly be converted into sets, but that ignores the performance impact of using sets instead of short tuples (where short, if I remember correctly, is a length of 3, check the python-dev archives), as well as the module-level contant creation that occurs with tuples. There was probably a good reason why such a thing hasn't happened with lists and dicts (according to my Python 2.4 installation), and why it may not happen with sets. A nontrivial number of other 'if obj in seq' instances actually need dictionaries, the test is for some sort of data handler or headers with a particular name. - Josiah From aleaxit at gmail.com Mon Feb 6 19:37:42 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 6 Feb 2006 10:37:42 -0800 Subject: [Python-Dev] syntactic support for sets In-Reply-To: References: <1138968292.7232.48.camel@warna.dub.corp.google.com> <43E3A8B8.2040500@v.loewis.de> <1139235080.24831.45.camel@warna.dub.corp.google.com> <12651352.1139236566517.JavaMail.ronaldoussoren@mac.com> <1139236951.24831.54.camel@warna.dub.corp.google.com> Message-ID: On 2/6/06, Guido van Rossum wrote: > On 2/6/06, Donovan Baarda wrote: > > yeah... the problem is differentiating the empty set from an empty dict. > > The only alternative that occured to me was the not-so-nice and > > not-backwards-compatible "{:}" for an empty dict and "{}" for an empty > > set. > > How about spelling the empty set as ``set()''? Wouldn't that solve the > ambiguity and the backwards compatibility nicely? And of course, thanks to the time machine, it has always worked that way: hesperos:~$ python2.4 Python 2.4.1 (#1, Apr 21 2005, 11:14:17) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> set() set([]) >>> just like dict(), tuple(), list(), str(), int(), float(), bool(), complex() -- each type, called without args, returns an instance F of that type such that "bool(F) is False" holds (meaning len(F)==0 for container types, F==0 for number types). Alex From aleaxit at gmail.com Mon Feb 6 20:00:10 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 6 Feb 2006 11:00:10 -0800 Subject: [Python-Dev] Octal literals In-Reply-To: References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> <43e6bfbc.47932262@news.gmane.org> <20060206080501.GA10226@xs4all.nl> <43e75b8e.87822602@news.gmane.org> Message-ID: On 2/6/06, Guido van Rossum wrote: ... > What we should do in 3.0 is not entirely clear to me. It would be nice > if there was only a single type (named 'int', of course) with two > run-time representations, one similar to the current int and one > similar to the current long. But that's not so easy, and somewhat > contrary to the philosophy that differences in (C-level) > representation are best distinguisghed by looking at the type of an > object. The next most likely solution is to make long a subclass of > int, or perhaps to make int an abstract base class with two > subclasses, short and long. Essentially, you need to decide: does type(x) mostly refer to the protocol that x respects ("interface" plus semantics and pragmatics), or to the underlying implementation? If the latter, as your observation about "the philosophy" suggests, then it would NOT be nice if int was an exception wrt other types. If int is to be a concrete type, then I'd MUCH rather it didn't get subclassed, for all sorts of both pratical and principled reasons. So, to me, the best solution would be the abstract base class with concrete implementation subclasses. Besides being usable for isinstance checks, like basestring, it should also work as a factory when called, returning an instance of the appropriate concrete subclass. AND it would let me have (part of) what I was pining for a while ago -- an abstract base class that type gmpy.mpz can subclass to assert "I _am_ an integer type!", so lists will accept mpz instances as indices, etc etc. Now consider how nice it would be, on occasion, to be able to operate on an integer that's guaranteed to be 8, 16, 32, or 64 bits, to ensured the desired shifting/masking behavior for certain kinds of low-level programming; and also on one that's unsigned, in each of these sizes. Python could have a module offering signed8, unsigned16, and so forth (all combinations of size and signedness supported by the underlying C compiler), all subclassing the abstract int, and guarantee much happiness to people who are, for example, writing a Python prototype of code that's going to become C or assembly... Similarly, it would help a slightly different kind of prototyping a lot if another Python module could offer 32-bit, 64-bit, 80-bit and 128-bit floating point types (if supported by the underlying C compiler) -- all subclassing an ABSTRACT 'float'; the concrete implementation that one gets by calling float or using a float literal would also subclass it... and so would the decimal type (why not? it's floating point -- 'float' doesn't mean 'BINARY fp';-). And I'd be happy, because gmpy.mpf could also subclass the abstract float! And then finally we could have an abstract superclass 'number', whose subclasses are the abstract int and the abstract float (dunno 'bout complex, I'd be happy either way), and Python's typesystem would finally start being nice and cleanly organized instead of grand-prarie-level flat ...!-) Alex From rhamph at gmail.com Mon Feb 6 20:39:52 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 6 Feb 2006 12:39:52 -0700 Subject: [Python-Dev] Octal literals In-Reply-To: References: <43E47DBC.9000703@v.loewis.de> <43e60b15.1685113@news.gmane.org> <20060205091611.10F7.JCARLSON@uci.edu> <43e657a1.21281501@news.gmane.org> <43e6bfbc.47932262@news.gmane.org> <20060206080501.GA10226@xs4all.nl> <43e75b8e.87822602@news.gmane.org> Message-ID: On 2/6/06, Alex Martelli wrote: > Now consider how nice it would be, on occasion, to be able to operate > on an integer that's guaranteed to be 8, 16, 32, or 64 bits, to > ensured the desired shifting/masking behavior for certain kinds of > low-level programming; and also on one that's unsigned, in each of > these sizes. Python could have a module offering signed8, unsigned16, > and so forth (all combinations of size and signedness supported by the > underlying C compiler), all subclassing the abstract int, and > guarantee much happiness to people who are, for example, writing a > Python prototype of code that's going to become C or assembly... I dearly hope such types do NOT subclass abstract int. The reason is that although they can represent an integral value they do not behave like one. Approximately half of all possible float values are integral, but would you want it to subclass abstract int when possible? Of course not, the behavior is vastly different, and any function doing more than just comparing to it would have to convert it to the true int type before use it. I see little point for more than one integer type. long behaves properly like an integer in all cases I can think of, with the long exception of performance. And given that python tends to be orders of magnitudes slower than C code there is little desire to trade off functionality for performance. That we have two integer types is more of a historical artifact than a consious decision. We may not be willing to trade off functionality for performance, but once we've already made the tradeoff we're reluctant to go back. So it seems the challenge is this: can anybody patch long to have performance sufficiently close to int for small numbers? -- Adam Olsen, aka Rhamphoryncus From smiles at worksmail.net Mon Feb 6 21:12:03 2006 From: smiles at worksmail.net (Chris or Leslie Smith) Date: Mon, 6 Feb 2006 14:12:03 -0600 Subject: [Python-Dev] math.areclose ...? References: Message-ID: <001301c62b59$cdfbe900$152c4fca@csmith> || || def areclose(x,y,rtol=1.e-5,atol=1.e-8): || return abs(x-y) References: <001301c62b59$cdfbe900$152c4fca@csmith> Message-ID: <20060206202031.GA26735@panix.com> On Mon, Feb 06, 2006, Chris or Leslie Smith wrote: >Aahz: >>Alex: >>> > || def areclose(x,y,rtol=1.e-5,atol=1.e-8): > || return abs(x-y) | > | Looks interesting. I don't quite understand what atol/rtol are, > | though. > > Does it help to spell it like this? > > def areclose(x, y, relative_err = 1.e-5, absolute_err=1.e-8): > diff = abs(x - y) > ave = (abs(x) + abs(y))/2 > return diff < absolute_err or diff/ave < relative_err > > Also, separating the two terms with 'or' rather than '+' makes the > two error terms mean more what they are named. The '+' mixes the two > effects and even though the result is basically the same, it makes it > difficult to explain when the test will be true. Yes, that's a big help. I was a bit concerned that this would have no utility for numbers with large magnitude. Alex, given your focus on Python readability, I'm a bit surprised you didn't write this to start with! -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From janssen at parc.com Mon Feb 6 21:55:50 2006 From: janssen at parc.com (Bill Janssen) Date: Mon, 6 Feb 2006 12:55:50 PST Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: Your message of "Sun, 05 Feb 2006 09:43:28 PST." Message-ID: <06Feb6.125557pst."58633"@synergy1.parc.xerox.com> > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. +1. This would remove my strongest objection to the current Python 3000 PEP. Now, let's improve lambda... :-). Bill From aleaxit at gmail.com Mon Feb 6 22:03:26 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 6 Feb 2006 13:03:26 -0800 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <20060206202031.GA26735@panix.com> References: <001301c62b59$cdfbe900$152c4fca@csmith> <20060206202031.GA26735@panix.com> Message-ID: On 2/6/06, Aahz wrote: ... > > def areclose(x, y, relative_err = 1.e-5, absolute_err=1.e-8): > > diff = abs(x - y) > > ave = (abs(x) + abs(y))/2 > > return diff < absolute_err or diff/ave < relative_err > > > > Also, separating the two terms with 'or' rather than '+' makes the > > two error terms mean more what they are named. The '+' mixes the two > > effects and even though the result is basically the same, it makes it > > difficult to explain when the test will be true. > > Yes, that's a big help. I was a bit concerned that this would have no > utility for numbers with large magnitude. Alex, given your focus on > Python readability, I'm a bit surprised you didn't write this to start > with! As I said, I was just copying the definition in Numeric, which is well-tried by long use. Besides, this "clear expression" could present problems, such as possible overflows or divisions by zero when ave is 0 or very small; much as I care about readability, I care about correctness even more. Once it comes to readability, I prefer Numeric's choice to call the two terms "tolerances", rather than (as here) "errors"; maybe that depends on my roots being in engineering, where an error means a mistake (like it does in real life), while tolerance's a good and useful thing to have (ditto), rather than some scientific discipline where terms carry different nuances. Alex From thomas at thomas-lotze.de Mon Feb 6 22:32:10 2006 From: thomas at thomas-lotze.de (Thomas Lotze) Date: Mon, 06 Feb 2006 22:32:10 +0100 Subject: [Python-Dev] Let's just *keep* lambda References: Message-ID: Steven Bethard wrote: > Guido van Rossum wrote: >> After so many attempts to come up with an alternative for lambda, >> perhaps we should admit defeat. I've not had the time to follow the most >> recent rounds, but I propose that we keep lambda, so as to stop wasting >> everybody's talent and time on an impossible quest. +1 for keeping the functionality, especially given list and generator expressions being "compound lambda expressions" in a sense. Removing anonymous functions would break a nice symmetry there. > .. _alternatives: http://wiki.python.org/moin/AlternateLambdaSyntax Of those, I like the "for" syntax without parens around the arguments best: (x*y + z for x, y, z). Parentheses around the whole expression should be optional in the same cases that allow for omitting parentheses around generator expressions. It fits perfectly with the way generator expression syntax relates to generator function definitions, and re-using the "for" keyword keeps the zoo of reserved words small. Just my 2 cents and all that... -- Thomas From raymond.hettinger at verizon.net Mon Feb 6 22:37:22 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 06 Feb 2006 16:37:22 -0500 Subject: [Python-Dev] math.areclose ...? References: <001301c62b59$cdfbe900$152c4fca@csmith> Message-ID: <002701c62b65$844a9750$7f00a8c0@RaymondLaptop1> [Chris Smith] > Does it help to spell it like this? > > def areclose(x, y, relative_err = 1.e-5, absolute_err=1.e-8): > diff = abs(x - y) > ave = (abs(x) + abs(y))/2 > return diff < absolute_err or diff/ave < relative_err There is a certain beauty and clarity to this presentation; however, it is problematic numerically: * the division by either absolute_err and relative_err can overflow or trigger a ZeroDivisionError * the 'or' part of the expression can introduce an unnecessary discontinuity in the first derivative. The original Numeric definition is likely to be better for people who know what they're doing; however, I still question whether it is an appropriate remedy for the beginner issue of why 1.1 + 1.1 + 1.1 doesn't equal 3.3. Raymond From imbaczek at gmail.com Mon Feb 6 22:48:31 2006 From: imbaczek at gmail.com (=?ISO-8859-2?Q?Marek_=22Baczek=22_Baczy=F1ski?=) Date: Mon, 6 Feb 2006 22:48:31 +0100 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <002701c62b65$844a9750$7f00a8c0@RaymondLaptop1> References: <001301c62b59$cdfbe900$152c4fca@csmith> <002701c62b65$844a9750$7f00a8c0@RaymondLaptop1> Message-ID: <5f3d2c310602061348mc866626v@mail.gmail.com> 2006/2/6, Raymond Hettinger : > The original Numeric definition is likely to be better for people who know > what they're doing; however, I still question whether it is an appropriate > remedy for the beginner issue > of why 1.1 + 1.1 + 1.1 doesn't equal 3.3. Beginners won't know about math.areclose anyway (and if they will, they won't use it, thinking "why bother?"), and having a standard, well-behaved and *correct* version of a useful function can't hurt. -- { Marek Baczy?ski :: UIN 57114871 :: GG 161671 :: JID imbaczek at jabber.gda.pl } { http://www.vlo.ids.gda.pl/ | imbaczek at poczta fm | http://www.promode.org } .. .. .. .. ... ... ...... evolve or face extinction ...... ... ... .. .. .. .. From rrr at ronadam.com Tue Feb 7 00:51:29 2006 From: rrr at ronadam.com (Ron Adam) Date: Mon, 06 Feb 2006 17:51:29 -0600 Subject: [Python-Dev] math.areclose ...? In-Reply-To: References: <001301c62b59$cdfbe900$152c4fca@csmith> <20060206202031.GA26735@panix.com> Message-ID: <43E7E101.10908@ronadam.com> Alex Martelli wrote: > On 2/6/06, Aahz wrote: > ... > >>>def areclose(x, y, relative_err = 1.e-5, absolute_err=1.e-8): >>> diff = abs(x - y) >>> ave = (abs(x) + abs(y))/2 >>> return diff < absolute_err or diff/ave < relative_err >>> >>>Also, separating the two terms with 'or' rather than '+' makes the >>>two error terms mean more what they are named. The '+' mixes the two >>>effects and even though the result is basically the same, it makes it >>>difficult to explain when the test will be true. >> >>Yes, that's a big help. I was a bit concerned that this would have no >>utility for numbers with large magnitude. Alex, given your focus on >>Python readability, I'm a bit surprised you didn't write this to start >>with! > > > As I said, I was just copying the definition in Numeric, which is > well-tried by long use. Besides, this "clear expression" could > present problems, such as possible overflows or divisions by zero when > ave is 0 or very small; much as I care about readability, I care about > correctness even more. It looks like the definition from Numeric measures relative error while the above measure relative deviation. I'm not sure which one would be desirable or if they are interchangeable. I was looking up relative error to try and understand the above at the following site. http://mathforum.org/library/drmath/view/65797.html As far as beginner vs advanced users are concerned I think that is a matter of documentation especially when intermediate users are concerned which I believe are the majority. Possibly something like the following would be suitable... ? """ The absolute error is the absolute value of the difference between the accepted value and the measurement. Absolute error = abs( Observed - Accepted value ) The Relative err is the percentage of absolute err relative to the accepted value. Absolute error Relative error = -------------- x 100% Accepted value """ def isclose(observed, accepted, abs_err, rel_err): """Determine if the accuracy of a observed value is close to an accepted value""" diff = abs(observed, accepted) if diff < abs_err: return True try: return 100 * abs_diff / accepted < rel_err except ZeroDivisionError: pass return False Cheers, Ron Adam From kbk at shore.net Tue Feb 7 03:45:28 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Mon, 6 Feb 2006 21:45:28 -0500 (EST) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200602070245.k172jSGE025255@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 391 open ( +0) / 3038 closed (+10) / 3429 total (+10) Bugs : 915 open ( +9) / 5540 closed (+21) / 6455 total (+30) RFE : 209 open ( +2) / 197 closed ( +0) / 406 total ( +2) New / Reopened Patches ______________________ difflib exceeding recursion limit (2006-01-24) CLOSED http://python.org/sf/1413711 opened by Gustavo Niemeyer Patch for bug #1380970 (2006-01-25) http://python.org/sf/1414934 opened by Collin Winter Clairify docs on reference stealing (2006-01-26) http://python.org/sf/1415507 opened by Collin Winter optparse enable_interspersed_args disable_interspersed_args (2006-01-26) http://python.org/sf/1415508 opened by Rocky Bernstein Configure patch for Mac OS X 10.3 (2006-01-27) http://python.org/sf/1416559 opened by Ronald Oussoren have SimpleHTTPServer return last-modified headers (2006-01-28) http://python.org/sf/1417555 opened by Aaron Swartz Fix "be be" documentation typo in lang ref (2006-02-01) CLOSED http://python.org/sf/1421726 opened by Wummel Changes to nis module to support multiple NIS domains (2006-02-02) CLOSED http://python.org/sf/1422385 opened by Ben Bell Patches Closed ______________ difflib exceeding recursion limit (2006-01-24) http://python.org/sf/1413711 closed by niemeyer fix bsddb test associate problems w/bsddb 4.1 (2006-01-16) http://python.org/sf/1407992 closed by greg Patch f. bug 495682 cannot handle http_proxy with user:pass@ (2005-11-05) http://python.org/sf/1349118 closed by loewis bsddb3 build problems on FreeBSD (2.4 + 2.5) (2005-02-22) http://python.org/sf/1146231 closed by greg Add support for db 4.3 (2004-11-23) http://python.org/sf/1071911 closed by nnorwitz zipfile: use correct system type on unixy systems (2006-01-23) http://python.org/sf/1412872 closed by loewis Fill out the functional module (2006-01-22) http://python.org/sf/1412451 closed by rhettinger Fix "be be" documentation typo in lang ref (2006-02-01) http://python.org/sf/1421726 closed by effbot Changes to nis module to support multiple NIS domains (2006-02-02) http://python.org/sf/1422385 closed by loewis anonymous mmap (2006-01-16) http://python.org/sf/1407135 closed by nnorwitz New / Reopened Bugs ___________________ Popenhangs with latest Cygwin update (2006-01-23) CLOSED http://python.org/sf/1413378 opened by Eric McRae Popened file object close hangs in latest Cygwin update (2006-01-23) http://python.org/sf/1413379 opened by Eric McRae zipfile: inserting some filenames produces corrupt .zips (2006-01-24) http://python.org/sf/1413790 opened by Grant Olson email.Utils.py: UnicodeError in RFC2322 header (2006-01-25) http://python.org/sf/1414018 opened by A. Sagawa Can only install 1 of each version of Python on Windows (2006-01-25) CLOSED http://python.org/sf/1414612 opened by Max M Rasmussen Underspecified behaviour of string.split/rsplit (2006-01-25) http://python.org/sf/1414673 opened by Collin Winter inconsistency in help(set) (2006-01-25) http://python.org/sf/1414697 opened by Gregory Petrosyan Typo in online documentation - 6.8.3.6 Replacing popen2.* (2006-01-26) CLOSED http://python.org/sf/1415455 opened by Phil Wright Inconsistency between StringIO and cStringIO (2006-01-27) http://python.org/sf/1416477 opened by Michael Kerrin Problem with SOAPpy on 64-bit systems (2006-01-27) CLOSED http://python.org/sf/1416544 opened by Gustavo J. A. M. Carneiro SimpleHTTPServer doesn't return last-modified headers (2006-01-28) http://python.org/sf/1417554 opened by Aaron Swartz EditorWindow demo causes attr-error (2006-01-29) http://python.org/sf/1417598 opened by snowman float/atof have become locale aware (2006-01-29) http://python.org/sf/1417699 opened by Bernhard Herzog PyRun_SimpleString won't parse \\x (2006-01-30) CLOSED http://python.org/sf/1418374 opened by gnupun PyImport_AppendInittab stores pointer to parameter (2006-01-31) http://python.org/sf/1419652 opened by coder_5 class dictionary shortcircuits __getattr__ (2006-01-31) http://python.org/sf/1419989 opened by Shaun Cutts IMPORT PROBLEM: Local submodule shadows global module (2006-02-01) http://python.org/sf/1421513 opened by Jens Engel [win32] stderr atty encoding not set (2006-02-01) http://python.org/sf/1421664 opened by Snaury http response dictionary incomplete (2006-02-01) http://python.org/sf/1421696 opened by Jim Jewett CVS (not SVN) mentioned in Python FAQ (2006-02-01) CLOSED http://python.org/sf/1421811 opened by Gregory Petrosyan 2.4.1 mentioned in Python FAQ as most stable version (2006-02-01) CLOSED http://python.org/sf/1421814 opened by Gregory Petrosyan Inconsistency in Programming FAQ (2006-02-01) http://python.org/sf/1421839 opened by Gregory Petrosyan email.MIME*.as_string removes duplicate spaces (2006-02-02) http://python.org/sf/1422094 opened by hads Unicode IOError: execfile(u'\u043a\u043a\u043a/x.py') (2006-02-02) http://python.org/sf/1422398 opened by Robert Kiendl PEP 4 additions (2006-02-02) http://python.org/sf/1423073 opened by Jim Jewett mmap module leaks file descriptors on UNIX (2006-02-02) CLOSED http://python.org/sf/1423153 opened by Fazal Majid Email tests fail (2006-02-04) CLOSED http://python.org/sf/1423972 opened by Martin v. L??wis Assert failure in signal handling (2006-02-04) CLOSED http://python.org/sf/1424017 opened by doom The mmap module does unnecessary dup() (2006-02-04) CLOSED http://python.org/sf/1424041 opened by Keith Dart The email package needs an "application" type (2006-02-04) http://python.org/sf/1424065 opened by Keith Dart urllib.FancyURLopener.redirect_internal looses data on POST! (2006-02-04) http://python.org/sf/1424148 opened by Robert Kiendl urllib: HTTPS over (Squid) Proxy fails (2006-02-04) http://python.org/sf/1424152 opened by Robert Kiendl patch for etree cdata and attr quoting (2006-02-04) http://python.org/sf/1424171 opened by Chris McDonough os.remove OSError: [Errno 13] Permission denied (2006-02-06) http://python.org/sf/1425127 opened by cheops msvccompiler.py modified to work with .NET 2005 on win64 (2006-02-06) http://python.org/sf/1425482 opened by beaudrym Bugs Closed ___________ Popenhangs with latest Cygwin update (2006-01-23) http://python.org/sf/1413378 deleted by sferic __self - Watcom compiler reserved word (2006-01-23) http://python.org/sf/1412837 closed by nnorwitz bsddb: segfault on db.associate call with Txn and large data (2006-01-23) http://python.org/sf/1413192 closed by nnorwitz Closing dbenv first bsddb doesn't release locks & segfau (2003-08-13) http://python.org/sf/788526 closed by nnorwitz cannot handle http_proxy with user:pass@ (2001-12-21) http://python.org/sf/495682 closed by loewis BSD DB test failures for BSD DB 4.1 (2005-10-19) http://python.org/sf/1332873 closed by nnorwitz 2.[345]: --with-wctype-functions 4 test failures (2004-01-10) http://python.org/sf/874534 closed by nnorwitz posixmodule uses utimes, which is broken in glibc-2.3.2 (2003-08-10) http://python.org/sf/786194 closed by nnorwitz Error: ... ossaudiodev.c, line 48: Missing type specifier (2005-05-05) http://python.org/sf/1196154 closed by nnorwitz Can only install 1 of each version of Python on Windows (2006-01-25) http://python.org/sf/1414612 closed by loewis Typo in online documentation - 6.8.3.6 Replacing popen2.* (2006-01-26) http://python.org/sf/1415455 closed by nnorwitz Problem with SOAPpy on 64-bit systems (2006-01-27) http://python.org/sf/1416544 closed by loewis PyRun_SimpleString won't parse \\x (2006-01-30) http://python.org/sf/1418374 deleted by effbot Registry key CurrentVersion not set (2003-10-22) http://python.org/sf/827963 closed by loewis CVS (not SVN) mentioned in Python FAQ (2006-02-01) http://python.org/sf/1421811 closed by loewis 2.4.1 mentioned in Python FAQ as most stable version (2006-02-01) http://python.org/sf/1421814 closed by loewis urllib2 doesn't do HTTP-EQUIV & Refresh (2002-10-21) http://python.org/sf/626543 closed by jjlee urllib2 dont respect debuglevel in httplib (2005-02-27) http://python.org/sf/1152723 closed by abbatini TimedRotatingFileHandler midnight rollover time increases (01/04/06) http://python.org/sf/1396622 closed by sf-robot mmap module leaks file descriptors on UNIX (2006-02-02) http://python.org/sf/1423153 closed by nnorwitz Email tests fail (2006-02-04) http://python.org/sf/1423972 closed by bwarsaw Assert failure in signal handling (2006-02-04) http://python.org/sf/1424017 closed by nnorwitz The mmap module does unnecessary dup() (2006-02-04) http://python.org/sf/1424041 closed by nnorwitz r41552 broke test_file on OS X (2005-12-04) http://python.org/sf/1373161 closed by nnorwitz New / Reopened RFE __________________ lib-deprecated (2006-02-02) http://python.org/sf/1423082 opened by Jim Jewett Support for MSVC 7 and MSVC8 in msvccompiler (2006-02-06) http://python.org/sf/1425256 opened by dlm From edgimar at lycos.com Sun Feb 5 22:26:48 2006 From: edgimar at lycos.com (Mark Edgington) Date: Sun, 05 Feb 2006 22:26:48 +0100 Subject: [Python-Dev] threadsafe patch for asynchat Message-ID: <43E66D98.3080008@lycos.com> Does anyone have any comments about applying the following patch to asynchat? It should not affect the behavior of the module in any way for those who do not want to use the feature provided by the patch. The point of the patch is to make it easy to use asynchat in a multithreaded application. Maybe I am missing something, and the patch really doesn't make it threadsafe? Any comments would be appreciated. Also, if it looks good to everyone, feel free to use it. -------BEGIN PATCH---------- --- asynchat.py Fri Oct 15 03:03:16 2004 +++ asynchat.py.new Sun Feb 05 22:05:42 2006 @@ -59,10 +59,11 @@ ac_in_buffer_size = 4096 ac_out_buffer_size = 4096 - def __init__ (self, conn=None): + def __init__ (self, conn=None, running_in_thread=False): self.ac_in_buffer = '' self.ac_out_buffer = '' self.producer_fifo = fifo() + self.running_in_thread = runnning_in_thread asyncore.dispatcher.__init__ (self, conn) def collect_incoming_data(self, data): @@ -157,7 +158,9 @@ def push (self, data): self.producer_fifo.push (simple_producer (data)) - self.initiate_send() + # only initiate a send if not running in a threaded environment, since + # initiate_send() is not threadsafe. + if not self.running_in_thread: self.initiate_send() def push_with_producer (self, producer): self.producer_fifo.push (producer) -------END PATCH---------- -Mark From xavier.morel at masklinn.net Sun Feb 5 19:43:04 2006 From: xavier.morel at masklinn.net (Morel Xavier) Date: Sun, 05 Feb 2006 19:43:04 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <43E64738.6030600@masklinn.net> Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. > The inline anonymous `def` isn't as ugly/problematic as the block (block anonymous def) version, and could probably work better than lambda, I think (a bit more verbose, but at least it doesn't feel like a castrated function definition, is more coherent with the existing function definition syntax, and accepts more than a single statement... well that last part probably isn't a pro arguments...). Couldn't it be enabled (as an inline construct only) to replace the current lambda? From brett at python.org Tue Feb 7 03:56:12 2006 From: brett at python.org (Brett Cannon) Date: Mon, 6 Feb 2006 18:56:12 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: On 2/5/06, Guido van Rossum wrote: > After so many attempts to come up with an alternative for lambda, > perhaps we should admit defeat. I've not had the time to follow the > most recent rounds, but I propose that we keep lambda, so as to stop > wasting everybody's talent and time on an impossible quest. I have been thinking about this, and I have to say I am a little disappointed (-0 disappointed, not -1 disappointed). I honestly bought the argument for removing lambda. And I think that a deferred object would help with one of lambda's biggest uses and made its loss totally reasonable. But I know that everyone and their email client is against me on this one, so I am not going to really try to tear into this. But I do think that lambda needs a renaming. Speaking as someone who still forgets that Python's lambda is not the same as those found in functional languages, I would much rather have it named 'expr' or 'expression' or something that is more inline with its abilities then with a name taken for CS historical reasons. This ain't for father's lambda and thus shouldn't be named so. Then again, Guido did say he "should", not that he "did" admit defeat. =) -Brett From martin at v.loewis.de Tue Feb 7 07:29:49 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2006 07:29:49 +0100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <43E66D98.3080008@lycos.com> References: <43E66D98.3080008@lycos.com> Message-ID: <43E83E5D.3010000@v.loewis.de> Mark Edgington wrote: > Does anyone have any comments about applying the following patch to > asynchat? That patch looks wrong. What does it mean to "run in a thread"? All code runs in a thread, all the time: sometime, that thread is the main thread. Furthermore, I can't see any presumed thread-unsafety in asynchat. Sure, there is a lot of member variables in asynchat which aren't specifically protected against mutual access from different threads. So you shouldn't be accessing the same async_chat object from multiple threads. I cannot see why using a creating and using an async_chat object in a thread that is not the main thread could cause any problems. I also cannot see how this patch could have significant effect on asyn_chat's behaviour when used in multiple threads. Regards, Martin From martin at v.loewis.de Tue Feb 7 07:36:12 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2006 07:36:12 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <43E83FDC.9020402@v.loewis.de> Brett Cannon wrote: > But I know that everyone and their email client is against me on this > one, so I am not going to really try to tear into this. But I do > think that lambda needs a renaming. Speaking as someone who still > forgets that Python's lambda is not the same as those found in > functional languages Can you elaborate on that point? I feel that Python's lambda is exactly the same as the one in Lisp. Sure, the Lisp lambda supports multiple sequential expressions (the "progn" feature), but I understand that this is just "an extension" (although one that has been around several decades). Of course, Python's expressions are much more limited as Lisp's (where you really can have macros and special forms in as the "expression" in a lambda), but the lambda construct itself seems to be the very same one. Regards, Martin From radeex at gmail.com Tue Feb 7 08:19:35 2006 From: radeex at gmail.com (Christopher Armstrong) Date: Tue, 7 Feb 2006 18:19:35 +1100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43E83FDC.9020402@v.loewis.de> References: <43E83FDC.9020402@v.loewis.de> Message-ID: <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> On 2/7/06, "Martin v. L?wis" wrote: > Brett Cannon wrote: > > But I know that everyone and their email client is against me on this > > one, so I am not going to really try to tear into this. But I do > > think that lambda needs a renaming. Speaking as someone who still > > forgets that Python's lambda is not the same as those found in > > functional languages > > Can you elaborate on that point? I feel that Python's lambda is exactly > the same as the one in Lisp. Sure, the Lisp lambda supports multiple > sequential expressions (the "progn" feature), but I understand that > this is just "an extension" (although one that has been around several > decades). > > Of course, Python's expressions are much more limited as Lisp's (where > you really can have macros and special forms in as the "expression" > in a lambda), but the lambda construct itself seems to be the very > same one. If we phrase it somewhat differently, we can see that lambdas are different in Python and Lisp, in a very practical way. First: Everything in Lisp is an expression. There's no statement, in Lisp, that isn't also an expression. Lambdas in Lisp can contain arbitrary expressions; therefore you can put any language construct inside a lambda. In Python, you cannot put any language construct inside a lambda. Python's and Lisp's lambdas are effectively totally different. +1 on keeping Lambda, +1 on making it more useful. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From seojiwon at gmail.com Tue Feb 7 09:34:39 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Tue, 7 Feb 2006 00:34:39 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> References: <43E83FDC.9020402@v.loewis.de> <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> Message-ID: On 2/6/06, Christopher Armstrong wrote: > On 2/7/06, "Martin v. L?wis" wrote: > > Brett Cannon wrote: > > > But I know that everyone and their email client is against me on this > > > one, so I am not going to really try to tear into this. But I do > > > think that lambda needs a renaming. Speaking as someone who still > > > forgets that Python's lambda is not the same as those found in > > > functional languages > > > > Can you elaborate on that point? I feel that Python's lambda is exactly > > the same as the one in Lisp. Sure, the Lisp lambda supports multiple > > sequential expressions (the "progn" feature), but I understand that > > this is just "an extension" (although one that has been around several > > decades). > > > > Of course, Python's expressions are much more limited as Lisp's (where > > you really can have macros and special forms in as the "expression" > > in a lambda), but the lambda construct itself seems to be the very > > same one. > > If we phrase it somewhat differently, we can see that lambdas are > different in Python and Lisp, in a very practical way. First: > Everything in Lisp is an expression. There's no statement, in Lisp, > that isn't also an expression. Lambdas in Lisp can contain arbitrary > expressions; therefore you can put any language construct inside a > lambda. In Python, you cannot put any language construct inside a > lambda. Python's and Lisp's lambdas are effectively totally different. > > +1 on keeping Lambda, +1 on making it more useful. After lambda being made more useful, can I hope that I will be able to use lambda with multiple statements? :) Lambdas in Lisp and Python are different, but in the usability perspective they don't need to differ too much. -Jiwon From thomas at thomas-lotze.de Tue Feb 7 09:52:02 2006 From: thomas at thomas-lotze.de (Thomas Lotze) Date: Tue, 07 Feb 2006 09:52:02 +0100 Subject: [Python-Dev] Let's just *keep* lambda References: <43E83FDC.9020402@v.loewis.de> <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> Message-ID: Jiwon Seo wrote: > After lambda being made more useful, can I hope that I will be able to use > lambda with multiple statements? :) Lambdas in Lisp and Python are > different, but in the usability perspective they don't need to differ too > much. I don't think it helps usability much if anonymous functions are allowed multiple statements. IMO greater amounts of code deserve a named function for readability's sake, and the distinction between expressions and suites feels like a good criterion for what is a greater amount of code. In any case, it's the same limit as found in list and generator expressions or the proposed conditional expression. -- Thomas From p.f.moore at gmail.com Tue Feb 7 10:56:31 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 7 Feb 2006 09:56:31 +0000 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <79990c6b0602070156s72dab4dga54c9d3b2887df2c@mail.gmail.com> On 2/7/06, Brett Cannon wrote: > On 2/5/06, Guido van Rossum wrote: > > After so many attempts to come up with an alternative for lambda, > > perhaps we should admit defeat. I've not had the time to follow the > > most recent rounds, but I propose that we keep lambda, so as to stop > > wasting everybody's talent and time on an impossible quest. > > I have been thinking about this, and I have to say I am a little > disappointed (-0 disappointed, not -1 disappointed). I honestly > bought the argument for removing lambda. And I think that a deferred > object would help with one of lambda's biggest uses and made its loss > totally reasonable. I'm not 100% sure what you mean here, but as far as my understanding goes, current lambda *is* a "deferred object" (or at least a "deferred expression", which may not be quite what you mean...) > But I know that everyone and their email client is against me on this > one, so I am not going to really try to tear into this. But I do > think that lambda needs a renaming. I agree with this. The *name* "lambda" is a wart, even if the deferred expression feature isn't. My preference is to simply replace the keyword lambda with a keyword "expr" (or if that's not acceptable because there's too much prior use of expr as a variable name, then maybe "expression" - but that's starting to get a bit long). > Speaking as someone who still > forgets that Python's lambda is not the same as those found in > functional languages, Well, only in the sense that Python's *expressions* are not the same as those found in functional languages (ie, Python has statements which are not expressions). But I see your point - and I strongly object to going the other way and extending lambda/expr to allow statements or suites. > I would much rather have it named 'expr' or > 'expression' or something that is more inline with its abilities then > with a name taken for CS historical reasons. This ain't for father's > lambda and thus shouldn't be named so. Agreed. But if "expr" isn't acceptable, I don't like the other common suggestion of reusing "def". It's not a definition, nor is it "like an anonymous function" (the lack of support for statements/suites being the key difference). > Then again, Guido did say he "should", not that he "did" admit defeat. =) OTOH, he was trying to stop endless the discussion... :-) Paul. From crutcher at gmail.com Tue Feb 7 11:46:45 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Tue, 7 Feb 2006 02:46:45 -0800 Subject: [Python-Dev] Any interest in tail call optimization as a decorator? Message-ID: Maybe someone has already brought this up, but my searching hasn't revealed it. Is there any interest in something like this for the functional module? #!/usr/bin/env python2.4 # This program shows off a python decorator which implements # tail call optimization. It does this by throwing an exception # if it is it's own grandparent, and catching such exceptions # to recall the stack. import sys class TailRecurseException: def __init__(self, args, kwargs): self.args = args self.kwargs = kwargs def tail_call_optimized(g): def func(*args, **kwargs): try: raise ZeroDivisionError except ZeroDivisionError: f = sys.exc_info()[2].tb_frame if f.f_back and f.f_back.f_back \ and f.f_back.f_back.f_code == f.f_code: raise TailRecurseException(args, kwargs) else: while 1: try: return g(*args, **kwargs) except TailRecurseException, e: args = e.args kwargs = e.kwargs func.__doc__ = g.__doc__ return func @tail_call_optimized def factorial(n, acc=1): "calculate a factorial" if n == 0: return acc return factorial(n-1, n*acc) print factorial(10000) # prints a big, big number, # but doesn't hit the recursion limit. @tail_call_optimized def fib(i, current = 0, next = 1): if i == 0: return current else: return fib(i - 1, next, current + next) print fib(10000) # also prints a big number, # but doesn't hit the recursion limit. -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From arigo at tunes.org Tue Feb 7 11:42:24 2006 From: arigo at tunes.org (Armin Rigo) Date: Tue, 7 Feb 2006 11:42:24 +0100 Subject: [Python-Dev] cProfile module Message-ID: <20060207104224.GA6204@code0.codespeak.net> Hi all, As promized two months ago, I eventually finished the integration of the 'lsprof' profiler. It's now in an internal '_lsprof' module that is exposed via a 'cProfile' module with the same interface as 'profile', producing compatible dump stats that can be inspected with 'pstats'. See previous discussion here: * http://mail.python.org/pipermail/python-dev/2005-November/058212.html The code is currently in the following repository, from where I'll merge it into CPython if nobody objects: * http://codespeak.net/svn/user/arigo/hack/misc/lsprof/Doc * http://codespeak.net/svn/user/arigo/hack/misc/lsprof/Lib * http://codespeak.net/svn/user/arigo/hack/misc/lsprof/Modules with tests and docs, including new tests and doc refinements for profile itself. The docs mark hotshot as "reversed for specialized usage". They probably need a bit of bad-English-hunting... And yes, I do promize to maintain this code in the future. A bientot, Armin From murman at gmail.com Tue Feb 7 16:47:46 2006 From: murman at gmail.com (Michael Urman) Date: Tue, 7 Feb 2006 09:47:46 -0600 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: On 2/6/06, Brett Cannon wrote: > And I think that a deferred object would help with one of > lambda's biggest uses and made its loss totally reasonable. The ambiguity inherent from the perspective of a deferred object makes a general one impractical. Both map(Deferred().attribute, seq) and map(Deferred().method(arg), seq) look the same - how does the object know that the first case it should return the attribute of the first element of seq when called, but in the second it should wait for the next call when it will call method(arg) on the first element of seq? Since there's also no way to spell "lambda y: foo(x, y, z)" on a simple deferred object, it's strictly less powerful. If the current Python lambda's functionality is desired, there is no better pythonic way to spell it. There are plenty of new syntactic options that help highlight its expression nature, but are they worth the change? MIchael -- Michael Urman http://www.tortall.net/mu/blog/ From martin at v.loewis.de Tue Feb 7 20:11:06 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2006 20:11:06 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <43E83FDC.9020402@v.loewis.de> <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> Message-ID: <43E8F0CA.2030809@v.loewis.de> Jiwon Seo wrote: > After lambda being made more useful, can I hope that I will be able to > use lambda with multiple statements? :) Lambdas in Lisp and Python are > different, but in the usability perspective they don't need to differ > too much. To my knowledge, nobody proposed to make it "more useful", or to allow statements in the body of a lambda expression (neither single nor multiple). Regards, Martin From oliphant.travis at ieee.org Tue Feb 7 20:52:21 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Tue, 07 Feb 2006 12:52:21 -0700 Subject: [Python-Dev] Help with Unicode arrays in NumPy Message-ID: This is a design question which is why I'm posting here. Recently the NumPy developers have become more aware of the difference between UCS2 and UCS4 builds of Python. NumPy arrays can be of Unicode type. In other words a NumPy array can be made of up fixed-data-length unicode strings. Currently that means that they are "unicode" strings of basic size UCS2 or UCS4 depending on the platform. It is this duality that has some people concerned. For all other data-types, NumPy allows the user to explicitly request a bit-width for the data-type. So, we are thinking of introducing another data-type to NumPy to differentiate between UCS2 and UCS4 unicode strings. (This also means a unicode scalar object, i.e. string of each of these, exactly one of which will inherit from the Python type). Before embarking on this journey, however, we are seeking advice from individuals wiser to the way of Unicode on this list. Perhaps all we need to do is be more careful on input and output of Unicode data-types so that transfer of unicode can be handled correctly on each platform. Any thoughts? -Travis Oliphant From martin at v.loewis.de Tue Feb 7 21:06:28 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2006 21:06:28 +0100 Subject: [Python-Dev] Help with Unicode arrays in NumPy In-Reply-To: References: Message-ID: <43E8FDC4.1010607@v.loewis.de> Travis E. Oliphant wrote: > Currently that means that they are "unicode" strings of basic size UCS2 > or UCS4 depending on the platform. It is this duality that has some > people concerned. For all other data-types, NumPy allows the user to > explicitly request a bit-width for the data-type. Why is that a desirable property? Also: Why does have NumPy support for Unicode arrays in the first place? > Before embarking on this journey, however, we are seeking advice from > individuals wiser to the way of Unicode on this list. My initial reaction is: use whatever Python uses in "NumPy Unicode". Upon closer inspection, it is not all that clear what operations are supported on a Unicode array, and how these operations relate to the Python Unicode type. In any case, I think NumPy should have only a single "Unicode array" type (please do explain why having zero of them is insufficient). If the purpose of the type is to interoperate with a Python unicode object, it should use the same width (as this will allow for mempcy). If the purpose is to support arbitrary Unicode characters, it should use 4 bytes (as two bytes are insufficient to represent arbitrary Unicode characters). If the purpose is something else, please explain what the purpose is. Regards, Martin From oliphant.travis at ieee.org Tue Feb 7 21:23:13 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Tue, 07 Feb 2006 13:23:13 -0700 Subject: [Python-Dev] Help with Unicode arrays in NumPy In-Reply-To: <43E8FDC4.1010607@v.loewis.de> References: <43E8FDC4.1010607@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Travis E. Oliphant wrote: > >>Currently that means that they are "unicode" strings of basic size UCS2 >>or UCS4 depending on the platform. It is this duality that has some >>people concerned. For all other data-types, NumPy allows the user to >>explicitly request a bit-width for the data-type. > > > Why is that a desirable property? Also: Why does have NumPy support for > Unicode arrays in the first place? > Numpy supports arrays of arbitrary fixed-length "records". It is much more than numeric-only data now. One of the fields that a record can contain is a string. If strings are supported, it makes sense to support unicode strings as well. This allows NumPy to memory-map arbitrary data-files on disk. Perhaps you should explain why you think NumPy "shouldn't support Unicode" > > My initial reaction is: use whatever Python uses in "NumPy Unicode". > Upon closer inspection, it is not all that clear what operations > are supported on a Unicode array, and how these operations relate > to the Python Unicode type. That is currently what is done. The current unicode data-type is exactly what Python uses. The chararray subclass gives to unicode and string arrays all the methods of unicode and strings (operating on an element-by-element basis). When you extract an element from the unicode data-type you get a Python unicode object (every NumPy data-type has a corresponding "type-object" that determines what is returned when an element is extracted). All of these types are in a hierarchy of data-types which inherit from the basic Python types when available. > > In any case, I think NumPy should have only a single "Unicode array" > type (please do explain why having zero of them is insufficient). > Please explain why having zero of them is *sufficient*. > If the purpose of the type is to interoperate with a Python > unicode object, it should use the same width (as this will > allow for mempcy). > > If the purpose is to support arbitrary Unicode characters, it should > use 4 bytes (as two bytes are insufficient to represent arbitrary > Unicode characters). And Python does not support arbitrary Unicode characters on narrow builds? Then how is \U0010FFFF represented? > > If the purpose is something else, please explain what the purpose > is. The purpose is to represent bytes as they might exist in a file or data-stream according to the users specification. The purpose is whatever the user wants them for. It's the same purpose as having an unsigned 64-bit data-type --- because users may need it to represent data as it exists in a file. From theller at python.net Tue Feb 7 21:52:07 2006 From: theller at python.net (Thomas Heller) Date: Tue, 07 Feb 2006 21:52:07 +0100 Subject: [Python-Dev] ctypes patch (was: (libffi) Re: Copyright issue) In-Reply-To: <43E5F9AA.4080409@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?= =?iso-8859-1?Q?wis's?= message of "Sun, 05 Feb 2006 14:12:10 +0100") References: <4f0b69dc0602020944l3bcfe1d2v1bc149ac3f202e91@mail.gmail.com> <43E5F9AA.4080409@v.loewis.de> Message-ID: > Hye-Shik Chang writes: >>> > I did some work to make ctypes+libffi compacter and liberal. >>> > http://openlook.org/svnpublic/ctypes-compactffi/ (svn) >>> > >> Here goes patches for the integration: >> >> [1] http://people.freebsd.org/~perky/ctypesinteg-f1.diff.bz2 >> [2] http://people.freebsd.org/~perky/ctypesinteg-f2.diff.bz2 >> >> I implemented it in two flavors. [1] runs libffi's configure along with >> Python's and setup.py just builds it. And [2] has no change to >> Python's configure and setup.py runs libffi configure and builds it. >> And both patches don't have things for documentations yet. [Thomas Heller] > My plan is to make separate ctypes releases for 2.3 and 2.4, even after > it is integrated into Python 2.5, so it seems [2] would be better - it > must be possible to build ctypes without Python. > > As I said before, docs need still to be written. I think content is > more important than markup, so I'm writing in rest, it can be converted > to latex later. I expect that writing the docs will show quite some > edges that need to be cleaned up - that should certainly be done before > the first 2.5 release. > > Also I want to make a few releases before declaring the 1.0 version. > This does not mean that I'm against integrating it right now. "Martin v. L?wis" writes: > Not sure whether you think you need further approval: if you are ready > to check this into the Python trunk, just go ahead. As I said, I would > prefer if what is checked in is a literal copy of the ctypes CVS (as > far as reasonable). I was not looking for further approval, I wanted to explain why I prefer the patch [2] that Hye-Shik posted above. I'll do at least one separate ctypes release before checking this into the Python trunk. Thanks, Thomas From martin at v.loewis.de Tue Feb 7 21:53:16 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2006 21:53:16 +0100 Subject: [Python-Dev] Help with Unicode arrays in NumPy In-Reply-To: References: <43E8FDC4.1010607@v.loewis.de> Message-ID: <43E908BC.7020901@v.loewis.de> Travis E. Oliphant wrote: > Numpy supports arrays of arbitrary fixed-length "records". It is > much more than numeric-only data now. One of the fields that a > record can contain is a string. If strings are supported, it makes > sense to support unicode strings as well. Hmm. How do you support strings in fixed-length records? Strings are variable-sized, after all. On common application is that you have a C struct in some API which has a fixed-size array for string data (either with a length field, or null-terminated), in this case, it is moderately useful to model such a struct in Python. However, transferring this to Unicode is pointless - there aren't any similar Unicode structs that need support. > This allows NumPy to memory-map arbitrary data-files on disk. Ok, so this is the "C struct" case. Then why do you need Unicode support there? Which common file format has embedded fixed-size Unicode data? > Perhaps you should explain why you think NumPy "shouldn't support > Unicode" I think I said "Unicode arrays", not Unicode. Unicode arrays are a pointless data type, IMO. Unicode always comes in strings (i.e. variable sized, either null-terminated or with an introducing length). On disk/on the wire Unicode comes as UTF-8 more often than not. Using UCS-2/UCS-2 as an on-disk represenationis also questionable practice (although admittedly Microsoft uses that a lot). > That is currently what is done. The current unicode data-type is > exactly what Python uses. Then I wonder how this goes along with the use case "allow to map arbitrary files". > The chararray subclass gives to unicode and string arrays all the > methods of unicode and strings (operating on an element-by-element > basis). For strings, I can see use cases (although I wonder how you deal with data formats that also support variable-sized strings, as most data formats supporting strings do). > Please explain why having zero of them is *sufficient*. Because I (still) cannot imagine any specific application that might need such a feature (IOWYAGNI). >> If the purpose is to support arbitrary Unicode characters, it >> should use 4 bytes (as two bytes are insufficient to represent >> arbitrary Unicode characters). > > > And Python does not support arbitrary Unicode characters on narrow > builds? Then how is \U0010FFFF represented? It's represented using UTF-16. Try this for yourself: py> len(u"\U0010FFFF") 2 py> u"\U0010FFFF"[0] u'\udbff' py> u"\U0010FFFF"[1] u'\udfff' This has all kinds of non-obvious implications. > The purpose is to represent bytes as they might exist in a file or > data-stream according to the users specification. See, and this is precisely the statement that I challenge. Sure, they "might" exist - but I'd rather expect that they don't. If they exist, "Unicode" might come as variable-sized UTF-8, UTF-16, or UTF-32. In either case, NumPy should already support that by mapping a string object onto the encoded bytes, to which you then can apply .decode() should you need to process the actual Unicode data. > The purpose is > whatever the user wants them for. It's the same purpose as having an > unsigned 64-bit data-type --- because users may need it to represent > data as it exists in a file. No. I would expect you have 64-bit longs because users *do* need them, and because there wouldn't be an easy work-around if users wouldn't have them. For Unicode, it's different: users don't directly need them (atleast not many users), and if they do, there is an easy work-around for their absence. Say I want to process NTFS run lists. In NTFS run lists, there are 24-bit integers, 40-bit integers, and 4-bit integers (i.e. nibbles). Can I represent them all in NumPy? Can I have NumPy transparently map a sequence of run list records (which are variable-sized) map as an array of run list records? Regards, Martin From brett at python.org Tue Feb 7 22:43:15 2006 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2006 13:43:15 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> References: <43E83FDC.9020402@v.loewis.de> <60ed19d40602062319v4bf6c4e2x49a657176d7955d9@mail.gmail.com> Message-ID: On 2/6/06, Christopher Armstrong wrote: > On 2/7/06, "Martin v. L?wis" wrote: > > Brett Cannon wrote: > > > But I know that everyone and their email client is against me on this > > > one, so I am not going to really try to tear into this. But I do > > > think that lambda needs a renaming. Speaking as someone who still > > > forgets that Python's lambda is not the same as those found in > > > functional languages > > > > Can you elaborate on that point? I feel that Python's lambda is exactly > > the same as the one in Lisp. Sure, the Lisp lambda supports multiple > > sequential expressions (the "progn" feature), but I understand that > > this is just "an extension" (although one that has been around several > > decades). > > > > Of course, Python's expressions are much more limited as Lisp's (where > > you really can have macros and special forms in as the "expression" > > in a lambda), but the lambda construct itself seems to be the very > > same one. > > If we phrase it somewhat differently, we can see that lambdas are > different in Python and Lisp, in a very practical way. First: > Everything in Lisp is an expression. There's no statement, in Lisp, > that isn't also an expression. Lambdas in Lisp can contain arbitrary > expressions; therefore you can put any language construct inside a > lambda. In Python, you cannot put any language construct inside a > lambda. Python's and Lisp's lambdas are effectively totally different. > Chris is exactly right in what I meant. Lisp-like language do not have the statement/expression dichotomy. For instance, function definitions are syntactic sugar for defining a lambda expression that is bound to a name. This only works in Python if the function body is a single expression which is not the entire language. For Lisp, though, that can be anything allowed in the language, so the abilities are different. -Brett From brett at python.org Tue Feb 7 22:52:51 2006 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2006 13:52:51 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <79990c6b0602070156s72dab4dga54c9d3b2887df2c@mail.gmail.com> References: <79990c6b0602070156s72dab4dga54c9d3b2887df2c@mail.gmail.com> Message-ID: On 2/7/06, Paul Moore wrote: > On 2/7/06, Brett Cannon wrote: > > On 2/5/06, Guido van Rossum wrote: > > > After so many attempts to come up with an alternative for lambda, > > > perhaps we should admit defeat. I've not had the time to follow the > > > most recent rounds, but I propose that we keep lambda, so as to stop > > > wasting everybody's talent and time on an impossible quest. > > > > I have been thinking about this, and I have to say I am a little > > disappointed (-0 disappointed, not -1 disappointed). I honestly > > bought the argument for removing lambda. And I think that a deferred > > object would help with one of lambda's biggest uses and made its loss > > totally reasonable. > > I'm not 100% sure what you mean here, but as far as my understanding > goes, current lambda *is* a "deferred object" (or at least a "deferred > expression", which may not be quite what you mean...) > Yes, lambda is deferred. What I mean is using lambda for things like ``lambda x: x.attr`` and such; specifically for deferred execution, and not for stuff like ``lambda x: func(1, 2, x, 3, 4)`` stuff. > > But I know that everyone and their email client is against me on this > > one, so I am not going to really try to tear into this. But I do > > think that lambda needs a renaming. > > I agree with this. The *name* "lambda" is a wart, even if the deferred > expression feature isn't. My preference is to simply replace the > keyword lambda with a keyword "expr" (or if that's not acceptable > because there's too much prior use of expr as a variable name, then > maybe "expression" - but that's starting to get a bit long). > > > Speaking as someone who still > > forgets that Python's lambda is not the same as those found in > > functional languages, > > Well, only in the sense that Python's *expressions* are not the same > as those found in functional languages (ie, Python has statements > which are not expressions). But I see your point - and I strongly > object to going the other way and extending lambda/expr to allow > statements or suites. > > > I would much rather have it named 'expr' or > > 'expression' or something that is more inline with its abilities then > > with a name taken for CS historical reasons. This ain't for father's > > lambda and thus shouldn't be named so. > > Agreed. But if "expr" isn't acceptable, I don't like the other common > suggestion of reusing "def". It's not a definition, nor is it "like an > anonymous function" (the lack of support for statements/suites being > the key difference). > Yeah, reusing def is taking back into the functional world too much. It makes our current use of def seem more like syntactic sugar for assigning a lambda to a name for function definition and that is not what is happening here. > > Then again, Guido did say he "should", not that he "did" admit defeat. =) > > OTOH, he was trying to stop endless the discussion... :-) > =) Well, it should when Python 3 comes out, so there is some extra incentive for that to happen sooner than later. -Brett From brett at python.org Tue Feb 7 23:05:32 2006 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2006 14:05:32 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: On 2/7/06, Michael Urman wrote: > On 2/6/06, Brett Cannon wrote: > > And I think that a deferred object would help with one of > > lambda's biggest uses and made its loss totally reasonable. > > The ambiguity inherent from the perspective of a deferred object makes > a general one impractical. Both map(Deferred().attribute, seq) and > map(Deferred().method(arg), seq) look the same - how does the object > know that the first case it should return the attribute of the first > element of seq when called, but in the second it should wait for the > next call when it will call method(arg) on the first element of seq? > Magic. =) Honestly, I don't know, but I bet there is some evil, black magic way to pull it off. Otherwise, worst case, Deferred takes an argument that flags that it has a method being called on it for it to defer against and not to treat it as an attribute access only. And that is within reason in terms of interace requirement for the object, in my opinion. > Since there's also no way to spell "lambda y: foo(x, y, z)" on a > simple deferred object, it's strictly less powerful. If the current > Python lambda's functionality is desired, there is no better pythonic > way to spell it. There are plenty of new syntactic options that help > highlight its expression nature, but are they worth the change? I never claimed that a deferred object would replace all uses of lambda, just that it would make it reasonable. For the above suggestion I would go to a named function. Or, if the argument was on the end or everything named, use functional.partial(). -Brett From martin at v.loewis.de Tue Feb 7 23:55:47 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2006 23:55:47 +0100 Subject: [Python-Dev] Linking with mscvrt Message-ID: <43E92573.6090300@v.loewis.de> I just came up with an idea how to resolve the VC versioning problems for good: Python should link with mscvrt.dll (which is part of the operating system), not with the CRT that the compiler provides. To do that, we would need to compile and link with the SDK header files and import libraries, not with the ones that visual studio provides. For that to work, everyone building Python or Python extensions (*) would have to install the Platform SDK (which is available for free, but contains quite a number of bits). Would that be acceptable? Disclaimer: I haven't tried yet whether this would actually work. Regards, Martin (*) For Python extensions, it should be possible to use mingw instead, and configure it for linking against msvcrt. From guido at python.org Wed Feb 8 00:05:38 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 7 Feb 2006 15:05:38 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <43E83E5D.3010000@v.loewis.de> References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> Message-ID: IMO asynchat and asyncore are braindead. The should really be removed from the standard library. The code is 10 years old and represents at least 10-year-old thinking about how to do this. The amount of hackery in Zope related to asyncore was outrageous -- basically most of asyncore's guts were replaced with more advanced Zope code, but the API was maintained for compatibility reasons. A nightmare. --Guido On 2/6/06, "Martin v. L?wis" wrote: > Mark Edgington wrote: > > Does anyone have any comments about applying the following patch to > > asynchat? > > That patch looks wrong. What does it mean to "run in a thread"? > All code runs in a thread, all the time: sometime, that thread > is the main thread. > > Furthermore, I can't see any presumed thread-unsafety in asynchat. > > Sure, there is a lot of member variables in asynchat which aren't > specifically protected against mutual access from different threads. > So you shouldn't be accessing the same async_chat object from multiple > threads. I cannot see why using a creating and using > an async_chat object in a thread that is not the main thread > could cause any problems. I also cannot see how this patch could > have significant effect on asyn_chat's behaviour when used in > multiple threads. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Wed Feb 8 00:46:18 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 8 Feb 2006 00:46:18 +0100 Subject: [Python-Dev] threadsafe patch for asynchat References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> Message-ID: Guido van Rossum wrote: > IMO asynchat and asyncore are braindead. The should really be removed > from the standard library. The code is 10 years old and represents at > least 10-year-old thinking about how to do this. strange. I'd say it works perfectly fine for what it was designed for (after all, sockets haven't changed much in 10 years either). what other reactive socket framework is there that would fit well into the standard library ? is twisted really simple enough ? From fredrik at pythonware.com Wed Feb 8 00:56:32 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 8 Feb 2006 00:56:32 +0100 Subject: [Python-Dev] release plan for 2.5 ? Message-ID: a while ago, I wrote > > Hopefully something can get hammered out so that at least the Python > > 3 docs can premiere having been developed on by the whole community. > > why wait for Python 3 ? > > what's the current release plan for Python 2.5, btw? I cannot find a > relevant PEP, and the "what's new" says "late 2005": > > http://www.python.org/dev/doc/devel/whatsnew/contents.html but I don't think that anyone followed up on this. what's the current status ? From fumanchu at amor.org Wed Feb 8 01:01:38 2006 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 7 Feb 2006 16:01:38 -0800 Subject: [Python-Dev] threadsafe patch for asynchat Message-ID: <6949EC6CD39F97498A57E0FA55295B210171975F@ex9.hostedexchange.local> Guido van Rossum wrote: > IMO asynchat and asyncore are braindead. The should really be removed > from the standard library. The code is 10 years old and represents at > least 10-year-old thinking about how to do this. The amount of hackery > in Zope related to asyncore was outrageous -- basically most of > asyncore's guts were replaced with more advanced Zope code, but the > API was maintained for compatibility reasons. A nightmare. Perhaps, but please keep in mind that the smtpd module uses both, currently, and would have to be rewritten if either is "removed". Robert Brewer System Architect Amor Ministries fumanchu at amor.org From aleaxit at gmail.com Wed Feb 8 01:28:09 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 7 Feb 2006 16:28:09 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> Message-ID: On 2/7/06, Fredrik Lundh wrote: ... > what other reactive socket framework is there that would fit well into > the standard library ? is twisted really simple enough ? Twisted is wonderful, powerful, rich, and very large. Perhaps a small subset could be carefully extracted that (given suitable volunteers to maintain it in the future) might fit in the standard library, but [a] that extraction is not going to be a simple or fast job, and [b] I suspect that the minimum sensible subset would still be much larger (and richer / more powerful) than asyncore. Alex From jcarlson at uci.edu Wed Feb 8 01:57:15 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 07 Feb 2006 16:57:15 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E83E5D.3010000@v.loewis.de> Message-ID: <20060207153758.1116.JCARLSON@uci.edu> Guido van Rossum wrote: > IMO asynchat and asyncore are braindead. The should really be removed > from the standard library. The code is 10 years old and represents at > least 10-year-old thinking about how to do this. The amount of hackery > in Zope related to asyncore was outrageous -- basically most of > asyncore's guts were replaced with more advanced Zope code, but the > API was maintained for compatibility reasons. A nightmare. I'm going to go ahead and disagree with Guido on this one. Before removing asyncore (and asynchat) from the standard library, I believe that there would necessarily need to be a viable replacement already in place. The SocketServer module and its derivatives are wholly unscalable for server-oriented applications once you get past a few dozen threads (where properly designed asyncore derivatives will do quite well all the way to your platform file handle limit). Every once and a while I hear about people pushing for Twisted to be included with Python, but at 2 megs for the base bz2 package, it seems a little...hefty. I'm not aware of any other community-accepted package for asynchronous socket clients and servers, but I'm always looking. Now, don't get me wrong, writing servers and clients using asyncore or asynchat can be a beast, but it does get one into the callback/reactor method of programming, which seems to have invaded other parts of Python and 3rd party libraries (xml.sax, tk, Twisted, wxPython, ...). Back to the topic that Guido was really complaining about: Zope + asyncore. I don't doubt that getting Zope to play nicely with asyncore was difficult, but it begs the questions: what would have been done if asyncore didn't exist, and why wasn't that done instead of trying to play nicely with asyncore? - Josiah From barry at python.org Wed Feb 8 02:19:33 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 07 Feb 2006 20:19:33 -0500 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <6949EC6CD39F97498A57E0FA55295B210171975F@ex9.hostedexchange.local> References: <6949EC6CD39F97498A57E0FA55295B210171975F@ex9.hostedexchange.local> Message-ID: <1139361573.19969.18.camel@geddy.wooz.org> On Tue, 2006-02-07 at 16:01 -0800, Robert Brewer wrote: > Perhaps, but please keep in mind that the smtpd module uses both, currently, and would have to be rewritten if either is "removed". Would that really be a huge loss? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060207/df75ae91/attachment.pgp From radeex at gmail.com Wed Feb 8 02:59:00 2006 From: radeex at gmail.com (Christopher Armstrong) Date: Wed, 8 Feb 2006 12:59:00 +1100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> Message-ID: <60ed19d40602071759v5144a163h28cd39a2853279f4@mail.gmail.com> On 2/8/06, Alex Martelli wrote: > On 2/7/06, Fredrik Lundh wrote: > ... > > what other reactive socket framework is there that would fit well into > > the standard library ? is twisted really simple enough ? > > Twisted is wonderful, powerful, rich, and very large. Perhaps a small > subset could be carefully extracted that (given suitable volunteers to > maintain it in the future) might fit in the standard library, but [a] > that extraction is not going to be a simple or fast job, and [b] I > suspect that the minimum sensible subset would still be much larger > (and richer / more powerful) than asyncore. The subject of putting (parts of) Twisted into the standard library comes up once every 6 months or so, at least on our mailing list. For all that I think asyncore is worthless, I'm still against copying Twisted into the stdlib. Or at least I'm not willing to maintain the necessary fork, and I fear the nightmares about versioning that can easily occur when you've got both standard library and third party versions of a project. But, for the record, to the people who argue not to put Twisted into the stdlib because of its size: The parts of it that would actually be applicable (i.e. those that obselete async* in the stdlib) are only a few kilobytes of code. At a quick run of "wc", the parts that support event loops, accurate timed calls, SSL, Unix sockets, TCP, UDP, arbitrary file descriptors, processes, and threads sums up to about 5300 lines of code. asynchat and asyncore are about 1200. -- Twisted | Christopher Armstrong: International Man of Twistery Radix | -- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// | -- http://twistedmatrix.com |o O| | w----v----w-+ From nnorwitz at gmail.com Wed Feb 8 04:03:11 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 7 Feb 2006 19:03:11 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/7/06, Fredrik Lundh wrote: > > > > what's the current release plan for Python 2.5, btw? I cannot find a > > relevant PEP, and the "what's new" says "late 2005": > > > but I don't think that anyone followed up on this. what's the current > status ? Guido and I had a brief discussion about this. IIRC, he was thinking alpha around March and release around summer. I think this is aggressive with all the things still to do. We really need to get the ssize_t branch integrated. There are a bunch of PEPs that have been accepted (or close), but not implemented. I think these include (please correct me, so we can get a good list): http://www.python.org/peps/ SA 308 Conditional Expressions SA 328 Imports: Multi-Line and Absolute/Relative SA 342 Coroutines via Enhanced Generators S 343 The "with" Statement S 353 Using ssize_t as the index type This one should be marked as final I believe: SA 341 Unifying try-except and try-finally n From jeremy at alum.mit.edu Wed Feb 8 04:26:02 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 7 Feb 2006 22:26:02 -0500 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: It looks like we need a Python 2.5 Release Schedule PEP. Jeremy On 2/7/06, Neal Norwitz wrote: > On 2/7/06, Fredrik Lundh wrote: > > > > > > what's the current release plan for Python 2.5, btw? I cannot find a > > > relevant PEP, and the "what's new" says "late 2005": > > > > > but I don't think that anyone followed up on this. what's the current > > status ? > > Guido and I had a brief discussion about this. IIRC, he was thinking > alpha around March and release around summer. I think this is > aggressive with all the things still to do. We really need to get the > ssize_t branch integrated. > > There are a bunch of PEPs that have been accepted (or close), but not > implemented. I think these include (please correct me, so we can get > a good list): > > http://www.python.org/peps/ > > SA 308 Conditional Expressions > SA 328 Imports: Multi-Line and Absolute/Relative > SA 342 Coroutines via Enhanced Generators > S 343 The "with" Statement > S 353 Using ssize_t as the index type > > This one should be marked as final I believe: > > SA 341 Unifying try-except and try-finally > > n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From nnorwitz at gmail.com Wed Feb 8 05:49:32 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 7 Feb 2006 20:49:32 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <60ed19d40602071759v5144a163h28cd39a2853279f4@mail.gmail.com> References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> <60ed19d40602071759v5144a163h28cd39a2853279f4@mail.gmail.com> Message-ID: On 2/7/06, Christopher Armstrong wrote: > > > Twisted is wonderful, powerful, rich, and very large. Perhaps a small > > subset could be carefully extracted > > The subject of putting (parts of) Twisted into the standard library > comes up once every 6 months or so, at least on our mailing list. For > all that I think asyncore is worthless, I'm still against copying > Twisted into the stdlib. Or at least I'm not willing to maintain the > necessary fork, and I fear the nightmares about versioning that can > easily occur when you've got both standard library and third party > versions of a project. I wouldn't be enthusiastic about putting all of Twisted in the stdlib either. Twisted is on a different release schedule than Python. However, isn't there a relatively small core subset like Alex mentioned that isn't changing much? Could we split up those components and have those live in the core, but the vast majority of Twisted live outside as it does now? n From janssen at parc.com Wed Feb 8 05:53:46 2006 From: janssen at parc.com (Bill Janssen) Date: Tue, 7 Feb 2006 20:53:46 PST Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: Your message of "Tue, 07 Feb 2006 15:46:18 PST." Message-ID: <06Feb7.205350pst."58633"@synergy1.parc.xerox.com> > what other reactive socket framework is there that would fit well into > the standard library ? is twisted really simple enough ? I've been very happy with Medusa, which is asyncore-based. Perhaps the right idea is to fix the various problems of asyncore. We might lift the similar code from the kernel of ILU, for example, which carefully addresses the various issues around this style of action loop. Bill From tim.peters at gmail.com Wed Feb 8 06:15:41 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 8 Feb 2006 00:15:41 -0500 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <20060207153758.1116.JCARLSON@uci.edu> References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> Message-ID: <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> [Josiah Carlson] > ... > Back to the topic that Guido was really complaining about: Zope + > asyncore. I don't doubt that getting Zope to play nicely with asyncore > was difficult, It's more that mixing asyncore with threads is a bloody nightmare, and ZEO and Zope both do that. Zope (but not ZEO) goes on to mix threads with asynchat too. In addition, ZEO makes life much harder than should be necessary by running in two different modes and auto-switching between them, depending on whether "the app" is or is not running an asyncore mainloop itself. In order to _detect_ when "the app" fires up an asyncore mainloop, ZEO monkey-patches asyncore's loop() function and physically replaces it with its own loop() function. It goes downhill from there. Guido's memories are partly out of date now: ZEO used to replace a lot more of asyncore than it does now, because of bugs in the asyncore distributed with older Python versions. The _needs_ for that went away little by little over the years, but the code in ZEO stuck around much longer. ZEO's ThreadedAsync/LoopCallback.py is much smaller now (ZODB 3.6) than Guido remembers. For a brief while, I even ripped out ZEO's monkey-patching of Python's asyncore loop(), but it turned out that newer code in Zope3 (but not Zope2) relied on, in turn, poking values into ZEO's module globals to cause ZEO's loop() replacement to shut down (that's the kind of "expedient" joy you get when mixing asyncore with threads). Every piece of it remains "underdocumented" and, IMO, highly obscure. > but it begs the questions: what would have been done if asyncore didn't exist, Who knows? What would python-dev be like if you didn't exist :-)? > and why wasn't that done instead of trying to play nicely with asyncore? Bugs and "missing features" in asyncore. For ZEO's purposes, if I had designed it, I expect it would have used threads (without asyncore). However, bits of code still sitting around suggest that it was at least the _intent_ at one time that ZEO be able to run without threads at all. That's certainly not possible now. If you look at asyncore's revision history, you'll note that Jeremy and Guido made many changes when they worked at Zope Corp. Those largely reflect the history of moving ZEO's asyncore monkey-patches into the Python core. BTW, if you don't use ZEO, I believe it's possible to run Zope3 without asyncore (you can use Twisted in Zope3 instead). From brett at python.org Wed Feb 8 06:39:19 2006 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2006 21:39:19 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/7/06, Neal Norwitz wrote: > On 2/7/06, Fredrik Lundh wrote: > > > > > > what's the current release plan for Python 2.5, btw? I cannot find a > > > relevant PEP, and the "what's new" says "late 2005": > > > > > but I don't think that anyone followed up on this. what's the current > > status ? > > Guido and I had a brief discussion about this. IIRC, he was thinking > alpha around March and release around summer. I think this is > aggressive with all the things still to do. We really need to get the > ssize_t branch integrated. > > There are a bunch of PEPs that have been accepted (or close), but not > implemented. I think these include (please correct me, so we can get > a good list): > > http://www.python.org/peps/ > > SA 308 Conditional Expressions > SA 328 Imports: Multi-Line and Absolute/Relative > SA 342 Coroutines via Enhanced Generators > S 343 The "with" Statement > S 353 Using ssize_t as the index type > > This one should be marked as final I believe: > > SA 341 Unifying try-except and try-finally > Supposedly Guido is close on pronouncing on PEP 352 (Required Superclass for Exceptions), or so he said last time that thread came about. -Brett From nnorwitz at gmail.com Wed Feb 8 07:35:31 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 7 Feb 2006 22:35:31 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/7/06, Jeremy Hylton wrote: > It looks like we need a Python 2.5 Release Schedule PEP. Very draft: http://www.python.org/peps/pep-0356.html Needs lots of work and release managers. Anthony, Martin, Fred, Sean are all mentioned with TBDs and question marks. n From smiles at worksmail.net Wed Feb 8 06:57:18 2006 From: smiles at worksmail.net (Smith) Date: Tue, 7 Feb 2006 23:57:18 -0600 Subject: [Python-Dev] math.areclose ...? References: <001301c62b59$cdfbe900$152c4fca@csmith> <002701c62b65$844a9750$7f00a8c0@RaymondLaptop1> Message-ID: <006c01c62c7d$d33926b0$1f2c4fca@csmith> Raymond Hettinger wrote: | [Chris Smith] || Does it help to spell it like this? || || def areclose(x, y, relative_err = 1.e-5, absolute_err=1.e-8): || diff = abs(x - y) || ave = (abs(x) + abs(y))/2 || return diff < absolute_err or diff/ave < relative_err | | There is a certain beauty and clarity to this presentation; however, | it is problematic numerically: | | * the division by either absolute_err and relative_err can overflow or | trigger a ZeroDivisionError I'm not dividing by either of these values so that shouldn't be a problem. As long as absolute_err is not 0 then the first test would catch the possiblity that x==y==ave==0. (see below) As for the overflow, does your version of python overflow? Mine (2.4) just returns 1.#INF which still computes as a number: ### >>> 1.79769313486e+308+1.79769313486e+308 1.#INF >>> inf=_ >>> inf>1 True >>> inf<1 False >>> 2./inf 0.0 >>> inf/inf -1.#IND ### There is a problem with dividing by 'ave' if the x and y are at the floating point limits, but the symmetric behaving form (presented by Scott Daniels) will have the same problem. The following format for close() has the same semantic meaning but avoids the overflow possibility and avoids extra work for the case when abs_tol=0 and x==y: ### def close(x, y, abs_tol=1.e-8, rel_tol=1.e-5): '''Return True if |x-y| < abs_tol or |x-y|/ave(|x|,|y|) < rel_tol. The average is not computed directly so as to avoid overflow for numbers close to the floating point upper limit.''' if x==y: return True diff = abs(x - y) if diff < abs_tol: return True f = rel_tol/2. if diff < f*abs(x) + f*abs(y): return True return False ### | | * the 'or' part of the expression can introduce an unnecessary | discontinuity in the first derivative. | If a value other than boolean were being returned, I could see the desire for continuity in derivative. Since the original form presents a boolean result, however, I'm having a hard time thinking of how the continuity issue comes to play. | The original Numeric definition is likely to be better for people who | know what they're doing; however, I still question whether it is an | appropriate remedy for the beginner issue | of why 1.1 + 1.1 + 1.1 doesn't equal 3.3. | I'm in total agreement. Being able to see that math.areclose(1.1*3,3.3) is True but 1.1*3==3.3 is False is not going to make them feel much better. They are going to have to face the floating point issue. As for the experienced user, perhaps such a function would be helpful. Maybe it would be better to require that the tolerances be given rather than defaulting so as to make clear which test is being used if only one test was going to be used: close(x,y,rel_tol=1e-5) close(x,y,abs_tol=1e-8) /c From smiles at worksmail.net Wed Feb 8 07:18:54 2006 From: smiles at worksmail.net (Smith) Date: Wed, 8 Feb 2006 00:18:54 -0600 Subject: [Python-Dev] small floating point number problem Message-ID: <006d01c62c7d$d7bab730$1f2c4fca@csmith> I just ran into a curious behavior with small floating points, trying to find the limits of them on my machine (XP). Does anyone know why the '0.0' is showing up for one case below but not for the other? According to my tests, the smallest representable float on my machine is much smaller than 1e-308: it is 2.470328229206234e-325 but I can only create it as a product of two numbers, not directly. Here is an attempt to create the much larger 1e-308: >>> a=1e-308 >>> a 0.0 >>> a==0 True <-- it really is 0; this is not a repr issue >>> b=.1*1e-307 >>> b 9.9999999999999991e-309 >>> a==b False <--they really are different >>> Also, I see that there is some graininess in the numbers at the low end, but I'm guessing that there is some issue with floating points that I would need to read up on again. The above dilemma is a little more troublesome. >>> m=2.470328229206234e-017 >>> s=1e-307 >>> m*s 4.9406564584124654e-324 #2x too large >>> 2*m*s 4.9406564584124654e-324 >>> 3*m*s==4*m*s True >>> /c From martin at v.loewis.de Wed Feb 8 08:05:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 08:05:51 +0100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> Message-ID: <43E9984F.9020502@v.loewis.de> Tim Peters wrote: > Bugs and "missing features" in asyncore. For ZEO's purposes, if I had > designed it, I expect it would have used threads (without asyncore). > However, bits of code still sitting around suggest that it was at > least the _intent_ at one time that ZEO be able to run without threads > at all. That's certainly not possible now. What is the reason that people want to use threads when they can have poll/select-style message processing? Why does Zope require threads? IOW, why would anybody *want* a "threadsafe patch for asynchat"? Regards, Martin From stephen at xemacs.org Wed Feb 8 08:19:30 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 08 Feb 2006 16:19:30 +0900 Subject: [Python-Dev] Help with Unicode arrays in NumPy In-Reply-To: (Travis E. Oliphant's message of "Tue, 07 Feb 2006 13:23:13 -0700") References: <43E8FDC4.1010607@v.loewis.de> Message-ID: <87r76epbb1.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Travis" == Travis E Oliphant writes: Travis> Numpy supports arrays of arbitrary fixed-length "records". Travis> It is much more than numeric-only data now. One of the Travis> fields that a record can contain is a string. If strings Travis> are supported, it makes sense to support unicode strings Travis> as well. That is not obvious. A string is really an array of bytes, which for historical reasons in some places (primarily the U.S. of A.) can be used to represent text. Unicode, on the other hand, is intended to represent text streams robustly and does so in a universal but flexible way ... but all of the different Unicode transformation formats are considered to represent the *identical* text stream. Some applications may specify a transformation format, others will not. In any case, internally Python is only going to support *one*; all the others must be read in through codecs anyway. See below. Travis> This allows NumPy to memory-map arbitrary data-files on Travis> disk. In the case where a transformation format *is* specified, I don't see why you can't use a byte array field (ie, ordinary "string") of appropriate size for this purpose, and read it through a codec when it needs to be treated as text. This is going to be necessary in essentially all of the cases I encounter, because the files are UTF-8 and sane internal representations are either UTF-16 or UTF-32. In particular, Python's internal representation is 16 or 32 bits wide. Travis> Perhaps you should explain why you think NumPy "shouldn't Travis> support Unicode" Because it can't, not in the way you would like to, if I understand you correctly. Python chooses *one* of the many standard representations for internal use, and because of the way the standard is specified, it doesn't matter which one! And none of the others can be represented directly, all must be decoded for internal use and encoded when written back to external media. So any memory mapping application is inherently nonportable, even across Python implementations. Travis> And Python does not support arbitrary Unicode characters Travis> on narrow builds? Then how is \U0010FFFF represented? In a way incompatible with the concept of character array. Now what do you do? The point is that Unicode is intentionally designed in such a way that a plethora of representations is possible, but all are easily and reliably interconverted. Implementations are then free to choose an appropriate internal representation, knowing that conversion from external representations is "cheap" and standardized. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From steve at holdenweb.com Wed Feb 8 08:33:28 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 08 Feb 2006 02:33:28 -0500 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <43E9984F.9020502@v.loewis.de> References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Tim Peters wrote: > >>Bugs and "missing features" in asyncore. For ZEO's purposes, if I had >>designed it, I expect it would have used threads (without asyncore). >>However, bits of code still sitting around suggest that it was at >>least the _intent_ at one time that ZEO be able to run without threads >>at all. That's certainly not possible now. > > > What is the reason that people want to use threads when they can have > poll/select-style message processing? Why does Zope require threads? > IOW, why would anybody *want* a "threadsafe patch for asynchat"? > In case the processing of events needed to block? If I'm processing web requests in an async* dispatch loop and a request needs the results of a (probably lengthy) database query in order to generate its output, how do I give the dispatcher control again to process the next asynchronous network event? The usual answer is "process the request in a thread". That way the dispatcher can spring to life for each event as quickly as needed. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From fredrik at pythonware.com Wed Feb 8 08:44:25 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 8 Feb 2006 08:44:25 +0100 Subject: [Python-Dev] threadsafe patch for asynchat References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com><43E9984F.9020502@v.loewis.de> Message-ID: Steve Holden wrote: > > What is the reason that people want to use threads when they can have > > poll/select-style message processing? Why does Zope require threads? > > IOW, why would anybody *want* a "threadsafe patch for asynchat"? > > > In case the processing of events needed to block? If I'm processing web > requests in an async* dispatch loop and a request needs the results of a > (probably lengthy) database query in order to generate its output, how > do I give the dispatcher control again to process the next asynchronous > network event? > > The usual answer is "process the request in a thread". That way the > dispatcher can spring to life for each event as quickly as needed. but why do such threads have to talk to asyncore directly ? From jcarlson at uci.edu Wed Feb 8 08:57:00 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 07 Feb 2006 23:57:00 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: Message-ID: <20060207235610.1121.JCARLSON@uci.edu> "Fredrik Lundh" wrote: > > Steve Holden wrote: > > > > What is the reason that people want to use threads when they can have > > > poll/select-style message processing? Why does Zope require threads? > > > IOW, why would anybody *want* a "threadsafe patch for asynchat"? > > > > > In case the processing of events needed to block? If I'm processing web > > requests in an async* dispatch loop and a request needs the results of a > > (probably lengthy) database query in order to generate its output, how > > do I give the dispatcher control again to process the next asynchronous > > network event? > > > > The usual answer is "process the request in a thread". That way the > > dispatcher can spring to life for each event as quickly as needed. > > but why do such threads have to talk to asyncore directly ? Indeed. I seem to remember a discussion a few months ago about "easy" thread programming, which invariably directed people off to use the simplest abstractions necessary: Queues. - Josiah From jcarlson at uci.edu Wed Feb 8 09:07:12 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 08 Feb 2006 00:07:12 -0800 Subject: [Python-Dev] small floating point number problem In-Reply-To: <006d01c62c7d$d7bab730$1f2c4fca@csmith> References: <006d01c62c7d$d7bab730$1f2c4fca@csmith> Message-ID: <20060207235820.1124.JCARLSON@uci.edu> "Smith" wrote: > > I just ran into a curious behavior with small floating points, trying > to find the limits of them on my machine (XP). Does anyone know why the > '0.0' is showing up for one case below but not for the other? According > to my tests, the smallest representable float on my machine is much > smaller than 1e-308: it is There are all sorts of ugly bits when working with all binary fp numbers (for the small ones, look for a reference on 'denormals'). I'm sure that Raymond has more than a few things to say about them (and fp in general), but I will speed up the discussion by saying that you should read the IEEE 754 standard for floating point, or alternatively ask on comp.lang.python where more users would get more out of the answers that you will recieve there. One thing to remember is that decimal is not the native representation of binary floating point, so 1e-100 differs from 1e-101 significantly in various bit positions. You can use struct.pack('d', flt) to see this, or you can try any one of the dozens of IEEE 754 javascript calculators out there. - Josiah From raymond.hettinger at verizon.net Wed Feb 8 09:08:25 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 08 Feb 2006 03:08:25 -0500 Subject: [Python-Dev] small floating point number problem References: <006d01c62c7d$d7bab730$1f2c4fca@csmith> Message-ID: <001c01c62c86$d694beb0$b83efea9@RaymondLaptop1> [Smith] >I just ran into a curious behavior with small floating points, trying to >find the limits of them on my machine (XP). Does anyone know why the '0.0' >is showing up for one case below but not for the other? According to my >tests, the smallest representable float on my machine is much smaller than >1e-308: it is > > 2.470328229206234e-325 > > but I can only create it as a product of two numbers, not directly. Here > is an attempt to create the much larger 1e-308: > >>>> a=1e-308 >>>> a > 0.0 The clue is in that the two differ by 17 orders of magnitude (325-308) which is about 52 bits. The interpreter builds 1-e308 by using the underlying C library string-to-float function and it isn't constructing numbers outside the normal range for floats. When you enter a value outside that range, the function underflows it to zero. In contrast, your computed floats (such as 1*1e-307) return a denormal result (where the significand is stored with fewer bits than normal because the exponent is already at its outer limit). That denormal result is not zero and the C library float-to-string conversion successfully generates a decimal string representation. The asymmetric handling of denormals by the atof() and ftoa() functions is why you see a difference. A consequence of that asymmetry is the breakdown of the expected eval(repr(f))==f invariant: >>> f = f = .1*1e-307 >>> eval(repr(f)) == f False Raymond From thomas at xs4all.net Wed Feb 8 11:00:06 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 8 Feb 2006 11:00:06 +0100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <06Feb7.205350pst."58633"@synergy1.parc.xerox.com> References: <06Feb7.205350pst."58633"@synergy1.parc.xerox.com> Message-ID: <20060208100006.GD10226@xs4all.nl> On Tue, Feb 07, 2006 at 08:53:46PM -0800, Bill Janssen wrote: > Perhaps the right idea is to fix the various problems of asyncore. The problem with making asyncore more useful is that you end up with (a cut down version of) Twisted, although not one that would be able to integrate with Twisted. asyncore/asynchat and Twisted are really not that different, and anything you do to enhance the former will make it look more like the latter. I'd personally rather fork parts of Twisted, in spite of the maintenance issues, than re-invent Twisted, fix all the issues Twisted already solves and face the same kind of maintenance issues. It would be perfect if the twisted-light in the stdlib would integrate with the 'real' Twisted, so that users can 'upgrade' their programs just by installing Twisted and using the extra features. Not that I think we should stop at the event core and the TCP/SSL parts of Twisted; imaplib, poplib, httplib, xmlrpclib, for instance, could all do with Twisted-inspired alternatives (or even replacements, if the synchronous API was kept the same.) The synchronous versions are fine for simple scripts (or complex scripts that don't mind long blocking operations.) If we start exporting a really useful asynchronous framework, I would expect asynchronous counterparts to the useful higher-level networking modules, too. But that doesn't have to come right away ;) Anything beyond simple bugfixes on asyncore/asynchat seems like a terrible waste of effort, to me. And I hardly ever use Twisted. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fuzzyman at voidspace.org.uk Wed Feb 8 11:01:40 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Wed, 08 Feb 2006 10:01:40 +0000 Subject: [Python-Dev] Old Style Classes Goiung in Py3K Message-ID: <43E9C184.5090608@voidspace.org.uk> Hello all, I understand that old style classes are slated to disappear in Python 3000. Does this mean that the following will be a syntax error : class Something: pass *or* that instead it will automatically inherit from object ? The latter would break a few orders of magnitude less code of course... All the best, Michael Foord http://www.voidspace.org.uk/python/index.shtml From raveendra-babu.m at hp.com Wed Feb 8 10:41:35 2006 From: raveendra-babu.m at hp.com (M, Raveendra Babu (STSD)) Date: Wed, 8 Feb 2006 15:11:35 +0530 Subject: [Python-Dev] Make error on solaris 9 x86 - error: parse error before "upad128_t" Message-ID: Hi, I am trying to build python-2.3.5 on solaris 9 - X86. 1) first I have unpacked : Python-2.3.5.tgz using : tar -zxvf Python-2.3.5.tgz no erros at this stage 2) then run : ./configure No errors at this stage 3)then /usr/ccs/bin/make it is giving some errors and the error is : gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Python/pythonrun.o Python/pythonrun.c In file included from /usr/include/sys/reg.h:13, from /usr/include/sys/regset.h:24, from /usr/include/sys/ucontext.h:21, from /usr/include/sys/signal.h:240, from /usr/include/signal.h:27, from Python/pythonrun.c:17: /usr/include/ia32/sys/reg.h:300: error: parse error before "upad128_t" /usr/include/ia32/sys/reg.h:302: error: parse error before '}' token /usr/include/ia32/sys/reg.h:309: error: field `kfpu_fx' has incomplete type /usr/include/ia32/sys/reg.h:314: confused by earlier errors, bailing out *** Error code 1 make: Fatal error: Command failed for target `Python/pythonrun.o' Can you please reply me with some fix for this problem. Regards -Raveendrababu From fredrik at pythonware.com Wed Feb 8 11:15:42 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 8 Feb 2006 11:15:42 +0100 Subject: [Python-Dev] Make error on solaris 9 x86 - error: parse errorbefore "upad128_t" References: Message-ID: M, Raveendra Babu (STSD) wrote: > 3)then /usr/ccs/bin/make > > it is giving some errors and the error is : > gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes > -I. -I./Include -DPy_BUILD_CORE -o Python/pythonrun.o > Python/pythonrun.c > In file included from /usr/include/sys/reg.h:13, > from /usr/include/sys/regset.h:24, > from /usr/include/sys/ucontext.h:21, > from /usr/include/sys/signal.h:240, > from /usr/include/signal.h:27, > from Python/pythonrun.c:17: > /usr/include/ia32/sys/reg.h:300: error: parse error before "upad128_t" > /usr/include/ia32/sys/reg.h:302: error: parse error before '}' token > /usr/include/ia32/sys/reg.h:309: error: field `kfpu_fx' has incomplete > type > /usr/include/ia32/sys/reg.h:314: confused by earlier errors, bailing out > *** Error code 1 > make: Fatal error: Command failed for target `Python/pythonrun.o' > > Can you please reply me with some fix for this problem. a quick google search indicates that this is a compiler problem. random FAQ entry: The problem is that the Solaris headers changed across updates of Solaris 9 and you are using a GCC from before the change on an updated system. (i.e. a GCC built for Solaris 9 <= 12/03 on Solaris 9 >= 4/04). You can either rebuild GCC for your version of the system (it works, even using a GCC built for the previous version), or fix your headers: http://groups.yahoo.com/group/solarisx86/message/6617 From patrick at collison.ie Wed Feb 8 11:13:25 2006 From: patrick at collison.ie (Patrick Collison) Date: Wed, 8 Feb 2006 10:13:25 +0000 Subject: [Python-Dev] Let's just *keep* lambda Message-ID: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> >> After so many attempts to come up with an alternative for lambda, >> perhaps we should admit defeat. I've not had the time to follow the >> most recent rounds, but I propose that we keep lambda, so as to stop >> wasting everybody's talent and time on an impossible quest. > > I agree with this. The *name* "lambda" is a wart, even if the deferred > expression feature isn't. My preference is to simply replace the > keyword lambda with a keyword "expr" (or if that's not acceptable > because there's too much prior use of expr as a variable name, then > maybe "expression" - but that's starting to get a bit long). Sorry, I'm a little late to this discussion. How about `procedure', or just `proc'? And to think that people thought that keeping "lambda", but changing the name, would avoid all the heated discussion... :-) -Patrick From theller at python.net Wed Feb 8 12:05:43 2006 From: theller at python.net (Thomas Heller) Date: Wed, 08 Feb 2006 12:05:43 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43E92573.6090300@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?= =?iso-8859-1?Q?wis's?= message of "Tue, 07 Feb 2006 23:55:47 +0100") References: <43E92573.6090300@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > I just came up with an idea how to resolve the VC versioning > problems for good: Python should link with mscvrt.dll (which > is part of the operating system), not with the CRT that the > compiler provides. > > To do that, we would need to compile and link with the SDK > header files and import libraries, not with the ones that > visual studio provides. > > For that to work, everyone building Python or Python extensions (*) > would have to install the Platform SDK (which is available > for free, but contains quite a number of bits). Would that be > acceptable? > > Disclaimer: I haven't tried yet whether this would actually > work. > > Regards, > Martin > > (*) For Python extensions, it should be possible to use mingw > instead, and configure it for linking against msvcrt. I think think would remove a lot of headaches. Downloading and installing the Platform SDK should not be an issue, imo. The only problem that I see is this: I'm not sure the platform SDK include files (.H and .IDL) are really compatible with VC7.1. I remember that we (on our company, building C++ software) had to 'Unregister the PSDK Directories with Visual Studio' (available from the start menu) before building the stuff, otherwise there were compiler errors. Thomas From steve at holdenweb.com Wed Feb 8 13:25:35 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 08 Feb 2006 07:25:35 -0500 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <20060207235610.1121.JCARLSON@uci.edu> References: <20060207235610.1121.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > "Fredrik Lundh" wrote: > >>Steve Holden wrote: >> >> >>>>What is the reason that people want to use threads when they can have >>>>poll/select-style message processing? Why does Zope require threads? >>>>IOW, why would anybody *want* a "threadsafe patch for asynchat"? >>>> >>> >>>In case the processing of events needed to block? If I'm processing web >>>requests in an async* dispatch loop and a request needs the results of a >>>(probably lengthy) database query in order to generate its output, how >>>do I give the dispatcher control again to process the next asynchronous >>>network event? >>> >>>The usual answer is "process the request in a thread". That way the >>>dispatcher can spring to life for each event as quickly as needed. >> >>but why do such threads have to talk to asyncore directly ? > Good question. > > Indeed. I seem to remember a discussion a few months ago about "easy" > thread programming, which invariably directed people off to use the > simplest abstractions necessary: Queues. > Maybe people are finding Python too easy and they just want to complicate their code to the point where it contains interesting bugs? I dunno .... regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From abo at minkirri.apana.org.au Wed Feb 8 14:23:26 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Wed, 08 Feb 2006 13:23:26 +0000 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> Message-ID: <1139405006.21021.40.camel@warna.dub.corp.google.com> On Wed, 2006-02-08 at 02:33 -0500, Steve Holden wrote: > Martin v. L?wis wrote: > > Tim Peters wrote: [...] > > What is the reason that people want to use threads when they can have > > poll/select-style message processing? Why does Zope require threads? > > IOW, why would anybody *want* a "threadsafe patch for asynchat"? > > > In case the processing of events needed to block? If I'm processing web > requests in an async* dispatch loop and a request needs the results of a > (probably lengthy) database query in order to generate its output, how > do I give the dispatcher control again to process the next asynchronous > network event? > > The usual answer is "process the request in a thread". That way the > dispatcher can spring to life for each event as quickly as needed. I believe that Twisted does pretty much this with it's "deferred" stuff. It shoves slow stuff off for processing in a separate thread that re-syncs with the event loop when it's finished. In the case of Zope/ZEO I'm not entirely sure but I think what happened was medusa (asyncore/asynchat based stuff Zope2 was based on) didn't have this deferred handler support. When they found some of the stuff Zope was doing took a long time, they came up with an initially simpler but IMHO uglier solution of running multiple async loops in separate threads and using a front-end dispatcher to distribute connections to them. This way it wasn't too bad if an async loop stalled, because the other loops in other threads could continue to process stuff. If ZEO is still using this approach I think switching to a twisted style approach would be a good idea. However, I suspect this would be a very painful refactor... -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From andrew-pythondev at puzzling.org Wed Feb 8 14:57:04 2006 From: andrew-pythondev at puzzling.org (Andrew Bennetts) Date: Thu, 9 Feb 2006 00:57:04 +1100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <1139405006.21021.40.camel@warna.dub.corp.google.com> References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> <1139405006.21021.40.camel@warna.dub.corp.google.com> Message-ID: <20060208135704.GN547@home.puzzling.org> Donovan Baarda wrote: > On Wed, 2006-02-08 at 02:33 -0500, Steve Holden wrote: > > Martin v. L?wis wrote: > > > Tim Peters wrote: > [...] > > > What is the reason that people want to use threads when they can have > > > poll/select-style message processing? Why does Zope require threads? > > > IOW, why would anybody *want* a "threadsafe patch for asynchat"? > > > > > In case the processing of events needed to block? If I'm processing web > > requests in an async* dispatch loop and a request needs the results of a > > (probably lengthy) database query in order to generate its output, how > > do I give the dispatcher control again to process the next asynchronous > > network event? > > > > The usual answer is "process the request in a thread". That way the > > dispatcher can spring to life for each event as quickly as needed. > > I believe that Twisted does pretty much this with it's "deferred" stuff. > It shoves slow stuff off for processing in a separate thread that > re-syncs with the event loop when it's finished. Argh! No. Threading is completely orthogonal to Deferreds. Deferreds are just an abstraction for managing callbacks for an asychronous operation. They don't magically invoke threads, or otherwise turn synchronous code into asynchronous code for you. This seems to be a depressingly common misconception. I wish I knew how to stop it. They're much simpler than people seem to think. They're an object a function returns to say "I don't have a result for you yet, but if you attach callbacks to this I'll run those when I do." We've do this because it's much nicer than having to pass callbacks into functions, particularly when you want to deal with chains of callbacks and error handling. There is a single utility function in Twisted called "deferToThread" that will run a function in a threadpool, and arrange for a Deferred to be fired with the result (in the event loop thread, of course). This is just one of many possible uses for Deferreds, and not an especially common one. I'm happy to provide pointers to several Twisted docs if anyone is at all unclear on this. While they are very useful, I don't think they're an essential part of a minimal Twisted replacement for asyncore/asynchat -- in fact, they'd work just fine with asyncore/asynchat, because they do so little. -Andrew. From arigo at tunes.org Wed Feb 8 15:20:34 2006 From: arigo at tunes.org (Armin Rigo) Date: Wed, 8 Feb 2006 15:20:34 +0100 Subject: [Python-Dev] _length_cue() Message-ID: <20060208142034.GA1292@code0.codespeak.net> Hi all, Last september, the __len__ method of iterators was removed -- see discussion at: http://mail.python.org/pipermail/python-dev/2005-September/056879.html It was replaced by an optional undocumented method called _length_cue(), which would be used to guess the number of remaining items in an iterator, for performance reasons. I'm worried about the name. There are now exactly two names that behave like a special method without having the double-underscores around it. The first name is 'next', which is kind of fine because it's for iterator classes only and it's documented. But now, consider: the CPython implementation can unexpectedly invoke a method on a user-defined iterator class, even though this method's name is not '__*__' and not documented as special! That's new and that's bad. IMHO for safety reasons we need to stick double-underscores around this name too, e.g. __length_cue__(). It's new in 2.5 and not documented anyway so this change won't break anything. Do you agree with that? BTW the reason I'm looking at this is that I'm considering adding another undocumented internal-use-only method, maybe __getitem_cue__(), that would try to guess what the nth item to be returned will be. This would allow the repr of some iterators to display more helpful information when playing around with them at the prompt, e.g.: >>> enumerate([3.1, 3.14, 3.141, 3.1415, 3.14159, 3.141596]) A bientot, Armin From rasky at develer.com Wed Feb 8 15:24:33 2006 From: rasky at develer.com (Giovanni Bajo) Date: Wed, 8 Feb 2006 15:24:33 +0100 Subject: [Python-Dev] Linking with mscvrt References: <43E92573.6090300@v.loewis.de> Message-ID: <031401c62cbb$61810630$bf03030a@trilan> Martin v. L?wis wrote: > I just came up with an idea how to resolve the VC versioning > problems for good: Python should link with mscvrt.dll (which > is part of the operating system), not with the CRT that the > compiler provides. Can you elaborate exactly on which versioning problems you think of? > For that to work, everyone building Python or Python extensions (*) > would have to install the Platform SDK (which is available > for free, but contains quite a number of bits). Would that be > acceptable? It would complicate the build process and make Python lag behind CRT development (including bugfixes and whatnot) that Microsoft does. You could as well ask to always stick with GCC 2.95 to solve ABI problems, but I don't think it's the correct long time solution. I expect more and more Windows libraries (binary version) to be shipped with dependencies on MSVCR71.DLL. Anyway, it's just a feeling, since I still don't understand which problems you are trying to solve in the first place. -- Giovanni Bajo From dialtone at divmod.com Wed Feb 8 15:14:42 2006 From: dialtone at divmod.com (Valentino Volonghi aka Dialtone) Date: Wed, 8 Feb 2006 15:14:42 +0100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <1139405006.21021.40.camel@warna.dub.corp.google.com> References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> <1139405006.21021.40.camel@warna.dub.corp.google.com> Message-ID: <20060208141442.GA322@divmod.com> On Wed, Feb 08, 2006 at 01:23:26PM +0000, Donovan Baarda wrote: > I believe that Twisted does pretty much this with it's "deferred" stuff. > It shoves slow stuff off for processing in a separate thread that > re-syncs with the event loop when it's finished. Deferreds are only an elaborate way to deal with a bunch of callbacks. It's Twisted itself that provides a way to run something in a separate thread and then fire a deferred (from the main thread) when the child thread finishes (reactor.callInThread() to call stuff in a different thread, reactor.callFromThread() to call reactor APIs from a different thread) Deferreds are just a bit more than: class Deferred(object): def __init__(self): self.callbacks = [] def addCallback(self, callback): self.callbacks.append(callback) def callback(self, value): for callback in self.callbacks: value = callback(value) This is mostly what a deferred is (without error handling, extra argument passing, 'nested' deferreds handling and blabla, the core concept however is there). As you see there is no extra magic in deferreds (or weird dependency on Twisted, they are pure python and could be used everywhere, you can implement them in any language that supports first class functions). > In the case of Zope/ZEO I'm not entirely sure but I think what happened > was medusa (asyncore/asynchat based stuff Zope2 was based on) didn't > have this deferred handler support. When they found some of the stuff Here I think you meant that medusa didn't handle computation in separate threads instead. -- Valentino Volonghi aka Dialtone Now Running MacOSX 10.4 Blog: http://vvolonghi.blogspot.com New Pet: http://www.stiq.it -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/31084053/attachment.pgp From g.brandl at gmx.net Wed Feb 8 15:39:28 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 08 Feb 2006 15:39:28 +0100 Subject: [Python-Dev] Old Style Classes Goiung in Py3K In-Reply-To: <43E9C184.5090608@voidspace.org.uk> References: <43E9C184.5090608@voidspace.org.uk> Message-ID: Fuzzyman wrote: > Hello all, > > I understand that old style classes are slated to disappear in Python 3000. > > Does this mean that the following will be a syntax error : > > class Something: > pass > > *or* that instead it will automatically inherit from object ? Of course, I would say. There's no reason to forbid this in Py3k. regards, Georg From g.brandl at gmx.net Wed Feb 8 15:42:39 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 08 Feb 2006 15:42:39 +0100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> <60ed19d40602071759v5144a163h28cd39a2853279f4@mail.gmail.com> Message-ID: Neal Norwitz wrote: > On 2/7/06, Christopher Armstrong wrote: >> >> > Twisted is wonderful, powerful, rich, and very large. Perhaps a small >> > subset could be carefully extracted >> >> The subject of putting (parts of) Twisted into the standard library >> comes up once every 6 months or so, at least on our mailing list. For >> all that I think asyncore is worthless, I'm still against copying >> Twisted into the stdlib. Or at least I'm not willing to maintain the >> necessary fork, and I fear the nightmares about versioning that can >> easily occur when you've got both standard library and third party >> versions of a project. > > I wouldn't be enthusiastic about putting all of Twisted in the stdlib > either. Twisted is on a different release schedule than Python. > However, isn't there a relatively small core subset like Alex > mentioned that isn't changing much? Could we split up those > components and have those live in the core, but the vast majority of > Twisted live outside as it does now? +1. This would be very useful for simple networking applications. Georg From aahz at pythoncraft.com Wed Feb 8 16:42:29 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 8 Feb 2006 07:42:29 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <20060208100006.GD10226@xs4all.nl> References: <06Feb7.205350pst."58633"@synergy1.parc.xerox.com> <20060208100006.GD10226@xs4all.nl> Message-ID: <20060208154229.GB8217@panix.com> On Wed, Feb 08, 2006, Thomas Wouters wrote: > > Anything beyond simple bugfixes on asyncore/asynchat seems like a terrible > waste of effort, to me. And I hardly ever use Twisted. +1 -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From aahz at pythoncraft.com Wed Feb 8 16:48:39 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 8 Feb 2006 07:48:39 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: <20060208154839.GD8217@panix.com> On Wed, Feb 08, 2006, Patrick Collison wrote: > > How about `procedure', or just `proc'? -1 lambdas are *expected* to return a result -- procedures are functions with side-effects that don't return a result. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From aahz at pythoncraft.com Wed Feb 8 16:50:54 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 8 Feb 2006 07:50:54 -0800 Subject: [Python-Dev] _length_cue() In-Reply-To: <20060208142034.GA1292@code0.codespeak.net> References: <20060208142034.GA1292@code0.codespeak.net> Message-ID: <20060208155054.GE8217@panix.com> On Wed, Feb 08, 2006, Armin Rigo wrote: > > IMHO for safety reasons we need to stick double-underscores around this > name too, e.g. __length_cue__(). It's new in 2.5 and not documented > anyway so this change won't break anything. Do you agree with that? +1 -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From bokr at oz.net Wed Feb 8 16:54:38 2006 From: bokr at oz.net (Bengt Richter) Date: Wed, 08 Feb 2006 15:54:38 GMT Subject: [Python-Dev] small floating point number problem References: <006d01c62c7d$d7bab730$1f2c4fca@csmith> <001c01c62c86$d694beb0$b83efea9@RaymondLaptop1> Message-ID: <43ea0d7e.264446924@news.gmane.org> On Wed, 08 Feb 2006 03:08:25 -0500, "Raymond Hettinger" wrote: >[Smith] >>I just ran into a curious behavior with small floating points, trying to >>find the limits of them on my machine (XP). Does anyone know why the '0.0' >>is showing up for one case below but not for the other? According to my >>tests, the smallest representable float on my machine is much smaller than >>1e-308: it is >> >> 2.470328229206234e-325 >> >> but I can only create it as a product of two numbers, not directly. Here >> is an attempt to create the much larger 1e-308: >> >>>>> a=1e-308 >>>>> a >> 0.0 > >The clue is in that the two differ by 17 orders of magnitude (325-308) which >is about 52 bits. > >The interpreter builds 1-e308 by using the underlying C library >string-to-float function and it isn't constructing numbers outside the >normal range for floats. When you enter a value outside that range, the >function underflows it to zero. > >In contrast, your computed floats (such as 1*1e-307) return a denormal >result (where the significand is stored with fewer bits than normal because >the exponent is already at its outer limit). That denormal result is not >zero and the C library float-to-string conversion successfully generates a >decimal string representation. > >The asymmetric handling of denormals by the atof() and ftoa() functions is >why you see a difference. A consequence of that asymmetry is the breakdown >of the expected eval(repr(f))==f invariant: > >>>> f = f = .1*1e-307 >>>> eval(repr(f)) == f >False > BTW, for the OP, chasing minimum float values is probably best done with powers of 2 >>> math.ldexp(1, -1074) 4.9406564584124654e-324 >>> math.ldexp(1, -1075) 0.0 >>> .5**1074 4.9406564584124654e-324 >>> .5**1075 0.0 >>> math.frexp(.5**1074) (0.5, -1073) >>> math.frexp(.5**1075) (0.0, 0) Regards, Bengt Richter From gjc at inescporto.pt Wed Feb 8 16:47:04 2006 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 08 Feb 2006 15:47:04 +0000 Subject: [Python-Dev] Python modules should link to libpython Message-ID: <1139413624.10037.36.camel@localhost> gjc:/usr/lib/python2.4/lib-dynload$ ldd itertools.so libpthread.so.0 => /lib/libpthread.so.0 (0x00002aaaaabcc000) libc.so.6 => /lib/libc.so.6 (0x00002aaaaace2000) /lib/ld-linux-x86-64.so.2 (0x0000555555554000) gjc:/usr/lib/python2.4/lib-dynload$ It seems that Python C extension modules are not linking explicitly to libpython. Yet, they explicitly reference symbols defined in libpython. When libpython is loaded in a global scope all is fine. However, when libpython is dlopen()ed with the RTLD_LOCAL flag, python C extensions always get undefined symbols. This problem happened recently with the nautilus-python package, which installs an extension for the Nautilus file manager that allows extensions in Python. For performance reasons, it now opens extensions with RTLD_LOCAL flag, thus breaking python extensions. Any thoughts? Should I go ahead and open a bug report (maybe with patch), or is this controversial? -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic. From Scott.Daniels at Acm.Org Wed Feb 8 17:11:55 2006 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed, 08 Feb 2006 08:11:55 -0800 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <006c01c62c7d$d33926b0$1f2c4fca@csmith> References: <001301c62b59$cdfbe900$152c4fca@csmith> <002701c62b65$844a9750$7f00a8c0@RaymondLaptop1> <006c01c62c7d$d33926b0$1f2c4fca@csmith> Message-ID: Smith wrote: > ... There is a problem with dividing by 'ave' if the x and y are at > the floating point limits, but the symmetric behaving form (presented > by Scott Daniels) will have the same problem. Upon reflection, 'max' is probably better than averaging, and avoiding divide is also a reasonably good idea. Note that relative_tol < 1.0 (typically) so underflow, rather than overflow, is the issue: def nearby(x, y, relative_tol=1.e-5, absolute_tol=1.e-8): difference = abs(x - y) return (difference <= absolute_tol or difference <= max(abs(x), abs(y)) * relative_tol) I use <=, since "zero-tolerance" should pass equal values. --Scott David Daniels scott.Daniels at Acm.Org From ark at acm.org Wed Feb 8 17:54:14 2006 From: ark at acm.org (Andrew Koenig) Date: Wed, 8 Feb 2006 11:54:14 -0500 Subject: [Python-Dev] _length_cue() In-Reply-To: <20060208142034.GA1292@code0.codespeak.net> Message-ID: <008701c62cd0$4e1b4eb0$6402a8c0@arkdesktop> > I'm worried about the name. There are now exactly two names that behave > like a special method without having the double-underscores around it. > The first name is 'next', which is kind of fine because it's for > iterator classes only and it's documented. But now, consider: the > CPython implementation can unexpectedly invoke a method on a > user-defined iterator class, even though this method's name is not > '__*__' and not documented as special! That's new and that's bad. Might I suggest that at least you consider using "hint" instead of "cue"? I'm pretty sure that "hint" has been in use for some time, and always to mean a value that can't be assumed to be correct but that improves performance if it is. For example, algorithms that insert values in balanced trees sometimes take hint arguments that suggest where the algorithm should start searching for the insertion point. From pedro.werneck at terra.com.br Wed Feb 8 17:48:11 2006 From: pedro.werneck at terra.com.br (Pedro Werneck) Date: Wed, 8 Feb 2006 14:48:11 -0200 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> <60ed19d40602071759v5144a163h28cd39a2853279f4@mail.gmail.com> Message-ID: <20060208144811.4f94a997.pedro.werneck@terra.com.br> On Wed, 08 Feb 2006 15:42:39 +0100 Georg Brandl wrote: > Neal Norwitz wrote: > > On 2/7/06, Christopher Armstrong wrote: > >> > >> Twisted is wonderful, powerful, rich, and very large. Perhaps a > >small > > subset could be carefully extracted > >> > >The subject of putting (parts of) Twisted into the standard library > >comes up once every 6 months or so, at least on our mailing list. > >For all that I think asyncore is worthless, I'm still against > >copying Twisted into the stdlib. Or at least I'm not willing to > >maintain the necessary fork, and I fear the nightmares about > >versioning that can easily occur when you've got both standard > >library and third party versions of a project. > > > > I wouldn't be enthusiastic about putting all of Twisted in the > > stdlib either. Twisted is on a different release schedule than > > Python. However, isn't there a relatively small core subset like > > Alex mentioned that isn't changing much? Could we split up those > > components and have those live in the core, but the vast majority of > > Twisted live outside as it does now? > > +1. This would be very useful for simple networking applications. I have a simple library I wrote some time ago to make asynchronous TCP servers (honeypots), and I wrote it exactly for the reasons being discussed on this thread: the other developers were not very familiar with Python (they were planning to use Perl on the project) and a bit confused with asyncore. Twisted was the obvious answer, but I could not convince them to put it in the project because of the size and the work needed to put it in all machines they were planning to use. I used this library several times the last two years. The last two weeks I've been using it with another project, but yesterday (a coincidence ?) I decided to reduce all of it to a single module. It is roughly based on Twisted, the interface is similar, some parts are a copy of Twisted code (select code, LineProtocol is a copy of twisted's LineReceiver) but only 16k in size, everything is covered by unittests. It's intended for servers, but client support can be added with some effort too. Maybe it fits the needs of what is being discussed on this thread. It's available here: http://www.pythonbrasil.com.br/moin.cgi/HoneyPython?action=AttachFile&do=get&target=asiel.tar.bz2 -- Pedro Werneck From guido at python.org Wed Feb 8 18:59:07 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 8 Feb 2006 09:59:07 -0800 Subject: [Python-Dev] _length_cue() In-Reply-To: <008701c62cd0$4e1b4eb0$6402a8c0@arkdesktop> References: <20060208142034.GA1292@code0.codespeak.net> <008701c62cd0$4e1b4eb0$6402a8c0@arkdesktop> Message-ID: +1 for __length_hint__. Raymond? On 2/8/06, Andrew Koenig wrote: > > I'm worried about the name. There are now exactly two names that behave > > like a special method without having the double-underscores around it. > > The first name is 'next', which is kind of fine because it's for > > iterator classes only and it's documented. But now, consider: the > > CPython implementation can unexpectedly invoke a method on a > > user-defined iterator class, even though this method's name is not > > '__*__' and not documented as special! That's new and that's bad. > > Might I suggest that at least you consider using "hint" instead of "cue"? > I'm pretty sure that "hint" has been in use for some time, and always to > mean a value that can't be assumed to be correct but that improves > performance if it is. > > For example, algorithms that insert values in balanced trees sometimes take > hint arguments that suggest where the algorithm should start searching for > the insertion point. > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 8 19:07:01 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 8 Feb 2006 10:07:01 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: On 2/8/06, Patrick Collison wrote: > And to think that people thought that keeping "lambda", but changing > the name, would avoid all the heated discussion... :-) Note that I'm not participating in any attempts to "improve" lambda. Just about the only improvement I'd like to see is to add parentheses around the arguments, so you'd write lambda(x, y): x**y instead of lambda x, y: x**y. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Feb 8 19:16:16 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 08 Feb 2006 13:16:16 -0500 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: <5.1.1.6.0.20060208130827.0414dd70@mail.telecommunity.com> At 10:07 AM 2/8/2006 -0800, Guido van Rossum wrote: >On 2/8/06, Patrick Collison wrote: > > And to think that people thought that keeping "lambda", but changing > > the name, would avoid all the heated discussion... :-) > >Note that I'm not participating in any attempts to "improve" lambda. > >Just about the only improvement I'd like to see is to add parentheses >around the arguments, so you'd write lambda(x, y): x**y instead of >lambda x, y: x**y. lambda(x,y) looks like a function call until you hit the ':'; we don't usually have keywords that work that way. How about (lambda x,y: x**y)? It seems like all the recently added constructs (conditionals, yield expressions, generator expressions) take on this rather lisp-y look. :) Or, if you wanted to eliminate the "lambda" keyword, then "(from x,y return x**y)" could be a "function expression", and it looks even more like most of the recently-added expression constructs. Well, actually, I guess to mirror the style of conditionals and genexps more closely, it would have to be something like "(return x**y from x,y)" or "(x**y from x,y)". Ugh. Never mind, let's just leave it the way it is today. :) From martin at v.loewis.de Wed Feb 8 19:21:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 19:21:51 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: References: <43E92573.6090300@v.loewis.de> Message-ID: <43EA36BF.7090008@v.loewis.de> Thomas Heller wrote: > I'm not sure the platform SDK include files (.H and .IDL) are really > compatible with VC7.1. I remember that we (on our company, building C++ > software) had to 'Unregister the PSDK Directories with Visual Studio' > (available from the start menu) before building the stuff, otherwise > there were compiler errors. This needs some testing, sure. However, I'm fairly confident that Microsoft has fixed/is going to fix whatever issues arise - they want the platform SDK to be usable with a "recent" compiler (not necessarily the latest one). There was a recent update to the platform SDK (which now comes with both Itanium and AMD64 compilers), so I'm (still) optimistic. Regards, Martin From ldlandis at gmail.com Wed Feb 8 18:56:07 2006 From: ldlandis at gmail.com (LD 'Gus' Landis) Date: Wed, 8 Feb 2006 11:56:07 -0600 Subject: [Python-Dev] _length_cue() In-Reply-To: <008701c62cd0$4e1b4eb0$6402a8c0@arkdesktop> References: <20060208142034.GA1292@code0.codespeak.net> <008701c62cd0$4e1b4eb0$6402a8c0@arkdesktop> Message-ID: +1 on 'hint' vs 'cue'... also infers 'not definitive' (sort of like having a hint of how much longer the "honey do" list is... the honey do list is never 'exhaustive', only exhausting! ;-) On 2/8/06, Andrew Koenig wrote: > > I'm worried about the name. There are now exactly two names that behave > > like a special method without having the double-underscores around it. > > The first name is 'next', which is kind of fine because it's for > > iterator classes only and it's documented. But now, consider: the > > CPython implementation can unexpectedly invoke a method on a > > user-defined iterator class, even though this method's name is not > > '__*__' and not documented as special! That's new and that's bad. > > Might I suggest that at least you consider using "hint" instead of "cue"? > I'm pretty sure that "hint" has been in use for some time, and always to > mean a value that can't be assumed to be correct but that improves > performance if it is. > > For example, algorithms that insert values in balanced trees sometimes take > hint arguments that suggest where the algorithm should start searching for > the insertion point. > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ldlandis%40gmail.com > -- LD Landis - N0YRQ - from the St Paul side of Minneapolis From fumanchu at amor.org Wed Feb 8 19:24:35 2006 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 8 Feb 2006 10:24:35 -0800 Subject: [Python-Dev] threadsafe patch for asynchat Message-ID: <6949EC6CD39F97498A57E0FA55295B210179086B@ex9.hostedexchange.local> Barry Warsaw wrote: > On Tue, 2006-02-07 at 16:01 -0800, Robert Brewer wrote: > > > Perhaps, but please keep in mind that the smtpd module uses > > both, currently, and would have to be rewritten if either is > > "removed". > > Would that really be a huge loss? It'd be a huge loss for the random fellow who needs to write an email fixup proxy between a broken client and Exim in a couple of hours. ;) But I can't speak for how often this need comes up among users. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From ronaldoussoren at mac.com Wed Feb 8 19:33:31 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 8 Feb 2006 19:33:31 +0100 Subject: [Python-Dev] Python modules should link to libpython In-Reply-To: <1139413624.10037.36.camel@localhost> References: <1139413624.10037.36.camel@localhost> Message-ID: <9454E545-2A28-440C-8161-E3C430B2CC10@mac.com> On 8-feb-2006, at 16:47, Gustavo J. A. M. Carneiro wrote: > gjc:/usr/lib/python2.4/lib-dynload$ ldd itertools.so > libpthread.so.0 => /lib/libpthread.so.0 (0x00002aaaaabcc000) > libc.so.6 => /lib/libc.so.6 (0x00002aaaaace2000) > /lib/ld-linux-x86-64.so.2 (0x0000555555554000) > gjc:/usr/lib/python2.4/lib-dynload$ > > It seems that Python C extension modules are not linking explicitly to > libpython. Yet, they explicitly reference symbols defined in > libpython. > When libpython is loaded in a global scope all is fine. However, when > libpython is dlopen()ed with the RTLD_LOCAL flag, python C extensions > always get undefined symbols. > > This problem happened recently with the nautilus-python package, > which > installs an extension for the Nautilus file manager that allows > extensions in Python. For performance reasons, it now opens > extensions > with RTLD_LOCAL flag, thus breaking python extensions. > > Any thoughts? Should I go ahead and open a bug report (maybe with > patch), or is this controversial? I don't know about Linux, but on OSX we don't link with libpython (or Python.framework) on purpose: this allows you to share extensions between several builds of the same version of Python. If you do link with libpython and extension that was compiled by a python installed at a different location will result in having two copies of libpython in memory, only one of which is initialized. You end up with very interesting crashes. Ronald > > -- > Gustavo J. A. M. Carneiro > > The universe is always one step beyond logic. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/918ae5ee/attachment.bin From martin at v.loewis.de Wed Feb 8 19:38:42 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 19:38:42 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <031401c62cbb$61810630$bf03030a@trilan> References: <43E92573.6090300@v.loewis.de> <031401c62cbb$61810630$bf03030a@trilan> Message-ID: <43EA3AB2.4020808@v.loewis.de> Giovanni Bajo wrote: >>I just came up with an idea how to resolve the VC versioning >>problems for good: Python should link with mscvrt.dll (which >>is part of the operating system), not with the CRT that the >>compiler provides. > > > Can you elaborate exactly on which versioning problems you think of? I could, but I don't want to elaborate too much. Please google for it - there has been written a lot about it. In short, you cannot really link two different versions of msvcrt (e.g. mscvrt.dll, msvcrt4.dll, msvcr7.dll, msvcr71.dll, mscvrtd.dll, msvcr71d.dll, ...) into a single program, plus you cannot redistribute the CRT unless you are a Visual Studio licensee. This causes problems for extension writers: they need to own the same version of visual studio that Python was built with. > It would complicate the build process and make Python lag behind CRT > development (including bugfixes and whatnot) that Microsoft does. There isn't really too much development in the CRT, and the little development I can see (e.g. in VS 2005) is rather counter-productive. So ideally, Python should drop usage of the CRT entirely (but getting there will be a long process). Hopefully, P3k will drop usage of stdio for file objects, which will be a big step forward. > You could > as well ask to always stick with GCC 2.95 to solve ABI problems, but I don't > think it's the correct long time solution. I expect more and more Windows > libraries (binary version) to be shipped with dependencies on MSVCR71.DLL. Now that VS2005 is out, I doubt that. More and more will also depend on msvcr80.dll. Then, when the next visual studio comes out, you can (probably) add msvcr81.dll to the list of libraries that might be used. This will go on forever, and we cannot win. It's really not using GCC 2.95 which I'm after. It's using /lib/libc.so that I want to. People should be free to use whatever compiler they have access to. Regards, Martin From martin at v.loewis.de Wed Feb 8 19:43:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 19:43:48 +0100 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> Message-ID: <43EA3BE4.5000904@v.loewis.de> Steve Holden wrote: > In case the processing of events needed to block? If I'm processing web > requests in an async* dispatch loop and a request needs the results of a > (probably lengthy) database query in order to generate its output, how > do I give the dispatcher control again to process the next asynchronous > network event? I see. Ideally, you should obtain the socket for the connection to the database, and add it to the asyncore loop. That would require you have an async database API, of course. Regards, Martin From barry at python.org Wed Feb 8 19:45:55 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 08 Feb 2006 13:45:55 -0500 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <6949EC6CD39F97498A57E0FA55295B210179086B@ex9.hostedexchange.local> References: <6949EC6CD39F97498A57E0FA55295B210179086B@ex9.hostedexchange.local> Message-ID: <1139424355.15735.17.camel@geddy.wooz.org> On Wed, 2006-02-08 at 10:24 -0800, Robert Brewer wrote: > It'd be a huge loss for the random fellow who needs to write an email > fixup proxy between a broken client and Exim in a couple of hours. ;) Or the guy who needs to whip together an RFC-compliant minimal SMTP server to use in unit tests of some random Python implemented mailing list manager. Just fer instance. But still... > But I can't speak for how often this need comes up among users. Yeah, there is that. ;) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/eaac7cbf/attachment.pgp From guido at python.org Wed Feb 8 19:49:14 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 8 Feb 2006 10:49:14 -0800 Subject: [Python-Dev] Old Style Classes Goiung in Py3K In-Reply-To: <43E9C184.5090608@voidspace.org.uk> References: <43E9C184.5090608@voidspace.org.uk> Message-ID: On 2/8/06, Fuzzyman wrote: > I understand that old style classes are slated to disappear in Python 3000. > > Does this mean that the following will be a syntax error : > > class Something: > pass > > *or* that instead it will automatically inherit from object ? The latter of course. I never even considered making this illegal. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Wed Feb 8 19:49:22 2006 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2006 10:49:22 -0800 Subject: [Python-Dev] Old Style Classes Goiung in Py3K In-Reply-To: References: <43E9C184.5090608@voidspace.org.uk> Message-ID: On 2/8/06, Georg Brandl wrote: > Fuzzyman wrote: > > Hello all, > > > > I understand that old style classes are slated to disappear in Python 3000. > > > > Does this mean that the following will be a syntax error : > > > > class Something: > > pass > > > > *or* that instead it will automatically inherit from object ? > > Of course, I would say. There's no reason to forbid this in Py3k. > And you would be right. Guido has always said that classes would act as if they inherited from object by default. There are no plans to change the syntax of how you can specify inheritance in Python 3. All that is changing is what the default is when you specify no superclasses. -Brett From martin at v.loewis.de Wed Feb 8 19:55:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 19:55:38 +0100 Subject: [Python-Dev] Python modules should link to libpython In-Reply-To: <1139413624.10037.36.camel@localhost> References: <1139413624.10037.36.camel@localhost> Message-ID: <43EA3EAA.10303@v.loewis.de> Gustavo J. A. M. Carneiro wrote: > Any thoughts? Should I go ahead and open a bug report (maybe with > patch), or is this controversial? You should only link with libpython if there really is a shared libpython. In a standard Python installation, there is no libpython, but instead, symbols are in the executable. Notice that libpython isn't really supported: all changes to that code originate from contributions, and I refuse to develop changes to it myself. So you can file a bug report, but there likely won't be any reaction in the next few years (atleast not from me). OTOH, if a working patch was contributed, I could apply that fairly quickly: I agree that modules should link with libpython if libpython is shared. I can accept that the Mac does it differently, although I think the rationale for doing that is dangerous: you shouldn't really attempt to share extension modules across Python versions. Regards, Martin From brett at python.org Wed Feb 8 19:58:46 2006 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2006 10:58:46 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <5.1.1.6.0.20060208130827.0414dd70@mail.telecommunity.com> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <5.1.1.6.0.20060208130827.0414dd70@mail.telecommunity.com> Message-ID: On 2/8/06, Phillip J. Eby wrote: > At 10:07 AM 2/8/2006 -0800, Guido van Rossum wrote: > >On 2/8/06, Patrick Collison wrote: > > > And to think that people thought that keeping "lambda", but changing > > > the name, would avoid all the heated discussion... :-) > > > >Note that I'm not participating in any attempts to "improve" lambda. > > > >Just about the only improvement I'd like to see is to add parentheses > >around the arguments, so you'd write lambda(x, y): x**y instead of > >lambda x, y: x**y. > > lambda(x,y) looks like a function call until you hit the ':'; we don't > usually have keywords that work that way. > I agree with Phillip. Making it look more like a function definition, I think, is a bad move to make. The thing is quirky as-is, let's not partially mask that fact. > How about (lambda x,y: x**y)? It seems like all the recently added > constructs (conditionals, yield expressions, generator expressions) take on > this rather lisp-y look. :) > > Or, if you wanted to eliminate the "lambda" keyword, then "(from x,y return > x**y)" could be a "function expression", and it looks even more like most > of the recently-added expression constructs. > > Well, actually, I guess to mirror the style of conditionals and genexps > more closely, it would have to be something like "(return x**y from x,y)" > or "(x**y from x,y)". > > Ugh. Never mind, let's just leave it the way it is today. :) > ``(use x, y, in x**y)`` is the best I can think of off the top of my head. But if Guido is not budging on tweaking lambda in any way other than parentheses, then I say just leave the busted thing as it is and let it be the wart that was never removed. -Brett From ronaldoussoren at mac.com Wed Feb 8 20:02:40 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 8 Feb 2006 20:02:40 +0100 Subject: [Python-Dev] Python modules should link to libpython In-Reply-To: <43EA3EAA.10303@v.loewis.de> References: <1139413624.10037.36.camel@localhost> <43EA3EAA.10303@v.loewis.de> Message-ID: On 8-feb-2006, at 19:55, Martin v. L?wis wrote: > Gustavo J. A. M. Carneiro wrote: >> Any thoughts? Should I go ahead and open a bug report (maybe with >> patch), or is this controversial? > > I can accept that the Mac does it differently, although I think the > rationale for doing that is dangerous: you shouldn't really attempt > to share extension modules across Python versions. My explanation seems to be bad, I meant to say sharing extensions across different builds of the same Python version. One might install a normal unix build in /opt/python and a framework build in /Library/Frameworks. This is not as important now as it was when Python 2.3.x was state of the art, then you could have a python 2.3.x framework both in /System/Library/Frameworks (provided by Apple) and in /Library/ Frameworks (build yourself or downloaded the official MacPython binaries). Those would share the same site-packages directory (/Library/Python/2.3). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/98c407e0/attachment.bin From tim.peters at gmail.com Wed Feb 8 20:07:58 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 8 Feb 2006 14:07:58 -0500 Subject: [Python-Dev] small floating point number problem In-Reply-To: <001c01c62c86$d694beb0$b83efea9@RaymondLaptop1> References: <006d01c62c7d$d7bab730$1f2c4fca@csmith> <001c01c62c86$d694beb0$b83efea9@RaymondLaptop1> Message-ID: <1f7befae0602081107h621e299i2f2269386b41e656@mail.gmail.com> [Raymond Hettinger] > ... > The asymmetric handling of denormals by the atof() and ftoa() functions is > why you see a difference. A consequence of that asymmetry is the breakdown > of the expected eval(repr(f))==f invariant: Just noting that such behavior is a violation of the 754 standard for string->double conversion. But Microsoft's libraries don't _claim_ to support the 754 standard, so good luck suing them ;-). Python doesn't promise anything here either. From raymond.hettinger at verizon.net Wed Feb 8 20:16:10 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 08 Feb 2006 14:16:10 -0500 Subject: [Python-Dev] Let's just *keep* lambda References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie><869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <5.1.1.6.0.20060208130827.0414dd70@mail.telecommunity.com> Message-ID: <009401c62ce4$1f139320$b83efea9@RaymondLaptop1> > How about (lambda x,y: x**y)? The purpose of this thread was to conserve brain-power by bringing the issue to a close. Instead, it is turning into syntax/renaming fest. May I suggest that this be moved to comp.lang.python and return only if a community consensus emerges from the thousands of random variants? Raymond From bob at redivi.com Wed Feb 8 20:17:07 2006 From: bob at redivi.com (Bob Ippolito) Date: Wed, 8 Feb 2006 11:17:07 -0800 Subject: [Python-Dev] Python modules should link to libpython In-Reply-To: References: <1139413624.10037.36.camel@localhost> <43EA3EAA.10303@v.loewis.de> Message-ID: <7C490E16-52E7-4EEE-B4B7-6B2D817AC938@redivi.com> On Feb 8, 2006, at 11:02 AM, Ronald Oussoren wrote: > > On 8-feb-2006, at 19:55, Martin v. L?wis wrote: > >> Gustavo J. A. M. Carneiro wrote: >>> Any thoughts? Should I go ahead and open a bug report (maybe with >>> patch), or is this controversial? >> >> I can accept that the Mac does it differently, although I think the >> rationale for doing that is dangerous: you shouldn't really attempt >> to share extension modules across Python versions. > > My explanation seems to be bad, I meant to say sharing extensions > across > different builds of the same Python version. One might install a > normal > unix build in /opt/python and a framework build in /Library/ > Frameworks. > > This is not as important now as it was when Python 2.3.x was state > of the > art, then you could have a python 2.3.x framework both in > /System/Library/Frameworks (provided by Apple) and in /Library/ > Frameworks > (build yourself or downloaded the official MacPython binaries). > Those would > share the same site-packages directory (/Library/Python/2.3). They never shared the same site-packages directory... The major reason we use -undefined dynamic_lookup rather than linking directly to a particular Python is so that the framework can be moved around without everything going to hell.. e.g., to the inside of an application bundle. At the time, we didn't have any tools that could do the Mach-O header rewriting that py2app does now. There isn't a whole lot of reason to use -undefined dynamic_lookup these days, but there also isn't many compelling reasons to go back to direct linking. There is one use case that direct linking would support: having multiple distinct Python interpreters in the same process space, which could be useful for writing plug-ins to applications that are not Python based... Other than that, there's little reason to bother with it. -bob From martin at v.loewis.de Wed Feb 8 20:18:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 20:18:40 +0100 Subject: [Python-Dev] Python modules should link to libpython In-Reply-To: References: <1139413624.10037.36.camel@localhost> <43EA3EAA.10303@v.loewis.de> Message-ID: <43EA4410.7060808@v.loewis.de> Ronald Oussoren wrote: > My explanation seems to be bad, I meant to say sharing extensions across > different builds of the same Python version. One might install a normal > unix build in /opt/python and a framework build in /Library/Frameworks. Sorry, I didn't read your message carefully enough. This isn't a problem in Unix/ELF: you (normally) only put the name of the library into the resulting executable/library, not the absolute path. You then use the library search path (system-defined or LD_LIBRARY_PATH) to find the library. Regards, Martin From fumanchu at amor.org Wed Feb 8 20:20:23 2006 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 8 Feb 2006 11:20:23 -0800 Subject: [Python-Dev] Let's just *keep* lambda Message-ID: <6949EC6CD39F97498A57E0FA55295B21017909B9@ex9.hostedexchange.local> Raymond Hettinger wrote: > > How about (lambda x,y: x**y)? > > The purpose of this thread was to conserve brain-power by > bringing the issue to a close. Instead, it is turning into > syntax/renaming fest. May I suggest that this be moved to > comp.lang.python and return only if a community consensus > emerges from the thousands of random variants? I'd like to suggest this be moved to comp.lang.python and never return. Community consensus on syntax is a pipe dream. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From steve at holdenweb.com Wed Feb 8 20:28:25 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 08 Feb 2006 14:28:25 -0500 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <009401c62ce4$1f139320$b83efea9@RaymondLaptop1> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie><869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <5.1.1.6.0.20060208130827.0414dd70@mail.telecommunity.com> <009401c62ce4$1f139320$b83efea9@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: >>How about (lambda x,y: x**y)? > > > The purpose of this thread was to conserve brain-power by bringing the issue > to a close. Instead, it is turning into syntax/renaming fest. May I > suggest that this be moved to comp.lang.python and return only if a > community consensus emerges from the thousands of random variants? > Right, then we can get back to important stuff like how to represent octal constants. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From keith at kdart.com Wed Feb 8 20:45:48 2006 From: keith at kdart.com (Keith Dart) Date: Wed, 8 Feb 2006 11:45:48 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <1139424355.15735.17.camel@geddy.wooz.org> References: <6949EC6CD39F97498A57E0FA55295B210179086B@ex9.hostedexchange.local> <1139424355.15735.17.camel@geddy.wooz.org> Message-ID: <20060208114548.56d8a5ac@leviathan.kdart.com> Barry Warsaw wrote the following on 2006-02-08 at 13:45 PST: === > Or the guy who needs to whip together an RFC-compliant minimal SMTP > server to use in unit tests of some random Python implemented mailing > list manager. Just fer instance. But still... > > > But I can't speak for how often this need comes up among users. > > Yeah, there is that. ;) === There are other, third-party, SMTP server objects available. You could always use one of those. Once the "Python egg" and PyPI improve and start widespread use perhaps the question of what is in the core library and what is not will become moot. Being a Gentoo Linux user I already enjoy having many modules available, with automatic dependency installation, on demand. So the idea of "core" library is already blurred for me. -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Keith Dart public key: ID: 19017044 ===================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/9bdad0a3/attachment.pgp From keith at kdart.com Wed Feb 8 21:00:17 2006 From: keith at kdart.com (Keith Dart) Date: Wed, 8 Feb 2006 12:00:17 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: <20060208120017.09fcfb8c@leviathan.kdart.com> Guido van Rossum wrote the following on 2006-02-08 at 10:07 PST: === > Note that I'm not participating in any attempts to "improve" lambda. === FWIW, I like lambda. No need to change it. Thank you. -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Keith Dart public key: ID: 19017044 ===================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/31903529/attachment.pgp From raymond.hettinger at verizon.net Wed Feb 8 21:02:21 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 08 Feb 2006 15:02:21 -0500 Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net> Message-ID: <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> [Armin Rigo] > It was replaced by an optional undocumented method called _length_cue(), > which would be used to guess the number of remaining items in an > iterator, for performance reasons. > > I'm worried about the name. There are now exactly two names that behave > like a special method without having the double-underscores around it. > IMHO for safety reasons we need to stick double-underscores around this > name too, e.g. __length_cue__(). The single underscore was meant to communicate that this method is private (which is why it is undocumented). Accordingly, the dir() function is smart enough to omit the method from its listing (which is a good thing). We follow similar naming conventions in pure Python library code. OTOH, this one is a bit different in that it is not truly private; rather, it is more like a friend method used internally for various tools to be able to communicate with each other. If you change to a double underscore convention, you're essentially making this a public protocol. IMHO, the "safety reasons" are imaginary -- the scenario would involve subclassing one of these builtin objects and attaching an identically named private method. All that being said, I don't feel strongly about it and you guys are welcome to change it if offends your naming convention sensibilities. [Andrew Koenig] > Might I suggest that at least you consider using "hint" instead of "cue"? Personally, I prefer "cue" which my dictionary defines as "a signal, hint, or suggestion". The alternate definition of "a prompt for some action" applies equally well. Also, to my ear, length_hint doesn't sound right. I'm -0 on changing the name. If you must, then go ahead. [Armin Rigo] > BTW the reason I'm looking at this is that I'm considering adding > another undocumented internal-use-only method, maybe __getitem_cue__(), > that would try to guess what the nth item to be returned will be. This > would allow the repr of some iterators to display more helpful > information when playing around with them at the prompt, e.g.: > >>>> enumerate([3.1, 3.14, 3.141, 3.1415, 3.14159, 3.141596]) > At one point, I explored and then abandoned this idea. For objects like itertools.count(n), it worked fine -- the state was readily knowable and the eval(repr(obj)) round-trip was possible. However, for tools like enumerate(), it didn't make sense to have a preview that only applied in a tiny handful of (mostly academic) cases and was not evaluable in any case. I was really attracted to the idea of having more informative iterator representations but learned that even when it could be done, it wasn't especially useful. When someone creates an iterator at the interactive prompt, they almost always either wrap it in a consumer function or they assign it to a variable. The case of typing just, "enumerate([1,2,3])", comes up only once, when first learning was enumerate() does. Raymond From barry at python.org Wed Feb 8 21:08:08 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 08 Feb 2006 15:08:08 -0500 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <20060208114548.56d8a5ac@leviathan.kdart.com> References: <6949EC6CD39F97498A57E0FA55295B210179086B@ex9.hostedexchange.local> <1139424355.15735.17.camel@geddy.wooz.org> <20060208114548.56d8a5ac@leviathan.kdart.com> Message-ID: <1139429288.15736.43.camel@geddy.wooz.org> On Wed, 2006-02-08 at 11:45 -0800, Keith Dart wrote: > There are other, third-party, SMTP server objects available. You could > always use one of those. Very true. In fact, Twisted comes to the rescue again here. When I needed to test Mailman's NNTP integration I could either spend several weeks figuring out how to install and configure some traditional NNTP server, or I could just install Twisted and run exactly three commands (one of which was "sudo" :). > Once the "Python egg" and PyPI improve and start widespread use perhaps > the question of what is in the core library and what is not will become > moot. Indeed. > Being a Gentoo Linux user I already enjoy having many modules > available, with automatic dependency installation, on demand. So the > idea of "core" library is already blurred for me. Although I'm doing a lot more dev on the Mac these days, I definitely agree that this is what makes Gentoo so cool for Linux, and I can't wait for Gentoo-on-OSX to switch to doing things the Right Way (can you say bye-bye DarwinPorts?). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/15d90ae0/attachment.pgp From ateijelo at uh.cu Wed Feb 8 17:10:45 2006 From: ateijelo at uh.cu (Andy Teijelo =?iso-8859-1?q?P=E9rez?=) Date: Wed, 8 Feb 2006 11:10:45 -0500 Subject: [Python-Dev] Path PEP: some comments In-Reply-To: <030001c629c2$30345c90$bf03030a@trilan> References: <030001c629c2$30345c90$bf03030a@trilan> Message-ID: <200602081110.46754.ateijelo@uh.cu> El S?bado, 4 de Febrero de 2006 2:35, Giovanni Bajo escribi?: > Hello, > > my comments on the Path PEP: > > - Many methods contain the word 'path' in them. I suppose this is to help > transition from the old library to the new library. But in the context of a > new Python user, I don't think that Path.abspath() is optimal. Path.abs() > looks better. Maybe it's not so fundamental to have exactly the same names > of the old library, especially when thinking of future? If I rearrange my > code to use Path, I can as well rename methods to something more sound at > the same time. I haven't revised the whole class to look exactly which methods contain the word path and which do not. But, anyway this is just a simple comment. It's clear to me that Path.abspath() look redundant and Path.abs() tells clearly what the method does. But I think in most cases the method won't be used through the class, like 'Path.abs(instance)' but through an existing instance like 'home.abs()'. In this case, I think 'home.abspath()' would be more readable than 'home.abs()'. Anyway, in the long term, I think people will just get used to what gets finally decided, so I could say I'm +0 about this. (Does one have to be a python developer or something to use the {+,-}{0,1} thing?, 'cause I'm not.) Regards, Andy. From steven.bethard at gmail.com Wed Feb 8 22:16:15 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed, 8 Feb 2006 14:16:15 -0700 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <6949EC6CD39F97498A57E0FA55295B21017909B9@ex9.hostedexchange.local> References: <6949EC6CD39F97498A57E0FA55295B21017909B9@ex9.hostedexchange.local> Message-ID: Robert Brewer wrote: > Community consensus on syntax is a pipe dream. +1 QOTF And trust me, it'll be in there, since I'm one of the summary writers. ;-) STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From joao.macaiba at gmail.com Wed Feb 8 22:31:27 2006 From: joao.macaiba at gmail.com (Joao Macaiba) Date: Wed, 08 Feb 2006 19:31:27 -0200 Subject: [Python-Dev] Help on choosing a PEP to volunteer on it : 308, 328 or 343 Message-ID: <43EA632F.4010600@gmail.com> Hi. I'm interested in doing an undergraduate project under some Python core PEP. I'm newbie to Python core. Program in C/C++. I've downloaded the sources with svn and now I'm studying it. There are 3 PEP accepted : . 308 : Conditional Expressions . 328 : Imports: Multi-Line and Absolute/Relative . 343 : The "with" Statement I've some questions : 1. For a newbie in the Python core development, what is the best PEP to begin with ? 2. PEP's "owner" is the one who submitted the proposal or the one who is working on it; 3. How do we know what are the developers working on the PEP ? Thanks in advance. Regards. Joao Macaiba (wavefunction). From brett at python.org Wed Feb 8 22:39:34 2006 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2006 13:39:34 -0800 Subject: [Python-Dev] Help on choosing a PEP to volunteer on it : 308, 328 or 343 In-Reply-To: <43EA632F.4010600@gmail.com> References: <43EA632F.4010600@gmail.com> Message-ID: On 2/8/06, Joao Macaiba wrote: > Hi. I'm interested in doing an undergraduate project under some Python > core PEP. > > I'm newbie to Python core. Program in C/C++. > > I've downloaded the sources with svn and now I'm studying it. > > > There are 3 PEP accepted : > > . 308 : Conditional Expressions > > . 328 : Imports: Multi-Line and Absolute/Relative > > . 343 : The "with" Statement > > > I've some questions : > > 1. For a newbie in the Python core development, what is the best PEP to > begin with ? > Wild guess? 308, but that still requires changing the grammar and editing the AST compiler. 328 will need playing with the import code which is known to be hairy. 343 has the same needs as 308, but I bet would be more complicated. > 2. PEP's "owner" is the one who submitted the proposal or the one who is > working on it; > Technically it is the person who drew up the proposal and agreed to carry it through. Usually, though, they are also the ones willing to implement it (or at least make sure that happens). > 3. How do we know what are the developers working on the PEP ? > You ask just like you are. =) Otherwise you just have to listen on python-dev for anyone to mention they are working on it. -Brett From martin at v.loewis.de Wed Feb 8 22:45:54 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2006 22:45:54 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43E92573.6090300@v.loewis.de> References: <43E92573.6090300@v.loewis.de> Message-ID: <43EA6692.2050502@v.loewis.de> Martin v. L?wis wrote: > To do that, we would need to compile and link with the SDK > header files and import libraries, not with the ones that > visual studio provides. I withdraw that idea. It appears that the platform SDK doesn't (any longer?) provide an import library for msvrt.dll, and Microsoft documents mscvrt as intended only for "system components". Regards, Martin From raymond.hettinger at verizon.net Wed Feb 8 22:58:01 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 08 Feb 2006 16:58:01 -0500 Subject: [Python-Dev] Help on choosing a PEP to volunteer on it : 308, 328 or 343 References: <43EA632F.4010600@gmail.com> Message-ID: <001f01c62cfa$bba337c0$b83efea9@RaymondLaptop1> [Joao Macaiba] > 1. For a newbie in the Python core development, what is the best PEP to > begin with ? I recommend, PEP 308: Conditional Expressions Raymond From nyamatongwe at gmail.com Wed Feb 8 23:38:11 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Thu, 9 Feb 2006 09:38:11 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EA6692.2050502@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> Message-ID: <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> Martin v. L?wis: > So ideally, Python should drop usage of the CRT entirely (but getting > there will be a long process). Hopefully, P3k will drop usage of > stdio for file objects, which will be a big step forward. You don't need to drop the CRT, just encapsulate it so there is one copy controlled by Python that hands out wrapped objects (file handles, file pointers, memory blocks, others?). These wrappers can only be manipulated through calls back to that owning code that then calls the CRT. Unfortunately this change would itself be incompatible with current extensions. Neil From arigo at tunes.org Thu Feb 9 00:51:56 2006 From: arigo at tunes.org (Armin Rigo) Date: Thu, 9 Feb 2006 00:51:56 +0100 Subject: [Python-Dev] _length_cue() In-Reply-To: <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> Message-ID: <20060208235156.GA29514@code0.codespeak.net> Hi Raymond, On Wed, Feb 08, 2006 at 03:02:21PM -0500, Raymond Hettinger wrote: > IMHO, the "safety reasons" are imaginary -- the scenario would involve > subclassing one of these builtin objects and attaching an identically named > private method. No, the senario applies to any user-defined iterator class, not necessary subclassing an existing one: >>> class MyIter(object): ... def __iter__(self): ... return self ... def next(self): ... return whatever ... def _length_cue(self): ... print "oups! please, CPython, don't call me unexpectedly" ... >>> list(MyIter()) oups! please, CPython, don't call me unexpectedly (...) This means that _length_cue() is at the moment a special method, in the sense that Python can invoke it implicitely. This said, do we vote for __length_hint__ or __length_cue__? :-) And does anyone objects about __getitem_hint__ or __getitem_cue__? Maybe __lookahead_hint__ or __lookahead_cue__? Armin From thomas at xs4all.net Thu Feb 9 01:08:01 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 9 Feb 2006 01:08:01 +0100 Subject: [Python-Dev] Help on choosing a PEP to volunteer on it : 308, 328 or 343 In-Reply-To: References: <43EA632F.4010600@gmail.com> Message-ID: <20060209000801.GF10226@xs4all.nl> On Wed, Feb 08, 2006 at 01:39:34PM -0800, Brett Cannon wrote: > On 2/8/06, Joao Macaiba wrote: > > 1. For a newbie in the Python core development, what is the best PEP to > > begin with ? > Wild guess? 308, but that still requires changing the grammar and > editing the AST compiler. 328 will need playing with the import code > which is known to be hairy. 343 has the same needs as 308, but I bet > would be more complicated. Joao brought up an interesting point on #python on freenode, though... Is there any documentation regarding the AST code? I started fiddling with it just to get to know it, adding some weird syntax just for the hell of it, and I *think* I understand how the AST is supposed to work. I haven't gotten around to actually coding it, though (just like I haven't gotten around to PEP 13 ;) so maybe I have it all wrong. A short description of the principles and design choices would be nice, maybe with a paragraph on how to add new syntax constructs. How tightly should the AST follow the grammar, for instance? (I pointed Joao to the augmented assignment patch for 2.0, which doesn't say anything about the AST but should be helpful hints in his quest to understand Python's internals. Lord knows that's how I learned it... By the time he groks it all, hopefully someone can help him with the AST parts ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh at python.net Thu Feb 9 01:18:28 2006 From: mwh at python.net (Michael Hudson) Date: Thu, 09 Feb 2006 00:18:28 +0000 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: (Guido van Rossum's message of "Wed, 8 Feb 2006 10:07:01 -0800") References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: <2mbqxhe65n.fsf@starship.python.net> Guido van Rossum writes: > On 2/8/06, Patrick Collison wrote: >> And to think that people thought that keeping "lambda", but changing >> the name, would avoid all the heated discussion... :-) > > Note that I'm not participating in any attempts to "improve" lambda. > > Just about the only improvement I'd like to see is to add parentheses > around the arguments, so you'd write lambda(x, y): x**y instead of > lambda x, y: x**y. That would seem to be a bad idea, as it means something already: >>> f = lambda (x,y): x + y >>> t = (1,2) >>> f(t) 3 Cheers, mwh -- Solaris: Shire horse that dreams of being a race horse, blissfully unaware that its owners don't quite know whether to put it out to grass, to stud, or to the knackers yard. -- Jim's pedigree of operating systems, asr From edgimar at lycos.com Tue Feb 7 09:34:05 2006 From: edgimar at lycos.com (Mark Edgington) Date: Tue, 07 Feb 2006 09:34:05 +0100 Subject: [Python-Dev] threadsafe patch for asynchat Message-ID: <43E85B7D.8080203@lycos.com> Martin v. L?wis wrote: > That patch looks wrong. What does it mean to "run in a thread"? > All code runs in a thread, all the time: sometime, that thread > is the main thread. > > Furthermore, I can't see any presumed thread-unsafety in asynchat. Ok, perhaps the notation could be improved, but the idea of the semaphore in the patch is "Does it run inside of a multithreaded environment, and could its push() functions be called from a different thread?" I have verified that there is a problem with running it in such an environment. My results are more or less identical to those described in the following thread: http://mail.python.org/pipermail/medusa-dev/1999/000431.html (see also the reply message to this one regarding the solution -- if you look at the Zope source, Zope deals with the problem in the way I am suggesting asynchat be patched) It seems that somehow in line 271 (python 2.4) of asynchat.py, producer_fifo.list is not empty, and thus popleft() is executed. However, popleft() finds the deque empty. This means that somehow the deque (or list -- the bug is identical in python 2.3) is emptied between the if() and the popleft(), so perhaps asyncore.loop(), running in a different thread from the thread which calls async_chat.push(), empties it. The problem is typically exhibited when running in a multithreaded environment, and when calling the async_chat.push() function many (i.e. perhaps tens of thousands) times quickly in a row from a different thread. However, this behavior is avoided by creating a Lock for refill_buffer(), so that it cannot be executed simultaneously. It is also avoided by not executing initiate_send() at all (as is done by Zope in ZHTTPServer.zhttp_channel). > Sure, there is a lot of member variables in asynchat which aren't > specifically protected against mutual access from different threads. > So you shouldn't be accessing the same async_chat object from multiple > threads. If applying this patch does indeed make it safe to use async_chat.push() from other threads, why would it be a bad thing to have? It seems to make the code less cryptic (i.e. I don't need to override base classes in order to include code which processes a nonempty Queue object -- I simply make a call to the push() function of my instance of async_chat, and I'm done). -Mark (also, of course push_with_producer() would probably also need the same changes that would be made to push() ) From pedro.werneck at terra.com.br Wed Feb 8 18:11:38 2006 From: pedro.werneck at terra.com.br (Pedro Werneck) Date: Wed, 8 Feb 2006 15:11:38 -0200 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E66D98.3080008@lycos.com> <43E83E5D.3010000@v.loewis.de> <60ed19d40602071759v5144a163h28cd39a2853279f4@mail.gmail.com> Message-ID: <20060208151138.2316709d.pedro.werneck@terra.com.br> On Wed, 08 Feb 2006 15:42:39 +0100 Georg Brandl wrote: > Neal Norwitz wrote: > > On 2/7/06, Christopher Armstrong wrote: > >> > >> Twisted is wonderful, powerful, rich, and very large. Perhaps a > >small > > subset could be carefully extracted > >> > >The subject of putting (parts of) Twisted into the standard library > >comes up once every 6 months or so, at least on our mailing list. > >For all that I think asyncore is worthless, I'm still against > >copying Twisted into the stdlib. Or at least I'm not willing to > >maintain the necessary fork, and I fear the nightmares about > >versioning that can easily occur when you've got both standard > >library and third party versions of a project. > > > > I wouldn't be enthusiastic about putting all of Twisted in the > > stdlib either. Twisted is on a different release schedule than > > Python. However, isn't there a relatively small core subset like > > Alex mentioned that isn't changing much? Could we split up those > > components and have those live in the core, but the vast majority of > > Twisted live outside as it does now? > > +1. This would be very useful for simple networking applications. I have a simple library I wrote some time ago to make asynchronous TCP servers (honeypots), and I wrote it exactly for the reasons being discussed on this thread: the other developers were not very familiar with Python (they were planning to use Perl on the project) and a bit confused with asyncore. Twisted was the obvious answer, but I could not convince them to put it in the project because of the size and the work needed to put it in all machines they were planning to use. I used this library several times the last two years. The last two weeks I've been using it with another project, but yesterday (a coincidence ?) I decided to reduce all of it to a single module. It is roughly based on Twisted, the interface is similar, some parts are a copy of Twisted code (select code, LineProtocol is a copy of twisted's LineReceiver) but only 16k in size, everything is covered by unittests. It's intended for servers, but client support can be added with some effort too. Maybe it fits the needs of what is being discussed on this thread. It's available here: http://www.pythonbrasil.com.br/moin.cgi/HoneyPython?action=AttachFile&do=get&target=asiel.tar.bz2 -- Pedro Werneck From seojiwon at gmail.com Thu Feb 9 02:22:31 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Wed, 8 Feb 2006 17:22:31 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: On 2/8/06, Guido van Rossum wrote: > On 2/8/06, Patrick Collison wrote: > > And to think that people thought that keeping "lambda", but changing > > the name, would avoid all the heated discussion... :-) > > Note that I'm not participating in any attempts to "improve" lambda. Then, is there any chance anonymous function - or closure - is supported in python 3.0 ? Or at least have a discussion about it? (IMHO, closure is very handy for function like map, sort etc. And having to write a function for multiple statement is kind of good in that function name explains what it does. However, I sometimes feel that having no name at all is clearer. Also, having to define a function when it'll be used only once seemed inappropriate sometimes.) or is there already discussion about it (and closed)? -Jiwon -Jiwon > > Just about the only improvement I'd like to see is to add parentheses > around the arguments, so you'd write lambda(x, y): x**y instead of > lambda x, y: x**y. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com > From jcarlson at uci.edu Thu Feb 9 02:39:38 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 08 Feb 2006 17:39:38 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: Message-ID: <20060208173838.1143.JCARLSON@uci.edu> Jiwon Seo wrote: > > On 2/8/06, Guido van Rossum wrote: > > On 2/8/06, Patrick Collison wrote: > > > And to think that people thought that keeping "lambda", but changing > > > the name, would avoid all the heated discussion... :-) > > > > Note that I'm not participating in any attempts to "improve" lambda. > > Then, is there any chance anonymous function - or closure - is > supported in python 3.0 ? Or at least have a discussion about it? > > or is there already discussion about it (and closed)? Closures already exist in Python. >>> def foo(bar): ... return lambda: bar + 1 ... >>> a = foo(5) >>> a() 6 - Josiah From janssen at parc.com Thu Feb 9 02:54:51 2006 From: janssen at parc.com (Bill Janssen) Date: Wed, 8 Feb 2006 17:54:51 PST Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: Your message of "Wed, 08 Feb 2006 09:11:38 PST." <20060208151138.2316709d.pedro.werneck@terra.com.br> Message-ID: <06Feb8.175452pst."58633"@synergy1.parc.xerox.com> Not terrible. I think I may try re-working Medusa to use this. Bill From python at rcn.com Thu Feb 9 03:21:02 2006 From: python at rcn.com (Raymond Hettinger) Date: Wed, 8 Feb 2006 21:21:02 -0500 Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> Message-ID: <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> [Armin Rigo] > Hi Raymond, . . . > This means that _length_cue() is at the moment a special method, in the > sense that Python can invoke it implicitely. Okay, that makes sense. Go ahead and make the swap. > This said, do we vote for __length_hint__ or __length_cue__? :-) I prefer __length_cue__ unless someone has a strong objection. > And does anyone objects about __getitem_hint__ or __getitem_cue__? > Maybe __lookahead_hint__ or __lookahead_cue__? No objections here though I do question the utility of the protocol. It is going to be difficult to find pairs of objects (one providing the lookahead value and the other consuming it) that can make good use of the protocol. Outside of those unique pairings, there is no value at all. Thinking back over the code I ever seen, I cannot think of one case where the would have been helpful (except for the ill-fated adventure of trying to make iterators have more informative __repr__ methods). Before putting this in production, it would probably be worthwhile to search for code where it would have been helpful. In the case of __length_cue__, there was an immediate payoff. Value pre-fetching has more utility in an environment where the concept is used everywhere (such as your lightning demo at PyCon last year where you ran iterators forwards/backwards and do tricks with infinite iterators). Outside of such an environment, I think it is going to be use-case challenged. Raymond From brett at python.org Thu Feb 9 03:45:01 2006 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2006 18:45:01 -0800 Subject: [Python-Dev] Help on choosing a PEP to volunteer on it : 308, 328 or 343 In-Reply-To: <20060209000801.GF10226@xs4all.nl> References: <43EA632F.4010600@gmail.com> <20060209000801.GF10226@xs4all.nl> Message-ID: On 2/8/06, Thomas Wouters wrote: > On Wed, Feb 08, 2006 at 01:39:34PM -0800, Brett Cannon wrote: > > On 2/8/06, Joao Macaiba wrote: > > > > 1. For a newbie in the Python core development, what is the best PEP to > > > begin with ? > > > Wild guess? 308, but that still requires changing the grammar and > > editing the AST compiler. 328 will need playing with the import code > > which is known to be hairy. 343 has the same needs as 308, but I bet > > would be more complicated. > > Joao brought up an interesting point on #python on freenode, though... Is > there any documentation regarding the AST code? I started fiddling with it > just to get to know it, adding some weird syntax just for the hell of it, > and I *think* I understand how the AST is supposed to work. I haven't gotten > around to actually coding it, though (just like I haven't gotten around to > PEP 13 ;) so maybe I have it all wrong. A short description of the > principles and design choices would be nice, maybe with a paragraph on how > to add new syntax constructs. How tightly should the AST follow the grammar, > for instance? > There is a Python/compile.txt that was originally started by Jeremy that I subsequently picked up and heavily fleshed out at the last PyCon sprint. It didn't get checked in during the merge because Jeremy was not sure where to put it. But I just checked it in since I realized I can delete it once PEP 339 is updated. It is slightly out of date, though, because of the lack of info on the arena API. > (I pointed Joao to the augmented assignment patch for 2.0, which doesn't say > anything about the AST but should be helpful hints in his quest to > understand Python's internals. Lord knows that's how I learned it... By the > time he groks it all, hopefully someone can help him with the AST parts ;) > Probably best way to read it is to follow how an 'if' statement gets compiled. That's how I picked it up. -Brett From barry at python.org Thu Feb 9 04:15:14 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 08 Feb 2006 22:15:14 -0500 Subject: [Python-Dev] email 3.1 for Python 2.5 using PEP 8 module names Message-ID: <1139454914.10366.15.camel@geddy.wooz.org> I posted a message to the email-sig expressing my desire to change our module naming scheme to conform to PEP 8. This would entail a bump in the email version to 3.1, and would be included in Python 2.5. Of course, the old names would still work, for at least one Python release. All the responses so far have been favorable, and Fred Drake provided a nice hook for allow us to support both the old and new names. Code is now checked into the Python sandbox that implements this. Here's the top of the thread: http://mail.python.org/pipermail/email-sig/2006-February/000254.html I'd like to keep discussion on the email-sig, so please join us there if you care about this one way or the other. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060208/77b4e921/attachment.pgp From greg.ewing at canterbury.ac.nz Thu Feb 9 04:24:05 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Feb 2006 16:24:05 +1300 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EA6692.2050502@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> Message-ID: <43EAB5D5.6040104@canterbury.ac.nz> Martin v. L?wis wrote: > I withdraw that idea. It appears that the platform SDK doesn't > (any longer?) provide an import library for msvrt.dll, and > Microsoft documents mscvrt as intended only for "system > components". Insofar as it forms a base on which other separately- compiled pieces of code run, it seems to me that Python itself deserves to be classed as a "system component". Although I concede that's probably not quite what Microsoft mean by the term... -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 9 04:27:50 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Feb 2006 16:27:50 +1300 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> Message-ID: <43EAB6B6.3050505@canterbury.ac.nz> Neil Hodgson wrote: > You don't need to drop the CRT, just encapsulate it so there is one > copy controlled by Python that hands out wrapped objects (file > handles, file pointers, memory blocks, others?). These wrappers can > only be manipulated through calls back to that owning code that then > calls the CRT. But that won't help when you need to deal with third-party code that knows nothing about Python or its wrapped file objects, and calls the CRT (or one of the myriad extant CRTs, chosen at random:-) directly. I can't see *any* solution to this that works in general. Even if Python itself and all its extensions completely avoid using the CRT, there's still the possibility that two different extensions will use two third-party libraries that were compiled with different CRTs. As far as I can see, Microsoft have created an intractable mess here. Their solution of "compile your whole program with the same CRT" completely misses the possibility that the "whole program" may consist of disparate separately- written and separately-compiled parts, and there may be no single person with the ability and/or legal right to compile and link the whole thing. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 9 04:27:54 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Feb 2006 16:27:54 +1300 Subject: [Python-Dev] _length_cue() In-Reply-To: <20060208235156.GA29514@code0.codespeak.net> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> Message-ID: <43EAB6BA.6040003@canterbury.ac.nz> Armin Rigo wrote: > This said, do we vote for __length_hint__ or __length_cue__? :-) I prefer something containing "hint" rather than "cue" because it more explicitly says what we mean. I feel that __length_hint__ is a bit long, though. We have __len__, not __length__, so maybe it should be __len_hint__ or __lenhint__. > And does anyone objects about __getitem_hint__ or __getitem_cue__? I'm having trouble seeing widespread use cases for this. If an object is capable of computing arbitrary items on demand, seems to me it should be implemented as a lazily-evaluated sequence or mapping rather than an iterator. The iterator protocol is currently very simple and well-focused on a single task -- producing things one at a time, in sequence. Let's not clutter it up with too much more cruft. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 9 04:41:10 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Feb 2006 16:41:10 +1300 Subject: [Python-Dev] Let's send lambda to the shearing shed (Re: Let's just *keep* lambda) In-Reply-To: <2mbqxhe65n.fsf@starship.python.net> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <2mbqxhe65n.fsf@starship.python.net> Message-ID: <43EAB9D6.6030507@canterbury.ac.nz> My thought on lambda at the moment is that it's too VERBOSE. If a syntax for anonymous functions is to pull its weight, it needs to be *very* concise. The only time I ever consider writing a function definition in-line is when the body is extremely short, otherwise it's clearer to use a def instead. Given that, I do *not* have the space to waste with 6 or 7 characters of geeky noise-word. So my vote for Py3k is to either 1) Replace lambda args: value with args -> value or something equivalently concise, or 2) Remove lambda entirely. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From seojiwon at gmail.com Thu Feb 9 05:03:31 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Wed, 8 Feb 2006 20:03:31 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <20060208173838.1143.JCARLSON@uci.edu> References: <20060208173838.1143.JCARLSON@uci.edu> Message-ID: On 2/8/06, Josiah Carlson wrote: > > Jiwon Seo wrote: > > > > On 2/8/06, Guido van Rossum wrote: > > > On 2/8/06, Patrick Collison wrote: > > > > And to think that people thought that keeping "lambda", but changing > > > > the name, would avoid all the heated discussion... :-) > > > > > > Note that I'm not participating in any attempts to "improve" lambda. > > > > Then, is there any chance anonymous function - or closure - is > > supported in python 3.0 ? Or at least have a discussion about it? > > > > or is there already discussion about it (and closed)? > > Closures already exist in Python. > > >>> def foo(bar): > ... return lambda: bar + 1 > ... > >>> a = foo(5) > >>> a() > 6 Not in that we don't have anonymous function (or closure) with multiple statements. Also, current limited closure does not capture programming context - or variables. -Jiwon From smiles at worksmail.net Thu Feb 9 03:07:59 2006 From: smiles at worksmail.net (Smith) Date: Wed, 8 Feb 2006 20:07:59 -0600 Subject: [Python-Dev] [BULK] Python-Dev Digest, Vol 31, Issue 37 References: Message-ID: <008b01c62d30$7f3dbb30$2b2c4fca@csmith> | From: Michael Hudson | Guido van Rossum writes: | || On 2/8/06, Patrick Collison wrote: ||| And to think that people thought that keeping "lambda", but changing ||| the name, would avoid all the heated discussion... :-) || || Note that I'm not participating in any attempts to "improve" lambda. || || Just about the only improvement I'd like to see is to add parentheses || around the arguments, so you'd write lambda(x, y): x**y instead of || lambda x, y: x**y. | | That would seem to be a bad idea, as it means something already: | |||| f = lambda (x,y): x + y |||| t = (1,2) |||| f(t) | 3 | | Cheers, | mwh Hey! I didn't know you could do that. I'm happy. My lambdas just grew parenthesis on the arguments: >>> f=lambda(x):x+1 >>> f(2) 3 >>> def go(f,x): ... print f(x) ... >>> go(lambda(x):x+1,1) 2 >>> go(lambda(x,y):x+y,(1,3)) 4 >>> /c From jcarlson at uci.edu Thu Feb 9 05:51:33 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 08 Feb 2006 20:51:33 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <20060208173838.1143.JCARLSON@uci.edu> Message-ID: <20060208204655.1146.JCARLSON@uci.edu> Jiwon Seo wrote: > On 2/8/06, Josiah Carlson wrote: > > Closures already exist in Python. > > > > >>> def foo(bar): > > ... return lambda: bar + 1 > > ... > > >>> a = foo(5) > > >>> a() > > 6 > > Not in that we don't have anonymous function (or closure) with > multiple statements. As already said, lambdas (Python's anonymous functions) are limited to a single expression. If you can't do what you want with a single expression, then it probably SHOULD have a name, so you should use a standard function definition. > Also, current limited closure does not capture > programming context - or variables. You should clarify yourself. According to my experience, you can do anything you want with Python closures, it just may take more work than you are used to. def environment(): env = {} def get_variable(name): return env[name] def set_variable(name, value): env[name] = value def del_variable(name): del env[name] return get_variable, set_variable, del_variable - Josiah From jcarlson at uci.edu Thu Feb 9 06:02:59 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 08 Feb 2006 21:02:59 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <43E85B7D.8080203@lycos.com> References: <43E85B7D.8080203@lycos.com> Message-ID: <20060208205442.1149.JCARLSON@uci.edu> Mark Edgington wrote: > > Martin v. L?wis wrote: > > That patch looks wrong. What does it mean to "run in a thread"? > > All code runs in a thread, all the time: sometime, that thread > > is the main thread. > > > > Furthermore, I can't see any presumed thread-unsafety in asynchat. > > Ok, perhaps the notation could be improved, but the idea of the > semaphore in the patch is "Does it run inside of a multithreaded > environment, and could its push() functions be called from a different > thread?" Asyncore is not threadsafe. The reason it is not threadsafe is because there was no effort made to make it threadsafe, because it is not uncommon for the idea of asynchronous sockets to be the antithesis of threaded socket servers. In any case, one must be very careful as (at least in older versions of Python on certain platforms), running sock.send(data) on two threads simultaneously for the same socket was a segfault. I understand that this is what you are trying to avoid, but have you considered just doing... q = Queue.Queue() def push(sock, data): q.put((sock, data)) def mainloop(): ... while not q.empty(): sock, data = q.get() sock.push(data) ... Wow, now we don't have to update the standard library to introduce a false sense of thread-safety into asyncore! - Josiah From greg.ewing at canterbury.ac.nz Thu Feb 9 03:13:25 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Feb 2006 15:13:25 +1300 Subject: [Python-Dev] _length_cue() In-Reply-To: <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> Message-ID: <43EAA545.1060308@canterbury.ac.nz> Raymond Hettinger wrote: > [Andrew Koenig] > >>Might I suggest that at least you consider using "hint" instead of "cue"? > > Personally, I prefer "cue" which my dictionary defines as "a signal, hint, > or suggestion". The alternate definition of "a prompt for some action" > applies equally well. No, it doesn't, because it's in the wrong direction. The caller isn't prompting the callee to perform an action, it's asking for some information. I agree that "hint" is a more precise name. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Thu Feb 9 06:28:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 06:28:40 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EAB6B6.3050505@canterbury.ac.nz> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> Message-ID: <43EAD308.8080406@v.loewis.de> Greg Ewing wrote: > As far as I can see, Microsoft have created an intractable > mess here. Their solution of "compile your whole program > with the same CRT" completely misses the possibility that > the "whole program" may consist of disparate separately- > written and separately-compiled parts, and there may be no > single person with the ability and/or legal right to > compile and link the whole thing. Hence, Microsoft's suggesting is entirely different these days: use .NET, and you won't have these versioning problems anymore. I'm getting off-topic... Regards, Martin From martin at v.loewis.de Thu Feb 9 06:33:01 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 06:33:01 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> Message-ID: <43EAD40D.30701@v.loewis.de> Jiwon Seo wrote: > Then, is there any chance anonymous function - or closure - is > supported in python 3.0 ? Or at least have a discussion about it? That discussion appears to be closed (or, not really: everybody can discuss, but it likely won't change anything). > (IMHO, closure is very handy for function like map, sort etc. And > having to write a function for multiple statement is kind of good in > that function name explains what it does. However, I sometimes feel > that having no name at all is clearer. Also, having to define a > function when it'll be used only once seemed inappropriate sometimes.) Hmm. Can you give real-world examples (of existing code) where you needed this? Regards, Martin From oliphant.travis at ieee.org Thu Feb 9 08:18:40 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 09 Feb 2006 00:18:40 -0700 Subject: [Python-Dev] Help with Unicode arrays in NumPy In-Reply-To: <87r76epbb1.fsf@tleepslib.sk.tsukuba.ac.jp> References: <43E8FDC4.1010607@v.loewis.de> <87r76epbb1.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: Thank you, Martin and Stephen, for the suggestions and comments. For your information: We decided that all NumPy arrays of unicode strings will use UCS4 for internal representation. When an element of the array is selected, a unicodescalar (which inherits directly from the unicode builtin type but has attributes and methods of arrays) will be returned. On wide builds, the scalar is a perfect match. On narrow builds, surrogate pairs will be used if they are necessary as the data is copied over to the scalar. Best regards, -Travis From nyamatongwe at gmail.com Thu Feb 9 08:29:39 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Thu, 9 Feb 2006 18:29:39 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EACE52.7010205@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <031401c62cbb$61810630$bf03030a@trilan> <43EA3AB2.4020808@v.loewis.de> <50862ebd0602081437j49da385fuc3a31237bbff3725@mail.gmail.com> <43EACE52.7010205@v.loewis.de> Message-ID: <50862ebd0602082329h54f6d3dfx3e5943364034771@mail.gmail.com> Martin v. L?wis: > I don't think this would be good enough. I then also need a way to > provide extension authors with an API that looks like the CRT, but > isn't: they cannot realistically change all their code to use the > wrapped objects. In a recent case, somebody tried to passed a FILE* > to a postrgres DLL linked with a different CRT; he shouldn't need > to change the entire postgres code to use the modified API. The postgres example is strange to me as I'd never consider passing a FILE* over a DLL boundary. Maybe this is a Unix/Windows cultural thing due to such practices being more dangerous on Windows. > Also, there is still the redistribution issue: to redistribute > msvcr71.dll, you need to own a MSVC license. People that want to > use py2exe (or some such) are in trouble: they need to distribute > both python25.dll, and msvcr71.dll. They are allowed to distribute > the former, but (formally) not allowed to distribute the latter. Link statically. Neil From nyamatongwe at gmail.com Thu Feb 9 08:29:59 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Thu, 9 Feb 2006 18:29:59 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EAB6B6.3050505@canterbury.ac.nz> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> Message-ID: <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> Greg Ewing: > But that won't help when you need to deal with third-party > code that knows nothing about Python or its wrapped file > objects, and calls the CRT (or one of the myriad extant > CRTs, chosen at random:-) directly. Can you explain exactly why there is a problem here? Its fairly normal under Windows to build applications that provide a generic plugin interface (think Netscape plugins or COM) that allow the plugins to be built with any compiler and runtime. Neil From oliphant.travis at ieee.org Thu Feb 9 09:00:22 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 09 Feb 2006 01:00:22 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation Message-ID: <43EAF696.5070101@ieee.org> Guido seemed accepting to this idea about 9 months ago when I spoke to him. I finally got around to writing up the PEP. I'd really like to get this into Python 2.5 if possible. -Travis -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: PEP_index.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20060209/8116776b/attachment-0001.txt From smiles at worksmail.net Thu Feb 9 09:41:17 2006 From: smiles at worksmail.net (Smith) Date: Thu, 9 Feb 2006 02:41:17 -0600 Subject: [Python-Dev] py3k and not equal; re names Message-ID: <034901c62d54$cf30c320$2b2c4fca@csmith> I'm wondering if it's just "foolish consistency" (to quote a PEP 8) that is calling for the dropping of <> in preference of only !=. I've used the former since the beginning in everything from basic, fortran, claris works, excel, gnumeric, and python. I tried to find a rationale for the dropping--perhaps there is some other object that will be represented (like an empty set). I'm sure there must be some reason, but just want to put a vote in for keeping this variety. And another suggestion for py3k would be to increase the correspondence between string methods and re methods. e.g. since re.match and string.startswith are checking for the same thing, was there a reason to introduce the new names? The same question is asked for string.find and re.search. Instead of having to learn another set of method names to use re, it would be nice to have the only change be the pattern used for the method. Here is a side-by-side listing of methods in both modules that are candidates for consistency--hopefully not "foolish" ;-) string re ------ ------ find search startswith match split split replace sub NA subn NA findall NA finditer /c From eric.nieuwland at xs4all.nl Thu Feb 9 09:46:12 2006 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Thu, 9 Feb 2006 09:46:12 +0100 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43EAF696.5070101@ieee.org> References: <43EAF696.5070101@ieee.org> Message-ID: Travis Oliphant wrote: > PEP: ### > Title: Allowing any object to be used for slicing > [...] > Rationale > > Currently integers and long integers play a special role in slice > notation in that they are the only objects allowed in slice > syntax. In other words, if X is an object implementing the sequence > protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both > integers or long integers. There is no way for obj1 and obj2 to > tell Python that they could be reasonably used as indexes into a > sequence. This is an unnecessary limitation. > [...] I like the general idea from an academic point of view. Just one question: could you explain what I should expect from x[ slicer('spam') : slicer('eggs') ] when slicer implements this protocol? Specifically, I'd like to known how you want to define the interval between two objects. Or is that for the sliced/indexed object to decide? --eric From g.brandl at gmx.net Thu Feb 9 09:53:36 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 09 Feb 2006 09:53:36 +0100 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: Eric Nieuwland wrote: > Travis Oliphant wrote: >> PEP: ### >> Title: Allowing any object to be used for slicing >> [...] >> Rationale >> >> Currently integers and long integers play a special role in slice >> notation in that they are the only objects allowed in slice >> syntax. In other words, if X is an object implementing the sequence >> protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both >> integers or long integers. There is no way for obj1 and obj2 to >> tell Python that they could be reasonably used as indexes into a >> sequence. This is an unnecessary limitation. >> [...] > > I like the general idea from an academic point of view. > Just one question: could you explain what I should expect from x[ > slicer('spam') : slicer('eggs') ] when slicer implements this > protocol? > Specifically, I'd like to known how you want to define the interval > between two objects. Or is that for the sliced/indexed object to > decide? As I understand it: The sliced object will only see integers. The PEP wants to give arbitrary objects the possibility of pretending to be an integer that can be used for indexing. Georg From oliphant.travis at ieee.org Thu Feb 9 10:08:36 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Thu, 09 Feb 2006 02:08:36 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: <43EB0694.6050506@ieee.org> Eric Nieuwland wrote: > Travis Oliphant wrote: > >> PEP: ### >> Title: Allowing any object to be used for slicing >> [...] >> Rationale >> >> Currently integers and long integers play a special role in slice >> notation in that they are the only objects allowed in slice >> syntax. In other words, if X is an object implementing the sequence >> protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both >> integers or long integers. There is no way for obj1 and obj2 to >> tell Python that they could be reasonably used as indexes into a >> sequence. This is an unnecessary limitation. >> [...] > > > I like the general idea from an academic point of view. > Just one question: could you explain what I should expect from x[ > slicer('spam') : slicer('eggs') ] when slicer implements this protocol? > Specifically, I'd like to known how you want to define the interval > between two objects. Or is that for the sliced/indexed object to decide? I'm not proposing to define that. The sequence protocol already provides to the object only a c-integer (currently it's int but there's a PEP to change it to ssize_t). Right now, only Python integer and Python Long integers are allowed to be converted to this c-integer passed to the object that is implementing the slicing protocol. It's up to the object to deal with those integers as it sees fit. One possible complaint that is easily addressed is that the slot should really go into the PyNumber_Methods as nb_index because a number-like object is what would typically be easily convertible to a c-integer. Having to implement the sequence protocol (on the C-level) just to enable sq_index seems in-appropriate. So, I would change the PEP to implement nb_index instead... -Travis From rhamph at gmail.com Thu Feb 9 10:47:40 2006 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 9 Feb 2006 02:47:40 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43EAF696.5070101@ieee.org> References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Travis Oliphant wrote: > > Guido seemed accepting to this idea about 9 months ago when I spoke to > him. I finally got around to writing up the PEP. I'd really like to > get this into Python 2.5 if possible. -1 I've detailed my reasons here: http://mail.python.org/pipermail/python-dev/2006-January/059851.html In short, there are purely math usages that want to convert to int while raising exceptions from inexact results. The name __index__ seems inappropriate for this, and I feel it would be cleaner to fix float.__int__ to raise exceptions from inexact results (after a suitable warning period and with a trunc() function added to math.) -- Adam Olsen, aka Rhamphoryncus From seojiwon at gmail.com Thu Feb 9 10:51:58 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Thu, 9 Feb 2006 01:51:58 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43EAD40D.30701@v.loewis.de> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> Message-ID: On 2/8/06, "Martin v. L?wis" wrote: > Jiwon Seo wrote: > > Then, is there any chance anonymous function - or closure - is > > supported in python 3.0 ? Or at least have a discussion about it? > > That discussion appears to be closed (or, not really: everybody > can discuss, but it likely won't change anything). > > > (IMHO, closure is very handy for function like map, sort etc. And > > having to write a function for multiple statement is kind of good in > > that function name explains what it does. However, I sometimes feel > > that having no name at all is clearer. Also, having to define a > > function when it'll be used only once seemed inappropriate sometimes.) > > Hmm. Can you give real-world examples (of existing code) where you > needed this? Apparently, simplest example is, collection.visit(lambda x: print x) which currently is not possible. Another example is, map(lambda x: if odd(x): return 1 else: return 0, listOfNumbers) (however, with new if/else expression, that's not so much a problem any more.) Also, anything with exception handling code can't be without explicit function definition. collection.visit(lambda x: try: foo(x); except SomeError: error("error message")) Anyway, I was just curious that if anyone is interested in having more closure-like closure in python 3.0 - in any form, not necessary an extension on lambda. -Jiwon > > Regards, > Martin > From abo at minkirri.apana.org.au Thu Feb 9 11:32:56 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu, 09 Feb 2006 10:32:56 +0000 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <20060208141442.GA322@divmod.com> References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> <1139405006.21021.40.camel@warna.dub.corp.google.com> <20060208141442.GA322@divmod.com> Message-ID: <1139481176.8106.22.camel@warna.dub.corp.google.com> On Wed, 2006-02-08 at 15:14 +0100, Valentino Volonghi aka Dialtone wrote: > On Wed, Feb 08, 2006 at 01:23:26PM +0000, Donovan Baarda wrote: > > I believe that Twisted does pretty much this with it's "deferred" stuff. > > It shoves slow stuff off for processing in a separate thread that > > re-syncs with the event loop when it's finished. > > Deferreds are only an elaborate way to deal with a bunch of callbacks. > It's Twisted itself that provides a way to run something in a separate thread > and then fire a deferred (from the main thread) when the child thread > finishes (reactor.callInThread() to call stuff in a different thread, [...] I know they are more than just a way to run slow stuff in threads, but once you have them, simple as they are, they present an obvious solution to all sorts of things, including long computations in a thread. Note that once zope2 took the approach it did, blocking the async-loop didn't hurt so bad, so lots of zope add-ons just did it gratuitously. In many cases the slow event handlers were slow because they are waiting on IO that could in theory be serviced as yet another event handler in the async-loop. However, the Zope/Medusa async framework had become so scary hardly anyone knew how to do this without breaking Zope itself. > > In the case of Zope/ZEO I'm not entirely sure but I think what happened > > was medusa (asyncore/asynchat based stuff Zope2 was based on) didn't > > have this deferred handler support. When they found some of the stuff > > Here I think you meant that medusa didn't handle computation in separate > threads instead. No, I pretty much meant what I said :-) Medusa didn't have any concept of a deferred, hence the idea of using one to collect the results of a long computation in another thread never occurred to them... remember the highly refactored OO beauty that is twisted was not even a twinkle in anyone's eye yet. In theory it would be just as easy to add twisted style deferToThread to Medusa, and IMHO it is a much better approach. Unfortunately at the time they went the other way and implemented multiple async-loops in separate threads. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From fredrik at pythonware.com Thu Feb 9 13:12:29 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 9 Feb 2006 13:12:29 +0100 Subject: [Python-Dev] threadsafe patch for asynchat References: <43E83E5D.3010000@v.loewis.de><20060207153758.1116.JCARLSON@uci.edu><1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com><43E9984F.9020502@v.loewis.de> <1139405006.21021.40.camel@warna.dub.corp.google.com><20060208141442.GA322@divmod.com> <1139481176.8106.22.camel@warna.dub.corp.google.com> Message-ID: Donovan Baarda wrote: >> Here I think you meant that medusa didn't handle computation in separate >> threads instead. > > No, I pretty much meant what I said :-) > > Medusa didn't have any concept of a deferred, hence the idea of using > one to collect the results of a long computation in another thread never > occurred to them... remember the highly refactored OO beauty that is > twisted was not even a twinkle in anyone's eye yet. > > In theory it would be just as easy to add twisted style deferToThread to > Medusa, and IMHO it is a much better approach. Unfortunately at the time > they went the other way and implemented multiple async-loops in separate > threads. that doesn't mean that everyone using Medusa has done things in the wrong way, of course ;-) From barry at python.org Thu Feb 9 13:39:06 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 9 Feb 2006 07:39:06 -0500 Subject: [Python-Dev] py3k and not equal; re names In-Reply-To: <034901c62d54$cf30c320$2b2c4fca@csmith> References: <034901c62d54$cf30c320$2b2c4fca@csmith> Message-ID: <181F2BE2-8D38-4FC4-A69B-5F6F69C24FF7@python.org> On Feb 9, 2006, at 3:41 AM, Smith wrote: > I'm wondering if it's just "foolish consistency" (to quote a PEP 8) > that is calling for the dropping of <> in preference of only !=. > I've used the former since the beginning in everything from basic, > fortran, claris works, excel, gnumeric, and python. I tried to find > a rationale for the dropping--perhaps there is some other object > that will be represented (like an empty set). I'm sure there must > be some reason, but just want to put a vote in for keeping this > variety. I've long advocated for keeping <> as I find it much more visually distinctive when reading code. -Barry From p.f.moore at gmail.com Thu Feb 9 13:53:10 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 9 Feb 2006 12:53:10 +0000 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> Message-ID: <79990c6b0602090453m2a6766fcjffa16c488774a01f@mail.gmail.com> On 2/9/06, Neil Hodgson wrote: > Greg Ewing: > > > But that won't help when you need to deal with third-party > > code that knows nothing about Python or its wrapped file > > objects, and calls the CRT (or one of the myriad extant > > CRTs, chosen at random:-) directly. > > Can you explain exactly why there is a problem here? Its fairly > normal under Windows to build applications that provide a generic > plugin interface (think Netscape plugins or COM) that allow the > plugins to be built with any compiler and runtime. This has all been thrashed out before, but the issue is passing CRT-allocated objects across DLL boundaries. If you open a file (getting a FILE*) in one DLL, using one CRT, and pass it to a second DLL, linked with a different CRT, the FILE* is not valid in that second CRT, and operations on it will fail. At first glance, this is a minor issue - passing FILE* pointers across DLL boundaries isn't something I'd normally expect people to do - but look further and you find you're opening a real can of worms. For example, Python has public APIs which take FILE* parameters. Further, memory allocation is CRT-managed - allocate memory with one CRT's malloc, and dealloacte it elsewhere, and you have issues. So *any* pointer could be CRT-managed, to some extent. Etc, etc... As a counterexample, however, I've heard reports that you can do a binary edit of the DLLs in the Subversion Python bindings, to change references to python23.dll to python24.dll, and everything still works. Make of that what you will... Also, there are intractable cases, like mod_python. Apache is still built with MSVC6, where Python is built with MSVC7.1. And so, mod_python, as a bridge, has *no* CRT that is "officially" OK. And yet, it works. I don't know how, maybe the mod_python developers could comment. Anyway, that's the brief summary. Paul. From thomas at xs4all.net Thu Feb 9 14:49:57 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 9 Feb 2006 14:49:57 +0100 Subject: [Python-Dev] py3k and not equal; re names In-Reply-To: <181F2BE2-8D38-4FC4-A69B-5F6F69C24FF7@python.org> References: <034901c62d54$cf30c320$2b2c4fca@csmith> <181F2BE2-8D38-4FC4-A69B-5F6F69C24FF7@python.org> Message-ID: <20060209134957.GG10226@xs4all.nl> On Thu, Feb 09, 2006 at 07:39:06AM -0500, Barry Warsaw wrote: > I've long advocated for keeping <> as I find it much more visually > distinctive when reading code. +1. And, two years ago, in his PyCon keynote, Guido forgot to say <> was going away, so I think Barry and I are completely in our rights to demand it'd stay. <0.5 wink>-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip at pobox.com Thu Feb 9 15:38:58 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 9 Feb 2006 08:38:58 -0600 Subject: [Python-Dev] _length_cue() In-Reply-To: <43EAA545.1060308@canterbury.ac.nz> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <43EAA545.1060308@canterbury.ac.nz> Message-ID: <17387.21506.776343.95040@montanaro.dyndns.org> >> [Andrew Koenig] >> >>> Might I suggest that at least you consider using "hint" instead of "cue"? ... Greg> I agree that "hint" is a more precise name. Ditto. In addition, we already have queues. Do we really need to use a homonym that means something entirely different? (Hint: consider the added difficulty for non-native English speakers). Skip From abo at minkirri.apana.org.au Thu Feb 9 15:39:15 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu, 09 Feb 2006 14:39:15 +0000 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: References: <43E83E5D.3010000@v.loewis.de> <20060207153758.1116.JCARLSON@uci.edu> <1f7befae0602072115g4eb4c147h3be20a372af17c7e@mail.gmail.com> <43E9984F.9020502@v.loewis.de> <1139405006.21021.40.camel@warna.dub.corp.google.com> <20060208141442.GA322@divmod.com> <1139481176.8106.22.camel@warna.dub.corp.google.com> Message-ID: <1139495955.8106.70.camel@warna.dub.corp.google.com> On Thu, 2006-02-09 at 13:12 +0100, Fredrik Lundh wrote: > Donovan Baarda wrote: > > >> Here I think you meant that medusa didn't handle computation in separate > >> threads instead. > > > > No, I pretty much meant what I said :-) > > > > Medusa didn't have any concept of a deferred, hence the idea of using > > one to collect the results of a long computation in another thread never > > occurred to them... remember the highly refactored OO beauty that is > > twisted was not even a twinkle in anyone's eye yet. > > > > In theory it would be just as easy to add twisted style deferToThread to > > Medusa, and IMHO it is a much better approach. Unfortunately at the time > > they went the other way and implemented multiple async-loops in separate > > threads. > > that doesn't mean that everyone using Medusa has done things in the wrong > way, of course ;-) Of course... and even Zope2 was not necessarily the "wrong way"... it was a perfectly valid design decision, given that it was all new ground at the time. And it works really well... there were many consequences of that design that probably contributed to the robustness of other Zope components like ZODB... -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From skip at pobox.com Thu Feb 9 15:52:19 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 9 Feb 2006 08:52:19 -0600 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> Message-ID: <17387.22307.395057.121770@montanaro.dyndns.org> >> Hmm. Can you give real-world examples (of existing code) where you >> needed this? Jiwon> Apparently, simplest example is, Jiwon> collection.visit(lambda x: print x) Sure, but has several other people have indicated, statements are not expressions in Python as they are in C (or in Lisp, which doesn't have statements). You can't do this either: if print x: print 5 because "print x" is a statement, while the if statement only accepts expressions. Lambdas are expressions. Statements can't be embedded in expressions. That statements and expressions are different is a core feature of the language. That is almost certainly not going to change. Skip From oliphant.travis at ieee.org Thu Feb 9 16:23:05 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 08:23:05 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: Adam Olsen wrote: > On 2/9/06, Travis Oliphant wrote: > >>Guido seemed accepting to this idea about 9 months ago when I spoke to >>him. I finally got around to writing up the PEP. I'd really like to >>get this into Python 2.5 if possible. > > > -1 > > I've detailed my reasons here: > http://mail.python.org/pipermail/python-dev/2006-January/059851.html > > In short, there are purely math usages that want to convert to int > while raising exceptions from inexact results. The name __index__ > seems inappropriate for this, and I feel it would be cleaner to fix > float.__int__ to raise exceptions from inexact results (after a > suitable warning period and with a trunc() function added to math.) > I'm a little confused. Is your opposition solely due to the fact that you think float's __int__ method ought to raise exceptions and the apply_slice code should therefore use the __int__ slot? In theory I can understand this reasoning. In practice, however, the __int__ slot has been used for "coercion" and changing the semantics of int(3.2) at this stage seems like a recipe for lots of code breakage. I don't think something like that is possible until Python 3k. If that is not your opposition, please be more clear. Regardless of how it is done, it seems rather unPythonic to only allow 2 special types to be used in apply_slice and assign_slice. -Travis From jeremy at alum.mit.edu Thu Feb 9 16:29:36 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 9 Feb 2006 10:29:36 -0500 Subject: [Python-Dev] _length_cue() In-Reply-To: <17387.21506.776343.95040@montanaro.dyndns.org> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <43EAA545.1060308@canterbury.ac.nz> <17387.21506.776343.95040@montanaro.dyndns.org> Message-ID: Hint seems like the standard terminology in the field. I don't think it makes sense to invent our own terminology without some compelling reason. Jeremy On 2/9/06, skip at pobox.com wrote: > > >> [Andrew Koenig] > >> > >>> Might I suggest that at least you consider using "hint" instead of "cue"? > ... > > Greg> I agree that "hint" is a more precise name. > > Ditto. In addition, we already have queues. Do we really need to use a > homonym that means something entirely different? (Hint: consider the added > difficulty for non-native English speakers). > > Skip > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From guido at python.org Thu Feb 9 16:43:28 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 07:43:28 -0800 Subject: [Python-Dev] _length_cue() In-Reply-To: <17387.21506.776343.95040@montanaro.dyndns.org> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <43EAA545.1060308@canterbury.ac.nz> <17387.21506.776343.95040@montanaro.dyndns.org> Message-ID: On 2/9/06, skip at pobox.com wrote: > Greg> I agree that "hint" is a more precise name. > > Ditto. In addition, we already have queues. Do we really need to use a > homonym that means something entirely different? (Hint: consider the added > difficulty for non-native English speakers). Right. As a non-native speaker I can confirm that for English learners, "cue" is a bit mysterious at first while "hint" is obvious. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Thu Feb 9 16:39:40 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 9 Feb 2006 10:39:40 -0500 Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net><00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1><43EAA545.1060308@canterbury.ac.nz> <17387.21506.776343.95040@montanaro.dyndns.org> Message-ID: wrote in message news:17387.21506.776343.95040 at montanaro.dyndns.org... > >>> Might I suggest that at least you consider using "hint" instead of > "cue"? > ... > > Greg> I agree that "hint" is a more precise name. > > Ditto. In addition, we already have queues. Do we really need to use a > homonym that means something entirely different? (Hint: consider the > added > difficulty for non-native English speakers). Even as a native English speaker, but without knowing the intended meaning, I did not understand or guess that length_cue meant length_hint. The primary meaning of cue is 'signal to begin some action', with 'hint, suggestion' being a secondary meaning. Even then, I would take it as referring to possible action rather than possible information. Cue is also short for queue, leading to cue stick (looks like a pigtail, long and tapering) and cue ball. From skip at pobox.com Thu Feb 9 16:54:59 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 9 Feb 2006 09:54:59 -0600 Subject: [Python-Dev] _length_cue() In-Reply-To: References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <43EAA545.1060308@canterbury.ac.nz> <17387.21506.776343.95040@montanaro.dyndns.org> Message-ID: <17387.26067.994514.746486@montanaro.dyndns.org> >> Ditto. In addition, we already have queues. Do we really need to >> use a homonym that means something entirely different? (Hint: >> consider the added difficulty for non-native English speakers). Guido> Right. As a non-native speaker I can confirm that for English Guido> learners, "cue" is a bit mysterious at first while "hint" is Guido> obvious. Guido, you're hardly your typical non-native speaker. I think your English may be better than mine. ;-) At any rate, I was thinking of some of the posts I see on c.l.py where it requires a fair amount of detective work just to figure out what the poster has written, what with all the incorrect grammar and wild misspellings. For that sort of person I can believe that "cue", "queue" and "kew" might mean exactly the same thing... Skip From jack at performancedrivers.com Thu Feb 9 17:21:49 2006 From: jack at performancedrivers.com (Jack Diederich) Date: Thu, 9 Feb 2006 11:21:49 -0500 Subject: [Python-Dev] _length_cue() In-Reply-To: <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> Message-ID: <20060209162149.GC5942@performancedrivers.com> [Raymond Hettinger] > [Armin Rigo] > > BTW the reason I'm looking at this is that I'm considering adding > > another undocumented internal-use-only method, maybe __getitem_cue__(), > > that would try to guess what the nth item to be returned will be. This > > would allow the repr of some iterators to display more helpful > > information when playing around with them at the prompt, e.g.: > > > >>>> enumerate([3.1, 3.14, 3.141, 3.1415, 3.14159, 3.141596]) > > > > At one point, I explored and then abandoned this idea. For objects like > itertools.count(n), it worked fine -- the state was readily knowable and the > eval(repr(obj)) round-trip was possible. However, for tools like > enumerate(), it didn't make sense to have a preview that only applied in a > tiny handful of (mostly academic) cases and was not evaluable in any case. > That is my experience too. Even for knowable sequences people consume it in series and not just one element. My permutation module supports pulling out just the Nth canonical permutation. Lots of people have used the module and no one uses that feature. >>> import probstat >>> p = probstat.Permutation(range(4)) >>> p[0] [0, 1, 2, 3] >>> len(p) 24 >>> p[23] [3, 2, 1, 0] >>> -jackdied From martin at v.loewis.de Thu Feb 9 17:34:57 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 17:34:57 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602082329h54f6d3dfx3e5943364034771@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <031401c62cbb$61810630$bf03030a@trilan> <43EA3AB2.4020808@v.loewis.de> <50862ebd0602081437j49da385fuc3a31237bbff3725@mail.gmail.com> <43EACE52.7010205@v.loewis.de> <50862ebd0602082329h54f6d3dfx3e5943364034771@mail.gmail.com> Message-ID: <43EB6F31.8050804@v.loewis.de> Neil Hodgson wrote: > The postgres example is strange to me as I'd never consider passing > a FILE* over a DLL boundary. Maybe this is a Unix/Windows cultural > thing due to such practices being more dangerous on Windows. In the specific example, Postgres has a PQprint function that can print a query result to a file; the file was sys.stdout. >>Also, there is still the redistribution issue: to redistribute >>msvcr71.dll, you need to own a MSVC license. People that want to >>use py2exe (or some such) are in trouble: they need to distribute >>both python25.dll, and msvcr71.dll. They are allowed to distribute >>the former, but (formally) not allowed to distribute the latter. > > > Link statically. Not sure whether this was a serious suggestion. If pythonxy.dll was statically linked, you would get all the CRT duplication already in extension modules. Given that there are APIs in Python where you have to do malloc/free across the python.dll boundary, you get memory leaks. Regards, Martin From martin at v.loewis.de Thu Feb 9 17:39:31 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 17:39:31 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> Message-ID: <43EB7043.50103@v.loewis.de> Jiwon Seo wrote: > Apparently, simplest example is, > > collection.visit(lambda x: print x) Ok. I remotely recall Guido suggesting that print should become a function. It's not a specific example though: what precise library provides the visit method? > which currently is not possible. Another example is, > > map(lambda x: if odd(x): return 1 > else: return 0, > listOfNumbers) Hmm. What's wrong with map(odd, listOfNumbers) or, if you really need ints: map(lambda x:int(odd(x)), listOfNumbers) > Also, anything with exception handling code can't be without explicit > function definition. > > collection.visit(lambda x: try: foo(x); except SomeError: error("error > message")) That's not a specific example. Regards, Martin From martin at v.loewis.de Thu Feb 9 17:43:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 17:43:32 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> Message-ID: <43EB7134.8000307@v.loewis.de> Neil Hodgson wrote: >>But that won't help when you need to deal with third-party >>code that knows nothing about Python or its wrapped file >>objects, and calls the CRT (or one of the myriad extant >>CRTs, chosen at random:-) directly. > > > Can you explain exactly why there is a problem here? Its fairly > normal under Windows to build applications that provide a generic > plugin interface (think Netscape plugins or COM) that allow the > plugins to be built with any compiler and runtime. COM really solves all problems people might have on Windows. Alas, it is not a cross-platform API. Standard C is cross-platform, so Python uses it in its own APIs. Regards, Martin From brett at python.org Thu Feb 9 18:42:44 2006 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2006 09:42:44 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43EAF696.5070101@ieee.org> References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Travis Oliphant wrote: > > Guido seemed accepting to this idea about 9 months ago when I spoke to > him. I finally got around to writing up the PEP. I'd really like to > get this into Python 2.5 if possible. > > -Travis > > > > > PEP: ### > Title: Allowing any object to be used for slicing Overally I am fine with the idea. Being used as an index is different than coercion into an int so adding this extra method seems reasonable. > Implementation Plan > > 1) Add the slots > > 2) Change the ISINT macro in ceval.c to accomodate objects with the > index slot defined. > Maybe the macro should also be renamed? Not exactly testing if something is an int anymore if it checks for __index__. > 3) Change the _PyEval_SliceIndex function to accomodate objects > with the index slot defined. > -Brett From raymond.hettinger at verizon.net Thu Feb 9 19:13:37 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 09 Feb 2006 13:13:37 -0500 Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net><00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1><43EAA545.1060308@canterbury.ac.nz><17387.21506.776343.95040@montanaro.dyndns.org> Message-ID: <003101c62da4$8d0a3260$b83efea9@RaymondLaptop1> > Hint seems like the standard terminology in the field. I don't think > it makes sense to invent our own terminology without some compelling > reason. Okay, I give, hint wins. Raymond From bokr at oz.net Thu Feb 9 19:24:43 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 09 Feb 2006 18:24:43 GMT Subject: [Python-Dev] Let's just *keep* lambda References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> Message-ID: <43eb771a.357018775@news.gmane.org> On Thu, 09 Feb 2006 17:39:31 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Jiwon Seo wrote: >> Apparently, simplest example is, >> >> collection.visit(lambda x: print x) > >Ok. I remotely recall Guido suggesting that print should become >a function. > Even so, that one is so trivial to define (other than the >> part): >>> import sys >>> def printfun(*args): sys.stdout.write(' '.join(map(str,args))+'\n') ... >>> lamb = lambda x: printfun(x) >>> >>> lamb(123) 123 >>> printfun('How', 'about', 'that?') How about that? Also the quasi-C variant: >>> def printf(fmt, *args): sys.stdout.write(fmt%args) ... >>> (lambda x: printf('How about this: %s', x))('-- also a function\n(no \\n here ;-) ') How about this: -- also a function (no \n here ;-) >>> >It's not a specific example though: what precise library provides >the visit method? > >> which currently is not possible. Another example is, >> >> map(lambda x: if odd(x): return 1 >> else: return 0, >> listOfNumbers) > >Hmm. What's wrong with > >map(odd, listOfNumbers) > >or, if you really need ints: > >map(lambda x:int(odd(x)), listOfNumbers) > >> Also, anything with exception handling code can't be without explicit >> function definition. >> >> collection.visit(lambda x: try: foo(x); except SomeError: error("error >> message")) > >That's not a specific example. > >>> (lambda : """ ... I will say that the multi-line part ... of the argument against lambda suites ... is bogus, though ;-) ... """)( ... ).splitlines( ... )[-1].split()[1].capitalize( ... ).rstrip(',')+'! (though this is ridiculous ;-)' 'Bogus! (though this is ridiculous ;-)' And, as you know, you can abuse the heck out of lambda (obviously this is ridiculous**2 avoidance of external def) >>> lamb = lambda x: eval(compile("""if 1: ... def f(x): ... try: return 'zero one two three'.split()[x] ... except Exception,e:return 'No name for %r -- %s:%s'%(x,e.__class__.__name__, e) ... """,'','exec')) or locals()['f'](x) >>> lamb(2) 'two' >>> lamb(0) 'zero' >>> lamb(4) 'No name for 4 -- IndexError:list index out of range' >>> lamb('x') "No name for 'x' -- TypeError:list indices must be integers" But would e.g. [1] collection.visit(lambda x:: # double ':' to signify suite start try: return 'zero one two three'.split()[x] except Exception,e:return 'No name for %r -- %s:%s'%(x,e.__class__.__name__, e) ) be so bad an "improvement"? Search your heart for the purest answer ;-) (requires enclosing parens, and suite ends on closing ')' and if multiline, the first line after the :: defines the indent-one left edge, and explicit return of value required after ::). [1] (using the function body above just as example, not saying it makes sense for collection.visit) Regards, Bengt Richter From guido at python.org Thu Feb 9 19:33:10 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 10:33:10 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43EB7043.50103@v.loewis.de> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> Message-ID: Enough already. As has clearly been proven, lambda is already perfect. *** To those folks attempting to propose alternate syntax (e.g. x -> y): this is the wrong thread for that (see subject). Seriously, I've seen lots of proposals that just change the syntax, and none of them are so much better than what we have. My comments on some recent proposals: - for Smells to much like a loop. And what if there are no formals? Also the generalization from a generator without the "in " part is wrong; "f(x) for x in S" binds x, while the proposed "f(x) for x" has x as a free variable. Very odd. - -> The -> symbol is much easier to miss. Also it means something completely different in other languages. And it has some problems with multiple formals: (x, y -> x+y) isn't very clear on the binding -- since '->' is an uncommon operator, there's no strong intuition about whether ',' or '->' binds stronger. (x, y) -> x+y would make more sense, but has an ambiguity as long as we want to allow argument tuples (which I've wanted to take out, but that is also getting a lot of opposition). - lambda(): This was my own minimal proposal. I withdraw it -- I agree with the criticism that it looks too much like a function call. - Use a different keyword instead of lambda What is that going to solve? - If there were other proposals, I missed them, or they were too far out to left field to be taken seriously. *** To those people complaining that Python's lambda misleads people into thinking that it is the same as Lisp's lambda: you better get used to it. Python has a long tradition of borrowing notations from other languages and changing the "deep" meaning -- for example, Python's assignment operator does something completely different from the same operator in C or C++. *** To those people who believe that lambda is required in some situations because it behaves differently with respect to the surrounding scope than def: it doesn't, and it never did. This is (still!) a surprisingly common myth. I have no idea where it comes from; does this difference exist in some other language that has lambda as well as some other function definition mechanism? *** To those people still complaining that lambda is crippled because it doesn't do statements: First, remember that adding statement capability wouldn't really add any power to the language; lambda is purely syntactic sugar for an anonymous function definition (see above myth debunking section). Second, years of attempts to overcome this haven't come up with a usable syntax (and yes, curly braces have been proposed and rejected like everything else). It's a hard problem because switching back to indentation-based parsing inside an expression is problematic. For example, consider this hypothetical example: a = foo(lambda x, y: print x print y) Should this be considered legal? Or should it be written as a = foo(lambda x, y: print x print y ) ??? (Indenting the prints so they start at a later column than the 'l' of 'lambda', and adding an explicit dedent before the close parenthesis.) Note that if the former were allowed, we'd have additional ambiguity if foo() took two parameters, e.g.: a = foo(lambda x, y: print x print y, 42) -- is 42 the second argument to foo() or is it printed? I'd much rather avoid this snake's nest by giving the function a name and using existing statement syntax, like this: def callback(x, y): print x print y a = foo(callback) This is unambiguous, easier to parse (for humans as well as for computers), and doesn't actually span more text lines. Since this typically happens in a local scope, the name 'callback' disappears as soon as as the scope is exited. BTW I use the same approach regularly for breaking up long expressions; for example instead of writing a = foo(some_call(another_call(some_long_argument, another_argument), and_more(1, 2, 3), and_still_more()) I'll write x = another_call(some_long_argument, another_argument) a = foo(some_call(x, and_more(1, 2, 3)), and_still_more()) and suddenly my code is more compact and yet easier to read! (In real life, I'd use a more meaningful name than 'x', but since the example is nonsense it's hard to come up with a meaningful name here. :-) Regarding the leakage of temporary variable names in this case: I don't care; this typically happens in a local scope where a compiler could easily enough figure out that a variable is no longer in use. And for clarity we use local variables in this way all the time anyway. *** Parting shot: it appears that we're getting more and more expressionized versions of statements: first list comprehensions, then generator expressions, most recently conditional expressions, in Python 3000 print() will become a function... Seen this way, lambda was just ahead of its time! Perhaps we could add a try/except/finally expression, and allow assignments in expressions, and then we could rid of statements altogether, turning Python into an expression language. Change the use of parentheses a bit, and... voila, Lisp! :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dialtone at divmod.com Thu Feb 9 19:47:51 2006 From: dialtone at divmod.com (Valentino Volonghi aka Dialtone) Date: Thu, 9 Feb 2006 19:47:51 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43EB7043.50103@v.loewis.de> Message-ID: <20060209184751.20077.352925131.divmod.quotient.1@ohm> On Thu, 09 Feb 2006 17:39:31 +0100, "\"Martin v. L?wis\"" wrote: >It's not a specific example though: what precise library provides >the visit method? I'll provide my own usecase right now which is event driven programming of any kind (from GUI toolkits, to network frameworks/libraries). >From my experience these are the kind of usecases that suffer most from having to define functions everytime and, worse, to define functions before their actual usage (which is responsible for part of the bad reputation that, for example, deferreds have). Let's consider this piece of code (actual code that works today and uses twisted for convenience): def do_stuff(result): if result == 'Initial Value': d2 = work_on_result_and_return_a_deferred(result) d2.addCallback(println) return d2 return 'No work on result' def println(something): print something d1 = some_operation_that_results_in_a_deferred() d1.addCallback(do_stuff) d1.addCallback(lambda _: reactor.stop()) reactor.run() As evident the execution order is almost upside-down and this is because I have to define a function before using it (instead of defining and using a function inline). However python cannot have a statement inside an expression as has already been said, thus I think some new syntax to support this could be helpful, for example: when some_operation_that_results_in_a_deferred() -> result: if result == 'Initial Value': when work_on_result_and_return_a_deferred(result) -> inner_res: print inner_res else: print "No work on result" reactor.stop() reactor.run() In this case the execution order is correct and indentation helps in identifying which pieces of the execution will be run at a later time (depending on the when block). This way of coding could be useful for many kind of event driven frameworks like GUI toolkits that could do the following: when button.clicked() -> event, other_args: when some_dialog() -> result: if result is not None: window.set_title(result) IMHO similar considerations are valid for other libraries/frameworks like asyncore. What would this require? Python should basically support a protocol for a deferred like object that could be used by a framework to support that syntax. Something like: callback(initial_value) add_callback(callback, *a, **kw) add_errback(callback, *a, **kw) (extra methods if needed) HTH -- Valentino Volonghi aka Dialtone Now Running MacOSX 10.4 Blog: http://vvolonghi.blogspot.com New Pet: http://www.stiq.it From bokr at oz.net Thu Feb 9 20:06:12 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 09 Feb 2006 19:06:12 GMT Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <43EAA545.1060308@canterbury.ac.nz> <17387.21506.776343.95040@montanaro.dyndns.org> <17387.26067.994514.746486@montanaro.dyndns.org> Message-ID: <43eb91dd.363869997@news.gmane.org> On Thu, 9 Feb 2006 09:54:59 -0600, skip at pobox.com wrote: > > >> Ditto. In addition, we already have queues. Do we really need to > >> use a homonym that means something entirely different? (Hint: > >> consider the added difficulty for non-native English speakers). > > Guido> Right. As a non-native speaker I can confirm that for English > Guido> learners, "cue" is a bit mysterious at first while "hint" is > Guido> obvious. > >Guido, you're hardly your typical non-native speaker. I think your English >may be better than mine. ;-) At any rate, I was thinking of some of the >posts I see on c.l.py where it requires a fair amount of detective work just >to figure out what the poster has written, what with all the incorrect >grammar and wild misspellings. For that sort of person I can believe that >"cue", "queue" and "kew" might mean exactly the same thing... > FWIW, I first thought "cue" might be a typo mutation of "clue" ;-) +1 on something with "hint". Regards, Bengt Richter From guido at python.org Thu Feb 9 20:30:01 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 11:30:01 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43EAF696.5070101@ieee.org> References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Travis Oliphant wrote: > > Guido seemed accepting to this idea about 9 months ago when I spoke to > him. I finally got around to writing up the PEP. I'd really like to > get this into Python 2.5 if possible. Excellent! I was just going over the 2.5 schedule with Neal Norwitz last night, and looking back in my slides for OSCON 2005 I noticed this idea, and was wondering if you still wanted it. I'm glad the answer is yes! BTW do you also still want to turn ZeroDivisionError into a warning (that is changed into an error by default)? That idea shared a slide and I believe it was discussed in the same meeting you & I and some others had in San Mateo last summer. I'll comment on the PEP in-line. I've assigned it number 357 and checked it in. In the past, the protocol for aqcuiring a PEP number has been to ask the PEP coordinators (Barry Warsaw and David Goodger) to assign one. I believe that we could simplify this protocol to avoid necessary involvement of the PEP coordinators; all that is needed is someone with checkin privileges. I propose the following protocol: 1. In the peps directory, do a svn sync. 2. Look at the files that are there and the contents of pep-0000.txt. This should provide you with the last PEP number in sequence, ignoring the out-of-sequence PEPs (666, 754, and 3000). The reason to look in PEP 0 is that it is conceivable that a PEP number has been reserved in the index but not yet committed, so you should use the largest number. 3. Add 1 to the last PEP number. This gives your new PEP number, NNNN. 4. Using svn add and svn commit, check in the file pep-NNNN.txt (use %04d to format the number); the contents can be a minimal summary or even just headers. If this succeeds, you have successfully assigned yourself PEP number NNNN. Exit. 5. If you get an error from svn about the commit, someone else was carrying out the same protocol at the same time, and they won the race. Start over from step 1. I suspect the PEP coordinators have informally been using this protocol amongst themseles -- and amongst the occasional developer who bypassed the "official" protocol, like I've done in the past and like Neal Norwitz did last night with the Python 2.5 release schedule, PEP 356. I'm simply extending the protocol to all developers with checkin permissions. For PEP authors without checkin permissions, nothing changes, except that optionally if they don't get a timely response from the PEP coordinators, they can ask someone else with checkin permissions. > PEP: ### > Title: Allowing any object to be used for slicing > Version: $Revision 1.1$ > Last Modified: $Date: 2006/02/09 $ > Author: Travis Oliphant > Status: Draft > Type: Standards Track > Created: 09-Feb-2006 > Python-Version: 2.5 > > Abstract > > This PEP proposes adding an sq_index slot in PySequenceMethods and > an __index__ special method so that arbitrary objects can be used > in slice syntax. > > Rationale > > Currently integers and long integers play a special role in slice > notation in that they are the only objects allowed in slice > syntax. In other words, if X is an object implementing the sequence > protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both > integers or long integers. There is no way for obj1 and obj2 to > tell Python that they could be reasonably used as indexes into a > sequence. This is an unnecessary limitation. > > In NumPy, for example, there are 8 different integer scalars > corresponding to unsigned and signed integers of 8, 16, 32, and 64 > bits. These type-objects could reasonably be used as indexes into > a sequence if there were some way for their typeobjects to tell > Python what integer value to use. > > Proposal > > Add a sq_index slot to PySequenceMethods, and a corresponding > __index__ special method. Objects could define a function to > place in the sq_index slot that returns an C-integer for use in > PySequence_GetSlice, PySequence_SetSlice, and PySequence_DelSlice. Shouldn't this slot be in the PyNumberMethods extension? It feels more like a property of numbers than of a property of sequences. Also, the slot name should then probably be nb_index. There's also an ambiguity when using simple indexing. When writing x[i] where x is a sequence and i an object that isn't int or long but implements __index__, I think i.__index__() should be used rather than bailing out. I suspect that you didn't think of this because you've already special-cased this in your code -- when a non-integer is passed, the mapping API is used (mp_subscript). This is done to suppose extended slicing. The built-in sequences (list, str, unicode, tuple for sure, probably more) that implement mp_subscript should probe for nb_index before giving up. The generic code in PyObject_GetItem should also check for nb_index before giving up. > Implementation Plan > > 1) Add the slots > > 2) Change the ISINT macro in ceval.c to accomodate objects with the > index slot defined. > > 3) Change the _PyEval_SliceIndex function to accomodate objects > with the index slot defined. I think all sequence objects that implement mp_subscript should probably be modified according to the lines I sketched above. > Possible Concerns > > Speed: > > Implementation should not slow down Python because integers and long > integers used as indexes will complete in the same number of > instructions. The only change will be that what used to generate > an error will now be acceptable. > > Why not use nb_int which is already there? > > The nb_int, nb_oct, and nb_hex methods are used for coercion. > Floats have these methods defined and floats should not be used in > slice notation. > > Reference Implementation > > Available on PEP acceptance. This is very close to acceptance. I think I'd like to see the patch developed and submitted to SF (and assigned to me) prior to acceptance. > Copyright > > This document is placed in the public domain If you agree with the above comments, please send me an updated version of the PEP and I'll check it in over the old one, and approve it. Then just use SF to submit the patch etc. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 9 20:31:27 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 11:31:27 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Brett Cannon wrote: > > 2) Change the ISINT macro in ceval.c to accomodate objects with the > > index slot defined. > > Maybe the macro should also be renamed? Not exactly testing if > something is an int anymore if it checks for __index__. Have you looked at the code? ceval.c uses this macro only in the slice processing code. I don't particularly care what it's called... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 9 20:37:48 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 11:37:48 -0800 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EB7134.8000307@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <43EB7134.8000307@v.loewis.de> Message-ID: On 2/9/06, "Martin v. L?wis" wrote: > COM really solves all problems people might have on Windows. Taken deliberately out of context, that sounds rather like a claim even Microsoft itself wouldn't make. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 9 20:42:21 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 11:42:21 -0800 Subject: [Python-Dev] threadsafe patch for asynchat In-Reply-To: <43E85B7D.8080203@lycos.com> References: <43E85B7D.8080203@lycos.com> Message-ID: On 2/7/06, Mark Edgington wrote: > Ok, perhaps the notation could be improved, but the idea of the > semaphore in the patch is "Does it run inside of a multithreaded > environment, and could its push() functions be called from a different > thread?" The long-term fate of asyncore/asynchat aside, instead of wanting to patch asynchat, you should be able to subclass it easily to introduce the functionality you want. Given the disagreement over whether this is a good thing, I suggest that that's a much better way for you to solve your problem than to introduce yet another obscure confusing optional parameter. And you won't have to wait for Python 2.5. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Thu Feb 9 21:28:37 2006 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2006 12:28:37 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Guido van Rossum wrote: > On 2/9/06, Brett Cannon wrote: > > > 2) Change the ISINT macro in ceval.c to accomodate objects with the > > > index slot defined. > > > > Maybe the macro should also be renamed? Not exactly testing if > > something is an int anymore if it checks for __index__. > > Have you looked at the code? ceval.c uses this macro only in the slice > processing code. I don't particularly care what it's called... > Yeah, I looked. I just don't want a misnamed macro to start being abused for some odd reason. Might as well rename it while we are thinking about it then let it have a bad name. -Brett From bokr at oz.net Thu Feb 9 21:53:39 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 09 Feb 2006 20:53:39 GMT Subject: [Python-Dev] Let's send lambda to the shearing shed (Re: Let's just *keep* lambda) References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <2mbqxhe65n.fsf@starship.python.net> <43EAB9D6.6030507@canterbury.ac.nz> Message-ID: <43eb949d.364573719@news.gmane.org> On Thu, 09 Feb 2006 16:41:10 +1300, Greg Ewing wrote: >My thought on lambda at the moment is that it's too VERBOSE. > >If a syntax for anonymous functions is to pull its weight, >it needs to be *very* concise. The only time I ever consider >writing a function definition in-line is when the body is >extremely short, otherwise it's clearer to use a def instead. > >Given that, I do *not* have the space to waste with 6 or 7 >characters of geeky noise-word. OTOH, it does stand out as a flag to indicate what is being done. > >So my vote for Py3k is to either > >1) Replace lambda args: value with > > args -> value > >or something equivalently concise, or > Yet another bike shed color chip: !(args:expr) # <==> lambda args:expr and !(args::suite) # <==> (lambda args::suite) (where the latter lambda form requires outer enclosing parens) But either "::" form allows full def suite, with indentation for multilines having left edge of single indent defined by first line following the "::"-containing line, and explicit returns for values required and top suite ending on closing outer paren) Probable uses for the "::" form would be for short inline suite definitions !(x::print x) # <==> (lambda x::print x) & etc. similarly !(::global_counter+=1;return global_counter) !(::raise StopIteration)() # more honest than iter([]).next() but the flexibility would be there for an in-context definition, e.g., sorted(seq, key= !(x:: try: return abs(x) except TypeError: return 0)) and closures could be spelled !(c0,c1:!(x:c0+c1*x))(3,5) # single use with constants is silly spelling of !(x:3+5*x) Hm, are the latter two really better for eliminating "lambda"? Cf: sorted(seq, key=(lambda x:: try:return abs(x) except TypeError: return 0)) and (lambda c1,c2:lambda x:c0+c1*x)(3,5) # also silly with constants I'm not sure. I think I kind of like lambda args:expr and (lambda args::suite) but sometimes super-concise is nice ;-) >2) Remove lambda entirely. > -1 Regards, Bengt Richter From rhamph at gmail.com Thu Feb 9 22:14:43 2006 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 9 Feb 2006 14:14:43 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Travis E. Oliphant wrote: > I'm a little confused. Is your opposition solely due to the fact that > you think float's __int__ method ought to raise exceptions and the > apply_slice code should therefore use the __int__ slot? > > In theory I can understand this reasoning. In practice, however, the > __int__ slot has been used for "coercion" and changing the semantics of > int(3.2) at this stage seems like a recipe for lots of code breakage. I > don't think something like that is possible until Python 3k. > > If that is not your opposition, please be more clear. Regardless of how > it is done, it seems rather unPythonic to only allow 2 special types to > be used in apply_slice and assign_slice. Yes, that is the basis of my opposition, and I do understand it would take a long time to change __int__. What is the recommended practice for python? I can think of three distinct categories of behavior: - float to str. Some types converted to str might by lossy, but in general it's a very drastic conversion and unrelated to the others - float to Decimal. Raises an exception because it's usually lossy. - Decimal to int. Truncates, quite happily losing precision.. I guess my confusion revolves around float to Decimal. Is lossless conversion a good thing in python, or is prohibiting float to Decimal conversion just a fudge to prevent people from initializing a Decimal from a float when they really want a str? -- Adam Olsen, aka Rhamphoryncus From bokr at oz.net Thu Feb 9 22:26:34 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 09 Feb 2006 21:26:34 GMT Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation References: <43EAF696.5070101@ieee.org> Message-ID: <43ebae93.371219906@news.gmane.org> On Thu, 09 Feb 2006 01:00:22 -0700, Travis Oliphant wrote: > >Abstract > > This PEP proposes adding an sq_index slot in PySequenceMethods and > an __index__ special method so that arbitrary objects can be used > in slice syntax. > >Rationale > > Currently integers and long integers play a special role in slice > notation in that they are the only objects allowed in slice > syntax. In other words, if X is an object implementing the sequence > protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both > integers or long integers. There is no way for obj1 and obj2 to > tell Python that they could be reasonably used as indexes into a > sequence. This is an unnecessary limitation. > > In NumPy, for example, there are 8 different integer scalars > corresponding to unsigned and signed integers of 8, 16, 32, and 64 > bits. These type-objects could reasonably be used as indexes into > a sequence if there were some way for their typeobjects to tell > Python what integer value to use. > >Proposal > > Add a sq_index slot to PySequenceMethods, and a corresponding > __index__ special method. Objects could define a function to > place in the sq_index slot that returns an C-integer for use in > PySequence_GetSlice, PySequence_SetSlice, and PySequence_DelSlice. > How about if SLICE byte code interpretation would try to call obj.__int__() if passed a non-(int,long) obj ? Would that cover your use case? BTW the slice type happily accepts anything for start:stop:step I believe, and something[slice(whatever)] will call something.__getitem__ with the slice instance, though this is neither a fast nor nicely spelled way to customize. Regards, Bengt Richter From barry at python.org Thu Feb 9 22:20:16 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 09 Feb 2006 16:20:16 -0500 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: <1139520016.24122.26.camel@geddy.wooz.org> On Thu, 2006-02-09 at 11:30 -0800, Guido van Rossum wrote: > In the past, the protocol for aqcuiring a PEP number has been to ask > the PEP coordinators (Barry Warsaw and David Goodger) to assign one. I > believe that we could simplify this protocol to avoid necessary > involvement of the PEP coordinators; all that is needed is someone > with checkin privileges. I propose the following protocol: [omitted] In general, this is probably fine. Occasionally we reserve a PEP number for something special, or for a pre-request, but I think both are pretty rare. And because of svn and the commit messages we can at least catch those fairly quickly and fix them. Maybe we can add known reserved numbers to PEP 0 so they aren't taken accidentally. What I'm actually more concerned about is that we (really David) often review PEPs and reject first submissions on several grounds. I must say that David's done such a good job at keeping the quality of PEPs high that I'm leery of interfering with that. OTOH, perhaps those with commit privileges should be expected to produce high quality PEPs on the first draft. Maybe we can amend your rules to those people who both have commit privileges and have successfully submitted a PEP before. PEP virgins should go through the normal process. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060209/503a4aee/attachment.pgp From g.brandl at gmx.net Thu Feb 9 22:38:49 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 09 Feb 2006 22:38:49 +0100 Subject: [Python-Dev] Let's send lambda to the shearing shed (Re: Let's just *keep* lambda) In-Reply-To: <43eb949d.364573719@news.gmane.org> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <2mbqxhe65n.fsf@starship.python.net> <43EAB9D6.6030507@canterbury.ac.nz> <43eb949d.364573719@news.gmane.org> Message-ID: Bengt Richter wrote: >>1) Replace lambda args: value with >> >> args -> value >> >>or something equivalently concise, or >> > Yet another bike shed color chip: > > !(args:expr) # <==> lambda args:expr > and > !(args::suite) # <==> (lambda args::suite) Please drop it. Guido pronounced on it, it is _not_ going to change, and the introduction of new punctuation is clearly not improving anything. > (where the latter lambda form requires outer enclosing parens) But either "::" form > allows full def suite, with indentation for multilines having left edge of single indent > defined by first line following the "::"-containing line, and explicit returns for values > required and top suite ending on closing outer paren) > > Probable uses for the "::" form would be for short inline suite definitions > !(x::print x) # <==> (lambda x::print x) & etc. similarly Use sys.stdout.write. > !(::global_counter+=1;return global_counter) > !(::raise StopIteration)() # more honest than iter([]).next() Use a function. > but the flexibility would be there for an in-context definition, e.g., > > sorted(seq, key= !(x:: > try: return abs(x) > except TypeError: return 0)) Bah! I can't parse this. In "!(x::" there's clearly too much noise. Georg From oliphant.travis at ieee.org Thu Feb 9 22:39:07 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 14:39:07 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43ebae93.371219906@news.gmane.org> References: <43EAF696.5070101@ieee.org> <43ebae93.371219906@news.gmane.org> Message-ID: Bengt Richter wrote: >> > > How about if SLICE byte code interpretation would try to call > obj.__int__() if passed a non-(int,long) obj ? Would that cover your use case? > I believe that this is pretty much exactly what I'm proposing. The apply_slice and assign_slice functions in ceval.c are called for the SLICE and STORE_SLICE and DELETE_SLICE opcodes. > BTW the slice type happily accepts anything for start:stop:step I believe, > and something[slice(whatever)] will call something.__getitem__ with the slice > instance, though this is neither a fast nor nicely spelled way to customize. > Yes, the slice object itself takes whatever you want. However, Python special-cases what happens for X[a:b] *if* X as the sequence-protocol defined. This is the code-path I'm trying to enhance. -Travis From oliphant.travis at ieee.org Thu Feb 9 22:40:29 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 14:40:29 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43ebae93.371219906@news.gmane.org> References: <43EAF696.5070101@ieee.org> <43ebae93.371219906@news.gmane.org> Message-ID: Bengt Richter wrote: >> > > How about if SLICE byte code interpretation would try to call > obj.__int__() if passed a non-(int,long) obj ? Would that cover your use case? > I believe that this is pretty much what I'm proposing (except I'm not proposing to use the __int__ method because it is already used as coercion and doing this would allow floats to be used in slices which is a bad thing). The apply_slice and assign_slice functions in ceval.c are called for the SLICE and STORE_SLICE and DELETE_SLICE opcodes. > BTW the slice type happily accepts anything for start:stop:step I believe, > and something[slice(whatever)] will call something.__getitem__ with the slice > instance, though this is neither a fast nor nicely spelled way to customize. > Yes, the slice object itself takes whatever you want. However, Python special-cases what happens for X[a:b] *if* X as the sequence-protocol defined. This is the code-path I'm trying to enhance. -Travis From nyamatongwe at gmail.com Thu Feb 9 23:00:10 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Fri, 10 Feb 2006 09:00:10 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EB7134.8000307@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <43EB7134.8000307@v.loewis.de> Message-ID: <50862ebd0602091400h2a4e6c4fre5cb6c3181d5b921@mail.gmail.com> Martin v. L?wis: > COM really solves all problems people might have on Windows. COM was partly just a continuation of the practices used for controls, VBXs and other forms of extension. Visual Basic never forced use of a particular compiler or runtime library for extensions so why should Python? It was also easy to debug an extension DLL inside release-mode VB (I can't recall if debug versions of VB were ever readily available) which is something that is more difficult than it should be for Python. > Alas, it is not a cross-platform API. Standard C is cross-platform, > so Python uses it in its own APIs. The old (pre-XPCOM) Netscape plugin interface was cross-platform and worked with any compiler on Windows. Neil From nyamatongwe at gmail.com Thu Feb 9 23:00:35 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Fri, 10 Feb 2006 09:00:35 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EB6F31.8050804@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <031401c62cbb$61810630$bf03030a@trilan> <43EA3AB2.4020808@v.loewis.de> <50862ebd0602081437j49da385fuc3a31237bbff3725@mail.gmail.com> <43EACE52.7010205@v.loewis.de> <50862ebd0602082329h54f6d3dfx3e5943364034771@mail.gmail.com> <43EB6F31.8050804@v.loewis.de> Message-ID: <50862ebd0602091400q7cc4c240v41d23f6ae712a8ba@mail.gmail.com> Martin v. L?wis: > Not sure whether this was a serious suggestion. Yes it is. > If pythonxy.dll > was statically linked, you would get all the CRT duplication > already in extension modules. Given that there are APIs in Python > where you have to do malloc/free across the python.dll > boundary, you get memory leaks. Memory allocations across DLL boundaries will have to use wrapper functions. Neil From nyamatongwe at gmail.com Thu Feb 9 23:00:51 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Fri, 10 Feb 2006 09:00:51 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <79990c6b0602090453m2a6766fcjffa16c488774a01f@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <79990c6b0602090453m2a6766fcjffa16c488774a01f@mail.gmail.com> Message-ID: <50862ebd0602091400y6e0b3bftb48fd5166acb8dcc@mail.gmail.com> Paul Moore: > This has all been thrashed out before, but the issue is passing > CRT-allocated objects across DLL boundaries. Yes, that was the first point I addressed through wrapping CRT objects. > At first glance, this is a minor issue - passing FILE* pointers across > DLL boundaries isn't something I'd normally expect people to do - but > look further and you find you're opening a real can of worms. For > example, Python has public APIs which take FILE* parameters. So convert them to taking PyWrappedFile * parameters. > Further, > memory allocation is CRT-managed - allocate memory with one CRT's > malloc, and dealloacte it elsewhere, and you have issues. So *any* > pointer could be CRT-managed, to some extent. Etc, etc... I thought PyMem_Malloc was the correct call to use for memory allocation now and avoided direct links to the CRT for memory management. Neil From brett at python.org Thu Feb 9 23:01:42 2006 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2006 14:01:42 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <1139520016.24122.26.camel@geddy.wooz.org> References: <43EAF696.5070101@ieee.org> <1139520016.24122.26.camel@geddy.wooz.org> Message-ID: On 2/9/06, Barry Warsaw wrote: > On Thu, 2006-02-09 at 11:30 -0800, Guido van Rossum wrote: > > > In the past, the protocol for aqcuiring a PEP number has been to ask > > the PEP coordinators (Barry Warsaw and David Goodger) to assign one. I > > believe that we could simplify this protocol to avoid necessary > > involvement of the PEP coordinators; all that is needed is someone > > with checkin privileges. I propose the following protocol: > > [omitted] > > In general, this is probably fine. Occasionally we reserve a PEP number > for something special, or for a pre-request, but I think both are pretty > rare. And because of svn and the commit messages we can at least catch > those fairly quickly and fix them. Maybe we can add known reserved > numbers to PEP 0 so they aren't taken accidentally. > > What I'm actually more concerned about is that we (really David) often > review PEPs and reject first submissions on several grounds. I must say > that David's done such a good job at keeping the quality of PEPs high > that I'm leery of interfering with that. OTOH, perhaps those with > commit privileges should be expected to produce high quality PEPs on the > first draft. > > Maybe we can amend your rules to those people who both have commit > privileges and have successfully submitted a PEP before. PEP virgins > should go through the normal process. > Sounds reasonable to me. Then again I don't think I would ever attempt to get a PEP accepted without at least a single pass over by python-dev or c.l.py . But making it simpler definitely would be nice when you can already check in yourself. -Brett From oliphant.travis at ieee.org Thu Feb 9 23:11:17 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 15:11:17 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43EAF696.5070101@ieee.org> References: <43EAF696.5070101@ieee.org> Message-ID: Attached is an updated PEP for 357. I think the index concept is better situated in the PyNumberMethods structure. That way an object doesn't have to define the Sequence protocol just to be treated like an index. -Travis -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: PEP_index.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20060209/2abfd318/attachment.txt From martin at v.loewis.de Thu Feb 9 23:23:02 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 23:23:02 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602091400h2a4e6c4fre5cb6c3181d5b921@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <43EB7134.8000307@v.loewis.de> <50862ebd0602091400h2a4e6c4fre5cb6c3181d5b921@mail.gmail.com> Message-ID: <43EBC0C6.8040304@v.loewis.de> Neil Hodgson wrote: > COM was partly just a continuation of the practices used for > controls, VBXs and other forms of extension. Visual Basic never forced > use of a particular compiler or runtime library for extensions so why > should Python? Do you really not know? Because of API that happens to be defined the way it is. >>Alas, it is not a cross-platform API. Standard C is cross-platform, >>so Python uses it in its own APIs. > > > The old (pre-XPCOM) Netscape plugin interface was cross-platform > and worked with any compiler on Windows. Yes, and consequently, it avoids using standard C library types throughout. Regards, Martin From martin at v.loewis.de Thu Feb 9 23:24:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 23:24:58 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602091400q7cc4c240v41d23f6ae712a8ba@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <031401c62cbb$61810630$bf03030a@trilan> <43EA3AB2.4020808@v.loewis.de> <50862ebd0602081437j49da385fuc3a31237bbff3725@mail.gmail.com> <43EACE52.7010205@v.loewis.de> <50862ebd0602082329h54f6d3dfx3e5943364034771@mail.gmail.com> <43EB6F31.8050804@v.loewis.de> <50862ebd0602091400q7cc4c240v41d23f6ae712a8ba@mail.gmail.com> Message-ID: <43EBC13A.5010309@v.loewis.de> Neil Hodgson wrote: >>If pythonxy.dll >>was statically linked, you would get all the CRT duplication >>already in extension modules. Given that there are APIs in Python >>where you have to do malloc/free across the python.dll >>boundary, you get memory leaks. > > > Memory allocations across DLL boundaries will have to use wrapper functions. Sure, but that is a change to the API. Contributions are welcome, along with a plan how breakage of existing code can be minimized. Regards, Martin From martin at v.loewis.de Thu Feb 9 23:28:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Feb 2006 23:28:58 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602091400y6e0b3bftb48fd5166acb8dcc@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <79990c6b0602090453m2a6766fcjffa16c488774a01f@mail.gmail.com> <50862ebd0602091400y6e0b3bftb48fd5166acb8dcc@mail.gmail.com> Message-ID: <43EBC22A.20902@v.loewis.de> Neil Hodgson wrote: >>At first glance, this is a minor issue - passing FILE* pointers across >>DLL boundaries isn't something I'd normally expect people to do - but >>look further and you find you're opening a real can of worms. For >>example, Python has public APIs which take FILE* parameters. > > > So convert them to taking PyWrappedFile * parameters. Easy to say, hard to do. Regards, Martin From brett at python.org Thu Feb 9 23:32:47 2006 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2006 14:32:47 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: Looks good to me. Only change I might make is mention why __int__ doesn't work sooner (such as in the rationale). Otherwise +1 from me. -Brett On 2/9/06, Travis E. Oliphant wrote: > > Attached is an updated PEP for 357. I think the index concept is better > situated in the PyNumberMethods structure. That way an object doesn't > have to define the Sequence protocol just to be treated like an index. > > -Travis > > > PEP: 357357357 > Title: Allowing any object to be used for slicing > Version: Revision 1.2 > Last Modified: 02/09/2006 > Author: Travis Oliphant > Status: Draft > Type: Standards Track > Created: 09-Feb-2006 > Python-Version: 2.5 > > Abstract > > This PEP proposes adding an nb_as_index slot in PyNumberMethods and > an __index__ special method so that arbitrary objects can be used > in slice syntax. > > Rationale > > Currently integers and long integers play a special role in slice > notation in that they are the only objects allowed in slice > syntax. In other words, if X is an object implementing the sequence > protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both > integers or long integers. There is no way for obj1 and obj2 to > tell Python that they could be reasonably used as indexes into a > sequence. This is an unnecessary limitation. > > In NumPy, for example, there are 8 different integer scalars > corresponding to unsigned and signed integers of 8, 16, 32, and 64 > bits. These type-objects could reasonably be used as indexes into > a sequence if there were some way for their typeobjects to tell > Python what integer value to use. > > Proposal > > Add a nb_index slot to PyNumberMethods, and a corresponding > __index__ special method. Objects could define a function to > place in the sq_index slot that returns an appropriate > C-integer for use as ilow or ihigh in PySequence_GetSlice, > PySequence_SetSlice, and PySequence_DelSlice. > > Implementation Plan > > 1) Add the slots > > 2) Change the ISINT macro in ceval.c to ISINDEX and alter it to > accomodate objects with the index slot defined. > > 3) Change the _PyEval_SliceIndex function to accomodate objects > with the index slot defined. > > Possible Concerns > > Speed: > > Implementation should not slow down Python because integers and long > integers used as indexes will complete in the same number of > instructions. The only change will be that what used to generate > an error will now be acceptable. > > Why not use nb_int which is already there? > > The nb_int, nb_oct, and nb_hex methods are used for coercion. > Floats have these methods defined and floats should not be used in > slice notation. > > Reference Implementation > > Available on PEP acceptance. > > Copyright > > This document is placed in the public domain > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > > > From oliphant.travis at ieee.org Thu Feb 9 23:38:16 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 15:38:16 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: Guido van Rossum wrote: > On 2/9/06, Travis Oliphant wrote: > > > BTW do you also still want to turn ZeroDivisionError into a warning > (that is changed into an error by default)? That idea shared a slide > and I believe it was discussed in the same meeting you & I and some > others had in San Mateo last summer. I think that idea has some support, but I haven't been thinking about it for awhile. > > > Shouldn't this slot be in the PyNumberMethods extension? It feels more > like a property of numbers than of a property of sequences. Also, the > slot name should then probably be nb_index. Yes, definitely!!! > > There's also an ambiguity when using simple indexing. When writing > x[i] where x is a sequence and i an object that isn't int or long but > implements __index__, I think i.__index__() should be used rather than > bailing out. I suspect that you didn't think of this because you've > already special-cased this in your code -- when a non-integer is > passed, the mapping API is used (mp_subscript). This is done to > suppose extended slicing. The built-in sequences (list, str, unicode, > tuple for sure, probably more) that implement mp_subscript should > probe for nb_index before giving up. The generic code in > PyObject_GetItem should also check for nb_index before giving up. > I agree. These should also be changed. I'll change the PEP, too. > > I think all sequence objects that implement mp_subscript should > probably be modified according to the lines I sketched above. > True. > > This is very close to acceptance. I think I'd like to see the patch > developed and submitted to SF (and assigned to me) prior to > acceptance. > O.K. I'll work on it. -Travis From guido at python.org Thu Feb 9 23:42:15 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 14:42:15 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Adam Olsen wrote: > On 2/9/06, Travis Oliphant wrote: > > > > Guido seemed accepting to this idea about 9 months ago when I spoke to > > him. I finally got around to writing up the PEP. I'd really like to > > get this into Python 2.5 if possible. > > -1 > > I've detailed my reasons here: > http://mail.python.org/pipermail/python-dev/2006-January/059851.html I don't actually see anything relevant to this discussion in that post. > In short, there are purely math usages that want to convert to int > while raising exceptions from inexact results. The name __index__ > seems inappropriate for this, and I feel it would be cleaner to fix > float.__int__ to raise exceptions from inexact results (after a > suitable warning period and with a trunc() function added to math.) Maybe cleaner, but a thousand time harder given the status quo. Travis has a need for this *today* and __index__ can be added without any incompatibilities. I'm not even sure that it's worth changing __int__ for Python 3.0. Even if float.__int__ raised an exception if the float isn't exactly an integer, I still think it's wrong to use it here. Suppose I naively write some floating point code that usually (or with sufficiently lucky inputs) produces exact results, but which can produce inaccurate (or at least approximate) results in general. If I use such a result as an index, your proposal would allow that -- but the program would suddenly crash with an ImpreciseConversionError exception if the inputs produced an approximated result. I'd rather be made aware of this problem on the first run. Then I can decide whether to use int() or int(round()) or whatever other appropriate conversion. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nyamatongwe at gmail.com Thu Feb 9 23:47:56 2006 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Fri, 10 Feb 2006 09:47:56 +1100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <43EBC0C6.8040304@v.loewis.de> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <43EB7134.8000307@v.loewis.de> <50862ebd0602091400h2a4e6c4fre5cb6c3181d5b921@mail.gmail.com> <43EBC0C6.8040304@v.loewis.de> Message-ID: <50862ebd0602091447q1748e78ap5b4089d4355af00b@mail.gmail.com> Martin v. L?wis: > > Visual Basic never forced > > use of a particular compiler or runtime library for extensions so why > > should Python? > > Do you really not know? Because of API that happens to be defined > the way it is. It was rhetorical: Why should Python be inferior to VB? I suppose the answer (hmm, am I allowed to anser my own rhtorical questions?) is that it was originally developed on other operating systems and the Windows port has never been as much of a focus for most contributors. Neil From rhamph at gmail.com Thu Feb 9 23:52:06 2006 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 9 Feb 2006 15:52:06 -0700 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <20060209184751.20077.352925131.divmod.quotient.1@ohm> References: <43EB7043.50103@v.loewis.de> <20060209184751.20077.352925131.divmod.quotient.1@ohm> Message-ID: On 2/9/06, Valentino Volonghi aka Dialtone wrote: > Let's consider this piece of code (actual code that works today and uses > twisted for convenience): > > def do_stuff(result): > if result == 'Initial Value': > d2 = work_on_result_and_return_a_deferred(result) > d2.addCallback(println) > return d2 > return 'No work on result' > > def println(something): > print something > > d1 = some_operation_that_results_in_a_deferred() > d1.addCallback(do_stuff) > d1.addCallback(lambda _: reactor.stop()) > > reactor.run() PEP 342 provides a much better alternative: def do_stuff(): result = (yield some_operation()) something = (yield work_on_result(result)) print something reactor.stop() # Maybe unnecessary? reactor.run(do_stuff()) Apparantly it's already been applied to Python 2.5: http://www.python.org/dev/doc/devel/whatsnew/node4.html Now that may not be the exact syntax that Twisted provides, but the point is that the layout (and the top-to-bottom execution order) is possible. -- Adam Olsen, aka Rhamphoryncus From martin at v.loewis.de Fri Feb 10 00:03:33 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 00:03:33 +0100 Subject: [Python-Dev] Linking with mscvrt In-Reply-To: <50862ebd0602091447q1748e78ap5b4089d4355af00b@mail.gmail.com> References: <43E92573.6090300@v.loewis.de> <43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <43EB7134.8000307@v.loewis.de> <50862ebd0602091400h2a4e6c4fre5cb6c3181d5b921@mail.gmail.com> <43EBC0C6.8040304@v.loewis.de> <50862ebd0602091447q1748e78ap5b4089d4355af00b@mail.gmail.com> Message-ID: <43EBCA45.5000308@v.loewis.de> Neil Hodgson wrote: > I suppose the answer (hmm, am I allowed to anser my own rhtorical > questions?) is that it was originally developed on other operating > systems and the Windows port has never been as much of a focus for > most contributors. That's certainly the case. It is all Mark Hammond's doing still; not much has happened since the original Windows port. The other reason, of course, is that adding *specific* support for Windows will break support of other platforms. Microsoft had no problems breaking support of VB on Linux :-) Regards, Martin From thomas at xs4all.net Fri Feb 10 00:27:34 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 00:27:34 +0100 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: <20060209232734.GH10226@xs4all.nl> On Thu, Feb 09, 2006 at 02:32:47PM -0800, Brett Cannon wrote: > Looks good to me. Only change I might make is mention why __int__ > doesn't work sooner (such as in the rationale). Otherwise +1 from me. I have a slight reservation about the name. On the one hand it's clear the canonical use will be for indexing sequences, and __index__ doesn't look enough like __int__ to get people confused on the difference. On the other hand, there are other places (in C) that want an actual int, and they could use __index__ too. Even more so if a PyArg_Parse* grew a format for 'the index-value for this object' ;) On the *other* one hand, I can't think of a good name... but on the other other hand, it would be awkward to have to support an old name just because the real use wasn't envisioned yet. One-time-machine-for-the-shortsighted-quadrumanus-please-ly y'r,s -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From aahz at pythoncraft.com Fri Feb 10 00:39:46 2006 From: aahz at pythoncraft.com (Aahz) Date: Thu, 9 Feb 2006 15:39:46 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <20060209232734.GH10226@xs4all.nl> References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> Message-ID: <20060209233946.GB1795@panix.com> On Fri, Feb 10, 2006, Thomas Wouters wrote: > > I have a slight reservation about the name. On the one hand it's clear the > canonical use will be for indexing sequences, and __index__ doesn't look > enough like __int__ to get people confused on the difference. On the other > hand, there are other places (in C) that want an actual int, and they could > use __index__ too. Even more so if a PyArg_Parse* grew a format for 'the > index-value for this object' ;) > > On the *other* one hand, I can't think of a good name... but on the other > other hand, it would be awkward to have to support an old name just because > the real use wasn't envisioned yet. Can you provide a couple of examples where you think you'd want __index__ functionality but the name would be inappropriate? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From thomas at xs4all.net Fri Feb 10 01:03:48 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 01:03:48 +0100 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <20060209233946.GB1795@panix.com> References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> <20060209233946.GB1795@panix.com> Message-ID: <20060210000348.GW5045@xs4all.nl> On Thu, Feb 09, 2006 at 03:39:46PM -0800, Aahz wrote: > Can you provide a couple of examples where you think you'd want __index__ > functionality but the name would be inappropriate? Not really, or I wouldn't have had only a _slight_ reservation :) There are many functioncalls and methodcalls that only take integers, though, and they all currently use int() on their argument. file.read, socket.recv, signal.signal, str.zfill/center/ljust -- basically anything that uses the 'i' PyArg_Parse* format specifier, which is quite a lot. For a great many of them it will not make sense to pass objects that don't have an appropriate __int__, but who knows howmany really *mean* to ask for __index__ instead. I mostly voice the reservation to lure out people with actual reservations ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jimjjewett at gmail.com Fri Feb 10 01:10:38 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 9 Feb 2006 19:10:38 -0500 Subject: [Python-Dev] py3k and not equal; re names Message-ID: Smith asked: > I'm wondering if it's just "foolish consistency" (to quote a PEP 8) > that is calling for the dropping of <> in preference of only !=. Logically, "<=" means the same as "< or =" <> does not mean the same as "< or >"; it might just mean that they aren't comparable. Whether that is a strong enough reason to remove it is another question. -jJ From bokr at oz.net Fri Feb 10 01:16:30 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 10 Feb 2006 00:16:30 GMT Subject: [Python-Dev] Let's just *keep* lambda References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> Message-ID: <43ebb6a4.373284194@news.gmane.org> On Thu, 9 Feb 2006 10:33:10 -0800, Guido van Rossum wrote: >Enough already. > >As has clearly been proven, lambda is already perfect. > [...] > >To those people still complaining that lambda is crippled because it >doesn't do statements: First, remember that adding statement >capability wouldn't really add any power to the language; lambda is >purely syntactic sugar for an anonymous function definition (see above >myth debunking section). Second, years of attempts to overcome this >haven't come up with a usable syntax (and yes, curly braces have been Yes, but if you're an optimist, those years mean we're closer to the magic moment ;-) >proposed and rejected like everything else). It's a hard problem >because switching back to indentation-based parsing inside an >expression is problematic. For example, consider this hypothetical >example: > >a = foo(lambda x, y: > print x > print y) > >Should this be considered legal? Or should it be written as > >a = foo(lambda x, y: > print x > print y > ) > Neither. If I may ;-) First, please keep the existing expression lambda exactly as is. Second, allow a new lambda variant to have a suite. But this necessitates: 1. same suite syntax as a function def suite, with explicit returns of values except if falling out with a default None. Just like a function def. 2. diffentiating the variant lambda, and providing for suite termination. 2a. differentiate by using doubled ':' a = foo(lambda x, y :: print x+y) 2b. the lambda:: variant _requires_ enclosing parens, and the top suite ends at the closing ')' A calling context may be sufficient parens, but sometimes, like tuple expressions, yet another pair of enclosing expression-parens may be needed. 2c. Single-line suites terminate on closing paren. Hence a = foo(lambda x, y :: print x; print y) # is ok 2d. For multiline suites, the first line after the one with the '::' defines the column of a single indent (COLSI), at the first non-whitepace character. Further indents work normally and terminate by dedent, or the closing ')' may be placed anywhere convenient to terminate the whole lambda suite. Any statement dedenting to left of the established single indent column (COLSI) before the closing ')' is a syntax error. I recognize that this requires keeping track of independent nested indentation contexts, but that part of tokenizing was always fun, I imagine. I'd volunteer to suffer appropriately if you like this (the lambda variant, I mean, not my suffering ;-) I think that's it, though I'm always prepared for a d'oh moment ;-) >??? (Indenting the prints so they start at a later column than the 'l' >of 'lambda', and adding an explicit dedent before the close >parenthesis.) Note that if the former were allowed, we'd have >additional ambiguity if foo() took two parameters, e.g.: > >a = foo(lambda x, y: > print x > print y, 42) > >-- is 42 the second argument to foo() or is it printed? To make 42 a second argument, it would be spelled a = foo((lambda x, y:: print x print y), 42) to have the "print y, 42" statement, you could move the closing paren like a = foo((lambda x, y:: print x print y, 42)) but that would have redundant parens with the same meaning as a = foo(lambda x, y:: print x print y, 42) Though perhaps requiring the redundant parens for _all_ (lambda::) expressions would make the grammar easier. > >I'd much rather avoid this snake's nest by giving the function a name >and using existing statement syntax, like this: This is Python! How can a snake's nest be bad? ;-) Seriously, with the above indentation rules it seems straightforward to me. I do think it would be hard to do something reasonable without an explicitly differentiated lambda variant though. > >def callback(x, y): > print x > print y >a = foo(callback) vs a = foo(lambda x, y :: print x; print y) > >This is unambiguous, easier to parse (for humans as well as for >computers), and doesn't actually span more text lines. Since this Well, it does use more lines when :: allows simple statement suites ;-) >typically happens in a local scope, the name 'callback' disappears as >soon as as the scope is exited. > >BTW I use the same approach regularly for breaking up long >expressions; for example instead of writing > >a = foo(some_call(another_call(some_long_argument, > another_argument), > and_more(1, 2, 3), > and_still_more()) > >I'll write > >x = another_call(some_long_argument, another_argument) >a = foo(some_call(x, and_more(1, 2, 3)), and_still_more()) > >and suddenly my code is more compact and yet easier to read! (In real >life, I'd use a more meaningful name than 'x', but since the example >is nonsense it's hard to come up with a meaningful name here. :-) > I can't argue with any of that, except that I think I would like to be able to do both styles. Sometimes it's nice to define right in the context of one-shot use, e.g., I could see writing ss = sorted(seq, key=(lambda x:: try: return abs(x) except TypeError: return 0)) (unless I desperately wanted to avoid the LOAD_CONST, MAKE_FUNCTION overhead of using an inline lambda at all. I guess that does favor a global def done once). > >Parting shot: it appears that we're getting more and more >expressionized versions of statements: first list comprehensions, then >generator expressions, most recently conditional expressions, in >Python 3000 print() will become a function... Seen this way, lambda >was just ahead of its time! Perhaps we could add a try/except/finally >expression, and allow assignments in expressions, and then we could >rid of statements altogether, turning Python into an expression >language. Change the use of parentheses a bit, and... voila, Lisp! :-) > Well, if you want to do it, (lambda args::suite) is perhaps a start. I promise not to use it immoderately ;-) Regards, Bengt Richter From thomas at xs4all.net Fri Feb 10 01:23:25 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 01:23:25 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43ebb6a4.373284194@news.gmane.org> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> <43ebb6a4.373284194@news.gmane.org> Message-ID: <20060210002325.GI10226@xs4all.nl> On Fri, Feb 10, 2006 at 12:16:30AM +0000, Bengt Richter wrote: > On Thu, 9 Feb 2006 10:33:10 -0800, Guido van Rossum wrote: > >Enough already. > Yes, but if you're an optimist, those years mean we're closer to the magic > moment ;-) Please stop. Discuss it elsewhere. I suggest not CC'ing Guido in that discussion, either, at least not if you want the proposals to still have a chance. Also don't CC me, please, although it's not as hazardous as pissing off Guido ;) Make-the-hurting-stop-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido at python.org Fri Feb 10 01:27:35 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 16:27:35 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43ebb6a4.373284194@news.gmane.org> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> <43ebb6a4.373284194@news.gmane.org> Message-ID: [Bengt, on lambda :: suite] Since you probably won't stop until I give you an answer: I'm really not interested in a syntactic solution that allows multi-line lambdas. I don't think the complexity (in terms of users needing to learn them) is worth it. So please stop (as several people have already asked you). There's some text somewhere in the guidelines for python developers on when to know when to give up. Read it. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at holdenweb.com Fri Feb 10 02:03:40 2006 From: steve at holdenweb.com (Steve Holden) Date: Thu, 09 Feb 2006 20:03:40 -0500 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> <43ebb6a4.373284194@news.gmane.org> Message-ID: Guido van Rossum wrote: > [Bengt, on lambda :: suite] > > Since you probably won't stop until I give you an answer: I'm really > not interested in a syntactic solution that allows multi-line lambdas. > I don't think the complexity (in terms of users needing to learn them) > is worth it. So please stop (as several people have already asked > you). There's some text somewhere in the guidelines for python > developers on when to know when to give up. Read it. :-) > It's not just a matter of knowing when to give up. It's also a matter of actually *giving up* once you know it's time. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From bokr at oz.net Fri Feb 10 02:09:18 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 10 Feb 2006 01:09:18 GMT Subject: [Python-Dev] Let's just *keep* lambda References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> <43ebb6a4.373284194@news.gmane.org> <20060210002325.GI10226@xs4all.nl> Message-ID: <43ebde40.383424815@news.gmane.org> On Fri, 10 Feb 2006 01:23:25 +0100, Thomas Wouters wrote: >On Fri, Feb 10, 2006 at 12:16:30AM +0000, Bengt Richter wrote: >> On Thu, 9 Feb 2006 10:33:10 -0800, Guido van Rossum wrote: >> >Enough already. > [...some stuff snipped...] >> Yes, but if you're an optimist, those years mean we're closer to the magic >> moment ;-) [...some stuff snipped...] > >Please stop. Discuss it elsewhere. I suggest not CC'ing Guido in that >discussion, either, at least not if you want the proposals to still have a >chance. Also don't CC me, please, although it's not as hazardous as pissing >off Guido ;) Well, he presented a technical problem (indentation for lambda suites), and my main point was to address it with a suggestion he may not have seen (or why wouldn't he have mentioned it at least as a dumb failing attempt at solving the problem he was discussing?) IMHO it does solve the problem (modulo stupidities that I am prepared to have my nose rubbed in if I missed something) and was on topic. If a solution to a problem that Guido presents as an obstacle pisses him off, I'd be surprised, and disappointed. > >Make-the-hurting-stop-ly y'rs, I'm sorry you're hurting. That was not my intent ;-/ BTW, I never CC anyone unless they have asked me to. Unless gmane is doing it automatically, it shouldn't be happening. Regards, Bengt Richter From oliphant.travis at ieee.org Fri Feb 10 02:09:27 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 18:09:27 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: Guido van Rossum wrote: > On 2/9/06, Travis Oliphant wrote: > > > This is very close to acceptance. I think I'd like to see the patch > developed and submitted to SF (and assigned to me) prior to > acceptance. > > >>Copyright >> >> This document is placed in the public domain > > > If you agree with the above comments, please send me an updated > version of the PEP and I'll check it in over the old one, and approve > it. Then just use SF to submit the patch etc. > I uploaded a patch to SF against current SVN. The altered code compiles and the functionality works with classes defined in Python. I have yet to test against a C-type that defines the method. The patch adds a new API function int PyObject_AsIndex(obj). This was not specifically in the PEP but probably should be. The name could also be PyNumber_AsIndex(obj) but I was following the nb_nonzero slot example to help write the code. -Travis From aleaxit at gmail.com Fri Feb 10 02:26:45 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 9 Feb 2006 17:26:45 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Travis E. Oliphant wrote: ... > The patch adds a new API function int PyObject_AsIndex(obj). > > This was not specifically in the PEP but probably should be. The name > could also be PyNumber_AsIndex(obj) but I was following the nb_nonzero > slot example to help write the code. Shouldn't that new API function (whatever its name) also be somehow exposed for easy access from Python code? I realize new builtins are unpopular, so a builtin 'asindex' might not be appropriate, but perhaps operator.asindex might be. My main point is that I don't think we want every Python-coded sequence to have to call x.__index__() instead. Alex From guido at python.org Fri Feb 10 02:34:22 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 17:34:22 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: On 2/9/06, Alex Martelli wrote: > Shouldn't that new API function (whatever its name) also be somehow > exposed for easy access from Python code? I realize new builtins are > unpopular, so a builtin 'asindex' might not be appropriate, but > perhaps operator.asindex might be. My main point is that I don't think > we want every Python-coded sequence to have to call x.__index__() > instead. Very good point; this is why we have a PEP discussion phase. If it's x.__index__(), I think it ought to be operator.index(x). I'm not sure we need a builtin (also not sure we don't). I wonder if hasattr(x, "__index__") can be used as the litmus test for int-ness? (Then int and long should have one that returns self.) Travis, can you send me additional PEP updates as context or unified diffs vs. the PEP in SVN? (or against python.org/peps/pep-0357.txt if you don't want to download the entire PEP directory). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From oliphant.travis at ieee.org Fri Feb 10 02:35:43 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Thu, 09 Feb 2006 18:35:43 -0700 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <20060209232734.GH10226@xs4all.nl> References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> Message-ID: Thomas Wouters wrote: > On Thu, Feb 09, 2006 at 02:32:47PM -0800, Brett Cannon wrote: > >>Looks good to me. Only change I might make is mention why __int__ >>doesn't work sooner (such as in the rationale). Otherwise +1 from me. > > > I have a slight reservation about the name. On the one hand it's clear the > canonical use will be for indexing sequences, and __index__ doesn't look > enough like __int__ to get people confused on the difference. On the other > hand, there are other places (in C) that want an actual int, and they could > use __index__ too. Even more so if a PyArg_Parse* grew a format for 'the > index-value for this object' ;) > There are other places in Python that check specifically for int objects and long integer objects and fail with anything else. Perhaps all of these should aslo call the __index__ slot. But, then it *should* be renamed to i.e. "__true_int__". One such place is in abstract.c sequence_repeat function. The patch I submitted, perhaps aggressivele, changed that function to call the nb_index slot as well instead of raising an error. Perhaps the slot should be called nb_true_int? -Travis > On the *other* one hand, I can't think of a good name... but on the other > other hand, it would be awkward to have to support an old name just because > the real use wasn't envisioned yet. > > One-time-machine-for-the-shortsighted-quadrumanus-please-ly y'r,s From guido at python.org Fri Feb 10 02:54:41 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Feb 2006 17:54:41 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> Message-ID: On 2/9/06, Travis E. Oliphant wrote: > Thomas Wouters wrote: > > I have a slight reservation about the name. On the one hand it's clear the > > canonical use will be for indexing sequences, and __index__ doesn't look > > enough like __int__ to get people confused on the difference. On the other > > hand, there are other places (in C) that want an actual int, and they could > > use __index__ too. Even more so if a PyArg_Parse* grew a format for 'the > > index-value for this object' ;) I think we should just change all the existing formats that require int or long to support nb_as_index. > There are other places in Python that check specifically for int objects > and long integer objects and fail with anything else. Perhaps all of > these should aslo call the __index__ slot. Right, absolutely. > But, then it *should* be renamed to i.e. "__true_int__". One such place > is in abstract.c sequence_repeat function. I don't like __true_int__ very much. Personally, I'm fine with calling it __index__ after the most common operation. (Well, I would be since I think I came up with the name in the first place. :-) Since naming is always so subjective *and* important, I'll wait a few days, but if nobody suggests something better then we should just go with __index__. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bokr at oz.net Fri Feb 10 03:07:28 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 10 Feb 2006 02:07:28 GMT Subject: [Python-Dev] Let's just *keep* lambda References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> <43ebb6a4.373284194@news.gmane.org> Message-ID: <43ebf231.388529786@news.gmane.org> On Thu, 9 Feb 2006 16:27:35 -0800, Guido van Rossum wrote: >[Bengt, on lambda :: suite] > >Since you probably won't stop until I give you an answer: I'm really >not interested in a syntactic solution that allows multi-line lambdas. >I don't think the complexity (in terms of users needing to learn them) >is worth it. So please stop (as several people have already asked >you). There's some text somewhere in the guidelines for python >developers on when to know when to give up. Read it. :-) > Thank you. I give up ;-) I will try to find it and read it. But no fair tempting the weak with """ It's a hard problem ... For example, consider this hypothetical example: ... """ ;-) Regards, Bengt Richter From stephen at xemacs.org Fri Feb 10 03:43:41 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 10 Feb 2006 11:43:41 +0900 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: (Brett Cannon's message of "Thu, 9 Feb 2006 14:01:42 -0800") References: <43EAF696.5070101@ieee.org> <1139520016.24122.26.camel@geddy.wooz.org> Message-ID: <87lkwkkk6a.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Brett" == Brett Cannon writes: Brett> On 2/9/06, Barry Warsaw wrote: >> Maybe we can amend your rules to those people who both have >> commit privileges and have successfully submitted a PEP before. >> PEP virgins should go through the normal process. +1 Brett> Sounds reasonable to me. Then again I don't think I would Brett> ever attempt to get a PEP accepted without at least a Brett> single pass over by python-dev or c.l.py . But making it Brett> simpler definitely would be nice when you can already check Brett> in yourself. Besides Brett's point that in some sense most new authors *want* to go through the normal process, having the normal process means that there are a couple of people you can contact who are default mentor/editors, and TOOWDTI. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From tim.peters at gmail.com Fri Feb 10 04:17:02 2006 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 9 Feb 2006 22:17:02 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows Message-ID: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> Noticed that various socket tests are failing today, WinXP, Python trunk: test_socket_ssl Exception in thread Thread-27: Traceback (most recent call last): File "C:\Code\python\lib\threading.py", line 444, in __bootstrap self.run() File "C:\Code\python\lib\threading.py", line 424, in run self.__target(*self.__args, **self.__kwargs) File "C:\Code\python\lib\test\test_socket_ssl.py", line 50, in listener s.accept() File "C:\Code\python\lib\socket.py", line 169, in accept sock, addr = self._sock.accept() error: unable to select on socket test test_socket_ssl crashed -- socket.error: (10061, 'Connection refused') test test_urllibnet failed -- errors occurred; run in verbose mode for details Running that in verbose mode shows 2 "ok" and 8 "ERROR". A typical ERROR: ERROR: test_basic (test.test_urllibnet.urlopenNetworkTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_urllibnet.py", line 43, in test_basic open_url = urllib.urlopen("http://www.python.org/") File "C:\Code\python\lib\urllib.py", line 82, in urlopen return opener.open(url) File "C:\Code\python\lib\urllib.py", line 190, in open return getattr(self, name)(url) File "C:\Code\python\lib\urllib.py", line 325, in open_http h.endheaders() File "C:\Code\python\lib\httplib.py", line 798, in endheaders self._send_output() File "C:\Code\python\lib\httplib.py", line 679, in _send_output self.send(msg) File "C:\Code\python\lib\httplib.py", line 658, in send self.sock.sendall(str) File "", line 1, in sendall IOError: [Errno socket error] unable to select on socket test_logging appears to consume 100% of a CPU now, and never finishes. This may be an independent error. test_asynchat Exception in thread Thread-1: Traceback (most recent call last): File "C:\Code\python\lib\threading.py", line 444, in __bootstrap self.run() File "C:\Code\python\lib\test\test_asynchat.py", line 18, in run conn, client = sock.accept() File "C:\Code\python\lib\socket.py", line 169, in accept sock, addr = self._sock.accept() error: unable to select on socket test_socket is a long-winded disaster. test_socketserver test test_socketserver crashed -- socket.error: (10061, 'Connection refused') There are others, but tests that use sockets hang a lot now & it's tedious to worm around that. I _suspect_ that rev 42253 introduced these problems. For example, that added: + /* Guard against socket too large for select*/ + if (s->sock_fd >= FD_SETSIZE) + return SOCKET_INVALID; to _ssl.c, and added +/* Can we call select() with this socket without a buffer overrun? */ +#define IS_SELECTABLE(s) ((s)->sock_fd < FD_SETSIZE) to socketmodule.c, but those appear to make no sense. FD_SETSIZE is the maximum number of distinct fd's an fdset can hold, and the numerical magnitude of any specific fd has nothing to do with that in general (they may be related in fact on Unix systems that implement an fdset as "a big bit vector" -- but Windows doesn't work that way, and neither do all Unix systems, and nothing in socket specs requires an implementation to work that way). From greg.ewing at canterbury.ac.nz Fri Feb 10 04:20:22 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 10 Feb 2006 16:20:22 +1300 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <17387.22307.395057.121770@montanaro.dyndns.org> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <17387.22307.395057.121770@montanaro.dyndns.org> Message-ID: <43EC0676.6030301@canterbury.ac.nz> skip at pobox.com wrote: > Lambdas are expressions. Statements can't be embedded in expressions. That > statements and expressions are different is a core feature of the language. > That is almost certainly not going to change. Although "print" may become a function in 3.0, so that this particular example would no longer be a problem. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From barry at python.org Fri Feb 10 00:25:29 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 09 Feb 2006 18:25:29 -0500 Subject: [Python-Dev] py3k and not equal; re names In-Reply-To: References: Message-ID: <1139527529.30596.4.camel@geddy.wooz.org> On Thu, 2006-02-09 at 19:10 -0500, Jim Jewett wrote: > Logically, "<=" means the same as "< or =" > > <> does not mean the same as "< or >"; it might just mean that > they aren't comparable. Whether that is a strong enough reason > to remove it is another question. Visually, "==" looks very symmetrical and stands out nicely, while "!=" is asymmetric and jarring. "<>" has a visual symmetry that is a nice counterpart to "==". For me, that's enough of a reason to keep it. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060209/d4236221/attachment.pgp From greg.ewing at canterbury.ac.nz Fri Feb 10 04:49:13 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 10 Feb 2006 16:49:13 +1300 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <20060209184751.20077.352925131.divmod.quotient.1@ohm> References: <20060209184751.20077.352925131.divmod.quotient.1@ohm> Message-ID: <43EC0D39.6070506@canterbury.ac.nz> Valentino Volonghi aka Dialtone wrote: > when some_operation_that_results_in_a_deferred() -> result: > if result == 'Initial Value': > when work_on_result_and_return_a_deferred(result) -> inner_res: > print inner_res > else: > print "No work on result" > reactor.stop() Hmmm. This looks remarkably similar to something I got half way through dreaming up a while back, that I was going to call "Simple Continuations" (by analogy with "Simple Generators"). Maybe I should finish working out the details and write it up. On the other hand, it may turn out that it's subsumed by the new enhanced generators plus a trampoline. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Feb 10 04:59:38 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 10 Feb 2006 16:59:38 +1300 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <20060209232734.GH10226@xs4all.nl> References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> Message-ID: <43EC0FAA.3070300@canterbury.ac.nz> Thomas Wouters wrote: > I have a slight reservation about the name. ... On the other > hand, there are other places (in C) that want an actual int, and they could > use __index__ too. Maybe __exactint__? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Feb 10 05:05:22 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 10 Feb 2006 17:05:22 +1300 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> Message-ID: <43EC1102.4040400@canterbury.ac.nz> Guido van Rossum wrote: > To those people who believe that lambda is required in some situations > because it behaves differently with respect to the surrounding scope > than def: it doesn't, and it never did. This is (still!) a > surprisingly common myth. I have no idea where it comes from; does > this difference exist in some other language that has lambda as well > as some other function definition mechanism? Not that I know of. Maybe it's because these people first encountered the concept of a closure in when using lambda in Lisp or Scheme, and unconsciously assumed there was a dependency. > Parting shot: it appears that we're getting more and more > expressionized versions of statements: ... > Perhaps we could add a try/except/finally > expression, and allow assignments in expressions, and then we could > rid of statements altogether, turning Python into an expression > language. Change the use of parentheses a bit, and... voila, Lisp! :-) > Or we could go the other way and provide means of writing all expressions as statements. call: foo x lambda y,z: w =: +: y z print: "Result is" w -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From scott+python-dev at scottdial.com Fri Feb 10 06:02:25 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Fri, 10 Feb 2006 00:02:25 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> Message-ID: <43EC1E61.4050904@scottdial.com> Tim Peters wrote: > I _suspect_ that rev 42253 introduced these problems. For example, that added: > > + /* Guard against socket too large for select*/ > + if (s->sock_fd >= FD_SETSIZE) > + return SOCKET_INVALID; > > to _ssl.c, and added > > +/* Can we call select() with this socket without a buffer overrun? */ > +#define IS_SELECTABLE(s) ((s)->sock_fd < FD_SETSIZE) > > to socketmodule.c, but those appear to make no sense. FD_SETSIZE is > the maximum number of distinct fd's an fdset can hold, and the > numerical magnitude of any specific fd has nothing to do with that in > general (they may be related in fact on Unix systems that implement an > fdset as "a big bit vector" -- but Windows doesn't work that way, and > neither do all Unix systems, and nothing in socket specs requires an > implementation to work that way). Neal checked these changes in to address bug #876637 "Random stack corruption from socketmodule.c" But the Windows implementation of "select" is entirely different than other platforms, in so far as windows uses an internal counter to assign fds to an fd_set, so the fd number itself has no relevance to where they are placed in an fd_set. This stack corruption bug then does not exist on Windows, and so the code should not be used with Windows either. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From tjreedy at udel.edu Fri Feb 10 06:07:37 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 10 Feb 2006 00:07:37 -0500 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation References: <43EAF696.5070101@ieee.org> Message-ID: > Add a nb_index slot to PyNumberMethods, and a corresponding > __index__ special method. Objects could define a function to > place in the sq_index slot that returns an appropriate I presume 'sq_index' should also be 'nb_index' From tim.peters at gmail.com Fri Feb 10 06:36:09 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 00:36:09 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> Message-ID: <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> [Tim] > ... FD_SETSIZE is the maximum number of distinct fd's an fdset can > hold, and the numerical magnitude of any specific fd has nothing to do > with that in general (they may be related in fact on Unix systems that > implement an fdset as "a big bit vector" -- but Windows doesn't work > that way, and neither do all Unix systems, and nothing in socket > specs requires an implementation to work that way). Hmm. Looks like POSIX _does_ require that. Can't work on Windows, though. I have a distinct memory of a 64-bit Unix that didn't work that way either, but while that memory is younger than I am, it's too old for me to recall more than just that ;-). From dialtone at divmod.com Fri Feb 10 09:19:37 2006 From: dialtone at divmod.com (Valentino Volonghi aka Dialtone) Date: Fri, 10 Feb 2006 09:19:37 +0100 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43EC0D39.6070506@canterbury.ac.nz> Message-ID: <20060210081937.20077.1221015021.divmod.quotient.180@ohm> On Fri, 10 Feb 2006 16:49:13 +1300, Greg Ewing wrote: >Valentino Volonghi aka Dialtone wrote: > >> when some_operation_that_results_in_a_deferred() -> result: >> if result == 'Initial Value': >> when work_on_result_and_return_a_deferred(result) -> inner_res: >> print inner_res >> else: >> print "No work on result" >> reactor.stop() > >Hmmm. This looks remarkably similar to something I got half >way through dreaming up a while back, that I was going to >call "Simple Continuations" (by analogy with "Simple Generators"). >Maybe I should finish working out the details and write it up. > >On the other hand, it may turn out that it's subsumed by >the new enhanced generators plus a trampoline. This in only partially true. In fact, let's consider again twisted for the example, you can do something like this: @defgen def foo(): for url in urls: page = yield client.getPage(url) print page This has 2 disadvantages IMHO. First of all I have to use a function or a method decorated with @defgen to write that. But most important that code, although correct is serializing things that could be parallel. The solution is again simple but not really intuitive: @defgen def foo(): for d in map(client.getPage, urls): page = yield d print page Written in this way it will actually work in a parallel way but it is not really an intuitive solution. Using when instead: for url in urls: when client.getPage(url) -> page: print page This wouldn't have any problem and is quite readable. A similar construct is used in the E language and here http://www.skyhunter.com/marcs/ewalnut.html#SEC20 is explained how when works for them and their promise object. You can also have multiple things to wait for: when (client.getPage(url), cursor.execute(query)) -> (page, results): print page, results or l = [list, of, deferreds] when l -> *results: print results and we could catch errors in the following way: when client.getPage(url) -> page: print page except socket.error, e: print "something bad happened" HTH -- Valentino Volonghi aka Dialtone Now Running MacOSX 10.4 Blog: http://vvolonghi.blogspot.com New Pet: http://www.stiq.it From rasky at develer.com Fri Feb 10 09:43:10 2006 From: rasky at develer.com (Giovanni Bajo) Date: Fri, 10 Feb 2006 09:43:10 +0100 Subject: [Python-Dev] Linking with mscvrt References: <43E92573.6090300@v.loewis.de><43EA6692.2050502@v.loewis.de> <50862ebd0602081438q3d167cfbxa20c6c3d7cb7bedc@mail.gmail.com> <43EAB6B6.3050505@canterbury.ac.nz> <50862ebd0602082329g37f2506dg9fc4f1e6589d26e8@mail.gmail.com> <79990c6b0602090453m2a6766fcjffa16c488774a01f@mail.gmail.com><50862ebd0602091400y6e0b3bftb48fd5166acb8dcc@mail.gmail.com> <43EBC22A.20902@v.loewis.de> Message-ID: <018b01c62e1e$05a47b80$0e4d2597@bagio> Martin v. L?wis wrote: >>> At first glance, this is a minor issue - passing FILE* pointers >>> across >>> DLL boundaries isn't something I'd normally expect people to do - >>> but >>> look further and you find you're opening a real can of worms. For >>> example, Python has public APIs which take FILE* parameters. >> >> >> So convert them to taking PyWrappedFile * parameters. > > Easy to say, hard to do. But *that's* the solution for this problem. It's always been like this under Windows and will always be. Changing back to msvcrt so that people must compile their extension with non-standard compilation options it's really *worse* than just requiring msvcrt71 and punt. There's also a free compiler from Microsoft and tons of webpages which say how to compile with it. Or with mingw, even. So, I really believe that the situation is settling down. People are doing what they want to, with some difficulties perhaps, but there's nothing really undoable. If another change has to be pursued, it is to abstract Python from CRT altogether, or at least across boundaries. -- Giovanni Bajo From python at rcn.com Fri Feb 10 09:57:14 2006 From: python at rcn.com (Raymond Hettinger) Date: Fri, 10 Feb 2006 03:57:14 -0500 Subject: [Python-Dev] Let's just *keep* lambda References: <20060210081937.20077.1221015021.divmod.quotient.180@ohm> Message-ID: <000e01c62e1f$fd2e9470$b83efea9@RaymondLaptop1> Die thread, die! From python at rcn.com Fri Feb 10 10:12:55 2006 From: python at rcn.com (Raymond Hettinger) Date: Fri, 10 Feb 2006 04:12:55 -0500 Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <016901c62e1d$db031760$0e4d2597@bagio> Message-ID: <002001c62e22$2de27760$b83efea9@RaymondLaptop1> [Giovanni] >> I was really attracted to the idea of having more informative iterator >> representations but learned that even when it could be done, it wasn't >> especially useful. When someone creates an iterator at the >> interactive >> prompt, they almost always either wrap it in a consumer function or >> they >> assign it to a variable. The case of typing just, >> "enumerate([1,2,3])", >> comes up only once, when first learning was enumerate() does. > > On the other hand, it's very common to see the iterator in the debug > window > showing the locals or the watches. And it's pretty easy to add some > debugging > print statement to the code, run the program/test, find out that, hey, > that > function returns an iterator, go back and add a list() around it to find > out > what's inside. > > I would welcome if the iterator repr string could show, when possible, the > next > couple of elements. Sorry, that's a pipe-dream. Real use-cases for enumerate() don't usually have the luxury of having an argument that is a sequence. Instead, you have to run the iteration a few steps to see what lies ahead. In general, this isn't always possible (stdin for example) or desirable (where the iterator is time consuming or memory intensive and so shouldn't be run unless the value is actually needed) or may even be a disaster (if the iterator participates in co-routine style code that expects to be passing control back and forth between multiple open iterators). IOW, you cannot safely run an iterator a few steps in advance, save-up the results for display, and then expect everything else to work right. I spent a good time of time pursuing this mirage, but there was no water: http://mail.python.org/pipermail/python-dev/2004-April/044136.html AFAICT, the only way to achieve the effect you want is to get an environment where all iterators are designed around an API that supports being run forward and backward (such as the one demonstrated by Armin at PyCon last year). Raymond From thomas at xs4all.net Fri Feb 10 11:27:13 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 11:27:13 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> Message-ID: <20060210102713.GJ10226@xs4all.nl> On Fri, Feb 10, 2006 at 12:36:09AM -0500, Tim Peters wrote: > [Tim] > > ... FD_SETSIZE is the maximum number of distinct fd's an fdset can > > hold, and the numerical magnitude of any specific fd has nothing to do > > with that in general (they may be related in fact on Unix systems that > > implement an fdset as "a big bit vector" -- but Windows doesn't work > > that way, and neither do all Unix systems, and nothing in socket > > specs requires an implementation to work that way). > Hmm. Looks like POSIX _does_ require that. Can't work on Windows, > though. I have a distinct memory of a 64-bit Unix that didn't work > that way either, but while that memory is younger than I am, it's too > old for me to recall more than just that ;-). Perhaps the memory you have is of select-lookalikes, like poll(), or maybe of vendor-specific (and POSIX-breaking) extensions to select(). select() performs pretty poorly on large fdsets with holes in, and has the fixed size fdset problem, so poll() was added to fix that (by Linux and later by XPG4, IIRC.) poll() takes an array of structs containing the fd, the operations to watch for and an output parameter with seen events. Does that jar your memory? :) (The socketmodule has support for poll(), on systems that have it, by the way.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From ncoghlan at gmail.com Fri Feb 10 12:26:33 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2006 21:26:33 +1000 Subject: [Python-Dev] cProfile module In-Reply-To: <20060207104224.GA6204@code0.codespeak.net> References: <20060207104224.GA6204@code0.codespeak.net> Message-ID: <43EC7869.4050601@gmail.com> Armin Rigo wrote: > Hi all, > > As promized two months ago, I eventually finished the integration of the > 'lsprof' profiler. It's now in an internal '_lsprof' module that is > exposed via a 'cProfile' module with the same interface as 'profile', > producing compatible dump stats that can be inspected with 'pstats'. Hurrah! (trying to optimise the Decimal module before 2.4 was a painful exercise, because hotshot wasn't really up to the job and executing the tests and the benchmark under the normal profile module was horribly slow. . .). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Fri Feb 10 13:16:42 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2006 22:16:42 +1000 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> Message-ID: <43EC842A.2010304@gmail.com> Adam Olsen wrote: > I guess my confusion revolves around float to Decimal. Is lossless > conversion a good thing in python, or is prohibiting float to Decimal > conversion just a fudge to prevent people from initializing a Decimal > from a float when they really want a str? The general rule is that a lossy conversion is fine, so long as the programmer explicitly requests it. float to Decimal is a special case, which has more to do with the nature of Decimal and the guarantees it provides, than to do with general issues of lossless conversion. Specifically, what does Decimal(1.1) mean? Did you want Decimal("1.1") or Decimal("1.100000001")? Allowing direct conversion from float would simply infect the Decimal type with all of the problems of binary floating point representations, without providing any countervailing benefit. The idea of providing a special notation or separate method for float precision was toyed with, but eventually rejected in favour of the existing string formatting notation and a straight up type error. Facundo included the gory details in the final version of his PEP [1]. Cheers, Nick. [1] http://www.python.org/peps/pep-0327.html#from-float -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Fri Feb 10 13:45:44 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2006 22:45:44 +1000 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> Message-ID: <43EC8AF8.2000506@gmail.com> Guido van Rossum wrote: >> But, then it *should* be renamed to i.e. "__true_int__". One such place >> is in abstract.c sequence_repeat function. > > I don't like __true_int__ very much. Personally, I'm fine with calling > it __index__ after the most common operation. (Well, I would be since > I think I came up with the name in the first place. :-) Since naming > is always so subjective *and* important, I'll wait a few days, but if > nobody suggests something better then we should just go with > __index__. An alternative would be to call it "__discrete__", as that is the key characteristic of an indexing type - it consists of a sequence of discrete values that can be isomorphically mapped to the integers. Numbers conceptually representing continuously variable quantities (such as floats and decimals) are the ones that really shouldn't define this method. I wouldn't mind __index__ though, as even though some of the use cases won't be strictly using the result as an index, the shared characteristic of being isomorphic to the integers should be sufficient to allow the term to make some sort of sense. This would hardly be the first case where names of operators are overloaded using imprecise terminology, after all. 'or', 'and', 'sub' and 'xor' aren't the right terms for set union, intersection, difference and disjunction, but they're close enough conceptually that the names still have meaning. Ditto for 'mul' and 'add' meaning repetition and concatenation for sequences (no comment on 'mod' and string formatting though. . .) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From arigo at tunes.org Fri Feb 10 14:08:06 2006 From: arigo at tunes.org (Armin Rigo) Date: Fri, 10 Feb 2006 14:08:06 +0100 Subject: [Python-Dev] _length_cue() In-Reply-To: <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> Message-ID: <20060210130806.GA29717@code0.codespeak.net> Hi Raymond, On Wed, Feb 08, 2006 at 09:21:02PM -0500, Raymond Hettinger wrote: > (... __getitem_cue__ ...) > Before putting this in production, it would probably be worthwhile to search > for code where it would have been helpful. In the case of __length_cue__, > there was an immediate payoff. Indeed, I don't foresee any place where it would help apart from the __repr__ of the iterators, which is precisely what I'm aiming at. The alternative here would be a kind of "smart" global function that knows about many built-in iterator types and is able to fish for the data inside automatically (but this hits problems of data structures being private). I thought that __getitem_cue__ would be a less dirty solution. I really think a better __repr__ would be generally helpful, and I cannot think of a 3rd solution at the moment... (Ideas welcome!) A bientot, Armin From ncoghlan at gmail.com Fri Feb 10 14:21:52 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2006 23:21:52 +1000 Subject: [Python-Dev] _length_cue() In-Reply-To: <20060210130806.GA29717@code0.codespeak.net> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> <20060210130806.GA29717@code0.codespeak.net> Message-ID: <43EC9370.7060809@gmail.com> Armin Rigo wrote: > Indeed, I don't foresee any place where it would help apart from the > __repr__ of the iterators, which is precisely what I'm aiming at. The > alternative here would be a kind of "smart" global function that knows > about many built-in iterator types and is able to fish for the data > inside automatically (but this hits problems of data structures being > private). I thought that __getitem_cue__ would be a less dirty > solution. I really think a better __repr__ would be generally helpful, > and I cannot think of a 3rd solution at the moment... (Ideas welcome!) Do they really need anything more sophisticated than: def __repr__(self): return "%s(%r)" % (type(self).__name__, self._subiter) (modulo changes in the format of arguments, naturally. This simple one would work for things like enumerate and reversed, though) If the subiterators themselves have decent repr methods, the top-level repr should also look reasonable. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From arigo at tunes.org Fri Feb 10 14:25:41 2006 From: arigo at tunes.org (Armin Rigo) Date: Fri, 10 Feb 2006 14:25:41 +0100 Subject: [Python-Dev] _length_cue() In-Reply-To: <43EAB6BA.6040003@canterbury.ac.nz> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> <43EAB6BA.6040003@canterbury.ac.nz> Message-ID: <20060210132541.GB29717@code0.codespeak.net> Hi Greg, On Thu, Feb 09, 2006 at 04:27:54PM +1300, Greg Ewing wrote: > The iterator protocol is currently very simple and > well-focused on a single task -- producing things > one at a time, in sequence. Let's not clutter it up > with too much more cruft. Please refer to my original message: I intended these methods to be private and undocumented, not part of any official protocol in any way. A bientot, Armin From arigo at tunes.org Fri Feb 10 14:33:08 2006 From: arigo at tunes.org (Armin Rigo) Date: Fri, 10 Feb 2006 14:33:08 +0100 Subject: [Python-Dev] _length_cue() In-Reply-To: <43EC9370.7060809@gmail.com> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> <20060210130806.GA29717@code0.codespeak.net> <43EC9370.7060809@gmail.com> Message-ID: <20060210133308.GC29717@code0.codespeak.net> Hi Nick, On Fri, Feb 10, 2006 at 11:21:52PM +1000, Nick Coghlan wrote: > Do they really need anything more sophisticated than: > > def __repr__(self): > return "%s(%r)" % (type(self).__name__, self._subiter) > > (modulo changes in the format of arguments, naturally. This simple one would > work for things like enumerate and reversed, though) My goal here is not primarily to help debugging, but to help playing around at the interactive command-line. Python's command-line should not be dismissed as "useless for real programmers"; I definitely use it all the time to try things out. It would be nicer if all these iterators I'm not familiar with would give me a hint about what they actually return, instead of: >>> itertools.count(17) count(17) # yes, thank you, not very helpful >>> enumerate("spam") enumerate("spam") # with your proposed extension -- not better However, if this kind of goal is considered "not serious enough" for adding a private special method, then I'm fine with trying out a fishing approach. A bientot, Armin. From ncoghlan at gmail.com Fri Feb 10 14:44:45 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2006 23:44:45 +1000 Subject: [Python-Dev] _length_cue() In-Reply-To: <20060210133308.GC29717@code0.codespeak.net> References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> <20060210130806.GA29717@code0.codespeak.net> <43EC9370.7060809@gmail.com> <20060210133308.GC29717@code0.codespeak.net> Message-ID: <43EC98CD.5090205@gmail.com> Armin Rigo wrote: > Hi Nick, > > On Fri, Feb 10, 2006 at 11:21:52PM +1000, Nick Coghlan wrote: >> Do they really need anything more sophisticated than: >> >> def __repr__(self): >> return "%s(%r)" % (type(self).__name__, self._subiter) >> >> (modulo changes in the format of arguments, naturally. This simple one would >> work for things like enumerate and reversed, though) > > My goal here is not primarily to help debugging, but to help playing > around at the interactive command-line. Python's command-line should > not be dismissed as "useless for real programmers"; I definitely use it > all the time to try things out. It would be nicer if all these > iterators I'm not familiar with would give me a hint about what they > actually return, instead of: > >>>> itertools.count(17) > count(17) # yes, thank you, not very helpful >>>> enumerate("spam") > enumerate("spam") # with your proposed extension -- not better > > However, if this kind of goal is considered "not serious enough" for > adding a private special method, then I'm fine with trying out a fishing > approach. Ah, I see the use case now. You're right in thinking I was mainly considering the debugging element (and supporting even that would be an improvement on the current repr methods, which are just the 'type with instance ID' default repr). In terms of "what does it do" though, I'd tend to actually iterate the thing: Py> for x in enumerate("spam"): print x ... (0, 's') (1, 'p') (2, 'a') (3, 'm') Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From mwh at python.net Fri Feb 10 14:56:45 2006 From: mwh at python.net (Michael Hudson) Date: Fri, 10 Feb 2006 13:56:45 +0000 Subject: [Python-Dev] Post-PyCon PyPy Sprint: February 27th - March 2nd 2006 Message-ID: <2mwtg3co6a.fsf@starship.python.net> The next PyPy sprint is scheduled to take place right after PyCon 2006 in Dallas, Texas, USA. We hope to see lots of newcomers at this sprint, so we'll give friendly introductions. Note that during the Pycon conference we are giving PyPy talks which serve well as preparation. Goals and topics of the sprint ------------------------------ While attendees of the sprint are of course welcome to work on what they wish, we offer these ideas: - Work on an 'rctypes' module aiming at letting us use a ctypes implementation of an extension module from the compiled pypy-c. - Writing ctypes implementations of modules to be used by the above tool. - Experimenting with different garbage collection strategies. - Implementing Python 2.5 features in PyPy - Implementation of constraints solvers and integration of dataflow variables to PyPy. - Implement new features and improve the 'py' lib and py.test which are heavily used by PyPy (doctests/test selection/...). - Generally experiment with PyPy -- for example, play with transparent distribution of objects or coroutines and stackless features at application level. - Have fun! Location -------- The sprint will be held wherever the PyCon sprints end up being held, which is to say somewhere within the Dallas/Addison Marriott Quorum hotel. For more information see the PyCon 06 sprint pages: - http://us.pycon.org/TX2006/Sprinting - http://wiki.python.org/moin/PyCon2006/Sprints Exact times ----------- The PyPy sprint will from from Monday February 27th until Thursday March 2nd 2006. Hours will be from 10:00 until people have had enough. Registration, etc. ------------------ If you know before the conference that you definitely want to attend our sprint, please subscribe to the `PyPy sprint mailing list`_, introduce yourself and post a note that you want to come. Feel free to ask any questions or make suggestions there! There is a separate `PyCon 06 people`_ page tracking who is already planning to come. If you have commit rights on codespeak then you can modify yourself a checkout of http://codespeak.net/svn/pypy/extradoc/sprintinfo/pycon06/people.txt .. _`PyPy sprint mailing list`: http://codespeak.net/mailman/listinfo/pypy-sprint .. _`PyCon 06 people`: http://codespeak.net/pypy/extradoc/sprintinfo/pycon06/people.txt -- 42. You can measure a programmer's perspective by noting his attitude on the continuing vitality of FORTRAN. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From Jack.Jansen at cwi.nl Fri Feb 10 15:23:07 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri, 10 Feb 2006 15:23:07 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification Message-ID: I keep running into problems with the "const" modifications to PyArg_ParseTupleAndKeywords() (rev. 41638 by Jeremy). I have lots of code of the form char *kw[] = {"itself", 0}; if (PyArg_ParseTupleAndKeywords(_args, _kwds, "O&", kw, CFTypeRefObj_Convert, &itself)) ... which now no longer compiles, neither with C nor with C++ (gcc4, both MacOSX and Linux). Changing the kw declaration to "const char *kw[]" makes it compile again. I don't understand why it doesn't compile: even though the PyArg_ParseTupleAndKeywords signature promises that it won't change the "kw" argument I see no reason why I shouldn't be able to pass a non-const argument. And to make matters worse adding the "const" of course makes the code non-portable to previous versions of Python (where the C compiler rightly complains that I'm passing a const object through a non-const parameter). Can anyone enlighten me? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From guido at python.org Fri Feb 10 16:39:53 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 07:39:53 -0800 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: OMG. Are we now adding 'const' modifiers to random places? I thought "const propagation hell" was a place we were happily avoiding by not falling for that meme. What changed? --Guido On 2/10/06, Jack Jansen wrote: > I keep running into problems with the "const" modifications to > PyArg_ParseTupleAndKeywords() (rev. 41638 by Jeremy). > > I have lots of code of the form > char *kw[] = {"itself", 0}; > > if (PyArg_ParseTupleAndKeywords(_args, _kwds, "O&", kw, > CFTypeRefObj_Convert, &itself)) ... > which now no longer compiles, neither with C nor with C++ (gcc4, both > MacOSX and Linux). Changing the kw declaration to "const char *kw[]" > makes it compile again. > > I don't understand why it doesn't compile: even though the > PyArg_ParseTupleAndKeywords signature promises that it won't change > the "kw" argument I see no reason why I shouldn't be able to pass a > non-const argument. > > And to make matters worse adding the "const" of course makes the code > non-portable to previous versions of Python (where the C compiler > rightly complains that I'm passing a const object through a non-const > parameter). > > Can anyone enlighten me? > -- > Jack Jansen, , http://www.cwi.nl/~jack > If I can't dance I don't want to be part of your revolution -- Emma > Goldman > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Fri Feb 10 17:30:30 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 11:30:30 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: On 2/10/06, Guido van Rossum wrote: > OMG. Are we now adding 'const' modifiers to random places? I thought > "const propagation hell" was a place we were happily avoiding by not > falling for that meme. What changed? I added some const to several API functions that take char* but typically called by passing string literals. In C++, a string literal is a const char* so you need to add a const_cast<> to every call site, which is incredibly cumbersome. After some discussion on python-dev, I made changes to a small set of API functions and chased the const-ness the rest of the way, as you would expect. There was nothing random about the places const was added. I admit that I'm also puzzled by Jack's specific question. I don't understand why an array passed to PyArg_ParseTupleAndKeywords() would need to be declared as const. I observed the problem in my initial changes but didn't think very hard about the cause of the problem. Perhaps someone with better C/C++ standards chops can explain. Jeremy > > --Guido > > On 2/10/06, Jack Jansen wrote: > > I keep running into problems with the "const" modifications to > > PyArg_ParseTupleAndKeywords() (rev. 41638 by Jeremy). > > > > I have lots of code of the form > > char *kw[] = {"itself", 0}; > > > > if (PyArg_ParseTupleAndKeywords(_args, _kwds, "O&", kw, > > CFTypeRefObj_Convert, &itself)) ... > > which now no longer compiles, neither with C nor with C++ (gcc4, both > > MacOSX and Linux). Changing the kw declaration to "const char *kw[]" > > makes it compile again. > > > > I don't understand why it doesn't compile: even though the > > PyArg_ParseTupleAndKeywords signature promises that it won't change > > the "kw" argument I see no reason why I shouldn't be able to pass a > > non-const argument. > > > > And to make matters worse adding the "const" of course makes the code > > non-portable to previous versions of Python (where the C compiler > > rightly complains that I'm passing a const object through a non-const > > parameter). > > > > Can anyone enlighten me? > > -- > > Jack Jansen, , http://www.cwi.nl/~jack > > If I can't dance I don't want to be part of your revolution -- Emma > > Goldman > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From keith at kdart.com Fri Feb 10 17:37:34 2006 From: keith at kdart.com (Keith Dart) Date: Fri, 10 Feb 2006 08:37:34 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <43EB7043.50103@v.loewis.de> <43ebb6a4.373284194@news.gmane.org> Message-ID: <20060210083734.0618ac1d@leviathan.kdart.com> Guido van Rossum wrote the following on 2006-02-09 at 16:27 PST: === > Since you probably won't stop until I give you an answer: I'm really > not interested in a syntactic solution that allows multi-line lambdas. === Fuzzy little lambdas, wouldn't hurt a fly. Object of much derision, one has to wonder why? Docile little lambdas, so innocent and pure Only wants to function with finality and closure. Cute little lambdas, they really are so sweet When ingested by a Python they make a tasty treat. -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Keith Dart public key: ID: 19017044 ===================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060210/b9ad9956/attachment.pgp From thomas at xs4all.net Fri Feb 10 17:53:39 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 17:53:39 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: <20060210165339.GK10226@xs4all.nl> On Fri, Feb 10, 2006 at 11:30:30AM -0500, Jeremy Hylton wrote: > On 2/10/06, Guido van Rossum wrote: > > OMG. Are we now adding 'const' modifiers to random places? I thought > > "const propagation hell" was a place we were happily avoiding by not > > falling for that meme. What changed? > > I added some const to several API functions that take char* but > typically called by passing string literals. In C++, a string literal > is a const char* so you need to add a const_cast<> to every call site, > which is incredibly cumbersome. After some discussion on python-dev, > I made changes to a small set of API functions and chased the > const-ness the rest of the way, as you would expect. There was > nothing random about the places const was added. > > I admit that I'm also puzzled by Jack's specific question. I don't > understand why an array passed to PyArg_ParseTupleAndKeywords() would > need to be declared as const. I observed the problem in my initial > changes but didn't think very hard about the cause of the problem. > Perhaps someone with better C/C++ standards chops can explain. Well, it's counter-intuitive, but a direct result of how pointer equivalence is defined in C. I'm rusty in this part, so I will get some terminology wrong, but IIRC, a variable A is of an equivalent type of variable B if they hold the same type of data. So, a 'const char *' is equivalent to a 'char *' because they both hold the memory of a 'char'. But a 'const char**' (or 'const *char[]') is not equivalent to a 'char **' (or 'char *[]') because the first holds the address of a 'const char *', and the second the address of a 'char *'. A 'char * const *' is equivalent to a 'char **' though. As I said, I got some of the terminology wrong, but the end result is exactly that: a 'const char **' is not equivalent to a 'char **', even though a 'const char *' is equivalent to a 'char *'. Equivalence, in this case, means 'can be automatically downcasted'. Peter v/d Linden explains this quite well in "Expert C Programming" (aka 'Deep C Secrets'), but unfortunately I'm working from home and I left my copy at a coworkers' desk. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at v.loewis.de Fri Feb 10 18:02:03 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 18:02:03 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: <43ECC70B.8030501@v.loewis.de> Jeremy Hylton wrote: > I admit that I'm also puzzled by Jack's specific question. I don't > understand why an array passed to PyArg_ParseTupleAndKeywords() would > need to be declared as const. I observed the problem in my initial > changes but didn't think very hard about the cause of the problem. > Perhaps someone with better C/C++ standards chops can explain. Please take a look at this code: void foo(const char** x, const char*s) { x[0] = s; } void bar() { char *kwds[] = {0}; const char *s = "Text"; foo(kwds, s); kwds[0][0] = 't'; } If it was correct, you would be able to modify the const char array in the string literal, without any compiler errors. The assignment x[0] = s; is kosher, because you are putting a const char* into a const char* array, and the assigment kwds[0][0] = 't'; is ok, because you are modifying a char array. So the place where it has to fail is the passing of the pointer-pointer. Regards, Martin From jeremy at alum.mit.edu Fri Feb 10 18:06:21 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 12:06:21 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ECC70B.8030501@v.loewis.de> References: <43ECC70B.8030501@v.loewis.de> Message-ID: It looks like a solution may be to define it as "const char * const *" rather than "const char **". I'll see if that works. Jeremy On 2/10/06, "Martin v. L?wis" wrote: > Jeremy Hylton wrote: > > I admit that I'm also puzzled by Jack's specific question. I don't > > understand why an array passed to PyArg_ParseTupleAndKeywords() would > > need to be declared as const. I observed the problem in my initial > > changes but didn't think very hard about the cause of the problem. > > Perhaps someone with better C/C++ standards chops can explain. > > Please take a look at this code: > > void foo(const char** x, const char*s) > { > x[0] = s; > } > > void bar() > { > char *kwds[] = {0}; > const char *s = "Text"; > foo(kwds, s); > kwds[0][0] = 't'; > } > > If it was correct, you would be able to modify the const char > array in the string literal, without any compiler errors. The > assignment > > x[0] = s; > > is kosher, because you are putting a const char* into a > const char* array, and the assigment > > kwds[0][0] = 't'; > > is ok, because you are modifying a char array. So the place > where it has to fail is the passing of the pointer-pointer. > > Regards, > Martin > From martin at v.loewis.de Fri Feb 10 18:07:24 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 18:07:24 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: <43ECC84C.9020002@v.loewis.de> Jeremy Hylton wrote: > I added some const to several API functions that take char* but > typically called by passing string literals. In C++, a string literal > is a const char* so you need to add a const_cast<> to every call site, That's not true. A string literal of length N is of type const char[N+1]. However, a (deprecated) conversion of string literals to char* is provided in the language. So assigning a string literal to char* or passing it in a char* parameter is compliant with standard C++, no const_cast is required. Regards, Martin From jeremy at alum.mit.edu Fri Feb 10 18:14:28 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 12:14:28 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ECC84C.9020002@v.loewis.de> References: <43ECC84C.9020002@v.loewis.de> Message-ID: On 2/10/06, "Martin v. L?wis" wrote: > Jeremy Hylton wrote: > > I added some const to several API functions that take char* but > > typically called by passing string literals. In C++, a string literal > > is a const char* so you need to add a const_cast<> to every call site, > > That's not true. > > A string literal of length N is of type const char[N+1]. However, > a (deprecated) conversion of string literals to char* is provided > in the language. So assigning a string literal to char* or passing > it in a char* parameter is compliant with standard C++, no > const_cast is required. Ok. I reviewed the original problem and you're right, the problem was not that it failed outright but that it produced a warning about the deprecated conversion: warning: deprecated conversion from string constant to 'char*'' I work at a place that takes the same attitude as python-dev about warnings: They're treated as errors and you can't check in code that the compiler generates warnings for. Nonetheless, the consensus on the c++ sig and python-dev at the time was to fix Python. If we don't allow warnings in our compilations, we shouldn't require our users at accept warnings in theirs. Jeremy From tim.peters at gmail.com Fri Feb 10 18:19:01 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 12:19:01 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> [Jeremy Hylton] > ... > I admit that I'm also puzzled by Jack's specific question. I don't > understand why an array passed to PyArg_ParseTupleAndKeywords() would > need to be declared as const. I observed the problem in my initial > changes but didn't think very hard about the cause of the problem. > Perhaps someone with better C/C++ standards chops can explain. Oh, who cares? I predict "Jack's problem" would go away if we changed the declaration of PyArg_ParseTupleAndKeywords to what you intended to begin with: PyAPI_FUNC(int) PyArg_ParseTupleAndKeywords(PyObject *, PyObject *, const char *, const char * const *, ...); That is, declare the keywords argument as a pointer to const pointer to const char, rather than the current pointer to pointer to const char. How about someone on a Linux box try that with gcc, and check it in if it solves Jack's problem (meaning that gcc stops whining about the original spelling of his original example). From jeremy at alum.mit.edu Fri Feb 10 18:22:24 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 12:22:24 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <43ECC70B.8030501@v.loewis.de> Message-ID: On 2/10/06, Jeremy Hylton wrote: > It looks like a solution may be to define it as "const char * const *" > rather than "const char **". I'll see if that works. No. It doesn't work. I'm not sure about this one either, but some searching suggests that you can pass a char** to a function taking const char* const* in C++ but not in C. Sigh. I don't see any way to avoid a warning in Jack's case. Jeremy > > Jeremy > > On 2/10/06, "Martin v. L?wis" wrote: > > Jeremy Hylton wrote: > > > I admit that I'm also puzzled by Jack's specific question. I don't > > > understand why an array passed to PyArg_ParseTupleAndKeywords() would > > > need to be declared as const. I observed the problem in my initial > > > changes but didn't think very hard about the cause of the problem. > > > Perhaps someone with better C/C++ standards chops can explain. > > > > Please take a look at this code: > > > > void foo(const char** x, const char*s) > > { > > x[0] = s; > > } > > > > void bar() > > { > > char *kwds[] = {0}; > > const char *s = "Text"; > > foo(kwds, s); > > kwds[0][0] = 't'; > > } > > > > If it was correct, you would be able to modify the const char > > array in the string literal, without any compiler errors. The > > assignment > > > > x[0] = s; > > > > is kosher, because you are putting a const char* into a > > const char* array, and the assigment > > > > kwds[0][0] = 't'; > > > > is ok, because you are modifying a char array. So the place > > where it has to fail is the passing of the pointer-pointer. > > > > Regards, > > Martin > > > From tim.peters at gmail.com Fri Feb 10 18:27:35 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 12:27:35 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <43ECC70B.8030501@v.loewis.de> Message-ID: <1f7befae0602100927g56911490y166fa183c0c72568@mail.gmail.com> [Jeremy] >> It looks like a solution may be to define it as "const char * const *" >> rather than "const char **". I'll see if that works. [Jeremy] > No. It doesn't work. I'm not sure about this one either, but some > searching suggests that you can pass a char** to a function taking > const char* const* in C++ but not in C. Oops! I think that's right. > Sigh. I don't see any way to avoid a warning in Jack's case. Martin's turn ;-) From guido at python.org Fri Feb 10 18:29:42 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 09:29:42 -0800 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: On 2/10/06, Jeremy Hylton wrote: > I added some const to several API functions that take char* but > typically called by passing string literals. In C++, a string literal > is a const char* so you need to add a const_cast<> to every call site, > which is incredibly cumbersome. After some discussion on python-dev, > I made changes to a small set of API functions and chased the > const-ness the rest of the way, as you would expect. There was > nothing random about the places const was added. I still don't understand *why* this was done, nor how the set of functions was chosen if not randomly. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Fri Feb 10 18:43:00 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 12:43:00 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <20060210102713.GJ10226@xs4all.nl> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> Message-ID: <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> [Thomas Wouters] > Perhaps the memory you have is of select-lookalikes, like poll(), No, it was definitely select(), and on a 64-bit Unix (probably _not_ Linux) that allowed for an enormous number of sockets. > or maybe of vendor-specific (and POSIX-breaking) extensions to select(). Yes, it must have been non-POSIX. > select() performs pretty poorly on large fdsets with holes in, and has the fixed > size fdset problem, so poll() was added to fix that (by Linux and later by XPG4, > IIRC.) poll() takes an array of structs containing the fd, the operations to > watch for and an output parameter with seen events. Does that jar your > memory? :) No more than it had been jarred ;-) Well, a bit more: it was possible to pass a first argument to select() that was larger than FD_SETSIZE. In effect, FD_SETSIZE had no meaning. > (The socketmodule has support for poll(), on systems that have it, by the > way.) Yup. From tim.peters at gmail.com Fri Feb 10 18:54:17 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 12:54:17 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: Message-ID: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> [Jeremy] >> I added some const to several API functions that take char* but >> typically called by passing string literals. In C++, a string literal >> is a const char* so you need to add a const_cast<> to every call site, >> which is incredibly cumbersome. After some discussion on python-dev, >> I made changes to a small set of API functions and chased the >> const-ness the rest of the way, as you would expect. There was >> nothing random about the places const was added. [Guido] > I still don't understand *why* this was done, Primarily to make life easier for C++ programmers using Python's C API. But didn't Jeremy just say that? Some people (including me) have been adding const to char* API arguments for years, but in much slower motion, and at least I did it only when someone complained about a specific function. > nor how the set of functions was chosen if not randomly. [Jeremy] I added some const to several API functions that take char* but typically called by passing string literals. If he had _stuck_ to that, we wouldn't be having this discussion :-) (that is, nobody passes string literals to PyArg_ParseTupleAndKeywords's kws argument). From jeremy at alum.mit.edu Fri Feb 10 19:05:41 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 13:05:41 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> Message-ID: On 2/10/06, Tim Peters wrote: > [Jeremy] > I added some const to several API functions that take char* but > typically called by passing string literals. > > If he had _stuck_ to that, we wouldn't be having this discussion :-) > (that is, nobody passes string literals to > PyArg_ParseTupleAndKeywords's kws argument). They are passing arrays of string literals. In my mind, that was a nearly equivalent use case. I believe the C++ compiler complains about passing an array of string literals to char**. Jeremy From guido at python.org Fri Feb 10 19:07:45 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 10:07:45 -0800 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> Message-ID: On 2/10/06, Tim Peters wrote: > [Jeremy] > >> I added some const to several API functions that take char* but > >> typically called by passing string literals. In C++, a string literal > >> is a const char* so you need to add a const_cast<> to every call site, > >> which is incredibly cumbersome. After some discussion on python-dev, > >> I made changes to a small set of API functions and chased the > >> const-ness the rest of the way, as you would expect. There was > >> nothing random about the places const was added. > > [Guido] > > I still don't understand *why* this was done, > > Primarily to make life easier for C++ programmers using Python's C > API. But didn't Jeremy just say that? I didn't connect the dots. > Some people (including me) have been adding const to char* API > arguments for years, but in much slower motion, and at least I did it > only when someone complained about a specific function. > > > nor how the set of functions was chosen if not randomly. > > [Jeremy] > I added some const to several API functions that take char* but > typically called by passing string literals. > > If he had _stuck_ to that, we wouldn't be having this discussion :-) > (that is, nobody passes string literals to > PyArg_ParseTupleAndKeywords's kws argument). Is it too late to revert this one? Is there another way to make C++ programmers happy (e.g. my having a macro that expands to const when compiled with C++ but vanishes when compiled with C?) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Fri Feb 10 19:27:35 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 13:27:35 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> Message-ID: <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> [Jeremy] >>> I added some const to several API functions that take char* but >>> typically called by passing string literals. [Tim] >> If he had _stuck_ to that, we wouldn't be having this discussion :-) >> (that is, nobody passes string literals to >> PyArg_ParseTupleAndKeywords's kws argument). [Jeremy] > They are passing arrays of string literals. In my mind, that was a > nearly equivalent use case. I believe the C++ compiler complains > about passing an array of string literals to char**. It's the consequences: nobody complains about tacking "const" on to a former honest-to-God "char *" argument that was in fact not modified, because that's not only helpful for C++ programmers, it's _harmless_ for all programmers. For example, nobody could sanely object (and nobody did :-)) to adding const to the attribute-name argument in PyObject_SetAttrString(). Sticking to that creates no new problems for anyone, so that's as far as I ever went. From jeremy at alum.mit.edu Fri Feb 10 19:32:51 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 13:32:51 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> Message-ID: On 2/10/06, Guido van Rossum wrote: > On 2/10/06, Tim Peters wrote: > > [Jeremy] > > I added some const to several API functions that take char* but > > typically called by passing string literals. > > > > If he had _stuck_ to that, we wouldn't be having this discussion :-) > > (that is, nobody passes string literals to > > PyArg_ParseTupleAndKeywords's kws argument). > > Is it too late to revert this one? The change is still beneficial to C++ programmers, so my initial preference is to keep it. There are still some benefits to the other changes, so it's isn't a complete loss if we revert it. > Is there another way to make C++ programmers happy (e.g. my having a > macro that expands to const when compiled with C++ but vanishes when > compiled with C?) Sounds icky. Are we pretty sure there is no way to do the right thing in plain C? That is, declare the argument as taking a set of const strings and still allow non-const strings to be passed without warning. Jeremy From martin at v.loewis.de Fri Feb 10 20:18:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 20:18:53 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <43ECC84C.9020002@v.loewis.de> Message-ID: <43ECE71D.2050402@v.loewis.de> Jeremy Hylton wrote: > Ok. I reviewed the original problem and you're right, the problem was > not that it failed outright but that it produced a warning about the > deprecated conversion: > warning: deprecated conversion from string constant to 'char*'' > > I work at a place that takes the same attitude as python-dev about > warnings: They're treated as errors and you can't check in code that > the compiler generates warnings for. In that specific case, I think the compiler's warning should be turned off; it is a bug in the compiler if that specific warning cannot be turned off separately. While it is true that the conversion is deprecated, the C++ standard defines this as "Normative for the current edition of the Standard, but not guaranteed to be part of the Standard in future revisions." The current version is from 1998. I haven't been following closely, but I believe there are no plans to actually remove the feature in the next revision. FWIW, Annex D also defines these features as deprecated: - the use of "static" for objects in namespace scope (AFAICT including C file-level static variables and functions) - C library headers (i.e. ) Don't you get a warning when including Python.h, because that include ? > Nonetheless, the consensus on the c++ sig and python-dev at the time > was to fix Python. If we don't allow warnings in our compilations, we > shouldn't require our users at accept warnings in theirs. We don't allow warnings for "major compilers". This specific compiler appears flawed (or your configuration of it). Regards, Martin From martin at v.loewis.de Fri Feb 10 20:33:42 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 20:33:42 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <1f7befae0602100927g56911490y166fa183c0c72568@mail.gmail.com> References: <43ECC70B.8030501@v.loewis.de> <1f7befae0602100927g56911490y166fa183c0c72568@mail.gmail.com> Message-ID: <43ECEA96.1070406@v.loewis.de> Tim Peters wrote: >>Sigh. I don't see any way to avoid a warning in Jack's case. > > > Martin's turn ;-) I see two options: 1. Revert the change for the const char** keywords argument (but leave the change for everything else). C++ users should only see a problem if they have a const char* variable, not if they use literals (Jeremy's compiler's warning is insensate) For keyword arguments, people typically don't have char* variables; instead, they have an array of string literals. 2. Only add the const in C++: #ifdef __cplusplus #define Py_cxxconst const #else #define Py_cxxconst #endif PyAPI_FUNC(int) PyArg_ParseTupleAndKeywords(PyObject *, PyObject *, const char *, Py_cxxconst char *Py_cxxconst*, ...); This might look like it could break C/C++ interoperability on platforms that take an inventive interpretation of standard (e.g. if they would mangle even C symbols). However, I believe it won't make things worse: The C++ standard doesn't guarantee interoperability of C and C++ implementations at all, and the platforms I'm aware of support the above construct (since PA_PTAK is extern "C"). Regards, Martin From python at rcn.com Fri Feb 10 20:52:09 2006 From: python at rcn.com (Raymond Hettinger) Date: Fri, 10 Feb 2006 14:52:09 -0500 Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net><00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1><20060208235156.GA29514@code0.codespeak.net><009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1><20060210130806.GA29717@code0.codespeak.net><43EC9370.7060809@gmail.com> <20060210133308.GC29717@code0.codespeak.net> Message-ID: <001501c62e7b$7ab86dc0$b83efea9@RaymondLaptop1> [Armin] > It would be nicer if all these > iterators I'm not familiar with would give me a hint about what they > actually return, instead of: > >>>> itertools.count(17) > count(17) # yes, thank you, not very helpful I prefer that the repr() of count() be left alone. It follows the style used by xrange() and other repr's that can be run through eval(). Also, the existing repr keeps its information up-to-date to reflect the current state of the iterator: >>> it = count(10) >>> it.next() 10 >>> it count(11) A good deal of thought and discussion went into these repr forms. See the python-dev discussions in April 2004. Please don't randomly go in and change those choices. For most of the tools like enumerate(), there are very few assumptions you can make about the input without actually running the iteration. So, I don't see how you can change enumerate's repr method unless adopting a combination of styles, switching back and forth depending on the input: >>> enumerate('abcde') <(0, 'a'), (1, 'b'), ...> >>> enumerate(open('tmp.txt')) IMO, switching back and forth is an especially bad idea. Hence, enumerate's repr ought to be left alone too. Raymond From guido at python.org Fri Feb 10 21:21:26 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 12:21:26 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/7/06, Neal Norwitz wrote: > On 2/7/06, Jeremy Hylton wrote: > > It looks like we need a Python 2.5 Release Schedule PEP. > > Very draft: http://www.python.org/peps/pep-0356.html > > Needs lots of work and release managers. Anthony, Martin, Fred, Sean > are all mentioned with TBDs and question marks. Before he went off to a boondoggle^Woff-site at a Mexican resort, Neal made me promise that I'd look at this and try to get the 2.5 release plan going for real. First things first: we need a release manager. Anthony, do you want to do the honors again, or are you ready for retirement? Next, the schedule. Neal's draft of the schedule has us releasing 2.5 in October. That feels late -- nearly two years after 2.4 (which was released on Nov 30, 2004). Do people think it's reasonable to strive for a more aggressive (by a month) schedule, like this: alpha 1: May 2006 alpha 2: June 2006 beta 1: July 2006 beta 2: August 2006 rc 1: September 2006 final: September 2006 ??? Would anyone want to be even more aggressive (e.g. alpha 1 right after PyCon???). We could always do three alphas. There's a bunch of sections (some very long) towards the end of the PEP of questionable use; Neal just copied these from the 2.4 release schedule (PEP 320): - Ongoing tasks - Carryover features from Python 2.4 - Carryover features from Python 2.3 (!) Can someone go over these and suggest which we should keep, which we should drop? (I may do this later, but I have other priorities below.) Then, the list of features that ought to be in 2.5. Quoting Neal's draft: > PEP 308: Conditional Expressions Definitely. Don't we have a volunteer doing this now? > PEP 328: Absolute/Relative Imports Yes, please. > PEP 343: The "with" Statement Didn't Michael Hudson have a patch? > PEP 352: Required Superclass for Exceptions I believe this is pretty much non-controversial; it's a much weaker version of PEP 348 which was rightfully rejected for being too radical. I've tweaked some text in this PEP and approved it. Now we need to make it happen. It might be quite a tricky thing, since Exception is currently implemented in C as a classic class. If Brett wants to sprint on this at PyCon I'm there to help (Mon/Tue only). Fortunately we have MWH's patch 1104669 as a starting point. > PEP 353: Using ssize_t as the index type Neal tells me that this is in progress in a branch, but that the code is not yet flawless (tons of warnings etc.). Martin, can you tell us more? When do you expect this to land? Maybe aggressively merging into the HEAD and then releasing it as alpha would be a good way to shake out the final issues??? Other PEPs I'd like comment on: PEP 357 (__index__): the patch isn't on SF yet, but otherwise I'm all for this, and I'd like to accept it ASAP to get it in 2.5. It doesn't look like it'll cause any problems. PEP 314 (metadata v1.1): this is marked as completed, but there's a newer PEP available: PEP 334 (metadata v1.2). That PEP has 2.5 as its target date. Shouldn't we implement it? (This is a topic that I haven't followed closely.) There's also the question whether 314 should be marked final. Andrew or Richard? PEP 355 (path module): I still haven't reviewed this, because I'm -0 on adding what appears to me duplicate functionality. But if there's a consensus building perhaps it should be allowed to go forward (and then I *will* review it carefully). I found a few more PEPs slated for 2.5 but that haven't seen much action lately: PEP 351 - freeze protocol. I'm personally -1; I don't like the idea of freezing arbitrary mutable data structures. Are there champions who want to argue this? PEP 349 - str() may return unicode. Where is this? I'm not at all sure the PEP is ready. it would probably be a lot of work to make this work everywhere in the C code, not to mention the stdlib .py code. Perhaps this should be targeted for 2.6 instead? The consequences seem potentially huge. PEP 315 - do while. A simple enough syntax proposal, albeit one introducing a new keyword (which I'm fine with). I kind of like it but it doesn't strike me as super important -- if we put this off until Py3k I'd be fine with that too. Opinions? Champions? Ouch, a grep produced tons more. Quick rundown: PEP 246 - adaptation. I'm still as lukewarm as ever; it needs interfaces, promises to cause a paradigm shift, and the global map worries me. PEP 323 - copyable iterators. Seems stalled. Alex, do you care? PEP 332 - byte vectors. Looks incomplete. Put off until 2.6? PEP 337 - logging in the stdlib. What of it? This seems a good idea but potentially disruptive (because backwards incompatible). Also it could be done piecemeal on an opportunistic basis. Any volunteers? PEP 338 - support -m for modules in packages. I believe Nick Coghlan is close to implementing this. I'm fine with accepting it. PEP 344 - exception chaining. There are deep problems with this due to circularities; perhaps we should drop this, or revisit it for Py3k. That's the "pep parade" for now. It would be appropriate to start a new topic to discuss specific PEPs; a response to this thread referencing the new thread would be appropriate. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From scott+python-dev at scottdial.com Fri Feb 10 21:24:28 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Fri, 10 Feb 2006 15:24:28 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> Message-ID: <43ECF67C.1020700@scottdial.com> Tim Peters wrote: > No more than it had been jarred ;-) Well, a bit more: it was > possible to pass a first argument to select() that was larger than > FD_SETSIZE. In effect, FD_SETSIZE had no meaning. This begs the question then whether the check that is implemented has any relevance to any platform other than Linux. I am no portability guru, but I have to think there are other platforms where this patch will cause problems. For now at least, can we at least do some preprocessing magic to not use this code with Windows? -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From thomas at xs4all.net Fri Feb 10 21:40:29 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 21:40:29 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ECF67C.1020700@scottdial.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> Message-ID: <20060210204029.GY5045@xs4all.nl> On Fri, Feb 10, 2006 at 03:24:28PM -0500, Scott Dial wrote: > Tim Peters wrote: > >No more than it had been jarred ;-) Well, a bit more: it was > >possible to pass a first argument to select() that was larger than > >FD_SETSIZE. In effect, FD_SETSIZE had no meaning. > > any relevance to any platform other than Linux. I am no portability > guru, but I have to think there are other platforms where this patch > will cause problems. For now at least, can we at least do some > preprocessing magic to not use this code with Windows? I doubt it will have problems on other platforms. As Tim said, FD_SETSIZE is mandated by POSIX. Perhaps some platforms do allow larger sizes, by replacing the FD_* macros with functions that dynamically grow whatever magic is the 'fdset' datatype. I sincerely doubt it's a common approach, though, and for them to be POSIX they would need to have FD_SETSIZE set to some semi-sane value. So at worst, on those platforms (if any), we're reducing the number of sockets you can actually select() on, from some undefined platform maximum to whatever the platform *claims* is the maximum. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at v.loewis.de Fri Feb 10 21:40:59 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 21:40:59 +0100 Subject: [Python-Dev] ssize_t status (Was: release plan for 2.5 ?) In-Reply-To: References: Message-ID: <43ECFA5B.6030900@v.loewis.de> Guido van Rossum wrote: >> PEP 353: Using ssize_t as the index type > > > Neal tells me that this is in progress in a branch, but that the code > is not yet flawless (tons of warnings etc.). Martin, can you tell us > more? "It works", in a way. You only get the tons of warnings with the right compiler, and you don't actually need to fix them all to get something useful. Not all modules need to be converted to support more than 2**31 elements for all containers they operate on, so this could also be based on user feedback. Some users (so far, just Marc-Andre) have complained that this breaks backwards compatibility. Some improvements can be made still, but for some aspects (tp_as_sequence callbacks), I think the best we can hope for is compiler warnings about incorrect function pointer types. > When do you expect this to land? Maybe aggressively merging into > the HEAD and then releasing it as alpha would be a good way to shake > out the final issues??? Sure: I hope to complete this all in March. Regards, Martin From martin at v.loewis.de Fri Feb 10 21:47:26 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 21:47:26 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ECF67C.1020700@scottdial.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> Message-ID: <43ECFBDE.40609@v.loewis.de> Scott Dial wrote: > This begs the question then whether the check that is implemented has > any relevance to any platform other than Linux. I am no portability > guru, but I have to think there are other platforms where this patch > will cause problems. The patch is right on all platforms conforming to the POSIX standard. POSIX says that FD_ISSET and friends have undefined behaviour if the file descriptor is larger than FD_SETSIZE. For platforms not conforming to the POSIX standard, the patch errs on the conservative side: it refuses to do something that POSIX says has undefined behaviour, yet may be well-defined on that platform. Disabling this for Windows is fine with me; I also think there should be some kind of documentation that quickly shows the potential cause of the exception Regards, Martin From martin at v.loewis.de Fri Feb 10 22:00:46 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 22:00:46 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <20060210204029.GY5045@xs4all.nl> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <20060210204029.GY5045@xs4all.nl> Message-ID: <43ECFEFE.5060102@v.loewis.de> Thomas Wouters wrote: > I doubt it will have problems on other platforms. As Tim said, FD_SETSIZE is > mandated by POSIX. Perhaps some platforms do allow larger sizes, by > replacing the FD_* macros with functions that dynamically grow whatever > magic is the 'fdset' datatype. I sincerely doubt it's a common approach, > though, and for them to be POSIX they would need to have FD_SETSIZE set to > some semi-sane value. So at worst, on those platforms (if any), we're > reducing the number of sockets you can actually select() on, from some > undefined platform maximum to whatever the platform *claims* is the maximum. I think the Windows interpretation is actually well-designed: FD_SETSIZE shouldn't be the number of the largest descriptor, but instead be the maximum size of the set. So FD_SETSIZE is 64 on Windows, but you still can have much larger file descriptor numbers. The implementation strategy of Windows is to use an array of integers, rather than the bit mask, and an index telling you how many slots have already been filled. With FD_SETSIZE being 64, the fd_set requires 256 bytes. This strategy has a number of interesting implications: - a naive implementation of FD_SET is not idempotent; old winsock implementations where so naive. So you might fill the set by setting the same descriptor 64 times. Current implementations use a linear search to make the operation idempotent. - FD_CLR needs to perform a linear scan for the descriptor, and then shift all subsequent entries by one (it could actually just move the very last entry to the deleted slot, but doesn't) In any case, POSIX makes it undefined what FD_SET does when the socket is larger than FD_SETSIZE, and apparently clearly expects an fd_set to be a bit mask. Regards, Martin From fabianosidler at gmail.com Fri Feb 10 22:03:59 2006 From: fabianosidler at gmail.com (Fabiano Sidler) Date: Fri, 10 Feb 2006 22:03:59 +0100 Subject: [Python-Dev] compiler.pyassem Message-ID: Hi folks! Do I see things as they are and compiler.pyassem generates bytecode straight without involve any C code, i.e. code from the VM or the compiler? How is this achieved? I took a look at Python/compile.c as mentioned in compiler.pyassem and I'm trying to get into it, but about 6500 lines of C code are too much for me in one file. Could someone here please give me some hints on how one can do what compiler.pyassem does? Greetings, Fips From raymond.hettinger at verizon.net Fri Feb 10 22:05:35 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 10 Feb 2006 16:05:35 -0500 Subject: [Python-Dev] release plan for 2.5 ? References: Message-ID: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> [Guido van Rossum] > PEP 351 - freeze protocol. I'm personally -1; I don't like the idea of > freezing arbitrary mutable data structures. Are there champions who > want to argue this? It has at least one anti-champion. I think it is a horrible idea and would like to see it rejected in a way that brings finality. If needed, I can elaborate in a separate thread. > PEP 315 - do while. A simple enough syntax proposal, albeit one > introducing a new keyword (which I'm fine with). I kind of like it but > it doesn't strike me as super important -- if we put this off until > Py3k I'd be fine with that too. Opinions? Champions? I helped tweak a few issues with the PEP and got added as a co-author. I didn't push for it because the syntax is a little odd if nothing appears in the while suite: do: val = source.read(1) process(val) while val != lastitem: pass I never found a way to improve this. Dropping the final colon and post-while steps improved the looks but diverged too far away from the rest of the language: do: val = source.read(1) process(val) while val != lastitem So, unless another champion arises, putting this off until Py3k is fine with me. > PEP 323 - copyable iterators. Seems stalled. Alex, do you care? I installed the underlying mechanism in support of itertools.tee() in Py2.4. So, if anyone really wants to make xrange() copyable, it is now a trivial task -- likewise for any other iterator that has a potentially copyable state. I've yet to find a use case for it, so I never pushed for the rest of the PEP to be implemented. There's nothing wrong with the idea, but there doesn't seem to be much interest. > PEP 344 - exception chaining. There are deep problems with this due to > circularities; perhaps we should drop this, or revisit it for Py3k. I wouldn't hold-up Py2.5 for this. My original idea for this was somewhat simpler. Essentially, a high-level function would concatenate extra string information onto the result of an exception raised at a lower level. That strategy was applied to an existing problem for type objects and has met with good success. IOW, there is a simpler alternative on the table, but resolution won't take place until we collectively take interest in it again. At this point, it seems to be low on everyone's priority list (including mine). Raymond From pje at telecommunity.com Fri Feb 10 22:07:50 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 10 Feb 2006 16:07:50 -0500 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: <5.1.1.6.0.20060210154235.02007160@mail.telecommunity.com> At 12:21 PM 2/10/2006 -0800, Guido van Rossum wrote: > > PEP 343: The "with" Statement > >Didn't Michael Hudson have a patch? PEP 343's "Accepted" status was reverted to "Draft" in October, and then changed back to "Accepted". I believe the latter change is an error, since you haven't pronounced on the changes. Have you reviewed the __context__ stuff that was added? In any case Michael's patch was pre-AST branch merge, and no longer reflects the current spec. >PEP 332 - byte vectors. Looks incomplete. Put off until 2.6? Wasn't the plan to just make this a builtin version of array.array for bytes, plus a .decode method and maybe a few other tweaks? We presumably won't be able to .encode() to bytes or get bytes from sockets and files until 3.0, but having the type and being able to write it to files and sockets would be nice. I'm not sure about the b"" syntax, ISTR it was controversial but I don't remember if there was a resolution. >PEP 314 (metadata v1.1): this is marked as completed, but there's a >newer PEP available: PEP 334 (metadata v1.2). That PEP has 2.5 as its >target date. Shouldn't we implement it? (This is a topic that I >haven't followed closely.) There's also the question whether 314 >should be marked final. Andrew or Richard? I'm concerned that both metadata PEPs push to define syntax for things that have undefined semantics. And worse, to define incompatible syntax in some cases. PEP 345 for example, dictates the use of StrictVersion syntax for the required version of Python and the version of external requirements, but Python's own version numbers don't conform to strict version syntax. ISTM that the metadata standard needs more work, especially since PyPI doesn't actually support using all of the metadata provided by the implemented version of the standard. There's no way to search for requires/provides, for example (which is one reason why I went with distribution names for dependency resolution in setuptools). Also, the specs don't allow for a Maintainer distinct from the package Author, even though the distutils themselves allow this. IMO, 345 needs to go back to the drawing board, and I'm not really thrilled with the currently-useless "requires/provides" stuff in PEP 314. If we do anything with the package metadata in Python 2.5, I'd like it to be *installing* PKG-INFO files alongside the packages, using a filename of the form "distributionname-version-py2.5.someext". Setuptools supports such files currently under the ".egg-info" extension, but I'd be just as happy with '.pkg-info' if it becomes a Python standard addition to the installation. Having this gives most of the benefits of PEP 262 (database of installed packages), although I wouldn't mind extending the PKG-INFO file format to include some of the PEP 262 additional data. These are probably distutils-sig and/or catalog-sig topics; I just mainly wanted to point out that 314, 245, and 262 need at least some tweaking and possibly rethinking before any push to implementation. From thomas at xs4all.net Fri Feb 10 22:11:31 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 22:11:31 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: <20060210211131.GL10226@xs4all.nl> On Fri, Feb 10, 2006 at 12:21:26PM -0800, Guido van Rossum wrote: > ??? Would anyone want to be even more aggressive (e.g. alpha 1 right > after PyCon???). We could always do three alphas. Well, PyCon might be a nice place to finish any PEP patches. I know I'll be available to do such work on the sprint days ;) I don't think that means we'll have a working repository with all 2.5 features right after, though. > > PEP 308: Conditional Expressions > Definitely. Don't we have a volunteer doing this now? There is a volunteer, but he's new at this, so he probably needs a bit of time to work through the intricacies of the AST, the compiler and the eval loop. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jeremy at alum.mit.edu Fri Feb 10 22:14:12 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Feb 2006 16:14:12 -0500 Subject: [Python-Dev] compiler.pyassem In-Reply-To: References: Message-ID: On 2/10/06, Fabiano Sidler wrote: > Do I see things as they are and compiler.pyassem generates bytecode > straight without involve any C code, i.e. code from the VM or the > compiler? How is this achieved? I took a look at Python/compile.c as > mentioned in compiler.pyassem and I'm trying to get into it, but about > 6500 lines of C code are too much for me in one file. Could someone > here please give me some hints on how one can do what compiler.pyassem > does? I'm not sure what exactly you want to know. The compiler package implements most of a Python bytecode compiler in Python. It re-uses the parser written in C, but otherwise does the entire transformation in Python. The "how is this achieved?" question is hard to answer without saying "read the source." There are about 6000 lines of Python code in the compiler pacakge, but you can largely ignore ast.py and transformer.py if you just want to study the compiler. Perhaps you specific question is: How does the interpreter create new bytecode or function objects from a program instead of compiling from source or importing a module? At some level, bytecode is simply a string representation of a progam. The new module takes the bytecode plus a lot of meta-data including the names of variables and a list of constants, and produces a new code object. See the newCodeObject() method. I suspect further discussion on this topic might be better done on python-list, unless you have some discussion that is relevant for Python implementors. Jeremy From tim.peters at gmail.com Fri Feb 10 22:14:38 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 16:14:38 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ECF67C.1020700@scottdial.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> Message-ID: <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> [Scott Dial] > This begs the question then whether the check that is implemented has > any relevance to any platform other than Linux. I am no portability > guru, but I have to think there are other platforms where this patch > will cause problems. For now at least, can we at least do some > preprocessing magic to not use this code with Windows? We _have_ to gut this patch on Windows, because Python code using sockets on Windows no longer works. That can't stand. Indeed, I'm half tempted to revert the checkin right now since Python's test suite fails or hangs on Windows in test after test now. This at least blocks me from doing work I wanted to do (instead I spent the time allocated for that staring at test failures). I suggest skipping the new crud conditionalized on a symbol like Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE The Windows pyconfig.h can #define that, and other platforms can ignore its possible existence. If it applies to some Unix variant too, fine, that variant can also #define it. No idea here what the story is on, e.g., Cygwin or OS2. From guido at python.org Fri Feb 10 22:29:30 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 13:29:30 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <5.1.1.6.0.20060210154235.02007160@mail.telecommunity.com> References: <5.1.1.6.0.20060210154235.02007160@mail.telecommunity.com> Message-ID: On 2/10/06, Phillip J. Eby wrote: I'm not following up to anything that Phillip wrote (yet), but his response reminded me of two more issues: - wsgiref, an implementation of PEP 333 (Web Standard Gateway interface). I think this might make a good addition to the standard library. The web-sig has been discussing additional things that might be proposed for addition but I believe there's no consensus -- in any case we ought to be conservative. - setuplib? Wouldn't it make sense to add this to the 2.5 stdlib? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Feb 10 22:33:28 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 22:33:28 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> Message-ID: <43ED06A8.9000400@v.loewis.de> Tim Peters wrote: > I suggest skipping the new crud conditionalized on a symbol like > > Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE > Hmm... How about this patch: Index: Modules/socketmodule.c =================================================================== --- Modules/socketmodule.c (Revision 42308) +++ Modules/socketmodule.c (Arbeitskopie) @@ -396,7 +396,14 @@ static PyTypeObject sock_type; /* Can we call select() with this socket without a buffer overrun? */ +#ifdef MS_WINDOWS +/* Everything is selectable on Windows */ +#define IS_SELECTABLE(s) 1 +#else +/* POSIX says selecting descriptors above FD_SETSIZE is undefined + behaviour. */ #define IS_SELECTABLE(s) ((s)->sock_fd < FD_SETSIZE) +#endif static PyObject* select_error(void) Regards, Martin From tim.peters at gmail.com Fri Feb 10 22:35:49 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 16:35:49 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ECFEFE.5060102@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <20060210204029.GY5045@xs4all.nl> <43ECFEFE.5060102@v.loewis.de> Message-ID: <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> [Martin v. L?wis] > I think the Windows interpretation is actually well-designed: FD_SETSIZE > shouldn't be the number of the largest descriptor, but instead be the > maximum size of the set. It's more that the fdset macros were well designed: correct code using FD_SET() etc is portable across Windows and Linux, and that's so because the macros define an interface rather than an implementation. BTW, note that the first argument to select() is ignored on Windows. > So FD_SETSIZE is 64 on Windows, In Python FD_SETSIZE is 512 on Windows (see the top of selectmodule.c). > but you still can have much larger file descriptor numbers. Which is the _source_ of "the problem" on Windows: Windows socket handles aren't file descriptors (if they were, they'd be little integers ;-)). > ... > In any case, POSIX makes it undefined what FD_SET does when the > socket is larger than FD_SETSIZE, and apparently clearly expects > an fd_set to be a bit mask. Yup -- although the people who designed the fdset macros to begin with didn't appear to have this assumption. From scott+python-dev at scottdial.com Fri Feb 10 22:41:41 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Fri, 10 Feb 2006 16:41:41 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ED06A8.9000400@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> Message-ID: <43ED0895.1040502@scottdial.com> Martin v. L?wis wrote: > Tim Peters wrote: >> I suggest skipping the new crud conditionalized on a symbol like >> >> Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE >> > > Hmm... How about this patch: > > Index: Modules/socketmodule.c > =================================================================== > --- Modules/socketmodule.c (Revision 42308) > +++ Modules/socketmodule.c (Arbeitskopie) > @@ -396,7 +396,14 @@ > static PyTypeObject sock_type; > > /* Can we call select() with this socket without a buffer overrun? */ > +#ifdef MS_WINDOWS > +/* Everything is selectable on Windows */ > +#define IS_SELECTABLE(s) 1 > +#else > +/* POSIX says selecting descriptors above FD_SETSIZE is undefined > + behaviour. */ > #define IS_SELECTABLE(s) ((s)->sock_fd < FD_SETSIZE) > +#endif > > static PyObject* > select_error(void) > > Regards, > Martin That is the exact patch I applied, but you also need to patch _ssl.c --- C:/python-trunk/Modules/_ssl.c (revision 42305) +++ C:/python-trunk/Modules/_ssl.c (working copy) @@ -376,9 +376,11 @@ if (s->sock_fd < 0) return SOCKET_HAS_BEEN_CLOSED; +#ifndef MS_WINDOWS /* Guard against socket too large for select*/ if (s->sock_fd >= FD_SETSIZE) return SOCKET_INVALID; +#endif /* Construct the arguments to select */ tv.tv_sec = (int)s->sock_timeout; But then that leaves whether to go with the Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE symbol instead of MS_WINDOWS. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From scott+python-dev at scottdial.com Fri Feb 10 22:46:03 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Fri, 10 Feb 2006 16:46:03 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <20060210204029.GY5045@xs4all.nl> <43ECFEFE.5060102@v.loewis.de> <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> Message-ID: <43ED099B.1000209@scottdial.com> Tim Peters wrote: > [Martin v. L?wis] >> So FD_SETSIZE is 64 on Windows, > > In Python FD_SETSIZE is 512 on Windows (see the top of selectmodule.c). > Although I agree, in terms of the socketmodule, there was no such define overriding the default FD_SETSIZE, so you are both right. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From tim.peters at gmail.com Fri Feb 10 22:49:09 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 16:49:09 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ED06A8.9000400@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> Message-ID: <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> [Tim] >> I suggest skipping the new crud conditionalized on a symbol like >> >> Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE [Martin] > Hmm... How about this patch: I don't know. Of course it misses similar new tests added to _ssl.c (see the msg that started this thread), so it spreads beyond just this. Does it do the right thing for Windows variants like Cygwin, and OS/2? Don't know. If the initial #ifdef MS_WINDOWS here gets duplicated in multiple modules (and looks like it must -- or IS_SELECTABLE should be given a _Py name and defined once in pyport.h instead), and gets hairier over time, then I'd rather have a name like the one I suggested (to describe the _intent_ rather than paste together a growing collection of "which platform do I think I'm being compiled on?" names). > Index: Modules/socketmodule.c > =================================================================== > --- Modules/socketmodule.c (Revision 42308) > +++ Modules/socketmodule.c (Arbeitskopie) > @@ -396,7 +396,14 @@ > static PyTypeObject sock_type; > > /* Can we call select() with this socket without a buffer overrun? */ > +#ifdef MS_WINDOWS > +/* Everything is selectable on Windows */ > +#define IS_SELECTABLE(s) 1 > +#else > +/* POSIX says selecting descriptors above FD_SETSIZE is undefined > + behaviour. */ > #define IS_SELECTABLE(s) ((s)->sock_fd < FD_SETSIZE) > +#endif > > static PyObject* > select_error(void) From tim.peters at gmail.com Fri Feb 10 22:55:18 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 16:55:18 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ED099B.1000209@scottdial.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <20060210204029.GY5045@xs4all.nl> <43ECFEFE.5060102@v.loewis.de> <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> <43ED099B.1000209@scottdial.com> Message-ID: <1f7befae0602101355h44ed680apf1175f22d2def4d7@mail.gmail.com> [Martin v. L?wis] >>> So FD_SETSIZE is 64 on Windows, [Tim Peters] >> In Python FD_SETSIZE is 512 on Windows (see the top of selectmodule.c). [Scott Dial] > Although I agree, in terms of the socketmodule, there was no such define > overriding the default FD_SETSIZE, so you are both right. ? Sorrry, don't know what you're talking about here. Python's selectmodule.c #defines FD_SETSIZE before it includes winsock.h on Windows, so Microsoft's default is irrelevant to Python. The reason selectmodule.c uses "!defined(FD_SETSIZE)" in its #if defined(MS_WINDOWS) && !defined(FD_SETSIZE) #define FD_SETSIZE 512 #endif is explained in the comment right before that code. From barry at python.org Fri Feb 10 23:00:23 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 10 Feb 2006 17:00:23 -0500 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: <756C8711-C177-4313-8375-F6279F01FD37@python.org> On Feb 10, 2006, at 3:21 PM, Guido van Rossum wrote: > > PEP 351 - freeze protocol. I'm personally -1; I don't like the idea of > freezing arbitrary mutable data structures. Are there champions who > want to argue this? I have no interest in it any longer, and wouldn't shed a tear if it were rejected. One other un-PEP'd thing. I'd like to put email 3.1 in Python 2.5 with the new module naming scheme. The old names will still work, and all the unit tests pass. Do we need a PEP for that? -Barry From mal at egenix.com Fri Feb 10 23:06:24 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 10 Feb 2006 23:06:24 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: <43ED0E60.6070004@egenix.com> Guido van Rossum wrote: >> PEP 328: Absolute/Relative Imports > > Yes, please. +0 for adding relative imports. -1 for raising errors for in-package relative imports using the current notation in Python 2.6. See: http://mail.python.org/pipermail/python-dev/2004-September/048695.html for a previous discussion. The PEP still doesn't have any mention of the above discussion or later follow-ups. The main argument is that the strategy to make absolute imports mandatory and offer relative imports as work-around breaks the possibility to produce packages that work in e.g. Python 2.4 and 2.6, simply because Python 2.4 doesn't support the needed relative import syntax. The only strategy left would be to use absolute imports throughout, which isn't all that bad, except when it comes to relocating a package or moving a set of misc. modules into a package - which is not all that uncommon in larger projects, e.g. to group third-party top-level modules into a package to prevent cluttering up the top-level namespace or to simply make a clear distinction in your code that you are relying on a third-party module, e.g from thirdparty import tool I don't mind having to deal with a warning for these, but don't want to see this raise an error before Py3k. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 10 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From thomas at xs4all.net Fri Feb 10 23:38:42 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 23:38:42 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <43ED0E60.6070004@egenix.com> References: <43ED0E60.6070004@egenix.com> Message-ID: <20060210223842.GN10226@xs4all.nl> On Fri, Feb 10, 2006 at 11:06:24PM +0100, M.-A. Lemburg wrote: > Guido van Rossum wrote: > >> PEP 328: Absolute/Relative Imports > > > > Yes, please. > +0 for adding relative imports. -1 for raising errors for > in-package relative imports using the current notation > in Python 2.6. +1/-1 for me. Being able to explicitly demand relative imports is good, breaking things soon bad. I'll happily shoehorn this in at the sprints after PyCon ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido at python.org Fri Feb 10 23:45:54 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 14:45:54 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <20060210223842.GN10226@xs4all.nl> References: <43ED0E60.6070004@egenix.com> <20060210223842.GN10226@xs4all.nl> Message-ID: On 2/10/06, Thomas Wouters wrote: > On Fri, Feb 10, 2006 at 11:06:24PM +0100, M.-A. Lemburg wrote: > > Guido van Rossum wrote: > > >> PEP 328: Absolute/Relative Imports > > > > > > Yes, please. > > > +0 for adding relative imports. -1 for raising errors for > > in-package relative imports using the current notation > > in Python 2.6. > > +1/-1 for me. Being able to explicitly demand relative imports is good, > breaking things soon bad. I'll happily shoehorn this in at the sprints after > PyCon ;) The PEP has the following timeline (my interpretation): 2.4: implement new behavior with from __future__ import absolute_import 2.5: deprecate old-style relative import unless future statement present 2.6: disable old-style relative import, future statement no longer necessary Since it wasn't implemented in 2.4, I think all these should be bumped by one release. Aahz, since you own the PEP, can you do that (and make any other updates that might result)? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Fri Feb 10 23:45:54 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 10 Feb 2006 17:45:54 -0500 Subject: [Python-Dev] release plan for 2.5 ? References: <756C8711-C177-4313-8375-F6279F01FD37@python.org> Message-ID: <000801c62e93$c0de4460$b83efea9@RaymondLaptop1> [Barry Warsaw"]like to put email 3.1 in Python 2.5 > with the new module naming scheme. The old names will still work, > and all the unit tests pass. Do we need a PEP for that? +1 From guido at python.org Fri Feb 10 23:47:01 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 14:47:01 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <000801c62e93$c0de4460$b83efea9@RaymondLaptop1> References: <756C8711-C177-4313-8375-F6279F01FD37@python.org> <000801c62e93$c0de4460$b83efea9@RaymondLaptop1> Message-ID: On 2/10/06, Raymond Hettinger wrote: > [Barry Warsaw"]like to put email 3.1 in Python 2.5 > > with the new module naming scheme. The old names will still work, > > and all the unit tests pass. Do we need a PEP for that? > > +1 I don't know if Raymond meant "we need a PEP" or "go ahead with the feature" but my own feeling is that this doesn't need a PEP and Barry can Just Do It. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Feb 10 23:49:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2006 23:49:53 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> Message-ID: <43ED1891.5070907@v.loewis.de> Tim Peters wrote: > I don't know. Of course it misses similar new tests added to _ssl.c > (see the msg that started this thread), so it spreads beyond just > this. Does it do the right thing for Windows variants like Cygwin, > and OS/2? Don't know. I see. How does Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE help here? Does defining it in PC/pyconfig.h do the right thing? I guess I'm primarily opposed to the visual ugliness of the define. Why does it spell out "can be", but abbreviates "greater than or equal to"? What about Py_CHECK_FD_SETSIZE? Regards, Martin From aleaxit at gmail.com Fri Feb 10 23:54:25 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 10 Feb 2006 14:54:25 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/10/06, Guido van Rossum wrote: ... > Next, the schedule. Neal's draft of the schedule has us releasing 2.5 > in October. That feels late -- nearly two years after 2.4 (which was > released on Nov 30, 2004). Do people think it's reasonable to strive > for a more aggressive (by a month) schedule, like this: October would seem to me to be just about right. I don't see that one month either way should make any big difference, though. > ??? Would anyone want to be even more aggressive (e.g. alpha 1 right > after PyCon???). We could always do three alphas. If I could have a definitive frozen list of features by the first week of April at the latest, that could make it (as a "2.5 preview") into the 2nd edition of "Python in a Nutshell". But since alphas are not feature-frozen, it wouldn't make much of a difference to me, I think. > Other PEPs I'd like comment on: > > PEP 357 (__index__): the patch isn't on SF yet, but otherwise I'm all > for this, and I'd like to accept it ASAP to get it in 2.5. It doesn't > look like it'll cause any problems. It does look great, and by whatever name I support it most heartily. Do, however, notice that it's "yet another specialpurpose adaptation protocol" and that such specific restricted solutions to the general problem, with all of their issues, will just keep piling up forever (and need legacy support ditto) until and unless your temperature wrt 246 (or any variation thereof) should change. > PEP 355 (path module): I still haven't reviewed this, because I'm -0 > on adding what appears to me duplicate functionality. But if there's a I feel definitely -0 towards it too. > PEP 315 - do while. A simple enough syntax proposal, albeit one > introducing a new keyword (which I'm fine with). I kind of like it but > it doesn't strike me as super important -- if we put this off until > Py3k I'd be fine with that too. Opinions? Champions? Another -0 from me. I suggest we shelve it for now and revisit in 3k (maybe PEPs in that state, "not in any 2.* but revisit for 3.0", need a special status value). > PEP 246 - adaptation. I'm still as lukewarm as ever; it needs > interfaces, promises to cause a paradigm shift, and the global map > worries me. Doesn't _need_ interfaces as a concept -- any unique markers as "protocol names" would do, even strings, although obviously the "stronger" the markers the better (classes/types for example would be just perfect). It was written on the assumption of interfaces just because they were being proposed just before it. The key "paradigm shift" is to offer a way to unify what's already being widely done, in haphazard and dispersed manners. And I'll be quite happy to rewrite it in terms of a more nuanced hierarchy of maps (e.g. builtin / per-module / lexically nested, or whatever) if that's what it takes to warm you to it -- I just think it would be over-engineering it, since in practice the global-on-all-modules map would cover by far most usage (both for "blessed" protocols that come with Python, and for the use of "third party" adapting framework A to consume stuff that framework B produces, global is the natural "residence"; other uses are far less important. > PEP 323 - copyable iterators. Seems stalled. Alex, do you care? Sure, I'd like to make this happen, particularly since Raymond appears to have already done the hard part. What would you like to see happening to bless it for 2.5? > PEP 332 - byte vectors. Looks incomplete. Put off until 2.6? Ditto -- I'd like at least SOME of it to be in 2.5. What needs to happen for that? Alex From thomas at xs4all.net Fri Feb 10 23:55:36 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 10 Feb 2006 23:55:36 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: <43ED0E60.6070004@egenix.com> <20060210223842.GN10226@xs4all.nl> Message-ID: <20060210225536.GO10226@xs4all.nl> On Fri, Feb 10, 2006 at 02:45:54PM -0800, Guido van Rossum wrote: > The PEP has the following timeline (my interpretation): > > 2.4: implement new behavior with from __future__ import absolute_import > 2.5: deprecate old-style relative import unless future statement present > 2.6: disable old-style relative import, future statement no longer necessary > Since it wasn't implemented in 2.4, I think all these should be bumped > by one release. Aahz, since you own the PEP, can you do that (and make > any other updates that might result)? Bumping is fine (of course), but I'd like a short discussion on the actual disabling before it happens (rather than the disabling happening without anyone noticing until beta2.) There seem to be a lot of users still using 2.3, at the moment, in spite of its age. Hopefully, by the time 2.7 comes out, everyone will have switched to 2.5, but if not, it could still be a major annoyance to conscientious module-writers, like MAL. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From brett at python.org Sat Feb 11 00:06:45 2006 From: brett at python.org (Brett Cannon) Date: Fri, 10 Feb 2006 15:06:45 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/10/06, Guido van Rossum wrote: > On 2/7/06, Neal Norwitz wrote: > > On 2/7/06, Jeremy Hylton wrote: > > > It looks like we need a Python 2.5 Release Schedule PEP. > > > > Very draft: http://www.python.org/peps/pep-0356.html > > > > Needs lots of work and release managers. Anthony, Martin, Fred, Sean > > are all mentioned with TBDs and question marks. > > Before he went off to a boondoggle^Woff-site at a Mexican resort, Neal > made me promise that I'd look at this and try to get the 2.5 release > plan going for real. > > First things first: we need a release manager. Anthony, do you want to > do the honors again, or are you ready for retirement? > > Next, the schedule. Neal's draft of the schedule has us releasing 2.5 > in October. That feels late -- nearly two years after 2.4 (which was > released on Nov 30, 2004). Do people think it's reasonable to strive > for a more aggressive (by a month) schedule, like this: > > alpha 1: May 2006 > alpha 2: June 2006 > beta 1: July 2006 > beta 2: August 2006 > rc 1: September 2006 > final: September 2006 > > ??? Would anyone want to be even more aggressive (e.g. alpha 1 right > after PyCon???). We could always do three alphas. > I think that schedule is fine, but going alpha after PyCon is too fast with the number of PEPs that need implementing. [SNIP] > > PEP 352: Required Superclass for Exceptions > > I believe this is pretty much non-controversial; it's a much weaker > version of PEP 348 which was rightfully rejected for being too > radical. I've tweaked some text in this PEP and approved it. Now we > need to make it happen. It might be quite a tricky thing, since > Exception is currently implemented in C as a classic class. If Brett > wants to sprint on this at PyCon I'm there to help (Mon/Tue only). > Fortunately we have MWH's patch 1104669 as a starting point. > I might sprint on it. It's either this or I will work on the AST stuff (the PyObject branch is still not finishd and thus it has not been finalized if that solution or the way it is now will be the final way of implementing the compiler and I would like to see this settled). Either way I take responsibility to make sure the PEP gets implemented so you can take that question off of the schedule PEP. [SNIP] > PEP 351 - freeze protocol. I'm personally -1; I don't like the idea of > freezing arbitrary mutable data structures. Are there champions who > want to argue this? > If Barry doesn't even care anymore I say kill it. [SNIP] > PEP 315 - do while. A simple enough syntax proposal, albeit one > introducing a new keyword (which I'm fine with). I kind of like it but > it doesn't strike me as super important -- if we put this off until > Py3k I'd be fine with that too. Opinions? Champions? > Eh, seems okay but I am not jumping up and down for it. Waiting until Python 3 is fine with me if a discussion is warranted (don't really remember it coming up before). [SNIP] > PEP 332 - byte vectors. Looks incomplete. Put off until 2.6? > I say put off. This could be discussed at PyCon since this might be an important type to get right. [SNIP] > PEP 344 - exception chaining. There are deep problems with this due to > circularities; perhaps we should drop this, or revisit it for Py3k. > I say revisit issues later. Raymond says he has an idea for chaining just the messages which could be enough help for developers. But either way I don't think this has been hashed out enough to go in as-is. I suspect a simpler solution will work, such as ditching the traceback and only keeping either the text that would have been printed or just the exception instance (and thus also its message). -Brett From barry at python.org Sat Feb 11 00:26:51 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 10 Feb 2006 18:26:51 -0500 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: <756C8711-C177-4313-8375-F6279F01FD37@python.org> <000801c62e93$c0de4460$b83efea9@RaymondLaptop1> Message-ID: <62C86CE4-EF4F-4AEE-81EC-6C7557F38693@python.org> On Feb 10, 2006, at 5:47 PM, Guido van Rossum wrote: > On 2/10/06, Raymond Hettinger wrote: >> [Barry Warsaw"]like to put email 3.1 in Python 2.5 >>> with the new module naming scheme. The old names will still work, >>> and all the unit tests pass. Do we need a PEP for that? >> >> +1 > > I don't know if Raymond meant "we need a PEP" or "go ahead with the > feature" but my own feeling is that this doesn't need a PEP and Barry > can Just Do It. I was going to ask the same thing. :) Cool. So far there have been no objections on the email-sig, so I'll try to move the sandbox to the trunk this weekend. That should give us plenty of time to shake out any nastiness. -Barry From raymond.hettinger at verizon.net Sat Feb 11 00:32:06 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 10 Feb 2006 18:32:06 -0500 Subject: [Python-Dev] release plan for 2.5 ? References: <756C8711-C177-4313-8375-F6279F01FD37@python.org> <000801c62e93$c0de4460$b83efea9@RaymondLaptop1> Message-ID: <001901c62e9a$351be660$b83efea9@RaymondLaptop1> Just do it. ----- Original Message ----- From: "Guido van Rossum" To: "Raymond Hettinger" Cc: "Barry Warsaw" ; Sent: Friday, February 10, 2006 5:47 PM Subject: Re: [Python-Dev] release plan for 2.5 ? On 2/10/06, Raymond Hettinger wrote: > [Barry Warsaw"]like to put email 3.1 in Python 2.5 > > with the new module naming scheme. The old names will still work, > > and all the unit tests pass. Do we need a PEP for that? > > +1 I don't know if Raymond meant "we need a PEP" or "go ahead with the feature" but my own feeling is that this doesn't need a PEP and Barry can Just Do It. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Sat Feb 11 00:46:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 11 Feb 2006 12:46:20 +1300 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ECE71D.2050402@v.loewis.de> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> Message-ID: <43ED25CC.3080607@canterbury.ac.nz> Martin v. L?wis wrote: > FWIW, Annex D also defines these features as deprecated: > - the use of "static" for objects in namespace scope (AFAICT > including C file-level static variables and functions) > - C library headers (i.e. ) Things like this are really starting to get on my groat. It used to be that C++ was very nearly a superset of C, so it was easy to write code that would compile as either. But C++ seems to be evolving into a different language altogether. (And an obnoxiously authoritarian one at that. If I want to write some C++ code that uses stdio because I happen to like it better, why the heck shouldn't I be allowed to? It's MY program, not the C++ standards board's!) Sorry, I just had to say that. Greg From greg.ewing at canterbury.ac.nz Sat Feb 11 01:14:23 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 11 Feb 2006 13:14:23 +1300 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <20060210204029.GY5045@xs4all.nl> <43ECFEFE.5060102@v.loewis.de> <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> Message-ID: <43ED2C5F.2060504@canterbury.ac.nz> Tim Peters wrote: > [Martin v. L?wis] > > In any case, POSIX makes it undefined what FD_SET does when the > > socket is larger than FD_SETSIZE, and apparently clearly expects > > an fd_set to be a bit mask. > > Yup -- although the people who designed the fdset macros to begin with > didn't appear to have this assumption. I don't agree. I rather think the entire purpose of the fdset interface was simply to allow more than 32 items in the set (which the original select() in BSD was limited to). The whole thing still seems totally bitmask-oriented, down to the confusion between set size and file descriptor number. The MacOSX man page for select() (which seems fairly closely BSD-based) even explicitly says "The descriptor sets are stored as bit fields in arrays of integers." Greg From tim.peters at gmail.com Sat Feb 11 02:48:30 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 10 Feb 2006 20:48:30 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ED1891.5070907@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> Message-ID: <1f7befae0602101748l163ee73dka68d18628261ccfa@mail.gmail.com> [Martin v. L?wis] > I see. How does Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE help here? By naming a logical condition as opposed to a list of platform-specific symbols that aren't documented anywhere. For example, I have no idea exactly which compiler+OS combinations define MS_WINDOWS, so "#ifdef MS_WINDOWS" is always something of a mystery. I don't want to see mystery-symbols inside modules -- to the extent that they must be used, I want to hide them in .h files clearly dedicated to wrestling with portability headaches (like pyconfig.h and pyport.h). > Does defining it in PC/pyconfig.h do the right thing? That much would stop the test failures _I_ see, which is what I need to get unstuck. If POSIX systems simply ignore it, it would do the right thing for them too. Documentation in pyport.h would serve to guide others (in the "Config #defines referenced here:" comments near the top of that file). I don't know what other systems need, so assuming "we have to do something" _at all_ here, the best I can do is provide documented macros and config symbols to deal with it. I think the relationship between SIGNED_RIGHT_SHIFT_ZERO_FILLS and pyport.h's Py_ARITHMETIC_RIGHT_SHIFT macro is a good analogy here. Almost everyone ignores SIGNED_RIGHT_SHIFT_ZERO_FILLS, and that's fine, because almost all C compilers generate code to do sign-extending right shifts. If someone has a box that doesn't, fine, it's up to them to get SIGNED_RIGHT_SHIFT_ZERO_FILLS #define'd in their pyconfig.h, and everything else "just works" for them then. All other platforms can remain blissfully ignorant. > I guess I'm primarily opposed to the visual ugliness of the define. I don't much care how it's spelled. > Why does it spell out "can be", but abbreviates > "greater than or equal to"? Don't care. I don't know of a common abbrevation for "can be", but GE same-as >= is in my Fortran-trained blood :-) > What about Py_CHECK_FD_SETSIZE? That's fine, except I think it would be pragmatically better to make it Py_DONT_CHECK_FD_SETSIZE, since most platforms want to check it. The platforms that don't want this check (like Windows) are the oddballs, so it's better to default to checking, making the oddballs explicitly do something to stop such checking. It's no problem to add a #define to PC/pyconfig.h, since that particular config file is 100% hand-written (and always will be). From bokr at oz.net Sat Feb 11 03:02:10 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 11 Feb 2006 02:02:10 GMT Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification References: <20060210165339.GK10226@xs4all.nl> Message-ID: <43ed41b5.474421562@news.gmane.org> On Fri, 10 Feb 2006 17:53:39 +0100, Thomas Wouters wrote: >On Fri, Feb 10, 2006 at 11:30:30AM -0500, Jeremy Hylton wrote: >> On 2/10/06, Guido van Rossum wrote: >> > OMG. Are we now adding 'const' modifiers to random places? I thought >> > "const propagation hell" was a place we were happily avoiding by not >> > falling for that meme. What changed? >> >> I added some const to several API functions that take char* but >> typically called by passing string literals. In C++, a string literal >> is a const char* so you need to add a const_cast<> to every call site, >> which is incredibly cumbersome. After some discussion on python-dev, >> I made changes to a small set of API functions and chased the >> const-ness the rest of the way, as you would expect. There was >> nothing random about the places const was added. >> >> I admit that I'm also puzzled by Jack's specific question. I don't >> understand why an array passed to PyArg_ParseTupleAndKeywords() would >> need to be declared as const. I observed the problem in my initial >> changes but didn't think very hard about the cause of the problem. >> Perhaps someone with better C/C++ standards chops can explain. > >Well, it's counter-intuitive, but a direct result of how pointer equivalence >is defined in C. I'm rusty in this part, so I will get some terminology >wrong, but IIRC, a variable A is of an equivalent type of variable B if they >hold the same type of data. So, a 'const char *' is equivalent to a 'char *' >because they both hold the memory of a 'char'. But a 'const char**' (or >'const *char[]') is not equivalent to a 'char **' (or 'char *[]') because >the first holds the address of a 'const char *', and the second the address >of a 'char *'. A 'char * const *' is equivalent to a 'char **' though. > >As I said, I got some of the terminology wrong, but the end result is >exactly that: a 'const char **' is not equivalent to a 'char **', even >though a 'const char *' is equivalent to a 'char *'. Equivalence, in this >case, means 'can be automatically downcasted'. Peter v/d Linden explains >this quite well in "Expert C Programming" (aka 'Deep C Secrets'), but >unfortunately I'm working from home and I left my copy at a coworkers' desk. > Would it make sense to use a typedef for readability's sake? E.g., typedef const char * p_text_literal; and then use p_text_literal, const p_text_literal * in the signature, for read-only access to the data? (hope I got that right). (also testing whether I have been redirected to /dev/null ;-) Regards, Bengt Richter From scott+python-dev at scottdial.com Sat Feb 11 03:31:52 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Fri, 10 Feb 2006 21:31:52 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602101355h44ed680apf1175f22d2def4d7@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <20060210204029.GY5045@xs4all.nl> <43ECFEFE.5060102@v.loewis.de> <1f7befae0602101335se386a43oaeb5716620225f8c@mail.gmail.com> <43ED099B.1000209@scottdial.com> <1f7befae0602101355h44ed680apf1175f22d2def4d7@mail.gmail.com> Message-ID: <43ED4C98.9070301@scottdial.com> Tim Peters wrote: > ? Sorrry, don't know what you're talking about here. Python's > selectmodule.c #defines FD_SETSIZE before it includes winsock.h on > Windows, so Microsoft's default is irrelevant to Python. The reason > selectmodule.c uses "!defined(FD_SETSIZE)" in its Not that this is really that important, but if we are talking about as the code stands right now, IS_SELECTABLE uses FD_SETSIZE with no such define ever appearing. That is what I meant, and I am pretty sure that is where Martin came up with saying it was 64. But like I say.. it's not that important. Sorry for the noise. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From scott+python-dev at scottdial.com Sat Feb 11 03:42:46 2006 From: scott+python-dev at scottdial.com (Scott Dial) Date: Fri, 10 Feb 2006 21:42:46 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> Message-ID: <43ED4F26.2000906@scottdial.com> Tim Peters wrote: > Does it do the right thing for Windows variants like Cygwin, and OS/2? I can at least say that the Cygwin implements a full POSIX facade in front of Windows sockets, so it would be important that the code in question is used to protect it as well. Also, MS_WINDOWS is not defined for a Cygwin compile, so it is fine to be using that. But I realize there is a whole 'nother discussion about that. -- Scott Dial scott at scottdial.com dialsa at rose-hulman.edu From nas at arctrix.com Sat Feb 11 06:08:09 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 11 Feb 2006 05:08:09 +0000 (UTC) Subject: [Python-Dev] release plan for 2.5 ? References: Message-ID: Guido van Rossum wrote: > PEP 349 - str() may return unicode. Where is this? Does that mean you didn't find and read the PEP or was it written so badly that it answered none of your questions? The PEP is on python.org with all the rest. I set the status to "Deferred" because it seemed that no one was interested in the change. > I'm not at all sure the PEP is ready. it would probably be a lot > of work to make this work everywhere in the C code, not to mention > the stdlib .py code. Perhaps this should be targeted for 2.6 > instead? The consequences seem potentially huge. The backwards compatibility problems *seem* to be relatively minor. I only found one instance of breakage in the standard library. Note that my patch does not change PyObject_Str(); that would break massive amounts of code. Instead, I introduce a new function: PyString_New(). I'm not crazy about the name but I couldn't think of anything better. Neil From guido at python.org Sat Feb 11 06:25:21 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 21:25:21 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/10/06, Neil Schemenauer wrote: > Guido van Rossum wrote: > > PEP 349 - str() may return unicode. Where is this? > > Does that mean you didn't find and read the PEP or was it written so > badly that it answered none of your questions? The PEP is on > python.org with all the rest. I set the status to "Deferred" > because it seemed that no one was interested in the change. Sorry -- it was an awkward way to ask "what's the status"? You've answered that. > > I'm not at all sure the PEP is ready. it would probably be a lot > > of work to make this work everywhere in the C code, not to mention > > the stdlib .py code. Perhaps this should be targeted for 2.6 > > instead? The consequences seem potentially huge. > > The backwards compatibility problems *seem* to be relatively minor. > I only found one instance of breakage in the standard library. Note > that my patch does not change PyObject_Str(); that would break > massive amounts of code. Instead, I introduce a new function: > PyString_New(). I'm not crazy about the name but I couldn't think > of anything better. So let's think about this more post 2.5. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bokr at oz.net Sat Feb 11 06:30:00 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 11 Feb 2006 05:30:00 GMT Subject: [Python-Dev] release plan for 2.5 ? References: Message-ID: <43ed7605.487813468@news.gmane.org> On Sat, 11 Feb 2006 05:08:09 +0000 (UTC), Neil Schemenauer wrote: >Guido van Rossum wrote: >> PEP 349 - str() may return unicode. Where is this? > >Does that mean you didn't find and read the PEP or was it written so >badly that it answered none of your questions? The PEP is on >python.org with all the rest. I set the status to "Deferred" >because it seemed that no one was interested in the change. > >> I'm not at all sure the PEP is ready. it would probably be a lot >> of work to make this work everywhere in the C code, not to mention >> the stdlib .py code. Perhaps this should be targeted for 2.6 >> instead? The consequences seem potentially huge. > >The backwards compatibility problems *seem* to be relatively minor. >I only found one instance of breakage in the standard library. Note >that my patch does not change PyObject_Str(); that would break >massive amounts of code. Instead, I introduce a new function: >PyString_New(). I'm not crazy about the name but I couldn't think >of anything better. > Should this not be coordinated with PEP 332? Regards, Bengt Richter From guido at python.org Sat Feb 11 06:35:26 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 10 Feb 2006 21:35:26 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <43ed7605.487813468@news.gmane.org> References: <43ed7605.487813468@news.gmane.org> Message-ID: > On Sat, 11 Feb 2006 05:08:09 +0000 (UTC), Neil Schemenauer > >The backwards compatibility problems *seem* to be relatively minor. > >I only found one instance of breakage in the standard library. Note > >that my patch does not change PyObject_Str(); that would break > >massive amounts of code. Instead, I introduce a new function: > >PyString_New(). I'm not crazy about the name but I couldn't think > >of anything better. On 2/10/06, Bengt Richter wrote: > Should this not be coordinated with PEP 332? Probably.. But that PEP is rather incomplete. Wanna work on fixing that? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bokr at oz.net Sat Feb 11 09:20:27 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 11 Feb 2006 08:20:27 GMT Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] References: <43ed7605.487813468@news.gmane.org> Message-ID: <43ed8aaf.493103945@news.gmane.org> On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum wrote: >> On Sat, 11 Feb 2006 05:08:09 +0000 (UTC), Neil Schemenauer > >The backwards compatibility problems *seem* to be relatively minor. >> >I only found one instance of breakage in the standard library. Note >> >that my patch does not change PyObject_Str(); that would break >> >massive amounts of code. Instead, I introduce a new function: >> >PyString_New(). I'm not crazy about the name but I couldn't think >> >of anything better. > >On 2/10/06, Bengt Richter wrote: >> Should this not be coordinated with PEP 332? > >Probably.. But that PEP is rather incomplete. Wanna work on fixing that? > I'd be glad to add my thoughts, but first of course it's Skip's PEP, and Martin casts a long shadow when it comes to character coding issues that I suspect will have to be considered. (E.g., if there is a b'...' literal for bytes, the actual characters of the source code itself that the literal is being expressed in could be ascii or latin-1 or utf-8 or utf16le a la Microsoft, etc. UIAM, I read that the source is at least temporarily normalized to Unicode, and then re-encoded (except now for string literals?) per coding cookie or other encoding inference. (I may be out of date, gotta catch up). If one way or the other a string literal is in Unicode, then presumably so is a byte string b'...' literal -- i.e. internally u"b'...'" just before being turned into bytes. Should that then be an internal straight u"b'...'".encode('byte') with default ascii + escapes for non-ascii and non-printables, to define the full 8 bits without encoding error? Should unicode be encodable into byte via a specific encoding? E.g., u'abc'.encode('byte','latin1'), to distinguish producing a mutable byte string vs an immutable str type as with u'abc'.encode('latin1'). (but how does this play with str being able to produce unicode? And when do these changes happen?) I guess I'm getting ahead of myself ;-) So I would first ask Skip what he'd like to do, and Martin for some hints on reading, to avoid going down paths he already knows lead to brick walls ;-) And I need to think more about PEP 349. I would propose to do the reading they suggest, and edit up a new version of pep-0332.txt that anyone could then improve further. I don't know about an early deadline. I don't want to over-commit, as time and energies vary. OTOH, as you've noticed, I could be spending my time more effectively ;-) I changed the thread title, and will wait for some signs from you, Skip, Martin, Neil, and I don't know who else might be interested... Regards, Bengt Richter From martin at v.loewis.de Sat Feb 11 09:30:52 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 09:30:52 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ED25CC.3080607@canterbury.ac.nz> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43ED25CC.3080607@canterbury.ac.nz> Message-ID: <43EDA0BC.3080809@v.loewis.de> Greg Ewing wrote: >>FWIW, Annex D also defines these features as deprecated: >>- the use of "static" for objects in namespace scope (AFAICT >> including C file-level static variables and functions) >>- C library headers (i.e. ) > > > Things like this are really starting to get on my groat. > It used to be that C++ was very nearly a superset of C, > so it was easy to write code that would compile as either. > But C++ seems to be evolving into a different language > altogether. Not at all. People appear to completely fail to grasp the notion of "deprecated" in this context. It just means "it may go away in a future version", implying that the rest of it may *not* go away in a future version. That future version might get published in 2270, when everybody has switched to C++, and compatibility with C is no longer required. So the compiler is wrong for warning about it (or the user is wrong for asking to get warned), and you are wrong for getting upset about this. > (And an obnoxiously authoritarian one at that. If I want > to write some C++ code that uses stdio because I happen > to like it better, why the heck shouldn't I be allowed > to? It's MY program, not the C++ standards board's!) Again, you are misunderstanding what precisely is deprecated. Sure you can still use stdio, and it is never going away (it isn't deprecated). However, you have to spell the header as #include and then refer to the functions as std::printf, std::stderr, etc. What is really being deprecated here is the global namespace. That's also the reason to deprecate file-level static: you should use anonymous namespaces instead. (Also, just in case this is misunderstood again: it is *not* that programs cannot put stuff in the global namespace anymore. It's just that the standard library should not put stuff in the global namespace). Regards, Martin From martin at v.loewis.de Sat Feb 11 09:33:23 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 09:33:23 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ed41b5.474421562@news.gmane.org> References: <20060210165339.GK10226@xs4all.nl> <43ed41b5.474421562@news.gmane.org> Message-ID: <43EDA153.7010809@v.loewis.de> Bengt Richter wrote: > Would it make sense to use a typedef for readability's sake? E.g., > > typedef const char * p_text_literal; > > and then use > > p_text_literal, const p_text_literal * > > in the signature, for read-only access to the data? (hope I got that right). > > (also testing whether I have been redirected to /dev/null ;-) Nearly. Please try your proposals out in a sandbox before posting. How does this contribute to solving the PyArg_ParseTupleAndKeywords issue? Readability is not the problem that puzzled Jack. Regards, Martin From bokr at oz.net Fri Feb 10 18:35:10 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 10 Feb 2006 17:35:10 GMT Subject: [Python-Dev] _length_cue() References: <20060208142034.GA1292@code0.codespeak.net> <00a101c62cea$9315b2c0$b83efea9@RaymondLaptop1> <20060208235156.GA29514@code0.codespeak.net> <009001c62d1f$79a6abc0$b83efea9@RaymondLaptop1> <20060210130806.GA29717@code0.codespeak.net> <43EC9370.7060809@gmail.com> <20060210133308.GC29717@code0.codespeak.net> Message-ID: <43ecc1f7.441719018@news.gmane.org> On Fri, 10 Feb 2006 14:33:08 +0100, Armin Rigo wrote: >Hi Nick, > >On Fri, Feb 10, 2006 at 11:21:52PM +1000, Nick Coghlan wrote: >> Do they really need anything more sophisticated than: >> >> def __repr__(self): >> return "%s(%r)" % (type(self).__name__, self._subiter) >> >> (modulo changes in the format of arguments, naturally. This simple one would >> work for things like enumerate and reversed, though) > >My goal here is not primarily to help debugging, but to help playing >around at the interactive command-line. Python's command-line should >not be dismissed as "useless for real programmers"; I definitely use it >all the time to try things out. It would be nicer if all these >iterators I'm not familiar with would give me a hint about what they >actually return, instead of: > >>>> itertools.count(17) >count(17) # yes, thank you, not very helpful >>>> enumerate("spam") >enumerate("spam") # with your proposed extension -- not better > >However, if this kind of goal is considered "not serious enough" for >adding a private special method, then I'm fine with trying out a fishing >approach. > For enhancing interactive usage, how about putting the special info and smarts in help? Or even a specialized part of help, e.g., help.explain(itertools.count(17)) or maybe help.explore(itertools.count(17)) leading to an interactive prompt putting handy cmdwords in a line to get easily to type, mro, non-underscore methods, attribute name list, etc. E.g. I often find myself typing stuff like [x for x in dir(obj) if not x.startswith('_')] or [k for k,v in type(obj).__dict__.items() if callable(v) and not k.startswith('_')] that I would welcome being able to do easily with a specialized help.plaindir(obj) or help.plainmethods(obj) or help.mromethods(obj) etc. Hm, now that I think of it, I guess I could do stuff like that in site.py, since >>> help.plaindir = lambda x: sorted([x for x in dir(x) if not x.startswith('_')]) >>> help.plaindir(int) [] >>> help.plaindir([]) ['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] But some kind of standards would probably be nice for everyone if they like the general idea. I'll leave it to someone else as to whether and where a thread re help enhancements might be ok. My .02USD ;-) Regards, Bengt Richter From g.brandl at gmx.net Sat Feb 11 10:29:56 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 11 Feb 2006 10:29:56 +0100 Subject: [Python-Dev] The decorator(s) module Message-ID: Hi, it has been proposed before, but there was no conclusive answer last time: is there any chance for 2.5 to include commonly used decorators in a module? Of course not everything that jumps around should go in, only pretty basic stuff that can be widely used. Candidates are: - @decorator. This properly wraps up a decorator function to change the signature of the new function according to the decorated one's. - @contextmanager, see PEP 343. - @synchronized/@locked/whatever, for thread safety. - @memoize - Others from wiki:PythonDecoratorLibrary and Michele Simionato's decorator module at . Unfortunately, a @property decorator is impossible... regards, Georg From g.brandl at gmx.net Fri Feb 10 22:09:52 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 10 Feb 2006 22:09:52 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: Guido van Rossum wrote: > Next, the schedule. Neal's draft of the schedule has us releasing 2.5 > in October. That feels late -- nearly two years after 2.4 (which was > released on Nov 30, 2004). Do people think it's reasonable to strive > for a more aggressive (by a month) schedule, like this: > > alpha 1: May 2006 > alpha 2: June 2006 > beta 1: July 2006 > beta 2: August 2006 > rc 1: September 2006 > final: September 2006 > > ??? Would anyone want to be even more aggressive (e.g. alpha 1 right > after PyCon???). We could always do three alphas. I am not experienced in releasing, but with the multitude of new things introduced in Python 2.5, could it be a good idea to release an early alpha not long after all (most of?) the desired features are in the trunk? That way people would get to testing sooner and the number of non-obvious bugs may be reduced (I'm thinking of the import PEP, the implementation of which is bound to be hairy, or "with" in its full extent). Georg From g.brandl at gmx.net Fri Feb 10 22:38:59 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 10 Feb 2006 22:38:59 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: <5.1.1.6.0.20060210154235.02007160@mail.telecommunity.com> Message-ID: Guido van Rossum wrote: > - setuplib? Wouldn't it make sense to add this to the 2.5 stdlib? If you mean setuptools, I'm a big +1 (if it's production-ready by that time). Together with a whipped up cheese shop we should finally be able to put up something equal to cpan/rubygems. Georg From g.brandl at gmx.net Fri Feb 10 22:32:23 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 10 Feb 2006 22:32:23 +0100 Subject: [Python-Dev] The decorator(s) module Message-ID: Hi, it has been proposed before, but there was no conclusive answer last time: is there any chance for 2.5 to include commonly used decorators in a module? Of course not everything that jumps around should go in, only pretty basic stuff that can be widely used. Candidates are: - @decorator. This properly wraps up a decorator function to change the signature of the new function according to the decorated one's. - @contextmanager, see PEP 343. - @synchronized/@locked/whatever, for thread safety. - @memoize - Others from wiki:PythonDecoratorLibrary and Michele Simionato's decorator module at . Unfortunately, a @property decorator is impossible... regards, Georg From ncoghlan at gmail.com Sat Feb 11 12:04:41 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2006 21:04:41 +1000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts Message-ID: <43EDC4C9.3030405@gmail.com> I finally finished updating PEP 338 to comply with the flexible importing system in PEP 302. The result is a not-yet-thoroughly-tested module that should allow the -m switch to execute any module written in Python that is accessible via an absolute import statement. The PEP now uses runpy for the module name, and run_module for the function used to locate and execute scripts. There's probably some discussion to be had in relation to the Design Decisions section of the PEP, relating to the way I wrote the module (the handling of locals dictionaries in particular deserves consideration). Tracker items for the runpy module [1] and its documentation [2] are on Sourceforge (the interesting parts of the documentation are in the PEP, so I suggest reading that rather than the LaTex version). Still missing from the first tracker item are a patch to update '-m' to invoke the new module and some unit tests (the version on SF has only had ad hoc testing from the interactive prompt at this stage). I hope to have those up shortly, though. Cheers, Nick. [1] http://www.python.org/sf/1429601 [2] http://www.python.org/sf/1429605 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sat Feb 11 12:04:53 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2006 21:04:53 +1000 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: <43EDC4D5.5040505@gmail.com> Guido van Rossum wrote: > PEP 338 - support -m for modules in packages. I believe Nick Coghlan > is close to implementing this. I'm fine with accepting it. I just checked in a new version of PEP 338 that cleans up the approach so that it provides support for any PEP 302 compliant packaging mechanism as well as normal filesystem packages. I've started a new thread for the discussion: PEP 338 - Executing Modules as Scripts Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From thomas at xs4all.net Sat Feb 11 13:51:02 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 11 Feb 2006 13:51:02 +0100 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: Message-ID: <20060211125102.GP10226@xs4all.nl> On Fri, Feb 10, 2006 at 10:32:23PM +0100, Georg Brandl wrote: > Unfortunately, a @property decorator is impossible... Depends. You can do, e.g., def propertydef(propertydesc): data = propertydesc() if not data: raise ValueError, "Invalid property descriptors" getter, setter, deller = (data + (None, None))[:3] return property(fget=getter, fset=setter, fdel=deller, doc=propertydesc.__doc__) and use it like: class X(object): def __init__(self): self._prop = None @propertydef def prop(): "Public, read-only access to self._prop" def getter(self): return self._prop return (getter,) @propertydef def rwprop(): "Public read-write access to self._prop" def getter(self): return self._prop def setter(self, val): self._prop = val def deller(self): self._prop = None return (getter, setter, deller) @propertydef def hiddenprop(): "Public access to a value stored in a closure" prop = [None] def getter(self): return prop[0] def setter(self, val): prop[0] = val def deller(self): prop[0] = None return (getter, setter, deller) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From bokr at oz.net Fri Feb 10 21:36:15 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 10 Feb 2006 20:36:15 GMT Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification References: <43ECC70B.8030501@v.loewis.de> Message-ID: <43ecf843.455619215@news.gmane.org> On Fri, 10 Feb 2006 18:02:03 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Jeremy Hylton wrote: >> I admit that I'm also puzzled by Jack's specific question. I don't >> understand why an array passed to PyArg_ParseTupleAndKeywords() would >> need to be declared as const. I observed the problem in my initial >> changes but didn't think very hard about the cause of the problem. >> Perhaps someone with better C/C++ standards chops can explain. > >Please take a look at this code: > >void foo(const char** x, const char*s) >{ > x[0] = s; >} > >void bar() >{ > char *kwds[] = {0}; > const char *s = "Text"; > foo(kwds, s); > kwds[0][0] = 't'; >} > >If it was correct, you would be able to modify the const char >array in the string literal, without any compiler errors. The >assignment > > x[0] = s; > >is kosher, because you are putting a const char* into a >const char* array, and the assigment > > kwds[0][0] = 't'; > >is ok, because you are modifying a char array. So the place >where it has to fail is the passing of the pointer-pointer. > Will a typedef help? ----< martin.c >------------------------------------------- #include typedef const char *ptext; void foo(ptext *kw) { const char *s = "Text"; ptext *p; for(p=kw;*p;p++){ printf("foo:%s\n", *p);} kw[0] = s; for(p=kw;*p;p++){ printf("foo2:%s\n", *p);} kw[0][0] = 't'; /* comment this out and it compiles and runs */ for(p=kw;*p;p++){ printf("foo3:%s\n", *p);} } int main() { char *kwds[] = {"Foo","Bar",0}; char **p; for(p=kwds;*p;p++){ printf("%s\n", *p);} foo(kwds); for(p=kwds;*p;p++){ printf("%s\n", *p);} } ----------------------------------------------------------- [12:32] C:\pywk\pydev>cl martin.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86 Copyright (C) Microsoft Corp 1984-1998. All rights reserved. martin.c martin.c(10) : error C2166: l-value specifies const object But after commenting out: [12:32] C:\pywk\pydev>cl martin.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86 Copyright (C) Microsoft Corp 1984-1998. All rights reserved. martin.c Microsoft (R) Incremental Linker Version 6.00.8168 Copyright (C) Microsoft Corp 1992-1998. All rights reserved. /out:martin.exe martin.obj [12:34] C:\pywk\pydev>martin Foo Bar foo:Foo foo:Bar foo2:Text foo2:Bar foo3:Text foo3:Bar Text Bar Regards, Bengt Richter From martin at v.loewis.de Sat Feb 11 14:14:00 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 14:14:00 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ecf843.455619215@news.gmane.org> References: <43ECC70B.8030501@v.loewis.de> <43ecf843.455619215@news.gmane.org> Message-ID: <43EDE318.7030708@v.loewis.de> Bengt Richter wrote: > Will a typedef help? A typedef can never help. It is always possible to reformulate a program using typedefs to one that doesn't use typedefs. Compiling your program with the const modification line removed gives martin.c: In function 'int main()': martin.c:18: error: invalid conversion from 'char**' to 'const char**' martin.c:18: error: initializing argument 1 of 'void foo(const char**)' Regards, Martin From duncan.booth at suttoncourtenay.org.uk Sat Feb 11 14:29:07 2006 From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth) Date: Sat, 11 Feb 2006 07:29:07 -0600 Subject: [Python-Dev] The decorator(s) module References: Message-ID: Georg Brandl wrote in news:dsj0p7$tk3$1 at sea.gmane.org: > Unfortunately, a @property decorator is impossible... > It all depends what you want (and whether you want the implementation to be portable to other Python implementations). Here's one possible but not exactly portable example: from inspect import getouterframes, currentframe import unittest class property(property): @classmethod def get(cls, f): locals = getouterframes(currentframe())[1][0].f_locals prop = locals.get(f.__name__, property()) return cls(f, prop.fset, prop.fdel, prop.__doc__) @classmethod def set(cls, f): locals = getouterframes(currentframe())[1][0].f_locals prop = locals.get(f.__name__, property()) return cls(prop.fget, f, prop.fdel, prop.__doc__) @classmethod def delete(cls, f): locals = getouterframes(currentframe())[1][0].f_locals prop = locals.get(f.__name__, property()) return cls(prop.fget, prop.fset, f, prop.__doc__) class PropTests(unittest.TestCase): def test_setgetdel(self): class C(object): def __init__(self, colour): self._colour = colour @property.set def colour(self, value): self._colour = value @property.get def colour(self): return self._colour @property.delete def colour(self): self._colour = 'none' inst = C('red') self.assertEquals(inst.colour, 'red') inst.colour = 'green' self.assertEquals(inst._colour, 'green') del inst.colour self.assertEquals(inst._colour, 'none') if __name__=='__main__': unittest.main() From ronaldoussoren at mac.com Sat Feb 11 14:48:46 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sat, 11 Feb 2006 14:48:46 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43ED1891.5070907@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> Message-ID: On 10-feb-2006, at 23:49, Martin v. L?wis wrote: > Tim Peters wrote: >> I don't know. Of course it misses similar new tests added to _ssl.c >> (see the msg that started this thread), so it spreads beyond just >> this. Does it do the right thing for Windows variants like Cygwin, >> and OS/2? Don't know. > > I see. How does Py_SOCKET_FD_CAN_BE_GE_FD_SETSIZE help here? > Does defining it in PC/pyconfig.h do the right thing? > > I guess I'm primarily opposed to the visual ugliness of the > define. Why does it spell out "can be", but abbreviates > "greater than or equal to"? What about Py_CHECK_FD_SETSIZE? If I understand this discussion correctly that code that would be conditionalized using this define is the IS_SELECTABLE macro in selectmodule.c and very simular code in other modules. I'd say that calling the test _Py_IS_SELECTABLE and putting it into pyport.h as Tim mentioned in an aside seems to be a good solution. At the very least it is a lot nicer than defining a very long name in pyconfig.h and then having very simular code in several #if blocks. Ronald > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2157 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060211/93193833/attachment.bin From keith at kdart.com Sat Feb 11 14:51:52 2006 From: keith at kdart.com (Keith Dart) Date: Sat, 11 Feb 2006 05:51:52 -0800 Subject: [Python-Dev] Let's just *keep* lambda In-Reply-To: <43EC0676.6030301@canterbury.ac.nz> References: <869F21D3-0CE6-447E-B02E-045604F895E7@collison.ie> <43EAD40D.30701@v.loewis.de> <17387.22307.395057.121770@montanaro.dyndns.org> <43EC0676.6030301@canterbury.ac.nz> Message-ID: <20060211055152.36a8f6ae@leviathan.kdart.com> Greg Ewing wrote the following on 2006-02-10 at 16:20 PST: === > Although "print" may become a function in 3.0, so that this > particular example would no longer be a problem. === You can always make your own Print function. The pyNMS framework adds many new builtins, as well as a Print function, when it is installed. http://svn.dartworks.biz/svn/repos/pynms/trunk/lib/nmsbuiltins.py -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Keith Dart public key: ID: 19017044 ===================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060211/00d9efdc/attachment-0001.pgp From martin at v.loewis.de Sat Feb 11 14:59:26 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 14:59:26 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602092136n40b7bb8em131046e3ba96f1cd@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> Message-ID: <43EDEDBE.5010104@v.loewis.de> Ronald Oussoren wrote: > If I understand this discussion correctly that code that would be > conditionalized using this define is the IS_SELECTABLE macro in > selectmodule.c and very simular code in other modules. I'd say that > calling the test _Py_IS_SELECTABLE and putting it into pyport.h > as Tim mentioned in an aside seems to be a good solution. At the > very least it is a lot nicer than defining a very long name in > pyconfig.h and then having very simular code in several #if blocks. For the moment, I have committed Tim's original proposal. Moving the macro into pyport.h could be done in addition. That should be done only if selectmodule is also adjusted; this currently tests for _MSC_VER. Regards, Martin From ncoghlan at gmail.com Sat Feb 11 14:59:40 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2006 23:59:40 +1000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts In-Reply-To: <43EDC4C9.3030405@gmail.com> References: <43EDC4C9.3030405@gmail.com> Message-ID: <43EDEDCC.3010209@gmail.com> Nick Coghlan wrote: > The PEP now uses runpy for the module name, and run_module for the function > used to locate and execute scripts. There's probably some discussion to be had > in relation to the Design Decisions section of the PEP, relating to the way I > wrote the module (the handling of locals dictionaries in particular deserves > consideration). Huh. Speaking of not-thoroughly-tested, exec + function code objects doesn't seem to work anything like I expected, so some of my assumptions in the PEP relating to the way the locals dictionary should be handled are clearly wrong. As I discovered, the name binding operations in a function code object have no effect whatsoever on the dictionaries passed to an invocation of exec. I'll update the PEP to drop run_function_code, and make run_code a simple wrapper around the exec statement that always returns the dictionary used as 'locals' (which may happen to be the same dictionary used as 'globals'). If the way exec handles function code objects and provision of a locals dictionary ever changes, then run_code will pick up the new semantics automatically. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From dave at boost-consulting.com Sat Feb 11 15:11:26 2006 From: dave at boost-consulting.com (David Abrahams) Date: Sat, 11 Feb 2006 09:11:26 -0500 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? Message-ID: It isn't completely clear which branch or tag to get, and Google turned up no obvious documentation. Thanks, -- Dave Abrahams Boost Consulting www.boost-consulting.com From martin at v.loewis.de Sat Feb 11 16:02:02 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 16:02:02 +0100 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? In-Reply-To: References: Message-ID: <43EDFC6A.1090005@v.loewis.de> David Abrahams wrote: > It isn't completely clear which branch or tag to get, and Google > turned up no obvious documentation. http://svn.python.org/projects/python/tags/r242/ Regards, Martin From skip at pobox.com Sat Feb 11 16:10:41 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 11 Feb 2006 09:10:41 -0600 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? In-Reply-To: References: Message-ID: <17389.65137.504788.2865@montanaro.dyndns.org> Dave> It isn't completely clear which branch or tag to get, and Google Dave> turned up no obvious documentation. On subversion, you want releaseXY-maint for the various X.Y releases. For 2.4.2, release24-maint is what you want, though it may have a few bug fixes since 2.4.2 was released. With CVS I used to use "cvs log README" to see what all the tags and branches were. I don't know what the equivalent svn command is. Skip From raveendra-babu.m at hp.com Fri Feb 10 13:36:34 2006 From: raveendra-babu.m at hp.com (M, Raveendra Babu (STSD)) Date: Fri, 10 Feb 2006 18:06:34 +0530 Subject: [Python-Dev] To know how to set "pythonpath" Message-ID: I am a newbe to python. While I am running some scripts it reports some errors because of PYTHONPATH variable. Can you send me information of how to set PYTHONPATH. I am using python 2.1.3 on aix 5.2. Regards -Raveendrababu From mrussell at verio.net Fri Feb 10 14:08:39 2006 From: mrussell at verio.net (Mark Russell) Date: Fri, 10 Feb 2006 13:08:39 +0000 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <43EC8AF8.2000506@gmail.com> References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> <43EC8AF8.2000506@gmail.com> Message-ID: <054578D1-518D-4566-A15A-114B3BB85790@verio.net> On 10 Feb 2006, at 12:45, Nick Coghlan wrote: > An alternative would be to call it "__discrete__", as that is the key > characteristic of an indexing type - it consists of a sequence of > discrete > values that can be isomorphically mapped to the integers. Another alternative: __as_ordinal__. Wikipedia describes ordinals as "numbers used to denote the position in an ordered sequence" which seems a pretty precise description of the intended result. The "as_" prefix also captures the idea that this should be a lossless conversion. Mark Russell -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060210/cfc13b05/attachment.htm From thomas at xs4all.net Sat Feb 11 16:29:57 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 11 Feb 2006 16:29:57 +0100 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? In-Reply-To: <17389.65137.504788.2865@montanaro.dyndns.org> References: <17389.65137.504788.2865@montanaro.dyndns.org> Message-ID: <20060211152957.GQ10226@xs4all.nl> On Sat, Feb 11, 2006 at 09:10:41AM -0600, skip at pobox.com wrote: > Dave> It isn't completely clear which branch or tag to get, and Google > Dave> turned up no obvious documentation. > On subversion, you want releaseXY-maint for the various X.Y releases. For > 2.4.2, release24-maint is what you want, though it may have a few bug fixes > since 2.4.2 was released. With CVS I used to use "cvs log README" to see > what all the tags and branches were. I don't know what the equivalent svn > command is. The 'cvs log' trick only works if the file you log is actually part of the branch. Not an issue with Python or any other project that always branches sanely, fortunately, but there's always wackos out there ;) You get the list of branches in SVN with: svn ls http://svn.python.org/projects/python/branches/ And similarly, tags with: svn ls http://svn.python.org/projects/python/tags -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at v.loewis.de Sat Feb 11 16:32:23 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 16:32:23 +0100 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? In-Reply-To: <17389.65137.504788.2865@montanaro.dyndns.org> References: <17389.65137.504788.2865@montanaro.dyndns.org> Message-ID: <43EE0387.8080200@v.loewis.de> skip at pobox.com wrote: > On subversion, you want releaseXY-maint for the various X.Y releases. For > 2.4.2, release24-maint is what you want, though it may have a few bug fixes > since 2.4.2 was released. With CVS I used to use "cvs log README" to see > what all the tags and branches were. I don't know what the equivalent svn > command is. The easiest is to open either http://svn.python.org/projects/python/tags/ or http://svn.python.org/projects/python/branches/ in a web browser. If you want to use the subversion command line, do svn ls http://svn.python.org/projects/python/tags/ Regards, Martin From g.brandl at gmx.net Sat Feb 11 16:33:25 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 11 Feb 2006 16:33:25 +0100 Subject: [Python-Dev] Where to put "post-it notes"? Message-ID: I just updated the general copyright notice to include the year 2006. This is scattered in at least 6 files (I found that many searching for 2004 and 2005) which would be handy to record somewhere so that next year it's easier. Where does this belong? Georg From aahz at pythoncraft.com Sat Feb 11 16:37:34 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 11 Feb 2006 07:37:34 -0800 Subject: [Python-Dev] To know how to set "pythonpath" In-Reply-To: References: Message-ID: <20060211153734.GA26575@panix.com> On Fri, Feb 10, 2006, M, Raveendra Babu (STSD) wrote: > > I am a newbe to python. While I am running some scripts it reports some > errors because of PYTHONPATH variable. > > Can you send me information of how to set PYTHONPATH. > I am using python 2.1.3 on aix 5.2. Sorry, this is the wrong place. Please use another place, such as comp.lang.python, and read http://www.catb.org/~esr/faqs/smart-questions.html -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From aahz at pythoncraft.com Sat Feb 11 16:40:35 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 11 Feb 2006 07:40:35 -0800 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43EDA0BC.3080809@v.loewis.de> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43ED25CC.3080607@canterbury.ac.nz> <43EDA0BC.3080809@v.loewis.de> Message-ID: <20060211154035.GB26575@panix.com> On Sat, Feb 11, 2006, "Martin v. L?wis" wrote: > > Not at all. People appear to completely fail to grasp the notion of > "deprecated" in this context. It just means "it may go away in a > future version", implying that the rest of it may *not* go away in a > future version. > > That future version might get published in 2270, when everybody has > switched to C++, and compatibility with C is no longer required. Just for the clarification of those of us who are not C/C++ programmers, are you saying that this is different from the meaning in Python, where "deprecated" means that something *IS* going away? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From p.f.moore at gmail.com Sat Feb 11 16:50:35 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 11 Feb 2006 15:50:35 +0000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts In-Reply-To: <43EDC4C9.3030405@gmail.com> References: <43EDC4C9.3030405@gmail.com> Message-ID: <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> On 2/11/06, Nick Coghlan wrote: > I finally finished updating PEP 338 to comply with the flexible importing > system in PEP 302. > > The result is a not-yet-thoroughly-tested module that should allow the -m > switch to execute any module written in Python that is accessible via an > absolute import statement. Does this implementation resolve http://www.python.org/sf/1250389 as well? A reading of the PEP would seem to imply that it does, but the SF patches you mention don't include any changes to the core, so I'm not sure... Paul. From thomas at xs4all.net Sat Feb 11 16:56:51 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 11 Feb 2006 16:56:51 +0100 Subject: [Python-Dev] file.next() vs. file.readline() In-Reply-To: <20060105183008.GG18916@xs4all.nl> References: <20060104163419.GF18916@xs4all.nl> <20060105183008.GG18916@xs4all.nl> Message-ID: <20060211155651.GR10226@xs4all.nl> On Thu, Jan 05, 2006 at 07:30:08PM +0100, Thomas Wouters wrote: > On Wed, Jan 04, 2006 at 10:10:07AM -0800, Guido van Rossum wrote: > > I'd say go right ahead and submit a change to SF (and then after it's > > reviewed you can check it in yourself :-). > http://www.python.org/sf?id=1397960 So, any objections to me checking this in? It doesn't break anything that wasn't already broken, but neither does it fix it; it just makes the error more apparent. I don't think it'd be a bugfix candidate, since it changes the effect of the error (rather than silently delivering data out of order, it complains.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From ncoghlan at gmail.com Sat Feb 11 17:06:54 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2006 02:06:54 +1000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts In-Reply-To: <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> References: <43EDC4C9.3030405@gmail.com> <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> Message-ID: <43EE0B9E.2040208@gmail.com> Paul Moore wrote: > On 2/11/06, Nick Coghlan wrote: >> I finally finished updating PEP 338 to comply with the flexible importing >> system in PEP 302. >> >> The result is a not-yet-thoroughly-tested module that should allow the -m >> switch to execute any module written in Python that is accessible via an >> absolute import statement. > > Does this implementation resolve http://www.python.org/sf/1250389 as > well? A reading of the PEP would seem to imply that it does, but the > SF patches you mention don't include any changes to the core, so I'm > not sure... It will. I haven't updated the command line switch itself yet, so you'd need to do "-m runpy ". I do plan on fixing the switch, but at the moment there's a bug in the module's handling of nested packages, so I want to sort that out first. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From bokr at oz.net Sat Feb 11 17:23:16 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 11 Feb 2006 16:23:16 GMT Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification References: <43ECC70B.8030501@v.loewis.de> <43ecf843.455619215@news.gmane.org> <43EDE318.7030708@v.loewis.de> Message-ID: <43edff38.522936172@news.gmane.org> On Sat, 11 Feb 2006 14:14:00 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Bengt Richter wrote: >> Will a typedef help? > >A typedef can never help. It is always possible to reformulate >a program using typedefs to one that doesn't use typedefs. I realize that's true for a correct compiler, and should have reflected that you aren't just trying to appease a particular possibly quirky one. > >Compiling your program with the const modification line >removed gives > >martin.c: In function 'int main()': >martin.c:18: error: invalid conversion from 'char**' to 'const char**' >martin.c:18: error: initializing argument 1 of 'void foo(const char**)' > Sorry, I should have tried it with gcc, which does complain: [07:16] /c/pywk/pydev>gcc martin.c martin.c: In function `main': martin.c:19: warning: passing arg 1 of `foo' from incompatible pointer type also g++, but not just warning (no a.exe generated) [07:16] /c/pywk/pydev>g++ martin.c martin.c: In function `int main()': martin.c:19: invalid conversion from `char**' to `const char**' [07:17] /c/pywk/pydev>gcc -v gcc version 3.2.3 (mingw special 20030504-1) But Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86 didn't complain. But then it doesn't complain about const char** x either. I wonder if I have complaints accidentally turned off someplace ;-/ Sorry. Regards, Bengt Richter From ncoghlan at gmail.com Sat Feb 11 18:02:25 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2006 03:02:25 +1000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts In-Reply-To: <43EE0B9E.2040208@gmail.com> References: <43EDC4C9.3030405@gmail.com> <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> <43EE0B9E.2040208@gmail.com> Message-ID: <43EE18A1.3050906@gmail.com> Nick Coghlan wrote: > Paul Moore wrote: >> On 2/11/06, Nick Coghlan wrote: >>> I finally finished updating PEP 338 to comply with the flexible importing >>> system in PEP 302. >>> >>> The result is a not-yet-thoroughly-tested module that should allow the -m >>> switch to execute any module written in Python that is accessible via an >>> absolute import statement. >> Does this implementation resolve http://www.python.org/sf/1250389 as >> well? A reading of the PEP would seem to imply that it does, but the >> SF patches you mention don't include any changes to the core, so I'm >> not sure... > > It will. I haven't updated the command line switch itself yet, so you'd need > to do "-m runpy ". I do plan on fixing the switch, but at the moment > there's a bug in the module's handling of nested packages, so I want to sort > that out first. OK, nested packages now work right (I'd managed to make the common mistake that's highlighted quite clearly in the docs for __import__). Running from inside a zipfile also appears to be working, but I don't have zlib in my Python 2.5 build to be 100% certain of that (I could check for certain with Python 2.4, but that would involve enough mucking around that I don't want to do it right now). My aim is to have a patch up for the command line switch tomorrow. It shouldn't be too tricky, since it is just a matter of retrieving and calling the function from the module. That should supply the last missing piece for the PEP implementation (aside from figuring out how to integrate my current manual test setup for runpy.run_module into the unit tests - it shouldn't be that hard to create a temp directory and add some files to it, similar to what test_pkg already does). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sat Feb 11 18:08:40 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2006 03:08:40 +1000 Subject: [Python-Dev] Where to put "post-it notes"? In-Reply-To: References: Message-ID: <43EE1A18.4020201@gmail.com> Georg Brandl wrote: > I just updated the general copyright notice to include the > year 2006. This is scattered in at least 6 files (I found that many searching > for 2004 and 2005) which would be handy to record somewhere so that next year > it's easier. Where does this belong? PEP 101 maybe? Checking the copyright notices can be done independently of releases, but they should *definitely* be checked before a release goes out. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Sat Feb 11 19:28:25 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 11 Feb 2006 19:28:25 +0100 Subject: [Python-Dev] Where to put "post-it notes"? In-Reply-To: <43EE1A18.4020201@gmail.com> References: <43EE1A18.4020201@gmail.com> Message-ID: Nick Coghlan wrote: > Georg Brandl wrote: >> I just updated the general copyright notice to include the >> year 2006. This is scattered in at least 6 files (I found that many searching >> for 2004 and 2005) which would be handy to record somewhere so that next year >> it's easier. Where does this belong? > > PEP 101 maybe? Checking the copyright notices can be done independently of > releases, but they should *definitely* be checked before a release goes out. Ah! They were already there. I added two more files. By the way, PEP 101 will need to be rewritten to reflect the move to SVN. Georg From crutcher at gmail.com Sat Feb 11 19:33:45 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sat, 11 Feb 2006 10:33:45 -0800 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: Message-ID: +1, and we could maybe include tail_call_optimized? http://littlelanguages.com/2006/02/tail-call-optimization-as-python.html On 2/11/06, Georg Brandl wrote: > Hi, > > it has been proposed before, but there was no conclusive answer last time: > is there any chance for 2.5 to include commonly used decorators in a module? > > Of course not everything that jumps around should go in, only pretty basic > stuff that can be widely used. > > Candidates are: > - @decorator. This properly wraps up a decorator function to change the > signature of the new function according to the decorated one's. > > - @contextmanager, see PEP 343. > > - @synchronized/@locked/whatever, for thread safety. > > - @memoize > > - Others from wiki:PythonDecoratorLibrary and Michele Simionato's decorator > module at . > > Unfortunately, a @property decorator is impossible... > > regards, > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From dave at boost-consulting.com Sat Feb 11 19:54:50 2006 From: dave at boost-consulting.com (David Abrahams) Date: Sat, 11 Feb 2006 13:54:50 -0500 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? References: <17389.65137.504788.2865@montanaro.dyndns.org> <20060211152957.GQ10226@xs4all.nl> Message-ID: Thomas Wouters writes: > On Sat, Feb 11, 2006 at 09:10:41AM -0600, skip at pobox.com wrote: > >> Dave> It isn't completely clear which branch or tag to get, and Google >> Dave> turned up no obvious documentation. > >> On subversion, you want releaseXY-maint for the various X.Y releases. For >> 2.4.2, release24-maint is what you want, though it may have a few bug fixes >> since 2.4.2 was released. With CVS I used to use "cvs log README" to see >> what all the tags and branches were. I don't know what the equivalent svn >> command is. > > The 'cvs log' trick only works if the file you log is actually part of the > branch. Not an issue with Python or any other project that always branches > sanely, fortunately, but there's always wackos out there ;) > You get the list of branches in SVN with: > > svn ls http://svn.python.org/projects/python/branches/ > > And similarly, tags with: > > svn ls http://svn.python.org/projects/python/tags Yes, that's easy enough, but being sure of the meaning of any given tag or branch name is less easy. -- Dave Abrahams Boost Consulting www.boost-consulting.com From aleaxit at gmail.com Sat Feb 11 21:55:10 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sat, 11 Feb 2006 12:55:10 -0800 Subject: [Python-Dev] PEP 351 In-Reply-To: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> Message-ID: <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> On Feb 10, 2006, at 1:05 PM, Raymond Hettinger wrote: > [Guido van Rossum] >> PEP 351 - freeze protocol. I'm personally -1; I don't like the >> idea of >> freezing arbitrary mutable data structures. Are there champions who >> want to argue this? > > It has at least one anti-champion. I think it is a horrible idea > and would > like to see it rejected in a way that brings finality. If needed, > I can > elaborate in a separate thread. Could you please do that? I'd like to understand all of your objections. Thanks! Alex From arigo at tunes.org Sat Feb 11 21:57:35 2006 From: arigo at tunes.org (Armin Rigo) Date: Sat, 11 Feb 2006 21:57:35 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> References: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> Message-ID: <20060211205735.GA24548@code0.codespeak.net> Hi Tim, On Fri, Feb 10, 2006 at 12:19:01PM -0500, Tim Peters wrote: > Oh, who cares? I predict "Jack's problem" would go away if we changed > the declaration of PyArg_ParseTupleAndKeywords to what you intended > to begin with: > > PyAPI_FUNC(int) PyArg_ParseTupleAndKeywords(PyObject *, PyObject *, > const char *, const > char * const *, ...); Alas, this doesn't make gcc happy either. (I'm trying gcc 3.4.4.) In theory, it prevents the const-bypassing trick showed by Martin, but apparently the C standard (or gcc) is not smart enough to realize that. I don't see a way to spell it in C so that the same extension module compiles with 2.4 and 2.5 without a warning, short of icky macros. A bientot, Armin From barry at python.org Sat Feb 11 22:18:59 2006 From: barry at python.org (Barry Warsaw) Date: Sat, 11 Feb 2006 16:18:59 -0500 Subject: [Python-Dev] PEP 351 In-Reply-To: <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> Message-ID: On Feb 11, 2006, at 3:55 PM, Alex Martelli wrote: > > On Feb 10, 2006, at 1:05 PM, Raymond Hettinger wrote: > >> [Guido van Rossum] >>> PEP 351 - freeze protocol. I'm personally -1; I don't like the >>> idea of >>> freezing arbitrary mutable data structures. Are there champions who >>> want to argue this? >> >> It has at least one anti-champion. I think it is a horrible idea >> and would >> like to see it rejected in a way that brings finality. If needed, >> I can >> elaborate in a separate thread. > > Could you please do that? I'd like to understand all of your > objections. Thanks! Better yet, add them to the PEP. -Barry From greg.ewing at canterbury.ac.nz Sat Feb 11 22:48:05 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 12 Feb 2006 10:48:05 +1300 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43EDA0BC.3080809@v.loewis.de> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43ED25CC.3080607@canterbury.ac.nz> <43EDA0BC.3080809@v.loewis.de> Message-ID: <43EE5B95.3060005@canterbury.ac.nz> Martin v. L?wis wrote: > That future version might get published in 2270, There are *already* differences which make C and C++ annoyingly incompatible. One is the const char * const * issue that appeared here. Another is that it no longer seems to be permissible to forward-declare static things, which has caused me trouble with Pyrex. That's not just a deprecation -- some compilers refuse to compile it at all. Personally I wouldn't mind about these things, as I currently don't care if I never write another line of C++ in my life. But if e.g. Pyrex-generated code is to interoperate with other people's C++ code, I need to worry about these issues. > when everybody has switched to C++, and compatibility > with C is no longer required. Yeeks, I hope not! The world needs *less* C++, not more... > Sure you can still use stdio, and it is > never going away (it isn't deprecated). However, you > have to spell the header as > > #include > > and then refer to the functions as std::printf, > std::stderr, etc. Which makes it a very different language from C in this area. That's my point. Greg From python at rcn.com Sat Feb 11 23:04:43 2006 From: python at rcn.com (Raymond Hettinger) Date: Sat, 11 Feb 2006 17:04:43 -0500 Subject: [Python-Dev] PEP 351 References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> Message-ID: <001501c62f57$2b070b60$6a01a8c0@RaymondLaptop1> ----- Original Message ----- From: "Alex Martelli" To: "Raymond Hettinger" Cc: Sent: Saturday, February 11, 2006 3:55 PM Subject: PEP 351 > > On Feb 10, 2006, at 1:05 PM, Raymond Hettinger wrote: > >> [Guido van Rossum] >>> PEP 351 - freeze protocol. I'm personally -1; I don't like the idea of >>> freezing arbitrary mutable data structures. Are there champions who >>> want to argue this? >> >> It has at least one anti-champion. I think it is a horrible idea and >> would >> like to see it rejected in a way that brings finality. If needed, I can >> elaborate in a separate thread. > > Could you please do that? I'd like to understand all of your objections. > Thanks! Here was one email on the subject: http://mail.python.org/pipermail/python-dev/2005-October/057586.html I have a number of comp.lang.python posts on the subject also. The presence of frozenset() tempts this sort of hypergeneralization. The first stumbling block comes with dictionaries. Even if you skip past the question of why you would want to freeze a dictionary (do you really want to use it as a key?), one find that dicts are not naturally freezable -- dicts compare using both keys and values; hence, if you want to hash a dict, you need to hash both the keys and values, which means that the values have to be hashable, a new and suprising requirement -- also, the values cannot be mutated or else an equality comparison will fail when search for a frozen dict that has been used as a key. One person who experimented with an implementation dealt with the problem by recursively freezing all the components (perhaps one of the dict's values is another dict which then needs to be frozen too). Executive summary: freezing dicts is a can of worms and not especially useful. Another thought is that PEP 351 reflects a world view of wanting to treat all containers polymorphically. I would suggest that they aren't designed that way (i.e. you use different methods to add elements to lists, dicts, and sets). Also, it is not especially useful to shovel around mutable containers without respect to their type. Further, even if they were polymorphic and freezable, treating them generically is likely to reflect bad design -- the soul of good programming is the correct choice of appropriate data structures. Another PEP 351 world view is that tuples can serve as frozenlists; however, that view represents a Liskov violation (tuples don't support the same methods). This idea resurfaces and has be shot down again every few months. More important than all of the above is the thought that auto-freezing is like a bad C macro, it makes too much implicit and hides too much -- the supported methods change, there is a issue keeping in sync with the non-frozen original, etc. In my experience with frozensets, I've learned that freezing is not an incidental downstream effect; instead, it is an intentional, essential part of the design and needs to be explicit. If more is needed on the subject, I'll hunt down my old posts and organize them. I hope we don't offer a freeze() builtin. If it is there, it will be tempting to use it and I think it will steer people away from good design and have a net harmful effect. Raymond P.S. The word "freezing" is itself misleading because it suggests an in-place change. However, it really means that a new object is created (just like tuple(somelist)). From tim.peters at gmail.com Sat Feb 11 23:11:20 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 11 Feb 2006 17:11:20 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43EDEDBE.5010104@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> <43EDEDBE.5010104@v.loewis.de> Message-ID: <1f7befae0602111411u1e9e01adn77077cdcfdb43d45@mail.gmail.com> [Martin v. L?wis] > For the moment, I have committed Tim's original proposal. Thank you! I checked, and that fixed all the test failures I was seeing on Windows. > Moving the macro into pyport.h could be done in addition. That > should be done only if selectmodule is also adjusted; this currently > tests for _MSC_VER. It's a nice illustration of why platform-dependent code sprayed across modules sucks, too. Why _MSC_VER instead of MS_WINDOWS? What's the difference, exactly? Who knows? I see that selectmodule.c has this comment near the top: Under BeOS, we suffer the same dichotomy as Win32; sockets can be anything >= 0. but there doesn't appear to be any _code_ matching that comment in that module -- unless on BeOS _MSC_VER is defined. Beats me whether it is, but doubt it. The code in selectmodule when _MSC_VER is _not_ defined complains if a socket fd is >= FD_SETSIZE _or_ is < 0. But the new code in socketmodule on non-Windows boxes is happy with negative fds, saying "fine" whenever fd < FD_SETSIZE. Is that right or wrong? "The answer" isn't so important to me as that this kind of crap always happens when platform-specific logic ends up getting defined in multiple modules. Much better to define macros to hide this junk, exactly once; pyport.h is the natural place for it. From martin at v.loewis.de Sat Feb 11 23:52:45 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 11 Feb 2006 23:52:45 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <20060211154035.GB26575@panix.com> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43ED25CC.3080607@canterbury.ac.nz> <43EDA0BC.3080809@v.loewis.de> <20060211154035.GB26575@panix.com> Message-ID: <43EE6ABD.5040103@v.loewis.de> Aahz wrote: >>That future version might get published in 2270, when everybody has >>switched to C++, and compatibility with C is no longer required. > > > Just for the clarification of those of us who are not C/C++ programmers, > are you saying that this is different from the meaning in Python, where > "deprecated" means that something *IS* going away? To repeat the literal words from the standard: Annex D [depr]: 1 This clause describes features of the C++ Standard that are specified for compatibility with existing implementations. 2 These are deprecated features, where deprecated is defined as: Normative for the current edition of the Standard, but not guaranteed to be part of the Standard in future revisions. Regards, Martin From martin at v.loewis.de Sun Feb 12 00:02:32 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Feb 2006 00:02:32 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43EE5B95.3060005@canterbury.ac.nz> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43ED25CC.3080607@canterbury.ac.nz> <43EDA0BC.3080809@v.loewis.de> <43EE5B95.3060005@canterbury.ac.nz> Message-ID: <43EE6D08.4090909@v.loewis.de> Greg Ewing wrote: > There are *already* differences which make C and C++ > annoyingly incompatible. One is the const char * const * > issue that appeared here. Of course there are differences. C++ has classes, C doesn't. C++ has function overloading, C doesn't. C++ has assignment from char** to const char*const*, C doesn't. Why is it annoying that C++ extends C? > Another is that it no longer > seems to be permissible to forward-declare static things, Not sure what you are referring to. You can forward-declare static functions in C++ just fine. >>when everybody has switched to C++, and compatibility >>with C is no longer required. > > > Yeeks, I hope not! The world needs *less* C++, not more... I'm sure the committee waits until you retire before deciding that compatibility with C is not needed anymore :-) >>Sure you can still use stdio, and it is >>never going away (it isn't deprecated). However, you >>have to spell the header as >> >>#include >> >>and then refer to the functions as std::printf, >>std::stderr, etc. > > > Which makes it a very different language from C in > this area. That's my point. That future version of C++ to be published in 2270, yes, it will be different from C, because the last C programmer will have died 20 years ago. Regards, Martin From dave at boost-consulting.com Sun Feb 12 00:04:00 2006 From: dave at boost-consulting.com (David Abrahams) Date: Sat, 11 Feb 2006 18:04:00 -0500 Subject: [Python-Dev] How to get the Python-2.4.2 sources from SVN? References: <43EDFC6A.1090005@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > David Abrahams wrote: >> It isn't completely clear which branch or tag to get, and Google >> turned up no obvious documentation. > > http://svn.python.org/projects/python/tags/r242/ Thanks. -- Dave Abrahams Boost Consulting www.boost-consulting.com From noamraph at gmail.com Sun Feb 12 00:15:12 2006 From: noamraph at gmail.com (Noam Raphael) Date: Sun, 12 Feb 2006 01:15:12 +0200 Subject: [Python-Dev] PEP 351 In-Reply-To: <001501c62f57$2b070b60$6a01a8c0@RaymondLaptop1> References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> <001501c62f57$2b070b60$6a01a8c0@RaymondLaptop1> Message-ID: Hello, I just wanted to say this: you can reject PEP 351, please don't reject the idea of frozen objects completely. I'm working on an idea similar to that of the PEP, and I think that it can be done elegantly, without the concrete problems that Raymond pointed. I didn't work on it in the last few weeks, because of my job, but I hope to come back to it soon and post a PEP and a reference implementation in CPython. My quick responses, mostly to try to convince that I know a bit about what I'm talking about: First about the last point: I suggest that the function will be named frozen(x), which suggests that nothing happens to x, you only get a "frozen x". I suggest that this operation won't be called "freezing x", but "making a frozen copy of x". Now, along with the original order. Frozen dicts - if you want, you can decide that dicts aren't frozenable, and that's ok. But if you do want to make frozen copies of dicts, it isn't really such a problem - it's similar to hashing a tuple, which requires recursive hashing of all its elements; for making a frozen copy of a dict, you make a frozen copy of all its values. Treating all containers polymorphically - I don't suggest that. In my suggestion, you may have frozen lists, frozen tuples (which are normal tuples with frozen elements), frozen sets and frozen dicts. Treating tuples as frozen lists - I don't suggest to do that. But if my suggestion is accepted, there would be no need for tuples - frozen lists would be just as useful. And about the other concerns: > More important than all of the above is the thought that auto-freezing is > like a bad C macro, it makes too much implicit and hides too much -- the > supported methods change, there is a issue keeping in sync with the > non-frozen original, etc. > > In my experience with frozensets, I've learned that freezing is not an > incidental downstream effect; instead, it is an intentional, essential part > of the design and needs to be explicit. I think these concerns can only be judged given a real suggestion, along with an implementation. I have already implemented most of my idea in CPython, and I think it's elegant and doesn't cause problems. Of course, I may not be objective about the subject, but I only ask to wait for the real suggestion before dropping it down. To summarize, I see the faults in PEP 351. I think that another, fairly similar idea might be a good one. Have a good week, Noam From martin at v.loewis.de Sun Feb 12 00:45:59 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Feb 2006 00:45:59 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <20060211205735.GA24548@code0.codespeak.net> References: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> <20060211205735.GA24548@code0.codespeak.net> Message-ID: <43EE7737.4080503@v.loewis.de> Armin Rigo wrote: > Alas, this doesn't make gcc happy either. (I'm trying gcc 3.4.4.) In > theory, it prevents the const-bypassing trick showed by Martin, but > apparently the C standard (or gcc) is not smart enough to realize that. It appears to be language-defined. Looking at the assignment char **a; const char* const* b; b = a; then, in C++, 4.4p4 [conv.qual] has a rather longish formula to decide that the assignment is well-formed. In essence, it goes like this: - the pointers are "similar": they have the same levels of indirection, and the same underlying type. - In all places where the type of a has const/volatile qualification, the type of b also has these qualifications (i.e. none in the example) - Starting from the first point where the qualifications differ (from left to right), all later levels also have const. I'm unsure about C; I think the rule comes from 6.3.2.3p2: [#2] For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type; the values stored in the original and converted pointers shall compare equal. So it is possible to convert a non-const pointer to a const pointer, but only if the the target types are the same. In the example, they are not: the target type of a is char*, the target of b is const char*. Regards, Martin From martin at v.loewis.de Sun Feb 12 00:54:41 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Feb 2006 00:54:41 +0100 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602111411u1e9e01adn77077cdcfdb43d45@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <20060210102713.GJ10226@xs4all.nl> <1f7befae0602100943u5f6ef73eoece1609a8ecb81e1@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> <43EDEDBE.5010104@v.loewis.de> <1f7befae0602111411u1e9e01adn77077cdcfdb43d45@mail.gmail.com> Message-ID: <43EE7941.30802@v.loewis.de> Tim Peters wrote: > The code in selectmodule when _MSC_VER is _not_ defined complains if a > socket fd is >= FD_SETSIZE _or_ is < 0. But the new code in > socketmodule on non-Windows boxes is happy with negative fds, saying > "fine" whenever fd < FD_SETSIZE. Is that right or wrong? I think it is right: the code just "knows" that negative values cannot happen. The socket handles originate from system calls (socket(2), accept(2)), and a negative value returned there is an error. However, the system might (and did) return handles larger than FD_SETSIZE (as the kernel often won't know what value FD_SETSIZE has). > "The answer" isn't so important to me as that this kind of crap always > happens when platform-specific logic ends up getting defined in > multiple modules. Much better to define macros to hide this junk, > exactly once; pyport.h is the natural place for it. That must be done carefully, though. For example, how should the line max = 0; /* not used for Win32 */ be treated? Should we introduce a #define Py_SELECT_NUMBER_OF_FDS_PARAMETER_IS_IRRELEVANT? Regards, Martin From ncoghlan at gmail.com Sun Feb 12 03:05:17 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2006 12:05:17 +1000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts In-Reply-To: <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> References: <43EDC4C9.3030405@gmail.com> <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> Message-ID: <43EE97DD.1080207@gmail.com> Paul Moore wrote: > On 2/11/06, Nick Coghlan wrote: >> I finally finished updating PEP 338 to comply with the flexible importing >> system in PEP 302. >> >> The result is a not-yet-thoroughly-tested module that should allow the -m >> switch to execute any module written in Python that is accessible via an >> absolute import statement. > > Does this implementation resolve http://www.python.org/sf/1250389 as > well? A reading of the PEP would seem to imply that it does, but the > SF patches you mention don't include any changes to the core, so I'm > not sure... I copied the module and test packages over to my Python 2.4 site packages, and running modules from inside zip packages does indeed work as intended (with an explicit redirection through runpy, naturally). Kudos to the PEP 302 folks - I only tested with runpy.py's Python emulation of PEP 302 style imports for the normal file system initially, but zipimport still worked correctly on the first go. For Python 2.5, this redirection from the command line switch to runpy.run_module should be automatic. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From raymond.hettinger at verizon.net Sun Feb 12 03:49:47 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 11 Feb 2006 21:49:47 -0500 Subject: [Python-Dev] PEP 351 References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> <001501c62f57$2b070b60$6a01a8c0@RaymondLaptop1> Message-ID: <001c01c62f7e$fd086b50$b83efea9@RaymondLaptop1> [Noam] > I just wanted to say this: you can reject PEP 351, please don't reject > the idea of frozen objects completely. I'm working on an idea similar > to that of the PEP, . . . > I think these concerns can only be judged given a real suggestion, > along with an implementation. I have already implemented most of my > idea in CPython, and I think it's elegant and doesn't cause problems. > Of course, I may not be objective about the subject, but I only ask to > wait for the real suggestion before dropping it down I was afraid of this -- the freezing concept is a poison that will cause some good minds to waste a good deal of their time. Once frozensets were introduced, it was like lighting a flame drawing moths to their doom. At first, it seems like such a natural, obvious extension to generically freeze anything that is mutable. People exploring it seem to lose sight of motivating use cases and get progressively turned around. It doesn't take long to suddenly start thinking it is a good idea to have mutable strings, to recursively freeze components of a dictionary, to introduce further list/tuple variants, etc. Perhaps a consistent solution can be found, but it no longer resembles Python; rather, it is a new language, one that is not grounded in real-world use cases. Worse, I think a frozen() built-in would be hazardous to users, drawing them away from better solutions to their problems. Expect writing and defending a PEP to consume a month of your life. Before devoting more of your valuable time, here's a checklist of questions to ask yourself (sort of a mid-project self-assessment and reality check): 1. It is already possible to turn many objects into key strings -- perhaps by marshaling, pickling, or making a custom repr such as repr(sorted(mydict.items())). Have you ever had occasion to use this? IOW, have you ever really needed to use a dictionary as a key to another dictionary? Has there been any clamor for a frozendict(), not as a toy recipe but as a real user need that cannot be met by other Python techniques? If the answer is no, it should be a hint that a generalized freezing protocol will rot in the basement. 2. Before introducing a generalized freezing protocol, wouldn't it make sense to write a third-party extension for just frozendicts, just to see if anyone can possibly make productive use of it? One clue would be to search for code that exercises the existing code in dict.__eq__(). If you rarely have occasion to compare dicts, then it is certainly even more rare to want to be able to hash them. If not, then is this project being pursued because it is interesting or because there's a burning need that hasn't surfaced before? 3. Does working out the idea entail recursive freezing of a dictionary? Does that impose limits on generality (you can freeze some dicts but not others)? Does working out the idea lead you to mutable strings? If so, don't count on Guido's support.. 4. Leaving reality behind (meaning actual problems that aren't readily solvable with the existing language), try to contrive some hypothetical use cases? Any there any that are not readily met by the simple recipe in the earlier email: http://mail.python.org/pipermail/python-dev/2005-October/057586.html ? 5. How extensively does the rest of Python have to change to support the new built-in. If the patch ends-up touching many objects and introducing new rules, then the payoff needs to be pretty darned good. I presume that for frozen(x) to work a lot of types have to be modified. Python seems to fare quite well without frozendicts and frozenlists, so do we need to introduce them just to make the new frozen() built-in work with more than just sets? Raymond From bokr at oz.net Sun Feb 12 04:24:17 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 12 Feb 2006 03:24:17 GMT Subject: [Python-Dev] PEP 351 References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> Message-ID: <43ee8092.556050558@news.gmane.org> On Sat, 11 Feb 2006 12:55:10 -0800, Alex Martelli wrote: > >On Feb 10, 2006, at 1:05 PM, Raymond Hettinger wrote: > >> [Guido van Rossum] >>> PEP 351 - freeze protocol. I'm personally -1; I don't like the >>> idea of >>> freezing arbitrary mutable data structures. Are there champions who >>> want to argue this? >> >> It has at least one anti-champion. I think it is a horrible idea >> and would >> like to see it rejected in a way that brings finality. If needed, >> I can >> elaborate in a separate thread. > >Could you please do that? I'd like to understand all of your >objections. Thanks! > > PMJI. I just read PEP 351, and had an idea for doing the same without pre-instantiating protected subclasses, and doing the wrapping on demand instead. Perhaps of interest? (Or if already considered and rejected, shouldn't this be mentioned in the PEP?) The idea is to factor out freezing from the objects to be frozen. If it's going to involve copying anyway, feeding the object to a wrapping class constructor doesn't seem like much extra overhead. The examples in the PEP were very amenable to this approach, but I don't know how it would apply to whatever Alex's use cases might be. Anyhow, why shouldn't you be able to call freeze(an_ordinary_list) and get back freeze(xlist(an_ordinary_list)) automatically, based e.g. on a freeze_registry_dict[type(an_ordinary_list)] => xlist lookup, if plain hash fails? Common types that might be usefully freezable could be pre-registered, and when a freeze fails on a user object (presumably inheriting a __hash__ that bombs or because he wants it to) the programmer's solution would be to define a suitable callable to produce the frozen object, and register that, but not modify his unwrapped pre-freeze-mods object types and instantiations. BTW, xlist wouldn't need to exist, since freeze_registry_dict[type(alist)] could just return the tuple type. Otherwise the programmer would make a wrapper class taking the object as an __init__ (or maybe __new__) arg, and intercepting the mutating methods etc., and stuff that in the freeze_registry_dict. IWT some metaclass stuff might make it possible to parameterize a lot of wrapper class aspects, e.g., if you gave it a __mutator_method_name_list__ to work with. Perhaps freeze builtin could be a callable object with __call__ for the freeze "function" call and with e.g. freeze.register(objtype, wrapper_class) as a registry API. I am +0 on any of this in any case, not having had a use case to date, but I thought taking the __freeze__ out of the objects (by not forcing them to be them pre-instantiatated as wrapped instances) and letting registered freeze wrappers do it on demand instead might be interesting to someone. If not, or if it's been discussed (no mention on the PEP tho) feel free to ignore ;-) BTW freeze as just described might be an instance of class Freezer(object): def __init__(self): self._registry_dict = { set:frozenset, list:tuple, dict:imdict} def __call__(self, obj): try: return hash(obj) except TypeError: freezer = self._registry_dict.get(type(obj)) if freezer: return freezer(obj) raise TypeError('object is not freezable') def register(self, objtype, wrapper): self._registry_dict[objtype] = wrapper (above refers to imdict from PEP 351) Usage example: >>> import alt351 >>> freeze = alt351.Freezer() (well, pretend freeze is builtin) >>> fr5 = freeze(range(5)) >>> fr5 (0, 1, 2, 3, 4) >>> d = dict(a=1,b=2) >>> d {'a': 1, 'b': 2} >>> fd = freeze(d) >>> fd {'a': 1, 'b': 2} >>> fd['a'] 1 >>> fd['a']=3 Traceback (most recent call last): File "", line 1, in ? File "alt351.py", line 7, in _immutable raise TypeError('object is immutable') TypeError: object is immutable >>> type(fd) +0 ;-) Regards, Bengt Richter From tim.peters at gmail.com Sun Feb 12 04:35:35 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 11 Feb 2006 22:35:35 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <43EE7941.30802@v.loewis.de> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <43ECF67C.1020700@scottdial.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> <43EDEDBE.5010104@v.loewis.de> <1f7befae0602111411u1e9e01adn77077cdcfdb43d45@mail.gmail.com> <43EE7941.30802@v.loewis.de> Message-ID: <1f7befae0602111935l625f667bjd6ad6d95b5945a30@mail.gmail.com> [Tim] >> The code in selectmodule when _MSC_VER is _not_ defined complains if a >> socket fd is >= FD_SETSIZE _or_ is < 0. But the new code in >> socketmodule on non-Windows boxes is happy with negative fds, saying >> "fine" whenever fd < FD_SETSIZE. Is that right or wrong? [Martin] > I think it is right: the code just "knows" that negative values > cannot happen. The socket handles originate from system calls > (socket(2), accept(2)), and a negative value returned there is > an error. However, the system might (and did) return handles > larger than FD_SETSIZE (as the kernel often won't know what > value FD_SETSIZE has). Since the new code was just added, you can remember that now. No comments record the reasoning, though, and over time it's likely to become another mass of micro-optimized "mystery code". If it's true that negative values can't happen (and I believe that), then it doesn't hurt to verify that they're >= 0 either (except from a micro-efficiency view), and it would simplify the code do to so. >> "The answer" isn't so important to me as that this kind of crap always >> happens when platform-specific logic ends up getting defined in >> multiple modules. Much better to define macros to hide this junk, >> exactly once; pyport.h is the natural place for it. > That must be done carefully, though. For example, how should > the line > > max = 0; /* not used for Win32 */ > > be treated? Should we introduce a > #define Py_SELECT_NUMBER_OF_FDS_PARAMETER_IS_IRRELEVANT? I wouldn't: I'd simply throw away the current confusing avoidance of computing "max" on Windows. That's another case where platform-specific micro-efficiency seems the only justification (select() on Windows ignores its first argument; there's nothing special about "0" here, despite that the code currently makes 0 _look_ special on Windows somehow). So fine by me if the current: #if defined(_MSC_VER) max = 0; /* not used for Win32 */ #else /* !_MSC_VER */ if (v < 0 || v >= FD_SETSIZE) { PyErr_SetString(PyExc_ValueError, "filedescriptor out of range in select()"); goto finally; } if (v > max) max = v; #endif /* _MSC_VER */ block got replaced by, e.g.,: max = 0; if (! Py_IS_SOCKET_FD_OK(v)) { PyErr_SetString(PyExc_ValueError, "filedescriptor out of range in select()"); goto finally; } if (v > max) max = v; Unlike the current code, that would, for example, also allow for the _possibility_ of checking that v != INVALID_SOCKET on Windows, by fiddling the Windows expansion of Py_IS_SOCKET_FD_OK (and of course all users of that macro would grow the same new smarts). I'm not really a macro fan: I'm a fan of centralizing portability hacks in config header files, and hiding them under abstractions. C macros are usually strong enough to support this, and are all the concession to micro-efficiency I'm eager ;-) to make. From ncoghlan at gmail.com Sun Feb 12 04:48:20 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2006 13:48:20 +1000 Subject: [Python-Dev] PEP 338 - Executing Modules as Scripts In-Reply-To: <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> References: <43EDC4C9.3030405@gmail.com> <79990c6b0602110750n4dcb5e9ctedd03dd363a2d3f6@mail.gmail.com> Message-ID: <43EEB004.7030004@gmail.com> Paul Moore wrote: > On 2/11/06, Nick Coghlan wrote: >> I finally finished updating PEP 338 to comply with the flexible importing >> system in PEP 302. >> >> The result is a not-yet-thoroughly-tested module that should allow the -m >> switch to execute any module written in Python that is accessible via an >> absolute import statement. > > Does this implementation resolve http://www.python.org/sf/1250389 as > well? A reading of the PEP would seem to imply that it does, but the > SF patches you mention don't include any changes to the core, so I'm > not sure... I've uploaded a patch with the necessary changes to main.c to the PEP 338 implementation tracker item. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From nnorwitz at gmail.com Sun Feb 12 06:59:23 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 11 Feb 2006 21:59:23 -0800 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: <1f7befae0602111935l625f667bjd6ad6d95b5945a30@mail.gmail.com> References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> <43EDEDBE.5010104@v.loewis.de> <1f7befae0602111411u1e9e01adn77077cdcfdb43d45@mail.gmail.com> <43EE7941.30802@v.loewis.de> <1f7befae0602111935l625f667bjd6ad6d95b5945a30@mail.gmail.com> Message-ID: On 2/11/06, Tim Peters wrote: >> [Tim telling how I broke pyuthon] > [Martin fixing it] Sorry for the breakage (I didn't know about the Windows issues). Thank you Martin for fixing it. I agree with the solution. I was away from mail, ahem, "working". n From nnorwitz at gmail.com Sun Feb 12 07:32:58 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 11 Feb 2006 22:32:58 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/10/06, Guido van Rossum wrote: > > Next, the schedule. Neal's draft of the schedule has us releasing 2.5 > in October. That feels late -- nearly two years after 2.4 (which was > released on Nov 30, 2004). Do people think it's reasonable to strive > for a more aggressive (by a month) schedule, like this: > > alpha 1: May 2006 > alpha 2: June 2006 > beta 1: July 2006 > beta 2: August 2006 > rc 1: September 2006 > final: September 2006 I think this is very reasonable. Based on Martin's message and if we can get everyone fired up and implementing, it would possible to start in April. I'll update the PEP for starting in May now. We can revise further later. > ??? Would anyone want to be even more aggressive (e.g. alpha 1 right > after PyCon???). We could always do three alphas. I think PyCon is too early, but 3 alphas is a good idea. I'll add this as well. Probably separated by 3-4 weeks so it doesn't change the schedule much. The exact schedule will still changed based on release manager availability and other stuff that needs to be implemented. > > PEP 353: Using ssize_t as the index type > > Neal tells me that this is in progress in a branch, but that the code > is not yet flawless (tons of warnings etc.). Martin, can you tell us > more? When do you expect this to land? Maybe aggressively merging into > the HEAD and then releasing it as alpha would be a good way to shake > out the final issues??? I'm tempted to say we should merge now. I know the branch works on 64-bit boxes. I can test on a 32-bit box if Martin hasn't already. There will be a lot of churn fixing problems, but maybe we can get more people involved. n From nnorwitz at gmail.com Sun Feb 12 07:38:10 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 11 Feb 2006 22:38:10 -0800 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: On 2/10/06, Georg Brandl wrote: > > I am not experienced in releasing, but with the multitude of new things > introduced in Python 2.5, could it be a good idea to release an early alpha > not long after all (most of?) the desired features are in the trunk? In the past, all new features had to be in before beta 1 IIRC (it could have been beta 2 though). The goal is to get things in sooner, preferably prior to alpha. For 2.5, we should strive really hard to get features implemented prior to alpha 1. Some of the changes (AST, ssize_t) are pervasive. AST while localized, ripped the guts out of something every script needs (more or less). ssize_t touches just about everything it seems. n From thomas at xs4all.net Sun Feb 12 11:51:41 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sun, 12 Feb 2006 11:51:41 +0100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: Message-ID: <20060212105141.GS10226@xs4all.nl> On Sat, Feb 11, 2006 at 10:38:10PM -0800, Neal Norwitz wrote: > On 2/10/06, Georg Brandl wrote: > > I am not experienced in releasing, but with the multitude of new things > > introduced in Python 2.5, could it be a good idea to release an early alpha > > not long after all (most of?) the desired features are in the trunk? > In the past, all new features had to be in before beta 1 IIRC (it > could have been beta 2 though). The goal is to get things in sooner, > preferably prior to alpha. Well, in the past, features -- even syntax changes -- have gone in between the last beta and the final release (but reminding Guido might bring him to tears of regret. ;) Features have also gone into what would have been 'bugfix releases' if you looked at the numbering alone (1.5 -> 1.5.1 -> 1.5.2, for instance.) "The past" doesn't have a very impressive track record... However, beta 1 is a very good ultimate deadline, and it's been stuck by for the last few years, AFAIK. But I concur with: > For 2.5, we should strive really hard to get features implemented > prior to alpha 1. Some of the changes (AST, ssize_t) are pervasive. > AST while localized, ripped the guts out of something every script > needs (more or less). ssize_t touches just about everything it seems. that as many features as possible, in particular the broad-touching ones, should be in alpha 1. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at v.loewis.de Sun Feb 12 12:13:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Feb 2006 12:13:53 +0100 Subject: [Python-Dev] ssize_t branch (Was: release plan for 2.5 ?) In-Reply-To: References: Message-ID: <43EF1871.9030304@v.loewis.de> Neal Norwitz wrote: > I'm tempted to say we should merge now. I know the branch works on > 64-bit boxes. I can test on a 32-bit box if Martin hasn't already. > There will be a lot of churn fixing problems, but maybe we can get > more people involved. The ssize_t branch has now all the API I want it to have. I just posted the PEP to comp.lang.python, maybe people have additional things they consider absolutely necessary. There are two aspects left, and both can be done after the merge: - a lot of modules still need adjustments, to really support 64-bit collections. This shouldn't cause any API changes, AFAICT. - the printing of Py_ssize_t values should be supported. I think Tim proposed to provide the 'z' formatter across platforms. This is a new API, but it's a pure extension, so it can be done in the trunk. I would like to avoid changing APIs after the merge to the trunk has happened; I remember Guido saying (a few years ago) that this change must be a single large change, rather many small incremental changes. I agree, and I hope I have covered everything that needs to be covered. Regards, Martin From smiles at worksmail.net Sun Feb 12 19:44:51 2006 From: smiles at worksmail.net (Smith) Date: Sun, 12 Feb 2006 12:44:51 -0600 Subject: [Python-Dev] nice() Message-ID: <038701c63004$733603c0$132c4fca@csmith> I've been thinking about a function that was recently proposed at python-dev named 'areclose'. It is a function that is meant to tell whether two (or possible more) numbers are close to each other. It is a function similar to one that exists in Numeric. One such implementation is def areclose(x,y,abs_tol=1e-8,rel_tol=1e-5): diff = abs(x-y) return diff <= ans_tol or diff <= rel_tol*max(abs(x),abs(y)) (This is the form given by Scott Daniels on python-dev.) Anyway, one of the rationales for including such a function was: When teaching some programming to total newbies, a common frustration is how to explain why a==b is False when a and b are floats computed by different routes which ``should'' give the same results (if arithmetic had infinite precision). Decimals can help, but another approach I've found useful is embodied in Numeric.allclose(a,b) -- which returns True if all items of the arrays are ``close'' (equal to within certain absolute and relative tolerances) The problem with the above function, however, is that it *itself* has a comparison between floats and it will give undesired result for something like the following test: ### >>> print areclose(2, 2.1, .1, 0) #see if 2 and 2.1 are within 0.1 of each other False >>> ### Here is an alternative that might be a nice companion to the repr() and round() functions: nice(). It is a combination of Tim Peter's delightful 'case closed' presentation in the thread, "Rounding to n significant digits?" [1] and the hidden magic of "prints" simplification of floating point numbers when being asked to show them. It's default behavior is to return a number in the form that the number would have when being printed. An optional argument, however, allows the user to specify the number of digits to round the number to as counted from the most significant digit. (An alternative name, then, could be 'lround' but I think there is less baggage for the new user to think about if the name is something like nice()--a function that makes the floating point numbers "play nice." And I also think the name...sounds nice.) Here it is in action: ### >>> 3*1.1==3.3 False >>> nice(3*1.1)==nice(3.3) True >>> x=3.21/0.65; print x 4.93846153846 >>> print nice(x,2) 4.9 >>> x=x*1e5; print nice(x,2) 490000.0 ### Here's the function: ### def nice(x,leadingDigits=0): """Return x either as 'print' would show it (the default) or rounded to the specified digit as counted from the leftmost non-zero digit of the number, e.g. nice(0.00326,2) --> 0.0033""" assert leadingDigits>=0 if leadingDigits==0: return float(str(x)) #just give it back like 'print' would give it leadingDigits=int(leadingDigits) return float('%.*e' % (leadingDigits,x)) #give it back as rounded by the %e format ### Might something like this be useful? For new users, no arguments are needed other than x and floating points suddenly seem to behave in tests made using nice() values. It's also useful for those computing who want to show a physically meaningful value that has been rounded to the appropriate digit as counted from the most significant digit rather than from the decimal point. Some time back I had worked on the significant digit problem and had several math calls to figure out what the exponent was. The beauty of Tim's solution is that you just use built in string formatting to do the work. Nice. /c [1] http://mail.python.org/pipermail/tutor/2004-July/030324.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060212/1792bf6b/attachment.htm From mwh at python.net Mon Feb 13 00:30:27 2006 From: mwh at python.net (Michael Hudson) Date: Sun, 12 Feb 2006 23:30:27 +0000 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <5.1.1.6.0.20060210154235.02007160@mail.telecommunity.com> (Phillip J. Eby's message of "Fri, 10 Feb 2006 16:07:50 -0500") References: <5.1.1.6.0.20060210154235.02007160@mail.telecommunity.com> Message-ID: <2mirrkcfzg.fsf@starship.python.net> "Phillip J. Eby" writes: > At 12:21 PM 2/10/2006 -0800, Guido van Rossum wrote: >> > PEP 343: The "with" Statement >> >>Didn't Michael Hudson have a patch? > > PEP 343's "Accepted" status was reverted to "Draft" in October, and then > changed back to "Accepted". I believe the latter change is an error, since > you haven't pronounced on the changes. Have you reviewed the __context__ > stuff that was added? > > In any case Michael's patch was pre-AST branch merge, and no longer > reflects the current spec. It also never quite reflected the spec at the time, although I forget the detail it didn't support :/ Cheers, mwh -- 81. In computing, turning the obvious into the useful is a living definition of the word "frustration". -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From kxroberto at googlemail.com Sun Feb 12 21:46:50 2006 From: kxroberto at googlemail.com (Robert) Date: Sun, 12 Feb 2006 21:46:50 +0100 Subject: [Python-Dev] Fwd: Ruby/Python Continuations: Turning a block callback into a read()-method ? Message-ID: <43EF9EBA.2060609@googlemail.com> Fwd: news: After failing on a yield/iterator-continuation problem in Python (see below) I tried the Ruby (1.8.2) language first time on that construct: The example tries to convert a block callback interface (Net::FTP.retrbinary) into a read()-like iterator function in order to virtualize the existing FTP class as kind of file system. 4 bytes max per read in this first simple test below. But it fails on the second continuation with ThreadError after this second continuation really executing!? Any ideas how to make this work/correct? (The question is not about the specific FTP example as it - e.g. about a rewrite of FTP/retrbinary or use of OS tricks, real threads with polling etc... - but about the continuation language trick to get the execution flow right in order to turn any callback interface into an "enslaved callable iterator". Python can do such things in simple situations with yield-generator functions/iter.next()... But Python obviously fails by a hair when there is a function-context barrier for "yield". Ruby's block-yield-mechanism seems to not at all have the power of real generator-continuation as in Python, but in principle only to be that what a normal callback would be in Python. Yet "callcc" seemes to be promising - I thought so far :-( ) === Ruby callcc Pattern : execution fails with ThreadError!? =========== require 'net/ftp' module Net class FTPFile def initialize(ftp,path) @ftp = ftp @path=path @flag=true @iter=nil end def read if @iter puts "@iter.call" @iter.call else puts "RETR "+ at path @ftp.retrbinary("RETR "+ at path,4) do |block| print "CALLBACK ",block,"\n" callcc{|@iter| @flag=true} if @flag @flag=false return block end end end end end end ftp = Net::FTP.new("localhost",'user','pass') ff = Net::FTPFile.new(ftp,'data.txt') puts ff.read() puts ff.read() === Output/Error ==== vs:~/test$ ruby ftpfile.rb RETR data.txt CALLBACK robe robe @iter.call CALLBACK rt /usr/lib/ruby/1.8/monitor.rb:259:in `mon_check_owner': current thread not owner (ThreadError) from /usr/lib/ruby/1.8/monitor.rb:211:in `mon_exit' from /usr/lib/ruby/1.8/monitor.rb:231:in `synchronize' from /usr/lib/ruby/1.8/net/ftp.rb:399:in `retrbinary' from ftpfile.rb:17:in `read' from ftpfile.rb:33 vs:~/test$ === Python Pattern : I cannot write down the idea because of a barrier === #### I tried a pattern like: .... def open(self,ftppath,mode='rb'): class FTPFile: ... def iter_retr() ... def callback(blk): how-to-yield-from-here-as-iter_retr blk??? self.ftp.retrbinary("RETR %s" % self.relpath,callback) def read(self, bytes=-1): ... self.buf+=self.iter.next() ... ... .... ===== Robert From alan.gauld at freenet.co.uk Mon Feb 13 00:24:45 2006 From: alan.gauld at freenet.co.uk (Alan Gauld) Date: Sun, 12 Feb 2006 23:24:45 -0000 Subject: [Python-Dev] [Tutor] nice() References: <038701c63004$733603c0$132c4fca@csmith> Message-ID: <00a001c6302b$82d51f10$0b01a8c0@xp> I have no particularly strong view on the concept (except that I usually see the "problem" as a valuable opportunity to introduce a concept that has far wider reaching consequences than floating point numbers!). However I do dislike the name nice() - there is already a nice() in the os module with a fairly well understood function. But I'm sure some time with a thesaurus can overcome that single mild objection. :-) Alan G Author of the learn to program web tutor http://www.freenetpages.co.uk/hp/alan.gauld ----- Original Message ----- From: "Smith" To: Cc: ; Sent: Sunday, February 12, 2006 6:44 PM Subject: [Tutor] nice() I've been thinking about a function that was recently proposed at python-dev named 'areclose'. It is a function that is meant to tell whether two (or possible more) numbers are close to each other. It is a function similar to one that exists in Numeric. One such implementation is def areclose(x,y,abs_tol=1e-8,rel_tol=1e-5): diff = abs(x-y) return diff <= ans_tol or diff <= rel_tol*max(abs(x),abs(y)) (This is the form given by Scott Daniels on python-dev.) Anyway, one of the rationales for including such a function was: When teaching some programming to total newbies, a common frustration is how to explain why a==b is False when a and b are floats computed by different routes which ``should'' give the same results (if arithmetic had infinite precision). Decimals can help, but another approach I've found useful is embodied in Numeric.allclose(a,b) -- which returns True if all items of the arrays are ``close'' (equal to within certain absolute and relative tolerances) The problem with the above function, however, is that it *itself* has a comparison between floats and it will give undesired result for something like the following test: ### >>> print areclose(2, 2.1, .1, 0) #see if 2 and 2.1 are within 0.1 of each >>> other False >>> ### Here is an alternative that might be a nice companion to the repr() and round() functions: nice(). It is a combination of Tim Peter's delightful 'case closed' presentation in the thread, "Rounding to n significant digits?" [1] and the hidden magic of "prints" simplification of floating point numbers when being asked to show them. It's default behavior is to return a number in the form that the number would have when being printed. An optional argument, however, allows the user to specify the number of digits to round the number to as counted from the most significant digit. (An alternative name, then, could be 'lround' but I think there is less baggage for the new user to think about if the name is something like nice()--a function that makes the floating point numbers "play nice." And I also think the name...sounds nice.) Here it is in action: ### >>> 3*1.1==3.3 False >>> nice(3*1.1)==nice(3.3) True >>> x=3.21/0.65; print x 4.93846153846 >>> print nice(x,2) 4.9 >>> x=x*1e5; print nice(x,2) 490000.0 ### Here's the function: ### def nice(x,leadingDigits=0): """Return x either as 'print' would show it (the default) or rounded to the specified digit as counted from the leftmost non-zero digit of the number, e.g. nice(0.00326,2) --> 0.0033""" assert leadingDigits>=0 if leadingDigits==0: return float(str(x)) #just give it back like 'print' would give it leadingDigits=int(leadingDigits) return float('%.*e' % (leadingDigits,x)) #give it back as rounded by the %e format ### Might something like this be useful? For new users, no arguments are needed other than x and floating points suddenly seem to behave in tests made using nice() values. It's also useful for those computing who want to show a physically meaningful value that has been rounded to the appropriate digit as counted from the most significant digit rather than from the decimal point. Some time back I had worked on the significant digit problem and had several math calls to figure out what the exponent was. The beauty of Tim's solution is that you just use built in string formatting to do the work. Nice. /c [1] http://mail.python.org/pipermail/tutor/2004-July/030324.html From jcarlson at uci.edu Mon Feb 13 01:14:50 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 12 Feb 2006 16:14:50 -0800 Subject: [Python-Dev] [Tutor] nice() In-Reply-To: <00a001c6302b$82d51f10$0b01a8c0@xp> References: <038701c63004$733603c0$132c4fca@csmith> <00a001c6302b$82d51f10$0b01a8c0@xp> Message-ID: <20060212161410.5F02.JCARLSON@uci.edu> "Alan Gauld" wrote: > However I do dislike the name nice() - there is already a nice() in the > os module with a fairly well understood function. But I'm sure some > time with a thesaurus can overcome that single mild objection. :-) Presumably it would be located somewhere like the math module. - Josiah > Alan G > Author of the learn to program web tutor > http://www.freenetpages.co.uk/hp/alan.gauld > > > > ----- Original Message ----- > From: "Smith" > To: > Cc: ; > Sent: Sunday, February 12, 2006 6:44 PM > Subject: [Tutor] nice() > > > I've been thinking about a function that was recently proposed at python-dev > named 'areclose'. It is a function that is meant to tell whether two (or > possible more) numbers are close to each other. It is a function similar to > one that exists in Numeric. One such implementation is > > def areclose(x,y,abs_tol=1e-8,rel_tol=1e-5): > diff = abs(x-y) > return diff <= ans_tol or diff <= rel_tol*max(abs(x),abs(y)) > > (This is the form given by Scott Daniels on python-dev.) > > Anyway, one of the rationales for including such a function was: > > When teaching some programming to total newbies, a common frustration > is how to explain why a==b is False when a and b are floats computed > by different routes which ``should'' give the same results (if > arithmetic had infinite precision). Decimals can help, but another > approach I've found useful is embodied in Numeric.allclose(a,b) -- > which returns True if all items of the arrays are ``close'' (equal to > within certain absolute and relative tolerances) > The problem with the above function, however, is that it *itself* has a > comparison between floats and it will give undesired result for something > like the following test: > > ### > >>> print areclose(2, 2.1, .1, 0) #see if 2 and 2.1 are within 0.1 of each > >>> other > False > >>> > ### > > Here is an alternative that might be a nice companion to the repr() and > round() functions: nice(). It is a combination of Tim Peter's delightful > 'case closed' presentation in the thread, "Rounding to n significant > digits?" [1] and the hidden magic of "prints" simplification of floating > point numbers when being asked to show them. > > It's default behavior is to return a number in the form that the number > would have when being printed. An optional argument, however, allows the > user to specify the number of digits to round the number to as counted from > the most significant digit. (An alternative name, then, could be 'lround' > but I think there is less baggage for the new user to think about if the > name is something like nice()--a function that makes the floating point > numbers "play nice." And I also think the name...sounds nice.) > > Here it is in action: > > ### > >>> 3*1.1==3.3 > False > >>> nice(3*1.1)==nice(3.3) > True > >>> x=3.21/0.65; print x > 4.93846153846 > >>> print nice(x,2) > 4.9 > >>> x=x*1e5; print nice(x,2) > 490000.0 > ### > > Here's the function: > ### > def nice(x,leadingDigits=0): > """Return x either as 'print' would show it (the default) or rounded to the > specified digit as counted from the leftmost non-zero digit of the number, > > e.g. nice(0.00326,2) --> 0.0033""" > assert leadingDigits>=0 > if leadingDigits==0: > return float(str(x)) #just give it back like 'print' would give it > leadingDigits=int(leadingDigits) > return float('%.*e' % (leadingDigits,x)) #give it back as rounded by the %e > format > ### > > Might something like this be useful? For new users, no arguments are needed > other than x and floating points suddenly seem to behave in tests made using > nice() values. It's also useful for those computing who want to show a > physically meaningful value that has been rounded to the appropriate digit > as counted from the most significant digit rather than from the decimal > point. > > Some time back I had worked on the significant digit problem and had several > math calls to figure out what the exponent was. The beauty of Tim's solution > is that you just use built in string formatting to do the work. Nice. > > /c > > [1] http://mail.python.org/pipermail/tutor/2004-July/030324.html > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu From kd5bjo at gmail.com Mon Feb 13 01:19:46 2006 From: kd5bjo at gmail.com (Eric Sumner) Date: Sun, 12 Feb 2006 18:19:46 -0600 Subject: [Python-Dev] PEP 343: Context managers a superset of decorators? Message-ID: Forgive me if someone has already come up with this; I know I am coming to the party several months late. All of the proposals for decorators (including the accepted one) seemed a bit kludgey to me, and I couldn't figure out why. When I read PEP 343, I realized that they all provide a solution for an edge case without addressing the larger problem. If context managers are provided access to the contained and containing namespaces of their with statement, they can perform the same function that decorators do now. A transforming class could be implemented as: ## Code Start ------------------------------------------------- class DecoratorContext(object): def __init__(self, func): self.func = func def __context__(self): return self def __enter__(self, contained, containing): pass def __exit__(self, contained, containing): for k,v in contained.iteritems(): containing[k] = self.func(v) ## Code End --------------------------------------------------- With this in place, decorators can be used with the with statement: ## Code Start ------------------------------------------------- classmethod = DecoratorContext(classmethod) class foo: def __init__(self, ...): pass with classmethod: def method1(cls, ...): pass def method2(cls, ...): pass ## Code End --------------------------------------------------- The extra level of indention could be avoided by dealing with multiple block-starting statements on a line by stating that all except the last block contain only one statement: ## Code Start ------------------------------------------------- classmethod = DecoratorContext(classmethod) class foo: def __init__(self, ...): pass with classmethod: def method1(cls, ...): pass with classmethod: def method2(cls, ...): pass ## Code End --------------------------------------------------- I will readily admit that I have no idea how difficult either of these suggestions would be to implement, or if it would be a good idea to do so. At this point, they are just something to think about -- Eric Sumner From jcarlson at uci.edu Mon Feb 13 03:24:18 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 12 Feb 2006 18:24:18 -0800 Subject: [Python-Dev] PEP 343: Context managers a superset of decorators? In-Reply-To: References: Message-ID: <20060212181838.5F07.JCARLSON@uci.edu> Eric Sumner wrote: > Forgive me if someone has already come up with this; I know I am > coming to the party several months late. All of the proposals for > decorators (including the accepted one) seemed a bit kludgey to me, > and I couldn't figure out why. When I read PEP 343, I realized that > they all provide a solution for an edge case without addressing the > larger problem. [snip code samples] > I will readily admit that I have no idea how difficult either of these > suggestions would be to implement, or if it would be a good idea to do > so. At this point, they are just something to think about Re-read the decorator PEP: http://www.python.org/peps/pep-0318.html to understand why both of these options (indentation and prefix notation) are undesireable for a general decorator syntax. The desire for context managers to have access to its enclosing scope is another discussion entirely, though it may do so without express permission via stack frame manipulation. - Josiah From martin at v.loewis.de Mon Feb 13 05:06:21 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 13 Feb 2006 05:06:21 +0100 Subject: [Python-Dev] Fwd: Ruby/Python Continuations: Turning a block callback into a read()-method ? In-Reply-To: <43EF9EBA.2060609@googlemail.com> References: <43EF9EBA.2060609@googlemail.com> Message-ID: <43F005BD.2080504@v.loewis.de> Robert wrote: > Any ideas how to make this work/correct? Why is that a question for python-dev? Regards, Martin From steve at holdenweb.com Mon Feb 13 05:39:10 2006 From: steve at holdenweb.com (Steve Holden) Date: Sun, 12 Feb 2006 23:39:10 -0500 Subject: [Python-Dev] Pervasive socket failures on Windows In-Reply-To: References: <1f7befae0602091917t6ea68d8ah6cc990846921f158@mail.gmail.com> <1f7befae0602101314u308909aeha3c8c636d2bbac64@mail.gmail.com> <43ED06A8.9000400@v.loewis.de> <1f7befae0602101349x19721c6bp87dff31d4a2461ee@mail.gmail.com> <43ED1891.5070907@v.loewis.de> <43EDEDBE.5010104@v.loewis.de> <1f7befae0602111411u1e9e01adn77077cdcfdb43d45@mail.gmail.com> <43EE7941.30802@v.loewis.de> <1f7befae0602111935l625f667bjd6ad6d95b5945a30@mail.gmail.com> Message-ID: Neal Norwitz wrote: > On 2/11/06, Tim Peters wrote: > >>>[Tim telling how I broke pyuthon] >> >>[Martin fixing it] > > > Sorry for the breakage (I didn't know about the Windows issues). > Thank you Martin for fixing it. I agree with the solution. > > I was away from mail, ahem, "working". > yeah, right, at your off-site boondoggle south of the border. we know. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From greg.ewing at canterbury.ac.nz Mon Feb 13 07:10:18 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 13 Feb 2006 19:10:18 +1300 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43EE7737.4080503@v.loewis.de> References: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> <20060211205735.GA24548@code0.codespeak.net> <43EE7737.4080503@v.loewis.de> Message-ID: <43F022CA.1090200@canterbury.ac.nz> Martin v. L?wis wrote: > then, in C++, 4.4p4 [conv.qual] has a rather longish formula to > decide that the assignment is well-formed. In essence, it goes > like this: > > [A large head-exploding set of rules] Blarg. Const - Just Say No. Greg From greg.ewing at canterbury.ac.nz Mon Feb 13 07:34:57 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 13 Feb 2006 19:34:57 +1300 Subject: [Python-Dev] nice() In-Reply-To: <038701c63004$733603c0$132c4fca@csmith> References: <038701c63004$733603c0$132c4fca@csmith> Message-ID: <43F02891.5080908@canterbury.ac.nz> Smith wrote: > When teaching some programming to total newbies, a common frustration > is how to explain why a==b is False when a and b are floats computed > by different routes which ``should'' give the same results (if > arithmetic had infinite precision). This is just a special case of the problems inherent in the use of floating point. As with all of these, papering over this particular one isn't going to help in the long run -- another one will pop up in due course. Seems to me it's better to educate said newbies not to use algorithms that require comparing floats for equality at all. In my opinion, if you ever find yourself trying to do this, you're not thinking about the problem correctly, and your algorithm is simply wrong, even if you had infinitely precise floats. Greg From greg.ewing at canterbury.ac.nz Mon Feb 13 07:35:07 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 13 Feb 2006 19:35:07 +1300 Subject: [Python-Dev] PEP 351 In-Reply-To: <43ee8092.556050558@news.gmane.org> References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> <43ee8092.556050558@news.gmane.org> Message-ID: <43F0289B.1070006@canterbury.ac.nz> Bengt Richter wrote: > Anyhow, why shouldn't you be able to call freeze(an_ordinary_list) and get back freeze(xlist(an_ordinary_list)) > automatically, based e.g. on a freeze_registry_dict[type(an_ordinary_list)] => xlist lookup, if plain hash fails? [Cue: sound of loud alarm bells going off in Greg's head] -1 on having any kind of global freezing registry. If we need freezing at all, I think it would be quite sufficient to have a few types around such as frozenlist(), frozendict(), etc. I would consider it almost axiomatic that code needing to freeze something will know what type of thing it is freezing. If it doesn't, it has no business attempting to do so. If you need to freeze something not covered by the standard frozen types, write your own class or function to handle it, and invoke it explicitly where appropriate. Greg From kd5bjo at gmail.com Mon Feb 13 11:52:28 2006 From: kd5bjo at gmail.com (Eric Sumner) Date: Mon, 13 Feb 2006 04:52:28 -0600 Subject: [Python-Dev] PEP 343: Context managers a superset of decorators? In-Reply-To: <20060212181838.5F07.JCARLSON@uci.edu> References: <20060212181838.5F07.JCARLSON@uci.edu> Message-ID: On 2/12/06, Josiah Carlson wrote: [paragraphs swapped] > The desire for context managers to have access to its enclosing scope is > another discussion entirely, though it may do so without express > permission via stack frame manipulation. My main point was that, with relatively small changes to 343, it can replace the decorator syntax with a more general solution that matches the style of the rest of the language better. The main change (access to scopes) makes this possible, and the secondary change (altering the block syntax) mitigates (but does not remove) the syntax difficulties presented. I realize that I made an assumption that may not be valid; namely, that a new scope is generated by the 'with' statement. Stack frame manipulation would not be able to provide access to a scope that no longer exists. > Re-read the decorator PEP: http://www.python.org/peps/pep-0318.html to > understand why both of these options (indentation and prefix notation) > are undesireable for a general decorator syntax. With the changes that I propose, both syntaxes are equivalent and can be used interchangeably. While each of them has problems, I believe that in situations where one has a problem, the other usually does not. >From this point on, I provide a point-by-point reaction to the most applicable syntax objections listed in PEP 318. If you're not interested in this, bail out now. In the PEP, there is no discussion of a prefix notation in which the decorator is placed before the 'def' on the same line. The most similar example has the decorator between the 'def' and the parameter list. It mentions two problems: > There are a couple of objections to this form. The first is that it breaks > easily 'greppability' of the source -- you can no longer search for 'def foo(' > and find the definition of the function. The second, more serious, objection > is that in the case of multiple decorators, the syntax would be extremely > unwieldy. The first does not apply, as this syntax does not separate 'def' and the function name. The second is still a valid concern, but the decorator list can easily be broken across multiple lines. The main objection to an indented syntax seems to be that it requires decorated functions to be indented an extra level. For simple decorators, the compacted syntax could be used to sidestep this problem. The main complaints about the J2 proposal don't quite apply: the code in the block is a sequence of statements and 'with' is already going to be added to the language as a compound statement. -- Eric From ncoghlan at gmail.com Mon Feb 13 12:15:29 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Feb 2006 21:15:29 +1000 Subject: [Python-Dev] PEP 343: Context managers a superset of decorators? In-Reply-To: References: <20060212181838.5F07.JCARLSON@uci.edu> Message-ID: <43F06A51.206@gmail.com> Eric Sumner wrote: > I realize that I made an assumption that may not be valid; > namely, that a new scope is generated by the 'with' statement. The with statement uses the existing scope - its just a way of factoring out try/finally boilerplate code. No more, and, in fact, fractionally less (the 'less' being the fact that just like any other Python function, you only get to supply one value to be bound to a name in the invoking scope). Trying to link this with the function definition pipelining provided by decorators seems like a bit of a stretch. It certainly isn't a superset of the decorator functionality - if you want a statement that manipulates the namespace it contains, that's what class statements are for :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Mon Feb 13 16:03:29 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 13 Feb 2006 16:03:29 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available Message-ID: The above docs are from August 2005 while docs.python.org/dev is current. Shouldn't the old docs be removed? Georg From kd5bjo at gmail.com Mon Feb 13 17:42:30 2006 From: kd5bjo at gmail.com (Eric Sumner) Date: Mon, 13 Feb 2006 10:42:30 -0600 Subject: [Python-Dev] PEP 343: Context managers a superset of decorators? In-Reply-To: <43F06A51.206@gmail.com> References: <20060212181838.5F07.JCARLSON@uci.edu> <43F06A51.206@gmail.com> Message-ID: On 2/13/06, Nick Coghlan wrote: > Eric Sumner wrote: > > I realize that I made an assumption that may not be valid; > > namely, that a new scope is generated by the 'with' statement. > > The with statement uses the existing scope - its just a way of factoring out > try/finally boilerplate code. No more, and, in fact, fractionally less (the > 'less' being the fact that just like any other Python function, you only get > to supply one value to be bound to a name in the invoking scope). Ok. These changes are more substantial than I thought, then. > Trying to link this with the function definition pipelining provided by > decorators seems like a bit of a stretch. It certainly isn't a superset of the > decorator functionality - if you want a statement that manipulates the > namespace it contains, that's what class statements are for :) Several examples of how the 'with' block would be used involve transactions which are either rolled back or confirmed. All of these use the transaction capabilities of some external database. With separate scopes, the '__exit__' function can decide which names to export outwards to the containing scope. Unlike class statements, the contained scope is used temporarily and can be discarded when the 'with' statement is completed. This would allow a context manager to provide a local transaction handler. To me, it is not much of a leap from copying data between scopes to modifying it as it goes through, which is exactly what decorators do. The syntax that this provides for decorators seems reasonable enough (to me) to make the '@' syntax redundant. However, this is a larger change than I thought, and maybe not worth the effort to implement. -- Eric From smiles at worksmail.net Mon Feb 13 18:10:28 2006 From: smiles at worksmail.net (Smith) Date: Mon, 13 Feb 2006 11:10:28 -0600 Subject: [Python-Dev] nice() References: Message-ID: <004f01c630c0$f051e1f0$5f2c4fca@csmith> | From: Josiah Carlson | "Alan Gauld" wrote: || However I do dislike the name nice() - there is already a nice() in || the || os module with a fairly well understood function. perhaps trim(), nearly(), about(), defer_the_pain_of() :-) I've waited to think of names until after writing this. The reason for the last name option may become apparent after reading the rest of this post. || But I'm sure some || time with a thesaurus can overcome that single mild objection. :-) | | Presumably it would be located somewhere like the math module. I would like to see it as accessible as round, int, float, and repr. I really think a round-from-the-left is a nice tool to have. It's obviously very easy to build your own if you know what tools to use. Not everyone is going to be reading the python-dev or similar lists, however, and so having it handy would be nice. | From: Greg Ewing | Smith wrote: | || When teaching some programming to total newbies, a common || frustration is how to explain why a==b is False when a and b are || floats computed by different routes which ``should'' give the || same results (if arithmetic had infinite precision). | | This is just a special case of the problems inherent | in the use of floating point. As with all of these, | papering over this particular one isn't going to help | in the long run -- another one will pop up in due | course. | | Seems to me it's better to educate said newbies not | to use algorithms that require comparing floats for | equality at all. I think that having a helper function like nice() is a middle ground solution to the problem, falling short of using only decimal or rational values for numbers and doing better than requiring a test of error between floating values that should be equal but aren't because of alternate methods of computation. Just like the argument for having true division being the default behavior for the computational environment, it seems a little unfriendly to expect the more casual user to have to worry that 3*0.1 is not the same as 3/10.0. I know--they really are different, and one should (eventually) understand why, but does anyone really want the warts of floating point representation to be popping up in their work if they could be avoided, or at least easily circumvented? I know you know why the following numbers show up as not equal, but this would be an example of the pain in working with a reasonably simple exercise of, say, computing the bin boundaries for a histogram where bins are a width of 0.1: ### >>> for i in range(20): ... if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)): ... print i,repr(i*.1),repr(i/10.),i*.1,i/10. ... 3 0.30000000000000004 0.29999999999999999 0.3 0.3 6 0.60000000000000009 0.59999999999999998 0.6 0.6 7 0.70000000000000007 0.69999999999999996 0.7 0.7 12 1.2000000000000002 1.2 1.2 1.2 14 1.4000000000000001 1.3999999999999999 1.4 1.4 17 1.7000000000000002 1.7 1.7 1.7 19 1.9000000000000001 1.8999999999999999 1.9 1.9 ### For, say, garden variety numbers that aren't full of garbage digits resulting from fp computation, the boundaries computed as 0.1*i are not going to agree with such simple numbers as 1.4 and 0.7. Would anyone (and I truly don't know the answer) really mind if all floating point values were filtered through whatever lies behind the str() manipulation of floats before the computation was made? I'm not saying that strings would be compared, but that float(str(x)) would be compared to float(str(y)) if x were being compared to y as in x<=y. If this could be done, wouldn't a lot of grief just go away and not require the use of decimal or rational types for many users? I understand that the above really is just a patch over the problem, but I'm wondering if it moves the problem far enough away that most users wouldn't have to worry about it. Here, for example, are the first values where the running sum doesn't equal the straight multiple of some step size: ### >>> def go(x,n=1000): ... s=0;i=0 ... while snice(i*x): ... return i,s,i*x,`s`,`i*x` ... >>> for i in range(1,100): ... print i, go(i/1000.) ... print ... 1 (60372 60.3719999999 60.372 60.371999999949999 60.372) 2 (49645 99.2899999999 99.29 99.289999999949998 99.290000000000006) ### The soonest the breakdown occurs is at the 22496th multiple of 0.041 for the range given above. By the time someone starts getting into needs of iterating so many times, they will be ready to use the more sophisticated option of nice()--the one which makes it more versatile and less of a patch--the option to round the answers to a given number of leading digits rather than a given decimal precision like round. nice() gives a simple way to think about making a comparison of floats. You just have to ask yourself at what "part per X" do you no longer care whether the numbers are different or not. e.g., for approximately 1 part in 100, use nice(x,2) and nice(y,2) to make the comparison between x and y. Replacing nice(*) with nice(*,6) in the go() defined above produces no discrepancy in values computed the two different ways. Since the cost of str() and '%.*e' is nearly the same, perhaps a default value of leadingDigits=9 would be a good default value, and the float(str()) option could be eliminated from nice. Isn't nice() sort of a poor-man's decimal-type without all the extra baggage? | In my opinion, if you ever find | yourself trying to do this, you're not thinking about | the problem correctly, and your algorithm is simply | wrong, even if you had infinitely precise floats. | As for real world examples of when this would be nice I will have to rely on others to justify this more heavily. Some quick examples that come to mind are: * Creating histograms of physical measurements with limited significant digits (i.e., not lots of digits from computation) * Collecting sets of points within a certain range of a given value (all points within 10% of a given value) * Stopping iterations when computed errors have fallen below a certain threshold. (For this, getting the stopping condition "right" is not so critical because doing one more iteration usually isn't a problem if an error happens to be a tiny bit larger than the required tolerance. However, the leadingDigits option on nice() allows one to even get this stopping condition right to a limited precision, something like ### tol = 1e-5 while 1: #do something and compute err if nice(err,3)<=nice(tol,3): break ### By specifying the leadingDigits value of 3, the user is saying that it's fine to quit when the err >= 0.9995. Since there is no additional cost in specifying more digits, a value of 9 could be used as well. | Ismael at tutor wrote: | How about overloading Float comparison? I'm not so adept at such things--how easy is this to do for all comparisions in a script? in an interactive session? For the latter, if it were easy, perhaps it could be part of a "newbie" mode that could be loaded. I think that some (one above has said so) would rather not have an issue pushed away, they would want to leave things as they are and just learn to work around it, not be given a hand-holding device that is eventually going to let them down anyway. I'm wondering if just having to use the function to make a comparison will be like putting your helmet on before you cycle--a reminder that there may be hazards ahead, proceed with caution. If so, then overloading the Float comparision would be better not done, requiring the "buckling" of the floats within nice(). | | If I have understood correctly, float to float comparison must be done | comparing relative errors, so that when dealing with small but rightly | represented numbers it won't tell "True" just because they're | "close". I | think your/their solution only covers the case when dealing with "big" | numbers. Think of two small numbers that you think might fail the nice test and then use the leadingDigits option (set at something like 6) and see if the problem doesn't disappear. If I understand you correctly, is this such a case: x and y defined below are truly close and nice()'s default comparison would say they are different, but nice(*,6) would say they are the same--the same to the first 6 digits of the exponential representation: ### >>> x=1.234567e-7 >>> y=1.234568e-7 >>> nice(x)==nice(y) False >>> nice(x,6)==nice(y,6) True ### | Chuck Allison wrote on edu-sig: | There is a reliable way to compute the exact number of floating-point | "intervals" (one less than the number of FP numbers) between any two | FP numbers. It is a long-ago solved problem. I have attached a C++ | version. You can't define closeness by a "distance" in a FP system - | you should use this measure instead (called "ulps" - units in the | last place). The distance between large FP numbers may always be | greater than the tolerance you prescribe. The spacing between | adjacent FP numbers at the top of the scale for IEEE double precision | numbers is 2^(972) (approx. 10^(293))! I doubt you're going to make | your tolerance this big. I don't believe newbies can grasp this, but | they can be taught to get a "feel" for floating-point number systems. | You can't write reliable FP code without this understanding. See | http://uvsc.freshsources.com/decimals.pdf. A very readable 13 page introduction to some floating point issues. Thanks for the reference. The author concludes with, "Computer science students don't need to be numerical analysts, but they may be called upon to write mathematical software. Indeed, scientists and engineers use tools like Matlab and Mathematica, but who implements these systems? It takes the expertise that only CS graduates have to write such sophisticated software. Without knowledge of the intricacies of floating-point computation, they will make a mess of things. In this paper I have surveyed the basics that every CS graduate should have mastered before they can be trusted in a workplace that does any kind of computing with real numbers." So perhaps this brings us back to the original comment that "fp issues are a learning opportunity." They are. The question I have is "how soon do they need to run into them?" Is decreasing the likelihood that they will see the problem (but not eliminate it) a good thing for the python community or not? /c From guido at python.org Mon Feb 13 18:55:56 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 09:55:56 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43ed8aaf.493103945@news.gmane.org> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> Message-ID: One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes objects from string objects, with specific explicit encodings, and those should suffice, at least initially. I also wonder if having a b"..." literal would just add more confusion -- bytes are not characters, but b"..." makes it appear as if they are. --Guido On 2/11/06, Bengt Richter wrote: > On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum wrote: > > >> On Sat, 11 Feb 2006 05:08:09 +0000 (UTC), Neil Schemenauer > >The backwards compatibility problems *seem* to be relatively minor. > >> >I only found one instance of breakage in the standard library. Note > >> >that my patch does not change PyObject_Str(); that would break > >> >massive amounts of code. Instead, I introduce a new function: > >> >PyString_New(). I'm not crazy about the name but I couldn't think > >> >of anything better. > > > >On 2/10/06, Bengt Richter wrote: > >> Should this not be coordinated with PEP 332? > > > >Probably.. But that PEP is rather incomplete. Wanna work on fixing that? > > > I'd be glad to add my thoughts, but first of course it's Skip's PEP, > and Martin casts a long shadow when it comes to character coding issues > that I suspect will have to be considered. > > (E.g., if there is a b'...' literal for bytes, the actual characters of > the source code itself that the literal is being expressed in could be ascii > or latin-1 or utf-8 or utf16le a la Microsoft, etc. UIAM, I read that the source > is at least temporarily normalized to Unicode, and then re-encoded (except now > for string literals?) per coding cookie or other encoding inference. (I may be > out of date, gotta catch up). > > If one way or the other a string literal is in Unicode, then presumably so is > a byte string b'...' literal -- i.e. internally u"b'...'" just before > being turned into bytes. > > Should that then be an internal straight u"b'...'".encode('byte') with default ascii + escapes > for non-ascii and non-printables, to define the full 8 bits without encoding error? > Should unicode be encodable into byte via a specific encoding? E.g., u'abc'.encode('byte','latin1'), > to distinguish producing a mutable byte string vs an immutable str type as with u'abc'.encode('latin1'). > (but how does this play with str being able to produce unicode? And when do these changes happen?) > I guess I'm getting ahead of myself ;-) > > So I would first ask Skip what he'd like to do, and Martin for some hints on reading, to avoid > going down paths he already knows lead to brick walls ;-) And I need to think more about PEP 349. > > I would propose to do the reading they suggest, and edit up a new version of pep-0332.txt > that anyone could then improve further. I don't know about an early deadline. I don't want > to over-commit, as time and energies vary. OTOH, as you've noticed, I could be spending my > time more effectively ;-) > > I changed the thread title, and will wait for some signs from you, Skip, Martin, Neil, and I don't > know who else might be interested... > > Regards, > Bengt Richter > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Mon Feb 13 19:12:18 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 13 Feb 2006 19:12:18 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> Message-ID: <43F0CC02.50905@egenix.com> Guido van Rossum wrote: > One recommendation: for starters, I'd much rather see the bytes type > standardized without a literal notation. There should be are lots of > ways to create bytes objects from string objects, with specific > explicit encodings, and those should suffice, at least initially. > > I also wonder if having a b"..." literal would just add more confusion > -- bytes are not characters, but b"..." makes it appear as if they > are. Agreed. Given that we have a source code encoding which would need to be honored, b"..." doesn't really make all that much sense (unless you always use hex escapes). Note that if we drop the string type, all codecs which currently return strings will have to return bytes. This gives you a pretty exhaustive way of defining your binary literals in Python :-) Here's one: data = "abc".encode("latin-1") To simplify things we might want to have bytes("abc") do the above encoding per default. > --Guido > > On 2/11/06, Bengt Richter wrote: >> On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum wrote: >> >>>> On Sat, 11 Feb 2006 05:08:09 +0000 (UTC), Neil Schemenauer > >The backwards compatibility problems *seem* to be relatively minor. >>>>> I only found one instance of breakage in the standard library. Note >>>>> that my patch does not change PyObject_Str(); that would break >>>>> massive amounts of code. Instead, I introduce a new function: >>>>> PyString_New(). I'm not crazy about the name but I couldn't think >>>>> of anything better. >>> On 2/10/06, Bengt Richter wrote: >>>> Should this not be coordinated with PEP 332? >>> Probably.. But that PEP is rather incomplete. Wanna work on fixing that? >>> >> I'd be glad to add my thoughts, but first of course it's Skip's PEP, >> and Martin casts a long shadow when it comes to character coding issues >> that I suspect will have to be considered. >> >> (E.g., if there is a b'...' literal for bytes, the actual characters of >> the source code itself that the literal is being expressed in could be ascii >> or latin-1 or utf-8 or utf16le a la Microsoft, etc. UIAM, I read that the source >> is at least temporarily normalized to Unicode, and then re-encoded (except now >> for string literals?) per coding cookie or other encoding inference. (I may be >> out of date, gotta catch up). >> >> If one way or the other a string literal is in Unicode, then presumably so is >> a byte string b'...' literal -- i.e. internally u"b'...'" just before >> being turned into bytes. >> >> Should that then be an internal straight u"b'...'".encode('byte') with default ascii + escapes >> for non-ascii and non-printables, to define the full 8 bits without encoding error? >> Should unicode be encodable into byte via a specific encoding? E.g., u'abc'.encode('byte','latin1'), >> to distinguish producing a mutable byte string vs an immutable str type as with u'abc'.encode('latin1'). >> (but how does this play with str being able to produce unicode? And when do these changes happen?) >> I guess I'm getting ahead of myself ;-) >> >> So I would first ask Skip what he'd like to do, and Martin for some hints on reading, to avoid >> going down paths he already knows lead to brick walls ;-) And I need to think more about PEP 349. >> >> I would propose to do the reading they suggest, and edit up a new version of pep-0332.txt >> that anyone could then improve further. I don't know about an early deadline. I don't want >> to over-commit, as time and energies vary. OTOH, as you've noticed, I could be spending my >> time more effectively ;-) >> >> I changed the thread title, and will wait for some signs from you, Skip, Martin, Neil, and I don't >> know who else might be interested... >> >> Regards, >> Bengt Richter >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org >> > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From pje at telecommunity.com Mon Feb 13 19:19:04 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 13 Feb 2006 13:19:04 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> Message-ID: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: >One recommendation: for starters, I'd much rather see the bytes type >standardized without a literal notation. There should be are lots of >ways to create bytes objects from string objects, with specific >explicit encodings, and those should suffice, at least initially. > >I also wonder if having a b"..." literal would just add more confusion >-- bytes are not characters, but b"..." makes it appear as if they >are. Why not just have the constructor be: bytes(initializer [,encoding]) Where initializer must be either an iterable of suitable integers, or a unicode/string object. If the latter (i.e., it's a basestring), the encoding argument would then be required. Then, there's no need for special codec support for the bytes type, since you call bytes on the thing to be encoded. And of course, no need for a 'b' literal. From guido at python.org Mon Feb 13 19:52:03 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 10:52:03 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: <054578D1-518D-4566-A15A-114B3BB85790@verio.net> References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> <43EC8AF8.2000506@gmail.com> <054578D1-518D-4566-A15A-114B3BB85790@verio.net> Message-ID: On 2/10/06, Mark Russell wrote: > > On 10 Feb 2006, at 12:45, Nick Coghlan wrote: > > An alternative would be to call it "__discrete__", as that is the key > > characteristic of an indexing type - it consists of a sequence of discrete > > values that can be isomorphically mapped to the integers. > Another alternative: __as_ordinal__. Wikipedia describes ordinals as > "numbers used to denote the position in an ordered sequence" which seems a > pretty precise description of the intended result. The "as_" prefix also > captures the idea that this should be a lossless conversion. Aren't ordinals generally assumed to be non-negative? The numbers used as slice or sequence indices can be negative! Also, I don't buy the reason for 'as'l I don't see how this word would require the conversion to be losless. The PEP continues to use __index__ and I'm happy with that. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 13 20:12:42 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 11:12:42 -0800 Subject: [Python-Dev] ssize_t branch (Was: release plan for 2.5 ?) In-Reply-To: <43EF1871.9030304@v.loewis.de> References: <43EF1871.9030304@v.loewis.de> Message-ID: On 2/12/06, "Martin v. L?wis" wrote: > Neal Norwitz wrote: > > I'm tempted to say we should merge now. I know the branch works on > > 64-bit boxes. I can test on a 32-bit box if Martin hasn't already. > > There will be a lot of churn fixing problems, but maybe we can get > > more people involved. > > The ssize_t branch has now all the API I want it to have. I just > posted the PEP to comp.lang.python, maybe people have additional > things they consider absolutely necessary. > > There are two aspects left, and both can be done after the merge: > - a lot of modules still need adjustments, to really support > 64-bit collections. This shouldn't cause any API changes, AFAICT. > > - the printing of Py_ssize_t values should be supported. I think > Tim proposed to provide the 'z' formatter across platforms. > This is a new API, but it's a pure extension, so it can be > done in the trunk. Great news. I'm looking forward to getting this over with! > I would like to avoid changing APIs after the merge to the trunk > has happened; I remember Guido saying (a few years ago) that this > change must be a single large change, rather many small incremental > changes. I agree, and I hope I have covered everything that needs > to be covered. Let me qualify that a bit -- I'd be okay with one honking big change followed by some minor adjustments. I'd say that, since you've already done so much in the branch, we're quickly approaching the point where the extra testing we get from merging soon out-benefits the problems some folks may experience due to the branch not being perfect yet. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Mon Feb 13 21:27:28 2006 From: python at rcn.com (Raymond Hettinger) Date: Mon, 13 Feb 2006 15:27:28 -0500 Subject: [Python-Dev] nice() References: <004f01c630c0$f051e1f0$5f2c4fca@csmith> Message-ID: <005a01c630db$e93fa030$b83efea9@RaymondLaptop1> Please do not spam multiple mail lists with these posts (edu-sig, python-dev, and tutor). Raymond ----- Original Message ----- From: "Smith" To: Cc: ; Sent: Monday, February 13, 2006 12:10 PM Subject: Re: [Python-Dev] nice() From jimjjewett at gmail.com Mon Feb 13 21:28:37 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 13 Feb 2006 15:28:37 -0500 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation Message-ID: Guido: > I don't like __true_int__ very much. Personally, > I'm fine with calling it __index__ index is OK, but is there a reason __integer__ would be rejected? __int__ roughly follows the low-level C implementation, and may do odd things on unusual input. __integer__ properly creates a conceptual integer, so it won't lose or corrupt information (unless the class writer does this intentionally). -jJ From guido at python.org Mon Feb 13 21:32:15 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 12:32:15 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: Message-ID: On 2/13/06, Jim Jewett wrote: > Guido: > > > I don't like __true_int__ very much. Personally, > > I'm fine with calling it __index__ > > index is OK, but is there a reason __integer__ would be > rejected? > > __int__ roughly follows the low-level C implementation, > and may do odd things on unusual input. > > __integer__ properly creates a conceptual integer, so > it won't lose or corrupt information (unless the class > writer does this intentionally). Given the number of folks who misappreciate the difference between __getattr__ and __getattribute__, I'm not sure I'd want to encourage using abbreviated and full forms of the same term in the same context. When confronted with the existence of __int__ and __integer__ I can see plenty of confusion ahead. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 13 21:34:52 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 12:34:52 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> Message-ID: On 2/13/06, Phillip J. Eby wrote: > At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: > >One recommendation: for starters, I'd much rather see the bytes type > >standardized without a literal notation. There should be are lots of > >ways to create bytes objects from string objects, with specific > >explicit encodings, and those should suffice, at least initially. > > > >I also wonder if having a b"..." literal would just add more confusion > >-- bytes are not characters, but b"..." makes it appear as if they > >are. > > Why not just have the constructor be: > > bytes(initializer [,encoding]) > > Where initializer must be either an iterable of suitable integers, or a > unicode/string object. If the latter (i.e., it's a basestring), the > encoding argument would then be required. Then, there's no need for > special codec support for the bytes type, since you call bytes on the thing > to be encoded. And of course, no need for a 'b' literal. It'd be cruel and unusual punishment though to have to write bytes("abc", "Latin-1") I propose that the default encoding (for basestring instances) ought to be "ascii" just like everywhere else. (Meaning, it should really be the system default encoding, which defaults to "ascii" and is intentionally hard to change.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 13 21:40:57 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 12:40:57 -0800 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: Message-ID: Shouldn't docs.python.org be removed? It seems to add mroe confusion than anything, especially since most links on python.org continue to point to python.org/doc/. On 2/13/06, Georg Brandl wrote: > The above docs are from August 2005 while docs.python.org/dev is current. > Shouldn't the old docs be removed? (Now that I work for Google I realize more than ever before the importance of keeping URLs stable; PageRank(tm) numbers don't get transferred as quickly as contents. I have this worry too in the context of the python.org redesign; 301 permanent redirect is *not* going to help PageRank of the new page.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Mon Feb 13 21:52:44 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Feb 2006 15:52:44 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: Message-ID: <200602131552.44424.fdrake@acm.org> On Monday 13 February 2006 10:03, Georg Brandl wrote: > The above docs are from August 2005 while docs.python.org/dev is current. > Shouldn't the old docs be removed? I'm afraid I've generally been too busy to chime in much on this topic, but I've spent a bit of time thinking about it, and would like to keep on top of the issue still. The automatically-maintained version of the development docs is certainly preferrable to the manually-maintained-by-me version, and I've updated the link from www.python.org/doc/ to refer to that version for now. However, I do have some concerns about how this is all structured still. One of the goals of docs.python.org was to be able to do a Google site-search and only see the current version. Having multiple versions on that site is contrary to that purpose. I'd like to see the development version(s) move back to being in the www.python.org/dev/doc/ hierarchy. What I would also like to see is to have an automatically-updated version for each of the maintainer versions of Python, as well as the development trunk. That would mean two versions at this point (2.4.x, 2.5.x); only one of those is currently handled automatically. -Fred -- Fred L. Drake, Jr. From fredrik at pythonware.com Mon Feb 13 21:53:58 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 13 Feb 2006 21:53:58 +0100 Subject: [Python-Dev] moving content around (Re: http://www.python.org/dev/doc/devel still available) References: Message-ID: Guido van Rossum wrote: > (Now that I work for Google I realize more than ever before the > importance of keeping URLs stable; PageRank(tm) numbers don't get > transferred as quickly as contents. I have this worry too in the > context of the python.org redesign; 301 permanent redirect is *not* > going to help PageRank of the new page.) so what's the best way to move stuff around? wikipedia seems to display the content from the "new" location under the old URL, but with a small blurb at the top that says "redirected from ", e.g. http://en.wikipedia.org/wiki/F_Scott_Fitzgerald (not sure if it's done that way to avoid HTTP roundtrips, or for some obscure googlerank reason...) From jimjjewett at gmail.com Mon Feb 13 21:55:14 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 13 Feb 2006 15:55:14 -0500 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: Message-ID: Is there a reason __integer__ would be rejected? Guido van Rossum answered: > Given the number of folks who misappreciate the difference between > __getattr__ and __getattribute__, I'm not sure I'd want to encourage > using abbreviated and full forms of the same term in the same context. > When confronted with the existence of __int__ and __integer__ I can > see plenty of confusion ahead. I see this case as slightly different. getattr and getattribute are both things you might reasonably want to do. __int__ is something you probably shouldn't be doing very often anymore; it is being kept for backwards compatibility. Switching getattr and getattribute will cause bugs, which may be hard to diagnose, even for people who might reasonably be using the hooks. Switching __int__ and (newname) won't matter, unless __int__ was already doing something unexpected. Since backwards compatibility means we can't prevent __int__ from doing the unexpected, a similar name might be *good* -- at least it would tip people off that __int__ might not be what they want. I can't think of any way to associate getattr vs getattribute with timing or precedence. I already associate int with a specific C datatype and integer with something more abstract. (I'm not sure the new method is a better match for my integer concept, and it probably isn't a better match for java.lang.Integer, but ... the separation is there.) -jJ From guido at python.org Mon Feb 13 22:03:07 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 13:03:07 -0800 Subject: [Python-Dev] moving content around (Re: http://www.python.org/dev/doc/devel still available) In-Reply-To: References: Message-ID: On 2/13/06, Fredrik Lundh wrote: > Guido van Rossum wrote: > > > (Now that I work for Google I realize more than ever before the > > importance of keeping URLs stable; PageRank(tm) numbers don't get > > transferred as quickly as contents. I have this worry too in the > > context of the python.org redesign; 301 permanent redirect is *not* > > going to help PageRank of the new page.) > > so what's the best way to move stuff around? I don't know; my point was to avoid needless moving rather than giving a best practice for moving. > wikipedia seems to display the content from the "new" location under > the old URL, but with a small blurb at the top that says "redirected > from ", e.g. > > http://en.wikipedia.org/wiki/F_Scott_Fitzgerald > > (not sure if it's done that way to avoid HTTP roundtrips, or for some > obscure googlerank reason...) Can't say I understand that particular example. Wikipedia has different requirements though; there are aliases (e.g. homonyms, synonyms) that won't go away. For python.org we're looking at minimizing the URL space churn. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 13 22:09:59 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 13:09:59 -0800 Subject: [Python-Dev] PEP 351 In-Reply-To: <001501c62f57$2b070b60$6a01a8c0@RaymondLaptop1> References: <000a01c62e85$bd081770$b83efea9@RaymondLaptop1> <73080DA6-08F4-431A-A027-B9D15299A0A7@gmail.com> <001501c62f57$2b070b60$6a01a8c0@RaymondLaptop1> Message-ID: I've rejected PEP 351, with a reference to this thread as the rationale. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 13 22:11:36 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 13:11:36 -0800 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43F022CA.1090200@canterbury.ac.nz> References: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> <20060211205735.GA24548@code0.codespeak.net> <43EE7737.4080503@v.loewis.de> <43F022CA.1090200@canterbury.ac.nz> Message-ID: On 2/12/06, Greg Ewing wrote: > > [A large head-exploding set of rules] > > Blarg. > > Const - Just Say No. +1 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jimjjewett at gmail.com Mon Feb 13 22:16:17 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 13 Feb 2006 16:16:17 -0500 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation Message-ID: Travis wrote: > The patch adds a new API function int PyObject_AsIndex(obj) How did you decide between int and long? Why not ssize_t? Also, if index is being added as a builtin, should the failure result be changed? I'm thinking that this may become a replacement for isinstance(val, (int, long)). If so, it might be nice not to raise errors, or at least to raise a more specific subclass. (Catching a TypeError and then checking the message string ... does not seem clean.) -jJ From jimjjewett at gmail.com Mon Feb 13 22:24:52 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 13 Feb 2006 16:24:52 -0500 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation Message-ID: Travis wrote: > The patch adds a new API function int PyObject_AsIndex(obj) How did you decide between int and long? Why not ssize_t? From guido at python.org Mon Feb 13 22:30:26 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 13:30:26 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: Message-ID: On 2/13/06, Jim Jewett wrote: > Travis wrote: > > > The patch adds a new API function int PyObject_AsIndex(obj) > > How did you decide between int and long? > > Why not ssize_t? It should be the same type used everywhere for indexing. In the svn HEAD that's int. Once PEP 353 lands it should be ssize_t. I've made Travis aware of this issue already. > Also, if index is being added as a builtin, should the failure > result be changed? I don't like to add a built-in index() at this point; mostly because of Occam's razor (we haven't found a need). > I'm thinking that this may become a > replacement for isinstance(val, (int, long)). But only if it's okay if values > sys.maxint (or some other constant indicating the limit of ssize_t) are not required to be supported. > If so, it might > be nice not to raise errors, or at least to raise a more > specific subclass. (Catching a TypeError and then > checking the message string ... does not seem clean.) I'm not sure what you mean. How could index(x) ever replace isinstance(x, (int, long)) without raising an exception? Surely index("abc") *should* raise an exception. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From amk at amk.ca Mon Feb 13 23:41:00 2006 From: amk at amk.ca (A.M. Kuchling) Date: Mon, 13 Feb 2006 17:41:00 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <200602131552.44424.fdrake@acm.org> References: <200602131552.44424.fdrake@acm.org> Message-ID: <20060213224100.GB9576@rogue.amk.ca> On Mon, Feb 13, 2006 at 03:52:44PM -0500, Fred L. Drake, Jr. wrote: > What I would also like to see is to have an automatically-updated > version for each of the maintainer versions of Python, as well as > the development trunk. That would mean two versions at this point > (2.4.x, 2.5.x); only one of those is currently handled > automatically. If Thomas could set up a wildcard DNS of some sort, would it be a good idea to have lots of hostnames, e.g. docs-24.python.org, docs-25.python.org, etc.? We could probably make it work in Apache with mod_rewrite so that we aren't endlessly tweaking the config file as new versions are released. --amk From aahz at pythoncraft.com Mon Feb 13 22:43:45 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 13 Feb 2006 13:43:45 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: Message-ID: <20060213214345.GA5074@panix.com> On Mon, Feb 13, 2006, Jim Jewett wrote: > > getattr and getattribute are both things you might reasonably want to > do. __int__ is something you probably shouldn't be doing very often > anymore; it is being kept for backwards compatibility. And how do you convert a float to an int? __int__ is NOT going away; the sole purpose of __index__ is to enable sequence index functionality and similar use-cases for int-like objects that do not subclass from int. (For example, one might want to allow an enumeration type to index into a list.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From jeremy at alum.mit.edu Mon Feb 13 22:49:44 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 13 Feb 2006 16:49:44 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> <20060211205735.GA24548@code0.codespeak.net> <43EE7737.4080503@v.loewis.de> <43F022CA.1090200@canterbury.ac.nz> Message-ID: It sounds like the right answer for Python is to change the signature of PyArg_ParseTupleAndKeywords() back. We'll fix it when C fixes its const rules . Jeremy On 2/13/06, Guido van Rossum wrote: > On 2/12/06, Greg Ewing wrote: > > > [A large head-exploding set of rules] > > > > Blarg. > > > > Const - Just Say No. > > +1 > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From guido at python.org Mon Feb 13 22:52:54 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 13:52:54 -0800 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <1f7befae0602100919nc360b69u26bae73baa200aaf@mail.gmail.com> <20060211205735.GA24548@code0.codespeak.net> <43EE7737.4080503@v.loewis.de> <43F022CA.1090200@canterbury.ac.nz> Message-ID: +1 On 2/13/06, Jeremy Hylton wrote: > It sounds like the right answer for Python is to change the signature > of PyArg_ParseTupleAndKeywords() back. We'll fix it when C fixes its > const rules . > > Jeremy > > On 2/13/06, Guido van Rossum wrote: > > On 2/12/06, Greg Ewing wrote: > > > > [A large head-exploding set of rules] > > > > > > Blarg. > > > > > > Const - Just Say No. > > > > +1 > > > > -- > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Mon Feb 13 22:55:01 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 13 Feb 2006 22:55:01 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> Message-ID: <43F10035.8080207@egenix.com> Guido van Rossum wrote: > On 2/13/06, Phillip J. Eby wrote: >> At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: >>> One recommendation: for starters, I'd much rather see the bytes type >>> standardized without a literal notation. There should be are lots of >>> ways to create bytes objects from string objects, with specific >>> explicit encodings, and those should suffice, at least initially. >>> >>> I also wonder if having a b"..." literal would just add more confusion >>> -- bytes are not characters, but b"..." makes it appear as if they >>> are. >> Why not just have the constructor be: >> >> bytes(initializer [,encoding]) >> >> Where initializer must be either an iterable of suitable integers, or a >> unicode/string object. If the latter (i.e., it's a basestring), the >> encoding argument would then be required. Then, there's no need for >> special codec support for the bytes type, since you call bytes on the thing >> to be encoded. And of course, no need for a 'b' literal. > > It'd be cruel and unusual punishment though to have to write > > bytes("abc", "Latin-1") > > I propose that the default encoding (for basestring instances) ought > to be "ascii" just like everywhere else. (Meaning, it should really be > the system default encoding, which defaults to "ascii" and is > intentionally hard to change.) We're talking about Py3k here: "abc" will be a Unicode string, so why restrict the conversion to 7 bits when you can have 8 bits without any conversion problems ? While we're at it: I'd suggest that we remove the auto-conversion from bytes to Unicode in Py3k and the default encoding along with it. In Py3k the standard lib will have to be Unicode compatible anyway and string parser markers like "s#" will have to go away as well, so there's not much need for this anymore. (Maybe a bit radical, but I guess that's what Py3k is meant for.) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jeremy at alum.mit.edu Mon Feb 13 22:58:33 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 13 Feb 2006 16:58:33 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43ECE71D.2050402@v.loewis.de> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> Message-ID: On 2/10/06, "Martin v. L?wis" wrote: > Jeremy Hylton wrote: > > Ok. I reviewed the original problem and you're right, the problem was > > not that it failed outright but that it produced a warning about the > > deprecated conversion: > > warning: deprecated conversion from string constant to 'char*'' > > > > I work at a place that takes the same attitude as python-dev about > > warnings: They're treated as errors and you can't check in code that > > the compiler generates warnings for. > > In that specific case, I think the compiler's warning should be turned > off; it is a bug in the compiler if that specific warning cannot be > turned off separately. The compiler in question is gcc and the warning can be turned off with -Wno-write-strings. I think we'd be better off leaving that option on, though. This warning will help me find places where I'm passing a string literal to a function that does not take a const char*. That's valuable, not insensate. Jeremy > While it is true that the conversion is deprecated, the C++ standard > defines this as > > "Normative for the current edition of the Standard, but not guaranteed > to be part of the Standard in future revisions." > > The current version is from 1998. I haven't been following closely, > but I believe there are no plans to actually remove the feature > in the next revision. > > FWIW, Annex D also defines these features as deprecated: > - the use of "static" for objects in namespace scope (AFAICT > including C file-level static variables and functions) > - C library headers (i.e. ) > > Don't you get a warning when including Python.h, because that > include ? > > > Nonetheless, the consensus on the c++ sig and python-dev at the time > > was to fix Python. If we don't allow warnings in our compilations, we > > shouldn't require our users at accept warnings in theirs. > > We don't allow warnings for "major compilers". This specific compiler > appears flawed (or your configuration of it). > > Regards, > Martin > From thomas at xs4all.net Mon Feb 13 23:04:55 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 13 Feb 2006 23:04:55 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <20060213224100.GB9576@rogue.amk.ca> References: <200602131552.44424.fdrake@acm.org> <20060213224100.GB9576@rogue.amk.ca> Message-ID: <20060213220455.GU10226@xs4all.nl> On Mon, Feb 13, 2006 at 05:41:00PM -0500, A.M. Kuchling wrote: > On Mon, Feb 13, 2006 at 03:52:44PM -0500, Fred L. Drake, Jr. wrote: > > What I would also like to see is to have an automatically-updated > > version for each of the maintainer versions of Python, as well as > > the development trunk. That would mean two versions at this point > > (2.4.x, 2.5.x); only one of those is currently handled > > automatically. > If Thomas could set up a wildcard DNS of some sort, That wouldn't be a problem. I fear what it'll do to the PageRank though ;-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fuzzyman at voidspace.org.uk Mon Feb 13 23:14:23 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 13 Feb 2006 22:14:23 +0000 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: Message-ID: <43F104BF.2080109@voidspace.org.uk> Guido van Rossum wrote: > Shouldn't docs.python.org be removed? It seems to add mroe confusion > than anything, especially since most links on python.org continue to > point to python.org/doc/. > > All the web says about 1200 links into the docs.python.org subdomain. (Different to the google link feature, which only shows links to a specific URL I believe.) http://www.alltheweb.com/search?cat=web&cs=utf8&q=link%3Adocs.python.org&rys=0&itag=crv&_sb_lang=pref It's where I link to as well. Be a shame to lose it. ;-) Michael Foord > On 2/13/06, Georg Brandl wrote: > >> The above docs are from August 2005 while docs.python.org/dev is current. >> Shouldn't the old docs be removed? >> > > (Now that I work for Google I realize more than ever before the > importance of keeping URLs stable; PageRank(tm) numbers don't get > transferred as quickly as contents. I have this worry too in the > context of the python.org redesign; 301 permanent redirect is *not* > going to help PageRank of the new page.) > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From pje at telecommunity.com Mon Feb 13 23:15:05 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 13 Feb 2006 17:15:05 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F10035.8080207@egenix.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> At 10:55 PM 2/13/2006 +0100, M.-A. Lemburg wrote: >Guido van Rossum wrote: > > On 2/13/06, Phillip J. Eby wrote: > >> At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote: > >>> One recommendation: for starters, I'd much rather see the bytes type > >>> standardized without a literal notation. There should be are lots of > >>> ways to create bytes objects from string objects, with specific > >>> explicit encodings, and those should suffice, at least initially. > >>> > >>> I also wonder if having a b"..." literal would just add more confusion > >>> -- bytes are not characters, but b"..." makes it appear as if they > >>> are. > >> Why not just have the constructor be: > >> > >> bytes(initializer [,encoding]) > >> > >> Where initializer must be either an iterable of suitable integers, or a > >> unicode/string object. If the latter (i.e., it's a basestring), the > >> encoding argument would then be required. Then, there's no need for > >> special codec support for the bytes type, since you call bytes on the > thing > >> to be encoded. And of course, no need for a 'b' literal. > > > > It'd be cruel and unusual punishment though to have to write > > > > bytes("abc", "Latin-1") > > > > I propose that the default encoding (for basestring instances) ought > > to be "ascii" just like everywhere else. (Meaning, it should really be > > the system default encoding, which defaults to "ascii" and is > > intentionally hard to change.) > >We're talking about Py3k here: "abc" will be a Unicode string, >so why restrict the conversion to 7 bits when you can have 8 bits >without any conversion problems ? Actually, I thought we were talking about adding bytes() in 2.5. However, now that you've brought this up, it actually makes perfect sense to just use latin-1 as the effective encoding for both strings and unicode. In Python 2.x, strings are byte strings by definition, so it's only in 3.0 that an encoding would be required. And again, latin1 is a reasonable, roundtrippable default encoding. So, it sounds like making the encoding default to latin-1 would be a reasonably safe approach in both 2.x and 3.x. >While we're at it: I'd suggest that we remove the auto-conversion >from bytes to Unicode in Py3k and the default encoding along with >it. In Py3k the standard lib will have to be Unicode compatible >anyway and string parser markers like "s#" will have to go away >as well, so there's not much need for this anymore. I thought all this was already in the plan for 3.0, but maybe I assume too much. :) From mal at egenix.com Mon Feb 13 23:18:23 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 13 Feb 2006 23:18:23 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> Message-ID: <43F105AF.3000905@egenix.com> Tim Peters wrote: > [Jeremy] >>>> I added some const to several API functions that take char* but >>>> typically called by passing string literals. > > [Tim] >>> If he had _stuck_ to that, we wouldn't be having this discussion :-) >>> (that is, nobody passes string literals to >>> PyArg_ParseTupleAndKeywords's kws argument). > > [Jeremy] >> They are passing arrays of string literals. In my mind, that was a >> nearly equivalent use case. I believe the C++ compiler complains >> about passing an array of string literals to char**. > > It's the consequences: nobody complains about tacking "const" on to a > former honest-to-God "char *" argument that was in fact not modified, > because that's not only helpful for C++ programmers, it's _harmless_ > for all programmers. For example, nobody could sanely object (and > nobody did :-)) to adding const to the attribute-name argument in > PyObject_SetAttrString(). Sticking to that creates no new problems > for anyone, so that's as far as I ever went. Well, it broke my C extensions... I now have this in my code: /* The keyword array changed to const char* in Python 2.5 */ #if PY_VERSION_HEX >= 0x02050000 # define Py_KEYWORDS_STRING_TYPE const char #else # define Py_KEYWORDS_STRING_TYPE char #endif ... static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL}; ... if (!PyArg_ParseTupleAndKeywords(args,kws,format,kwslist,&a1)) goto onError; The crux is that code which should be portable across Python versions won't work otherwise: you either get Python 2.5 xor Python 2.x (for x < 5) compatibility. Not too happy about it, but then compared to the ssize_t changes and the relative imports PEP, this one is an easy one to handle. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From fdrake at acm.org Mon Feb 13 23:29:11 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Feb 2006 17:29:11 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: Message-ID: <200602131729.12333.fdrake@acm.org> On Monday 13 February 2006 15:40, Guido van Rossum wrote: > Shouldn't docs.python.org be removed? It seems to add mroe confusion > than anything, especially since most links on python.org continue to > point to python.org/doc/. docs.python.org was created specifically to make searching the most recent "stable" version of the docs easier (using Google's site: modifier, no less). I don't know what the link count statistics say (other than what you mention), and don't know which gets hit more often, but I still think it's a reasonable approach. I've been switching links to point to docs.python.org whenever I find an older link that points to www.python.org/doc/current/; other parts of the doc/ area from the site didn't move, and perhaps that's a problem that should be addressed. > (Now that I work for Google I realize more than ever before the > importance of keeping URLs stable; PageRank(tm) numbers don't get > transferred as quickly as contents. I have this worry too in the > context of the python.org redesign; 301 permanent redirect is *not* > going to help PageRank of the new page.) Maybe I'm just not getting why that's relevant. -Fred -- Fred L. Drake, Jr. From fredrik at pythonware.com Mon Feb 13 23:45:18 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 13 Feb 2006 23:45:18 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available References: <200602131729.12333.fdrake@acm.org> Message-ID: Fred L. Drake, Jr. wrote: > docs.python.org was created specifically to make searching the most recent > "stable" version of the docs easier (using Google's site: modifier, no less). > I don't know what the link count statistics say (other than what you > mention), and don't know which gets hit more often I've been looking into page stats for the AltPyDotOrgCms activity; from what I can tell, it's evenly distributed (~55% on www.python.org/doc, 45% on docs.python.org) From mal at egenix.com Tue Feb 14 00:03:35 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 14 Feb 2006 00:03:35 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> Message-ID: <43F11047.705@egenix.com> Phillip J. Eby wrote: >>>> Why not just have the constructor be: >>>> >>>> bytes(initializer [,encoding]) >>>> >>>> Where initializer must be either an iterable of suitable integers, or a >>>> unicode/string object. If the latter (i.e., it's a basestring), the >>>> encoding argument would then be required. Then, there's no need for >>>> special codec support for the bytes type, since you call bytes on the >> thing >>>> to be encoded. And of course, no need for a 'b' literal. >>> It'd be cruel and unusual punishment though to have to write >>> >>> bytes("abc", "Latin-1") >>> >>> I propose that the default encoding (for basestring instances) ought >>> to be "ascii" just like everywhere else. (Meaning, it should really be >>> the system default encoding, which defaults to "ascii" and is >>> intentionally hard to change.) >> We're talking about Py3k here: "abc" will be a Unicode string, >> so why restrict the conversion to 7 bits when you can have 8 bits >> without any conversion problems ? > > Actually, I thought we were talking about adding bytes() in 2.5. Then we'd need to make the "ascii" encoding assumption again, just like Guido proposed. > However, now that you've brought this up, it actually makes perfect sense > to just use latin-1 as the effective encoding for both strings and > unicode. In Python 2.x, strings are byte strings by definition, so it's > only in 3.0 that an encoding would be required. And again, latin1 is a > reasonable, roundtrippable default encoding. It is. However, it's not a reasonable assumption of the default encoding since there are many encodings out there that special case the characters 0x80-0xFF, hence the choice of using ASCII as default encoding in Python. The conversion from Unicode to bytes is different in this respect, since you are converting from a "bigger" type to a "smaller" one. Choosing latin-1 as default for this conversion would give you all 8 bits, instead of just 7 bits that ASCII provides. > So, it sounds like making the encoding default to latin-1 would be a > reasonably safe approach in both 2.x and 3.x. Reasonable for bytes(): yes. In general: no. >> While we're at it: I'd suggest that we remove the auto-conversion >>from bytes to Unicode in Py3k and the default encoding along with >> it. In Py3k the standard lib will have to be Unicode compatible >> anyway and string parser markers like "s#" will have to go away >> as well, so there's not much need for this anymore. > > I thought all this was already in the plan for 3.0, but maybe I assume too > much. :) Wouldn't want to wait for Py4D :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From guido at python.org Tue Feb 14 00:10:50 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 15:10:50 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F10035.8080207@egenix.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> Message-ID: On 2/13/06, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > It'd be cruel and unusual punishment though to have to write > > > > bytes("abc", "Latin-1") > > > > I propose that the default encoding (for basestring instances) ought > > to be "ascii" just like everywhere else. (Meaning, it should really be > > the system default encoding, which defaults to "ascii" and is > > intentionally hard to change.) > > We're talking about Py3k here: "abc" will be a Unicode string, > so why restrict the conversion to 7 bits when you can have 8 bits > without any conversion problems ? As Phillip guessed, I was indeed thinking about introducing bytes() sooner than that, perhaps even in 2.5 (though I don't want anything rushed). Even in Py3k though, the encoding issue stands -- what if the file encoding is Unicode? Then using Latin-1 to encode bytes by default might not by what the user expected. Or what if the file encoding is something totally different? (Cyrillic, Greek, Japanese, Klingon.) Anything default but ASCII isn't going to work as expected. ASCII isn't going to work as expected either, but it will complain loudly (by throwing a UnicodeError) whenever you try it, rather than causing subtle bugs later. > While we're at it: I'd suggest that we remove the auto-conversion > from bytes to Unicode in Py3k and the default encoding along with > it. I'm not sure which auto-conversion you're talking about, since there is no bytes type yet. If you're talking about the auto-conversion from str to unicode: the bytes type should not be assumed to have *any* properties that the current str type has, and that includes auto-conversion. > In Py3k the standard lib will have to be Unicode compatible > anyway and string parser markers like "s#" will have to go away > as well, so there's not much need for this anymore. > > (Maybe a bit radical, but I guess that's what Py3k is meant for.) Right. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 14 00:15:23 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 15:15:23 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> Message-ID: On 2/13/06, Phillip J. Eby wrote: > Actually, I thought we were talking about adding bytes() in 2.5. I was. > However, now that you've brought this up, it actually makes perfect sense > to just use latin-1 as the effective encoding for both strings and > unicode. In Python 2.x, strings are byte strings by definition, so it's > only in 3.0 that an encoding would be required. And again, latin1 is a > reasonable, roundtrippable default encoding. > > So, it sounds like making the encoding default to latin-1 would be a > reasonably safe approach in both 2.x and 3.x. I disagree. IMO the same reasons why we don't do this now for the conversion between str and unicode stands for bytes. > >While we're at it: I'd suggest that we remove the auto-conversion > >from bytes to Unicode in Py3k and the default encoding along with > >it. In Py3k the standard lib will have to be Unicode compatible > >anyway and string parser markers like "s#" will have to go away > >as well, so there's not much need for this anymore. I don't know yet what the C API will look like in 3.0. But it may well have to support auto-conversion from Unicode to char* using some system default encoding (e.g. the Windows default code page?) in order to be able to conveniently wrap OS APIs that use char* instead of some sort of Unicode (and each OS has its own way of interpreting char* as Unicode -- I believe Apple uses UTF-8?). > I thought all this was already in the plan for 3.0, but maybe I assume too > much. :) In Py3k, I can see two reasonable approaches to conversion between strings (Unicode) and bytes: always require an explicit encoding, or assume ASCII. Anything else is asking for trouble IMO. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Tue Feb 14 00:17:07 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 13 Feb 2006 18:17:07 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F11047.705@egenix.com> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> At 12:03 AM 2/14/2006 +0100, M.-A. Lemburg wrote: >The conversion from Unicode to bytes is different in this >respect, since you are converting from a "bigger" type to >a "smaller" one. Choosing latin-1 as default for this >conversion would give you all 8 bits, instead of just 7 >bits that ASCII provides. I was just pointing out that since byte strings are bytes by definition, then simply putting those bytes in a bytes() object doesn't alter the existing encoding. So, using latin-1 when converting a string to bytes actually seems like the the One Obvious Way to do it. I'm so accustomed to being wary of encoding issues that the idea doesn't *feel* right at first - I keep going, "but you can't know what encoding those bytes are". Then I go, Duh, that's the point. If you convert str->bytes, there's no conversion and no interpretation - neither the str nor the bytes object knows its encoding, and that's okay. So str(bytes_object) (in 2.x) should also just turn it back to a normal bytestring. In fact, the 'encoding' argument seems useless in the case of str objects, and it seems it should default to latin-1 for unicode objects. The only use I see for having an encoding for a 'str' would be to allow confirming that the input string in fact is valid for that encoding. So, "bytes(some_str,'ascii')" would be an assertion that some_str must be valid ASCII. > > So, it sounds like making the encoding default to latin-1 would be a > > reasonably safe approach in both 2.x and 3.x. > >Reasonable for bytes(): yes. In general: no. Right, I was only talking about bytes(). For 3.0, the type formerly known as "str" won't exist, so only the Unicode part will be relevant then. From guido at python.org Tue Feb 14 00:23:45 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 15:23:45 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> Message-ID: On 2/13/06, Phillip J. Eby wrote: > At 12:03 AM 2/14/2006 +0100, M.-A. Lemburg wrote: > >The conversion from Unicode to bytes is different in this > >respect, since you are converting from a "bigger" type to > >a "smaller" one. Choosing latin-1 as default for this > >conversion would give you all 8 bits, instead of just 7 > >bits that ASCII provides. > > I was just pointing out that since byte strings are bytes by definition, > then simply putting those bytes in a bytes() object doesn't alter the > existing encoding. So, using latin-1 when converting a string to bytes > actually seems like the the One Obvious Way to do it. This actually makes some sense -- bytes(s) where isinstance(s, str) should just copy the data, since we can't know what encoding the user believes it is in anyway. (With the exception of string literals, where it makes sense to assume that the user believes it is in the same encoding as the source code -- but I believe non-ASCII characters in string literals are disallowed anyway, or at least known to cause undefined results in rats.) > I'm so accustomed to being wary of encoding issues that the idea doesn't > *feel* right at first - I keep going, "but you can't know what encoding > those bytes are". Then I go, Duh, that's the point. If you convert > str->bytes, there's no conversion and no interpretation - neither the str > nor the bytes object knows its encoding, and that's okay. So > str(bytes_object) (in 2.x) should also just turn it back to a normal > bytestring. You've got me convinced. Scrap my previous responses in this thread. > In fact, the 'encoding' argument seems useless in the case of str objects, Right. > and it seems it should default to latin-1 for unicode objects. But here I disagree. > The only > use I see for having an encoding for a 'str' would be to allow confirming > that the input string in fact is valid for that encoding. So, > "bytes(some_str,'ascii')" would be an assertion that some_str must be valid > ASCII. We already have ways to assert that a string is ASCII. > For 3.0, the type formerly known as "str" won't exist, so only the Unicode > part will be relevant then. And I think then the encoding should be required or default to ASCII. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fuzzyman at voidspace.org.uk Tue Feb 14 00:40:16 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 13 Feb 2006 23:40:16 +0000 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> Message-ID: <43F118E0.6090704@voidspace.org.uk> Phillip J. Eby wrote: [snip..] > > In fact, the 'encoding' argument seems useless in the case of str objects, > and it seems it should default to latin-1 for unicode objects. The only > -1 for having an implicit encode that behaves differently to other implicit encodes/decodes that happen in Python. Life is confusing enough already. Michael Foord From guido at python.org Tue Feb 14 00:44:27 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 15:44:27 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F118E0.6090704@voidspace.org.uk> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> Message-ID: On 2/13/06, Michael Foord wrote: > Phillip J. Eby wrote: > [snip..] > > > > In fact, the 'encoding' argument seems useless in the case of str objects, > > and it seems it should default to latin-1 for unicode objects. The only > > > -1 for having an implicit encode that behaves differently to other > implicit encodes/decodes that happen in Python. Life is confusing enough > already. But adding an encoding doesn't help. The str.encode() method always assumes that the string itself is ASCII-encoded, and that's not good enough: >>> "abc".encode("latin-1") 'abc' >>> "abc".decode("latin-1") u'abc' >>> "abc\xf0".decode("latin-1") u'abc\xf0' >>> "abc\xf0".encode("latin-1") Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 3: ordinal not in range(128) >>> The right way to look at this is, as Phillip says, to consider conversion between str and bytes as not an encoding but a data type change *only*. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Tue Feb 14 00:50:40 2006 From: barry at python.org (Barry Warsaw) Date: Mon, 13 Feb 2006 18:50:40 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> Message-ID: <1139874640.16016.24.camel@geddy.wooz.org> On Mon, 2006-02-13 at 15:44 -0800, Guido van Rossum wrote: > The right way to look at this is, as Phillip says, to consider > conversion between str and bytes as not an encoding but a data type > change *only*. That sounds right to me too. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060213/21a03edc/attachment.pgp From fuzzyman at voidspace.org.uk Tue Feb 14 00:53:16 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 13 Feb 2006 23:53:16 +0000 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> Message-ID: <43F11BEC.8050902@voidspace.org.uk> Guido van Rossum wrote: > On 2/13/06, Michael Foord wrote: > >> Phillip J. Eby wrote: >> [snip..] >> >>> In fact, the 'encoding' argument seems useless in the case of str objects, >>> and it seems it should default to latin-1 for unicode objects. The only >>> >>> >> -1 for having an implicit encode that behaves differently to other >> implicit encodes/decodes that happen in Python. Life is confusing enough >> already. >> > > But adding an encoding doesn't help. The str.encode() method always > assumes that the string itself is ASCII-encoded, and that's not good > enough: > > Sorry - I meant for the unicode to bytes case. A default encoding that behaves differently to the current to implicit encodes/decodes would be confusing IMHO. I agree that string to bytes shouldn't change the value of the bytes. The least confusing description of a non-unicode string is 'byte-string'. Michael Foord >>>> "abc".encode("latin-1") >>>> > 'abc' > >>>> "abc".decode("latin-1") >>>> > u'abc' > >>>> "abc\xf0".decode("latin-1") >>>> > u'abc\xf0' > >>>> "abc\xf0".encode("latin-1") >>>> > Traceback (most recent call last): > File "", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position > 3: ordinal not in range(128) > > > The right way to look at this is, as Phillip says, to consider > conversion between str and bytes as not an encoding but a data type > change *only*. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > > From aleaxit at gmail.com Tue Feb 14 00:53:31 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 13 Feb 2006 15:53:31 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: Message-ID: On 2/13/06, Guido van Rossum wrote: ... > I don't like to add a built-in index() at this point; mostly because > of Occam's razor (we haven't found a need). I thought you had agreed, back when I had said that __index__ should also be made easily available to implementors of Python-coded classes implementing sequences, more elegantly than by demanding that they code x.__index__() [I can't think offhand of any other special-named method that you HAVE to call directly -- there's always some syntax or functionality in the standard library to call it more elegantly on your behalf]. This doesn't neessarily argue that index should be in the built-ins module, of course, but I thought there was a sentiment towards having it in either the operator or math modules. Alex From guido at python.org Tue Feb 14 01:04:26 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 16:04:26 -0800 Subject: [Python-Dev] bdist_* to stdlib? Message-ID: In private email, Phillip Eby suggested to add these things to the 2.5. standard library: bdist_deb, bdist_msi, and friends He explained them as follows: """ bdist_deb makes .deb files (packages for Debian-based Linux distros, like Ubuntu). bdist_msi makes .msi installers for Windows (it's by Martin v. Loewis). Marc Lemburg proposed on the distutils-sig that these and various other implemented bdist_* formats (other than bdist_egg) be included in the next Python release, and there was no opposition there that I recall. """ I guess bdist_egg should also be added if we support setuptools (not setuplib as I mistakenly called it previously)? (I'm still a bit unclear on the various concepts here, not having made a distribution of anything in a very long time...) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 14 01:07:56 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 16:07:56 -0800 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: Message-ID: Sorry, you're right. operator.index() sounds fine. --Guido On 2/13/06, Alex Martelli wrote: > On 2/13/06, Guido van Rossum wrote: > ... > > I don't like to add a built-in index() at this point; mostly because > > of Occam's razor (we haven't found a need). > > I thought you had agreed, back when I had said that __index__ should > also be made easily available to implementors of Python-coded classes > implementing sequences, more elegantly than by demanding that they > code x.__index__() [I can't think offhand of any other special-named > method that you HAVE to call directly -- there's always some syntax or > functionality in the standard library to call it more elegantly on > your behalf]. This doesn't neessarily argue that index should be in > the built-ins module, of course, but I thought there was a sentiment > towards having it in either the operator or math modules. > > > Alex > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Tue Feb 14 01:09:57 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 13 Feb 2006 19:09:57 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> At 03:23 PM 2/13/2006 -0800, Guido van Rossum wrote: >On 2/13/06, Phillip J. Eby wrote: > > The only > > use I see for having an encoding for a 'str' would be to allow confirming > > that the input string in fact is valid for that encoding. So, > > "bytes(some_str,'ascii')" would be an assertion that some_str must be valid > > ASCII. > >We already have ways to assert that a string is ASCII. I didn't mean that it was the only purpose. In Python 2.x, practical code has to sometimes deal with "string-like" objects. That is, code that takes either strings or unicode. If such code calls bytes(), it's going to want to include an encoding so that unicode conversions won't fail. But silently ignoring the encoding argument in that case isn't a good idea. Ergo, I propose to permit the encoding to be specified when passing in a (2.x) str object, to allow code that handles both str and unicode to be "str-stable" in 2.x. I'm fine with rejecting an encoding argument if the initializer is not a str or unicode; I just don't want the call signature to vary based on a runtime distinction between str and unicode. And, I don't want the encoding argument to be silently ignored when you pass in a string. If I assert that I'm encoding ASCII (or utf-8 or whatever), then the string should be required to be valid. If I don't pass in an encoding, then I'm good to go. (This is orthogonal to the issue of what encoding is used as a default for conversions from the unicode type, btw.) > > For 3.0, the type formerly known as "str" won't exist, so only the Unicode > > part will be relevant then. > >And I think then the encoding should be required or default to ASCII. The reason I'm arguing for latin-1 is symmetry in 2.x versions only. (In 3.x, there's no str vs. unicode, and thus nothing to be symmetrical.) So, if you invoke bytes() without an encoding on a 2.x basestring, you should get the same result. Latin-1 produces "the same result" when viewed in terms of the resulting byte string. If we don't go with latin-1, I'd argue for requiring an encoding for unicode objects in 2.x, because that seems like the only reasonable way to break the symmetry between str and unicode, even though it forces "str-stable" code to specify an encoding. The key is that at least *one* of the signatures needs to be stable in meaning across both str and unicode in 2.x in order to allow unicode-safe, str-stable code to be written. (Again, for 3.x, this issue doesn't come into play because there's only one string type to worry about; what the default is or whether there's a default is therefore entirely up to you.) From guido at python.org Tue Feb 14 01:09:32 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 16:09:32 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F11BEC.8050902@voidspace.org.uk> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: On 2/13/06, Michael Foord wrote: > Sorry - I meant for the unicode to bytes case. A default encoding that > behaves differently to the current to implicit encodes/decodes would be > confusing IMHO. And I am in agreement with you there (I think only Phillip argued otherwise). > I agree that string to bytes shouldn't change the value of the bytes. It's a deal then. Can the owner of PEP 332 update the PEP to record these decisions? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python-dev at zesty.ca Tue Feb 14 01:16:10 2006 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Mon, 13 Feb 2006 18:16:10 -0600 (CST) Subject: [Python-Dev] Missing PyCon 2006 Message-ID: Hi folks. I had been planning to attend PyCon this year and was really looking forward to it, but i need to cancel. I am sorry that i won't be getting to see you all in a couple of weeks. If you know anyone who hasn't yet registered but wants to go, please contact me -- we can transfer my registration. Thanks, and sorry for using python-dev for this. -- ?!ng From guido at python.org Tue Feb 14 01:29:27 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 16:29:27 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> Message-ID: On 2/13/06, Phillip J. Eby wrote: > I didn't mean that it was the only purpose. In Python 2.x, practical code > has to sometimes deal with "string-like" objects. That is, code that takes > either strings or unicode. If such code calls bytes(), it's going to want > to include an encoding so that unicode conversions won't fail. That sounds like a rather hypothetical example. Have you thought it through? Presumably code that accepts both str and unicode either doesn't care about encodings, but simply returns objects of the same type as the arguments -- and then it's unlikely to want to convert the arguments to bytes; or it *does* care about encodings, and then it probably already has to special-case str vs. unicode because it has to control how str objects are interpreted. > But > silently ignoring the encoding argument in that case isn't a good idea. > > Ergo, I propose to permit the encoding to be specified when passing in a > (2.x) str object, to allow code that handles both str and unicode to be > "str-stable" in 2.x. Again, have you thought this through? What would bytes("abc\xf0", "latin-1") *mean*? Take the string "abc\xf0", interpret it as being encoded in XXX, and then encode from XXX to Latin-1. But what's XXX? As I showed in a previous post, "abc\xf0".encode("latin-1") *fails* because the source for the encoding is assumed to be ASCII. I think we can make this work only when the string in fact only contains ASCII and the encoding maps ASCII to itself (which most encodings do -- but e.g. EBCDIC does not). But I'm not sure how useful that is. > I'm fine with rejecting an encoding argument if the initializer is not a > str or unicode; I just don't want the call signature to vary based on a > runtime distinction between str and unicode. I'm still not sure that this will actually help anyone. > And, I don't want the > encoding argument to be silently ignored when you pass in a string. Agreed. > If I > assert that I'm encoding ASCII (or utf-8 or whatever), then the string > should be required to be valid. Defined how? That the string is already in that encoding? > If I don't pass in an encoding, then I'm > good to go. > > (This is orthogonal to the issue of what encoding is used as a default for > conversions from the unicode type, btw.) Right. The issues are completely different! > > > For 3.0, the type formerly known as "str" won't exist, so only the Unicode > > > part will be relevant then. > > > >And I think then the encoding should be required or default to ASCII. > > The reason I'm arguing for latin-1 is symmetry in 2.x versions only. (In > 3.x, there's no str vs. unicode, and thus nothing to be symmetrical.) So, > if you invoke bytes() without an encoding on a 2.x basestring, you should > get the same result. Latin-1 produces "the same result" when viewed in > terms of the resulting byte string. Only if you assume the str object is encoded in Latin-1. Your argument for symmetry would be a lot stronger if we used Latin-1 for the conversion between str and Unicode. But we don't. I like the other interpretation (which I thought was yours too?) much better: str <--> bytes conversions don't use encodings by simply change the type without changing the bytes; conversion between either and unicode works exactly the same, and requires an encoding unless all the characters involved are pure ASCII. > If we don't go with latin-1, I'd argue for requiring an encoding for > unicode objects in 2.x, because that seems like the only reasonable way to > break the symmetry between str and unicode, even though it forces > "str-stable" code to specify an encoding. The key is that at least *one* > of the signatures needs to be stable in meaning across both str and unicode > in 2.x in order to allow unicode-safe, str-stable code to be written. Using ASCII as the default encoding has the same property -- it can remain stable across the 2.x / 3.0 boundary. > (Again, for 3.x, this issue doesn't come into play because there's only one > string type to worry about; what the default is or whether there's a > default is therefore entirely up to you.) A nice-to-have property would be that it might be possible to write code that today deals with Unicode and str, but in 3.0 will deal with Unicode and bytes instead. But I'm not sure how likely that is since bytes objects won't have most methods that str and Unicode objects have (like lower(), find(), etc.). There's one property that bytes, str and unicode all share: type(x[0]) == type(x), at least as long as len(x) >= 1. This is perhaps the ultimate test for string-ness. Or should b[0] be an int, if b is a bytes object? That would change things dramatically. There's also the consideration for APIs that, informally, accept either a string or a sequence of objects. Many of these exist, and they are probably all being converted to support unicode as well as str (if it makes sense at all). Should a bytes object be considered as a sequence of things, or as a single thing, from the POV of these types of APIs? Should we try to standardize how code tests for the difference? (Currently all sorts of shortcuts are being taken, from isinstance(x, (list, tuple)) to isinstance(x, basestring).) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From foom at fuhm.net Tue Feb 14 01:49:55 2006 From: foom at fuhm.net (James Y Knight) Date: Mon, 13 Feb 2006 19:49:55 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: On Feb 13, 2006, at 7:09 PM, Guido van Rossum wrote: > On 2/13/06, Michael Foord wrote: >> Sorry - I meant for the unicode to bytes case. A default encoding >> that >> behaves differently to the current to implicit encodes/decodes >> would be >> confusing IMHO. > > And I am in agreement with you there (I think only Phillip argued > otherwise). > >> I agree that string to bytes shouldn't change the value of the bytes. > > It's a deal then. > > Can the owner of PEP 332 update the PEP to record these decisions? So, in python2.X, you have: - bytes("\x80"), you get a bytestring with a single byte of value 0x80 (when no encoding is specified, and the object is a str, it doesn't try to encode it at all). - bytes("\x80", encoding="latin-1"), you get an error, because encoding "\x80" into latin-1 implicitly decodes it into a unicode object first, via the system-wide default: ascii. - bytes(u"\x80"), you get an error, because the default encoding for a unicode string is ascii. - bytes(u"\x80", encoding="latin-1"), you get a bytestring with a single byte of value 0x80. In py3k, when the str object is eliminated, then what do you have? Perhaps - bytes("\x80"), you get an error, encoding is required. There is no such thing as "default encoding" anymore, as there's no str object. - bytes("\x80", encoding="latin-1"), you get a bytestring with a single byte of value 0x80. James From guido at python.org Tue Feb 14 02:11:42 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 17:11:42 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: On 2/13/06, James Y Knight wrote: > So, in python2.X, you have: > - bytes("\x80"), you get a bytestring with a single byte of value > 0x80 (when no encoding is specified, and the object is a str, it > doesn't try to encode it at all). > - bytes("\x80", encoding="latin-1"), you get an error, because > encoding "\x80" into latin-1 implicitly decodes it into a unicode > object first, via the system-wide default: ascii. > - bytes(u"\x80"), you get an error, because the default encoding for > a unicode string is ascii. > - bytes(u"\x80", encoding="latin-1"), you get a bytestring with a > single byte of value 0x80. Yes to all. > In py3k, when the str object is eliminated, then what do you have? > Perhaps > - bytes("\x80"), you get an error, encoding is required. There is no > such thing as "default encoding" anymore, as there's no str object. > - bytes("\x80", encoding="latin-1"), you get a bytestring with a > single byte of value 0x80. Yes to both again. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Tue Feb 14 02:49:21 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 13 Feb 2006 20:49:21 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <200602131729.12333.fdrake@acm.org> References: <200602131729.12333.fdrake@acm.org> Message-ID: On 2/13/06, Fred L. Drake, Jr. wrote: > On Monday 13 February 2006 15:40, Guido van Rossum wrote: > > Shouldn't docs.python.org be removed? It seems to add mroe confusion > > than anything, especially since most links on python.org continue to > > point to python.org/doc/. > > docs.python.org was created specifically to make searching the most recent > "stable" version of the docs easier (using Google's site: modifier, no less). > I don't know what the link count statistics say (other than what you > mention), and don't know which gets hit more often, but I still think it's a > reasonable approach. Why not do a query like this? http://www.google.com/search?q=site%3Apython.org/doc/current%20urllib&hl=en Jeremy From nas at arctrix.com Tue Feb 14 03:52:40 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 14 Feb 2006 02:52:40 +0000 (UTC) Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] References: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: Guido van Rossum wrote: >> In py3k, when the str object is eliminated, then what do you have? >> Perhaps >> - bytes("\x80"), you get an error, encoding is required. There is no >> such thing as "default encoding" anymore, as there's no str object. >> - bytes("\x80", encoding="latin-1"), you get a bytestring with a >> single byte of value 0x80. > > Yes to both again. I haven't been following this dicussion about bytes() real closely but I don't think that bytes() should do the encoding. We already have a way to spell that: "\x80".encode('latin-1') Also, I think it would useful to introduce byte array literals at the same time as the bytes object. That would allow people to use byte arrays without having to get involved with all the silly string encoding confusion. Neil From fdrake at acm.org Tue Feb 14 04:29:21 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 13 Feb 2006 22:29:21 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: Message-ID: <200602132229.21609.fdrake@acm.org> On Monday 13 February 2006 21:52, Neil Schemenauer wrote: > Also, I think it would useful to introduce byte array literals at > the same time as the bytes object. That would allow people to use > byte arrays without having to get involved with all the silly string > encoding confusion. bytes([0, 1, 2, 3]) -Fred -- Fred L. Drake, Jr. From guido at python.org Tue Feb 14 05:07:49 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 13 Feb 2006 20:07:49 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: On 2/13/06, Neil Schemenauer wrote: > Guido van Rossum wrote: > >> In py3k, when the str object is eliminated, then what do you have? > >> Perhaps > >> - bytes("\x80"), you get an error, encoding is required. There is no > >> such thing as "default encoding" anymore, as there's no str object. > >> - bytes("\x80", encoding="latin-1"), you get a bytestring with a > >> single byte of value 0x80. > > > > Yes to both again. > > I haven't been following this dicussion about bytes() real closely > but I don't think that bytes() should do the encoding. We already > have a way to spell that: > > "\x80".encode('latin-1') But in 2.5 we can't change that to return a bytes object without creating HUGE incompatibilities. In general I've come to appreciate that there are two ways of converting an object of type A to an object of type B: ask an A instance to convert itself to a B, or ask the type B to create a new instance from an A. Depending on what A and B are, both APIs make sense; sometimes reasons of decoupling require that A can't know about B, in which case you have to use the latter approach; sometimes B can't know about A, in which case you have to use the former. Even when A == B we sometimes support both APIs: to create a new list from a list a, you can write a[:] or list(a); to create a new dict from a dict d, you can write d.copy() or dict(d). An advantage of the latter API is that there's no confusion about the resulting type -- dict(d) is definitely a dict, and list(a) is definitely a list. Not so for d.copy() or a[:] -- if the input type is another mapping or sequence, it'll probably return an object of that same type. Again, it depends on the application which is better. I think that bytes(s, ) is fine, especially for expressing a new type, since it is unambiguous about the result type, and has no backwards compatibility issues. > Also, I think it would useful to introduce byte array literals at > the same time as the bytes object. That would allow people to use > byte arrays without having to get involved with all the silly string > encoding confusion. You missed the part where I said that introducing the bytes type *without* a literal seems to be a good first step. A new type, even built-in, is much less drastic than a new literal (which requires lexer and parser support in addition to everything else). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Tue Feb 14 05:59:03 2006 From: barry at python.org (Barry Warsaw) Date: Mon, 13 Feb 2006 23:59:03 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> Message-ID: <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> On Feb 13, 2006, at 7:29 PM, Guido van Rossum wrote: > There's one property that bytes, str and unicode all share: type(x[0]) > == type(x), at least as long as len(x) >= 1. This is perhaps the > ultimate test for string-ness. But not perfect, since of course other containers can contain objects of their own type too. But it leads to an interesting issue... > Or should b[0] be an int, if b is a bytes object? That would change > things dramatically. This makes me think I want an unsigned byte type, which b[0] would return. In another thread I think someone mentioned something about fixed width integral types, such that you could have an object that was guaranteed to be 8-bits wide, 16-bits wide, etc. Maybe you also want signed and unsigned versions of each. This may seem like YAGNI to many people, but as I've been working on a tightly embedded/ extended application for the last few years, I've definitely had occasions where I wish I could more closely and more directly model my C values as Python objects (without using the standard workarounds or writing my own C extension types). But anyway, without hyper-generalizing, it's still worth asking whether a bytes type is just a container of byte objects, where the contained objects would be distinct, fixed 8-bit unsigned integral types. > There's also the consideration for APIs that, informally, accept > either a string or a sequence of objects. Many of these exist, and > they are probably all being converted to support unicode as well as > str (if it makes sense at all). Should a bytes object be considered as > a sequence of things, or as a single thing, from the POV of these > types of APIs? Should we try to standardize how code tests for the > difference? (Currently all sorts of shortcuts are being taken, from > isinstance(x, (list, tuple)) to isinstance(x, basestring).) I think bytes objects are very much like string objects today -- they're the photons of Python since they can act like either sequences or scalars, depending on the context. For example, we have code that needs to deal with situations where an API can return either a scalar or a sequence of those scalars. So we have a utility function like this: def thingiter(obj): try: it = iter(obj) except TypeError: yield obj else: for item in it: yield item Maybe there's a better way to do this, but the most obvious problem is that (for our use cases), this fails for strings because in this context we want strings to act like scalars. So we add a little test just before the "try:" like "if isinstance(obj, basestring): yield obj". But that's yucky. I don't know what the solution is -- if there /is/ a solution short of special case tests like above, but I think the key observation is that sometimes you want your string to act like a sequence and sometimes you want it to act like a scalar. I suspect bytes objects will be the same way. -Barry From pje at telecommunity.com Tue Feb 14 06:20:56 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 14 Feb 2006 00:20:56 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: >On 2/13/06, Phillip J. Eby wrote: > > I didn't mean that it was the only purpose. In Python 2.x, practical code > > has to sometimes deal with "string-like" objects. That is, code that takes > > either strings or unicode. If such code calls bytes(), it's going to want > > to include an encoding so that unicode conversions won't fail. > >That sounds like a rather hypothetical example. Have you thought it >through? Presumably code that accepts both str and unicode either >doesn't care about encodings, but simply returns objects of the same >type as the arguments -- and then it's unlikely to want to convert the >arguments to bytes; or it *does* care about encodings, and then it >probably already has to special-case str vs. unicode because it has to >control how str objects are interpreted. Actually, it's the other way around. Code that wants to output uninterpreted bytes right now and accepts either strings or Unicode has to special-case *unicode* -- not str, because str is the only "bytes type" we currently have. This creates an interesting issue in WSGI for Jython, which of course only has one (unicode-based) string type now. Since there's no bytes type in Python in general, the only solution we could come up with was to treat such strings as latin-1: http://www.python.org/peps/pep-0333.html#unicode-issues This is why I'm biased towards latin-1 encoding of unicode to bytes; it's "the same thing" as an uninterpreted string of bytes. I think the difference in our viewpoints is that you're still thinking "string" thoughts, whereas I'm thinking "byte" thoughts. Bytes are just bytes; they don't *have* an encoding. So, if you think of "converting a string to bytes" as meaning "create an array of numerals corresponding to the characters in the string", then this leads to a uniform result whether the characters are in a str or a unicode object. In other words, to me, bytes(str_or_unicode) should be treated as: bytes(map(ord, str_or_unicode)) In other words, without an encoding, bytes() should simply treat str and unicode objects *as if they were a sequence of integers*, and produce an error when an integer is out of range. This is a logical and consistent interpretation in the absence of an encoding, because in that case you don't care about the encoding - it's just raw data. If, however, you include an encoding, then you're stating that you want to encode the *meaning* of the string, not merely its integer values. >What would bytes("abc\xf0", "latin-1") *mean*? Take the string >"abc\xf0", interpret it as being encoded in XXX, and then encode from >XXX to Latin-1. But what's XXX? As I showed in a previous post, >"abc\xf0".encode("latin-1") *fails* because the source for the >encoding is assumed to be ASCII. I'm saying that XXX would be the same encoding as you specified. i.e., including an encoding means you are encoding the *meaning* of the string. However, I believe I mainly proposed this as an alternative to having bytes(str_or_unicode) work like bytes(map(ord,str_or_unicode)), which I think is probably a saner default. >Your argument for symmetry would be a lot stronger if we used Latin-1 >for the conversion between str and Unicode. But we don't. But that's because we're dealing with its meaning *as a string*, not merely as ordinals in a sequence of bytes. > I like the >other interpretation (which I thought was yours too?) much better: str ><--> bytes conversions don't use encodings by simply change the type >without changing the bytes; I like it better too. The part you didn't like was where MAL and I believe this should be extended to Unicode characters in the 0-255 range also. :) >There's one property that bytes, str and unicode all share: type(x[0]) >== type(x), at least as long as len(x) >= 1. This is perhaps the >ultimate test for string-ness. > >Or should b[0] be an int, if b is a bytes object? That would change >things dramatically. +1 for it being an int. Heck, I'd want to at least consider the possibility of introducing a character type (chr?) in Python 3.0, and getting rid of the "iterating a string yields strings" characteristic. I've found it to be a bit of a pain when dealing with heterogeneous nested sequences that contain strings. >There's also the consideration for APIs that, informally, accept >either a string or a sequence of objects. Many of these exist, and >they are probably all being converted to support unicode as well as >str (if it makes sense at all). Should a bytes object be considered as >a sequence of things, or as a single thing, from the POV of these >types of APIs? Should we try to standardize how code tests for the >difference? (Currently all sorts of shortcuts are being taken, from >isinstance(x, (list, tuple)) to isinstance(x, basestring).) I'm inclined to think of certain features at least in terms of the buffer interface, but that's not something that's really exposed at the Python level. From martin at v.loewis.de Tue Feb 14 07:30:16 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 07:30:16 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> Message-ID: <43F178F8.60506@v.loewis.de> Jeremy Hylton wrote: > The compiler in question is gcc and the warning can be turned off with > -Wno-write-strings. I think we'd be better off leaving that option > on, though. This warning will help me find places where I'm passing a > string literal to a function that does not take a const char*. That's > valuable, not insensate. Hmm. I'd say this depends on what your reaction to the warning is. If you sprinkle const_casts in the code, nothing is gained. Perhaps there is some value in finding functions which ought to expect const char*. For that, occasional checks should be sufficient; I cannot see a point in having code permanently pass with that option. In particular not if you are interfacing with C libraries. Regards, Martin From martin at v.loewis.de Tue Feb 14 07:47:13 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 07:47:13 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F10035.8080207@egenix.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> Message-ID: <43F17CF1.1060902@v.loewis.de> M.-A. Lemburg wrote: > We're talking about Py3k here: "abc" will be a Unicode string, > so why restrict the conversion to 7 bits when you can have 8 bits > without any conversion problems ? YAGNI. If you have a need for byte string in source code, it will typically be "random" bytes, which can be nicely used through bytes([0x73, 0x9f, 0x44, 0xd2, 0xfb, 0x49, 0xa3, 0x14, 0x8b, 0xee]) For larger blocks, people should use base64.string_to_bytes (which can become a synonym for base64.decodestring in Py3k). If you have bytes that are meaningful text for some application (say, a wire protocol), it is typically ASCII-Text. No protocol I know of uses non-ASCII characters for protocol information. Of course, you need a way to get .encode output as bytes somehow, both in 2.5, and in Py3k. I suggest writing bytes(s.encode(encoding)) In 2.5, bytes() can be constructed from strings, and will do a conversion; in Py3k, .encode will already return a string, so this will be a no-op. Regards, Martin From martin at v.loewis.de Tue Feb 14 07:52:13 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 07:52:13 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> Message-ID: <43F17E1D.8030905@v.loewis.de> Phillip J. Eby wrote: > I was just pointing out that since byte strings are bytes by definition, > then simply putting those bytes in a bytes() object doesn't alter the > existing encoding. So, using latin-1 when converting a string to bytes > actually seems like the the One Obvious Way to do it. This is a misconception. In Python 2.x, the type str already *is* a bytes type. So if S is an instance of 2.x str, bytes(S) does not need to do any conversion. You don't need to assume it is latin-1: it's already bytes. > In fact, the 'encoding' argument seems useless in the case of str objects, > and it seems it should default to latin-1 for unicode objects. I agree with the former, but not with the latter. There shouldn't be a conversion of Unicode objects to bytes at all. If you want bytes from a Unicode string U, write bytes(U.encode(encoding)) Regards, Martin From martin at v.loewis.de Tue Feb 14 07:58:01 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 07:58:01 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: <43F17F79.9090407@v.loewis.de> Guido van Rossum wrote: >>In py3k, when the str object is eliminated, then what do you have? >>Perhaps >>- bytes("\x80"), you get an error, encoding is required. There is no >>such thing as "default encoding" anymore, as there's no str object. >>- bytes("\x80", encoding="latin-1"), you get a bytestring with a >>single byte of value 0x80. > > > Yes to both again. Please reconsider, and don't give bytes() an encoding= argument. It doesn't need one. In Python 3, people should write "\x80".encode("latin-1") if they absolutely want to, although they better write bytes([0x80]) Now, the first form isn't valid in 2.5, but bytes(u"\x80".encode("latin-1")) could work in all versions. Regards, Martin From rhamph at gmail.com Tue Feb 14 08:04:32 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 00:04:32 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F17CF1.1060902@v.loewis.de> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> Message-ID: On 2/13/06, "Martin v. L?wis" wrote: > M.-A. Lemburg wrote: > > We're talking about Py3k here: "abc" will be a Unicode string, > > so why restrict the conversion to 7 bits when you can have 8 bits > > without any conversion problems ? > > YAGNI. If you have a need for byte string in source code, it will > typically be "random" bytes, which can be nicely used through > > bytes([0x73, 0x9f, 0x44, 0xd2, 0xfb, 0x49, 0xa3, 0x14, 0x8b, 0xee]) > > For larger blocks, people should use base64.string_to_bytes (which > can become a synonym for base64.decodestring in Py3k). > > If you have bytes that are meaningful text for some application > (say, a wire protocol), it is typically ASCII-Text. No protocol > I know of uses non-ASCII characters for protocol information. What would that imply for repr()? To support eval(repr(x)) it would have to produce whatever format the source code includes to begin with. If I understand correctly there's three main candidates: 1. Direct copying to str in 2.x, pretending it's latin-1 in unicode in 3.x 2. Direct copying to str/unicode if it's only ascii values, switching to a list of hex literals if there's any non-ascii values 3. b"foo" literal with ascii for all ascii characters (other than \ and "), \xFF for individual characters that aren't ascii Given the choice I prefer the third option, with the second option as my runner up. The first option just screams "silent errors" to me. -- Adam Olsen, aka Rhamphoryncus From martin at v.loewis.de Tue Feb 14 08:04:50 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 08:04:50 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43F105AF.3000905@egenix.com> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> <43F105AF.3000905@egenix.com> Message-ID: <43F18112.5050608@v.loewis.de> M.-A. Lemburg wrote: >>It's the consequences: nobody complains about tacking "const" on to a >>former honest-to-God "char *" argument that was in fact not modified, >>because that's not only helpful for C++ programmers, it's _harmless_ >>for all programmers. For example, nobody could sanely object (and >>nobody did :-)) to adding const to the attribute-name argument in >>PyObject_SetAttrString(). Sticking to that creates no new problems >>for anyone, so that's as far as I ever went. > > > Well, it broke my C extensions... I now have this in my code: > > /* The keyword array changed to const char* in Python 2.5 */ > #if PY_VERSION_HEX >= 0x02050000 > # define Py_KEYWORDS_STRING_TYPE const char > #else > # define Py_KEYWORDS_STRING_TYPE char > #endif > ... > static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL}; > ... You did not read Tim's message carefully enough. He wasn't talking about PyArg_ParseTupleAndKeywords *at all*. He only talked about changing char* arguments to const char*, e.g. in PyObject_SetAttrString. Did that break your C extensions also? Regards, Martin From foom at fuhm.net Tue Feb 14 08:09:55 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 14 Feb 2006 02:09:55 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> References: <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> Message-ID: <92EFDF92-03A4-4987-9320-95E7F708AF16@fuhm.net> On Feb 14, 2006, at 12:20 AM, Phillip J. Eby wrote: > bytes(map(ord, str_or_unicode)) > > In other words, without an encoding, bytes() should simply treat > str and > unicode objects *as if they were a sequence of integers*, and > produce an > error when an integer is out of range. This is a logical and > consistent > interpretation in the absence of an encoding, because in that case you > don't care about the encoding - it's just raw data. If you're talking about "raw data", then make bytes(unicodestring) produce what buffer(unicodestring) currently does -- something completely and utterly worthless. :) [it depends on how you compiled python and what endianness your system has.] There really is no case where you don't care about the encoding...there is always a specific desired output encoding, and you have to think about what encoding that is. The argument that latin-1 is a sensible default just because you can convert to latin-1 by chopping off the upper 3 bytes of a unicode character's ordinal position is not convincing; you're still doing an encoding operation, it just happens to be computationally easy. That Jython programs have to pretend that unicode strings are an appropriate way to store bytes, and thus often have to do fake "latin-1" conversions which are really no such thing, doesn't make a convincing argument either. Using unicode strings to store bytes read from or written to a socket is really just broken. Actually having any default encoding at all is IMO a poor idea, but as python has one at the moment (ascii), might as well keep using it for consistency until it's eliminated (sys.setdefaultencoding ('undefined') is my friend.) James From martin at v.loewis.de Tue Feb 14 08:11:50 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 08:11:50 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: Message-ID: <43F182B6.8020409@v.loewis.de> Guido van Rossum wrote: > In private email, Phillip Eby suggested to add these things to the > 2.5. standard library: > > bdist_deb, bdist_msi, and friends [...] > I guess bdist_egg should also be added if we support setuptools (not > setuplib as I mistakenly called it previously)? I'm in favour of that (and not only because I wrote bdist_msi :-). I think distutils should support all native package formats we can get code for. I'm actually opposed to bdist_egg, from a conceptual point of view. I think it is wrong if Python creates its own packaging format (just as it was wrong that Java created jar files - but they are without deployment procedures even today). The burden should be on developer's side, for creating packages for the various systems, not on the users side, when each software comes with its own deployment infrastructure. OTOH, users are fond of eggs, for reasons that I haven't yet understood. >From a release management point of view, I would still like to make another bdist_msi release before contributing it to Python. Regards, Martin From martin at v.loewis.de Tue Feb 14 08:14:57 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 14 Feb 2006 08:14:57 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> Message-ID: <43F18371.9040306@v.loewis.de> Adam Olsen wrote: > What would that imply for repr()? To support eval(repr(x)) I don't think eval(repr(x)) needs to be supported for the bytes type. However, if that is desirable, it should return something like bytes([1,2,3]) Regards, Martin From thomas at xs4all.net Tue Feb 14 08:19:46 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Tue, 14 Feb 2006 08:19:46 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: Message-ID: <20060214071946.GV10226@xs4all.nl> On Mon, Feb 13, 2006 at 04:04:26PM -0800, Guido van Rossum wrote: > In private email, Phillip Eby suggested to add these things to the > 2.5. standard library: > > bdist_deb, bdist_msi, and friends FWIW, I've been using a patched distutils with bdist_deb, and it's worked fine for the most part. The only issue I had was with a setuptools package (rather than distutils), which I'm sure can be worked out. (Not that I'm particularly convinced setuptools is the right approach for a .deb, but I haven't really seen the point of setuptools anyway ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Tue Feb 14 09:09:22 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Tue, 14 Feb 2006 09:09:22 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> Message-ID: <20060214080921.GW10226@xs4all.nl> On Mon, Feb 13, 2006 at 03:44:27PM -0800, Guido van Rossum wrote: > But adding an encoding doesn't help. The str.encode() method always > assumes that the string itself is ASCII-encoded, and that's not good > enough: > >>> "abc".encode("latin-1") > 'abc' > >>> "abc".decode("latin-1") > u'abc' > >>> "abc\xf0".decode("latin-1") > u'abc\xf0' > >>> "abc\xf0".encode("latin-1") > Traceback (most recent call last): > File "", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position > 3: ordinal not in range(128) These comments disturb me. I never really understood why (byte) strings grew the 'encode' method, since 8-bit strings *are already encoded*, by their very nature. I mean, I understand it's useful because Python does non-unicode encodings like 'hex', but I don't really understand *why*. The benefits don't seem to outweigh the cost (but that's hindsight.) Directly encoding a (byte) string into a unicode encoding is mostly useless, as you've shown. The only use-case I can think of is translating ASCII in, for instance, EBCDIC. Encoding anything into an ASCII superset is a no-op, unless the system encoding isn't 'ascii' (and that's pretty rare, and not something a Python programmer should depend on.) On the other hand, the fact that (byte) strings have an 'encode' method creates a lot of confusion in unicode-newbies, and causes programs to break only when input is non-ASCII. And non-ASCII input just happens too often and too unpredictably in 'real-world' code, and not enough in European programmers' tests ;P Unicode objects and strings are not the same thing. We shouldn't treat them as the same thing. They share an interface (like lists and tuples do), and if you only use that interface, treating them as the same kind object is mostly ok. They actually share *less* of an interface than lists and tuples, though, as comparing strings to unicode objects can raise an exception, whereas comparing lists to tuples is not expected to. For anything less trivial than indexing, slicing and most of the string methods, and anything what so ever involving non-ASCII (or, rather, non-system-encoding), unicode objects and strings *must* be treated separately. For instance, there is no correct way to do: s.split("\x80") unless you know the type of 's'. If it's unicode, you want u"\x80" instead of "\x80". If it's not unicode, splitting "\x80" may not even be sensible, but you wouldn't know from looking at the code -- maybe it expects a specific encoding (or encoding family), maybe not. As soon as you deal with unicode, you need to really understand the concept, and too many programmers don't. And it's very hard to tell from someone's comments whether they fail to understand or just get some of the terminology wrong; that's why Guido's comments about 'encoding a byte string' and 'what if the file encoding is Unicode' scare me. The unicode/string mixup almost makes me wish Python was statically typed. So please, please, please don't make the mistake of 'doing something' with the 'encoding' argument to 'bytes(s, encoding)' when 's' is a (byte) string. It wouldn't actually be usable except for the same things as 'str.encode': to convert from ASCII to non-ASCII-supersets, or to convert to non-unicode encodings (such as 'hex'.) You can achieve those two by doing, e.g., 'bytes(s.encode('hex'))' if you really want to. Ignoring the encoding (rather than raising an exception) would also allow code to be trivially portable between Python 2.x and Py3K, when "" is actually a unicode object. Not that I'm happy with ignoring anything, but not ignoring would be bigger crime here. Oh, and while on the subject, I'm not convinced going all-unicode in Py3K is a good idea either, but maybe I should save that discussion for PyCon. I'm not thinking "why do we need unicode" anymore (which I did two years ago ;) but I *am* thinking it'll be a big step for 90% of the programmers if they have to grasp unicode and encodings to be able to even do 'raw_input()' sensibly. I know I spend an inordinate amount of time trying to explain the basics on #python on irc.freenode.net already. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From nnorwitz at gmail.com Tue Feb 14 09:09:36 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 14 Feb 2006 00:09:36 -0800 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <200602131552.44424.fdrake@acm.org> References: <200602131552.44424.fdrake@acm.org> Message-ID: On 2/13/06, Fred L. Drake, Jr. wrote: > On Monday 13 February 2006 10:03, Georg Brandl wrote: > > The above docs are from August 2005 while docs.python.org/dev is current. > > Shouldn't the old docs be removed? > > I'm afraid I've generally been too busy to chime in much on this topic, but > I've spent a bit of time thinking about it, and would like to keep on top of > the issue still. Fred, While you are here, are you planning to do the doc releases for 2.5? You are tentatively listed in PEP 356. (Technically it says TBD with a ? next to your name.) > The automatically-maintained version of the development docs is certainly > preferrable to the manually-maintained-by-me version, and I've updated the > link from www.python.org/doc/ to refer to that version for now. However, I > do have some concerns about how this is all structured still. I think this was the quick hack I did. I hope there are many concerns. :-) For example, if the doc build fails, ... Hmmm, this probably isn't a problem. The doc won't be updated, but will still be the last good version. So if I send mail when the doc doesn't build, then it might not be so bad. Will have to test this. I still need to switch over the failure mails to go to python-checkins. There are too many right now though. Unless people don't mind getting several messages about refleaks every day? Anyone? > What I would also like to see is to have an automatically-updated version for > each of the maintainer versions of Python, as well as the development trunk. > That would mean two versions at this point (2.4.x, 2.5.x); only one of those > is currently handled automatically. That shouldn't be a problem. See http://docs.python.org/dev/2.4/ n From mal at egenix.com Tue Feb 14 09:09:56 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 14 Feb 2006 09:09:56 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43F18112.5050608@v.loewis.de> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> <43F105AF.3000905@egenix.com> <43F18112.5050608@v.loewis.de> Message-ID: <43F19054.9080708@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: >>> It's the consequences: nobody complains about tacking "const" on to a >>> former honest-to-God "char *" argument that was in fact not modified, >>> because that's not only helpful for C++ programmers, it's _harmless_ >>> for all programmers. For example, nobody could sanely object (and >>> nobody did :-)) to adding const to the attribute-name argument in >>> PyObject_SetAttrString(). Sticking to that creates no new problems >>> for anyone, so that's as far as I ever went. >> >> Well, it broke my C extensions... I now have this in my code: >> >> /* The keyword array changed to const char* in Python 2.5 */ >> #if PY_VERSION_HEX >= 0x02050000 >> # define Py_KEYWORDS_STRING_TYPE const char >> #else >> # define Py_KEYWORDS_STRING_TYPE char >> #endif >> ... >> static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL}; >> ... > > You did not read Tim's message carefully enough. He wasn't talking > about PyArg_ParseTupleAndKeywords *at all*. He only talked about > changing char* arguments to const char*, e.g. in > PyObject_SetAttrString. Did that break your C extensions also? I did read Tim's post: sorry for phrasing the reply the way I did. I was referring to his statement "nobody complains about tacking "const" on to a former honest-to-God "char *" argument that was in fact not modified". Also: it's not me complaining, it's the compilers ! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From fuzzyman at voidspace.org.uk Tue Feb 14 10:29:37 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 14 Feb 2006 09:29:37 +0000 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: <43F1A301.2060609@voidspace.org.uk> Guido van Rossum wrote: > [snip..] > >>In py3k, when the str object is eliminated, then what do you have? >>Perhaps >>- bytes("\x80"), you get an error, encoding is required. There is no >>such thing as "default encoding" anymore, as there's no str object. >>- bytes("\x80", encoding="latin-1"), you get a bytestring with a >>single byte of value 0x80. >> >> > >Yes to both again. > > > *Slightly* related question. Sorry for the tangent. In Python 3K, when the string data-type has gone, what will ``open(filename).read()`` return ? Will the object returned have a ``decode`` method, to coerce to a unicode string ? Also, what datatype will ``u'some string'.encode('ascii')`` return ? I assume that when the ``bytes`` datatype is implemented, we will be able to do ``open(filename, 'wb').write(bytes(somedata))`` ? Hmmm... I probably ought to read the bytes PEP and the Py3k one... Just curious... All the best, Michael Foord >-- >--Guido van Rossum (home page: http://www.python.org/~guido/) > > > From greg.ewing at canterbury.ac.nz Tue Feb 14 11:52:59 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 14 Feb 2006 23:52:59 +1300 Subject: [Python-Dev] nice() In-Reply-To: <004f01c630c0$f051e1f0$5f2c4fca@csmith> References: <004f01c630c0$f051e1f0$5f2c4fca@csmith> Message-ID: <43F1B68B.5010604@canterbury.ac.nz> Smith wrote: > computing the bin boundaries for a histogram > where bins are a width of 0.1: > >>>>for i in range(20): > ... if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)): > ... print i,repr(i*.1),repr(i/10.),i*.1,i/10. I don't see how that has any relevance to the way bin boundaries would be used in practice, which is to say something like i = int(value / 0.1) bin[i] += 1 # modulo appropriate range checks which doesn't require comparing floats for equality at all. > For, say, garden variety numbers that aren't full of garbage digits > resulting from fp computation, the boundaries computed as 0.1*i are\ > not going to agree with such simple numbers as 1.4 and 0.7. Because the arithmetic is binary rather than decimal. But even using decimal, you get the same sort of problems using a bin width of 1.0/3.0. The solution is to use an algorithm that isn't sensitive to those problems, then it doesn't matter what base your arithmetic is done in. > I understand that the above really is just a patch over the problem, > but I'm wondering if it moves the problem far enough away that most > users wouldn't have to worry about it. No, it doesn't. The problems are not conveniently grouped together in some place you can get away from; they're scattered all over the place where you can stumble upon one at any time. > So perhaps this brings us back to the original comment that "fp issues > are a learning opportunity." They are. The question I have is "how > soon do they need to run into them?" Is decreasing the likelihood that > they will see the problem (but not eliminate it) a good thing for the > python community or not? I don't think you're doing anyone any favours by trying to protect them from having to know about these things, because they *need* to know about them if they're not to write algorithms that seem to work fine on tests but mysteriously start producing garbage when run on real data, possibly without it even being obvious that it is garbage. Greg From greg.ewing at canterbury.ac.nz Tue Feb 14 11:59:04 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 14 Feb 2006 23:59:04 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> Message-ID: <43F1B7F8.5050807@canterbury.ac.nz> Guido van Rossum wrote: > I also wonder if having a b"..." literal would just add more confusion > -- bytes are not characters, but b"..." makes it appear as if they > are. I'm inclined to agree. Bytes objects are more likely to be used for things which are *not* characters -- if they're characters, they would be better kept in strings or char arrays. +1 on any eventual bytes literal looking completely different from a string literal. Greg From greg.ewing at canterbury.ac.nz Tue Feb 14 12:25:03 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 00:25:03 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> Message-ID: <43F1BE0F.8090406@canterbury.ac.nz> Guido van Rossum wrote: > There's also the consideration for APIs that, informally, accept > either a string or a sequence of objects. My preference these days is not to design APIs that way. It's never necessary and it avoids a lot of problems. Greg From greg.ewing at canterbury.ac.nz Tue Feb 14 12:35:17 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 00:35:17 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> Message-ID: <43F1C075.3060207@canterbury.ac.nz> Barry Warsaw wrote: > This makes me think I want an unsigned byte type, which b[0] would > return. Come to think of it, this is something I don't remember seeing discussed. I've been thinking that bytes[i] would return an integer, but is the intention that it would return another bytes object? Greg From ncoghlan at gmail.com Tue Feb 14 12:53:04 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2006 21:53:04 +1000 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: <43F1C4A0.9060700@gmail.com> Guido van Rossum wrote: > In general I've come to appreciate that there are two ways of > converting an object of type A to an object of type B: ask an A > instance to convert itself to a B, or ask the type B to create a new > instance from an A. And the difference between the two isn't even always that clear cut. Sometimes you'll ask type B to create a new instance from an A, and then while you're not looking type B cheats and goes and asks the A instance to do it instead ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Tue Feb 14 13:08:47 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2006 22:08:47 +1000 Subject: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation In-Reply-To: References: <43EAF696.5070101@ieee.org> <20060209232734.GH10226@xs4all.nl> <43EC8AF8.2000506@gmail.com> <054578D1-518D-4566-A15A-114B3BB85790@verio.net> Message-ID: <43F1C84F.4040809@gmail.com> Guido van Rossum wrote: > On 2/10/06, Mark Russell wrote: >> On 10 Feb 2006, at 12:45, Nick Coghlan wrote: >> >> An alternative would be to call it "__discrete__", as that is the key >> >> characteristic of an indexing type - it consists of a sequence of discrete >> >> values that can be isomorphically mapped to the integers. >> Another alternative: __as_ordinal__. Wikipedia describes ordinals as >> "numbers used to denote the position in an ordered sequence" which seems a >> pretty precise description of the intended result. The "as_" prefix also >> captures the idea that this should be a lossless conversion. > > Aren't ordinals generally assumed to be non-negative? The numbers used > as slice or sequence indices can be negative! The other problem with 'ordinal' as a name is that the term already has a meaning in Python (what else would 'ord' be short for?). I liked index from the start, but I thought we should put at least a bit of effort into seeing if we could come up with anything better. I don't really see any way that either 'discrete' or 'ordinal' can be said to qualify as better :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From rhamph at gmail.com Tue Feb 14 13:47:39 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 05:47:39 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F18371.9040306@v.loewis.de> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> <43F18371.9040306@v.loewis.de> Message-ID: On 2/14/06, "Martin v. L?wis" wrote: > Adam Olsen wrote: > > What would that imply for repr()? To support eval(repr(x)) > > I don't think eval(repr(x)) needs to be supported for the bytes > type. However, if that is desirable, it should return something > like > > bytes([1,2,3]) I'm starting to wonder, do we really need anything fancy? Wouldn't it be sufficient to have a way to compactly store 8-bit integers? In 2.x we could convert unicode like this: bytes(ord(c) for c in u"It's...".encode('utf-8')) u"It's...".byteencode('utf-8') # Shortcut for above In 3.0 it changes to: "It's...".encode('utf-8') u"It's...".byteencode('utf-8') # Same as above, kept for compatibility Passing a str or unicode directly to bytes() would be an error. repr(bytes(...)) would produce bytes([1,2,3]). Probably need a __bytes__() method that print can call, or even better a __print__(file) method[0]. The write() methods would of course have to support bytes objects. I realize it would be odd for the interactive interpret to print them as a list of ints by default: >>> u"It's...".byteencode('utf-8') [73, 116, 39, 115, 46, 46, 46] But maybe it's time we stopped hiding the real nature of bytes from users? [0] By this I mean calling objects recursively and telling them what file to print to, rather than getting a temporary string from them and printing that. I always wondered why you could do that from C extensions but not from Python code. -- Adam Olsen, aka Rhamphoryncus From Jack.Jansen at cwi.nl Tue Feb 14 13:59:31 2006 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Tue, 14 Feb 2006 13:59:31 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43F105AF.3000905@egenix.com> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> <43F105AF.3000905@egenix.com> Message-ID: <957600FC-22C3-4E61-A67A-4910A01ADA5D@cwi.nl> Thanks to all for a rather insightful discussion, it's always fun to learn that after 28 years of C programming the language still has little corners that I know absolutely nothing about:-) Practically speaking, though, I've adopted MAL's solution for the time being: > /* The keyword array changed to const char* in Python 2.5 */ > #if PY_VERSION_HEX >= 0x02050000 > # define Py_KEYWORDS_STRING_TYPE const char > #else > # define Py_KEYWORDS_STRING_TYPE char > #endif > ... > static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL}; > ... > if (!PyArg_ParseTupleAndKeywords(args,kws,format,kwslist,&a1)) > goto onError; At least this appears to work... -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From jeremy at alum.mit.edu Tue Feb 14 14:01:10 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 14 Feb 2006 08:01:10 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43F19054.9080708@egenix.com> References: <1f7befae0602100954n4a746dffm209e64a5367ed38@mail.gmail.com> <1f7befae0602101027s64e292a4p9e42dd77eea3b00d@mail.gmail.com> <43F105AF.3000905@egenix.com> <43F18112.5050608@v.loewis.de> <43F19054.9080708@egenix.com> Message-ID: On 2/14/06, M.-A. Lemburg wrote: > Martin v. L?wis wrote: > > M.-A. Lemburg wrote: > >>> It's the consequences: nobody complains about tacking "const" on to a > >>> former honest-to-God "char *" argument that was in fact not modified, > >>> because that's not only helpful for C++ programmers, it's _harmless_ > >>> for all programmers. For example, nobody could sanely object (and > >>> nobody did :-)) to adding const to the attribute-name argument in > >>> PyObject_SetAttrString(). Sticking to that creates no new problems > >>> for anyone, so that's as far as I ever went. > >> > >> Well, it broke my C extensions... I now have this in my code: > >> > >> /* The keyword array changed to const char* in Python 2.5 */ > >> #if PY_VERSION_HEX >= 0x02050000 > >> # define Py_KEYWORDS_STRING_TYPE const char > >> #else > >> # define Py_KEYWORDS_STRING_TYPE char > >> #endif > >> ... > >> static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL}; > >> ... > > > > You did not read Tim's message carefully enough. He wasn't talking > > about PyArg_ParseTupleAndKeywords *at all*. He only talked about > > changing char* arguments to const char*, e.g. in > > PyObject_SetAttrString. Did that break your C extensions also? > > I did read Tim's post: sorry for phrasing the reply the way I did. > > I was referring to his statement "nobody complains about tacking "const" > on to a former honest-to-God "char *" argument that was in fact not > modified". > > Also: it's not me complaining, it's the compilers ! Tim was talking about adding const to a char* not adding const to a char** (note the two stars). The subsequent discussion has been about the different way those are handled in C and C++ and a general agreement that the "const char**" has been a bother for people. Jeremy From mwh at python.net Tue Feb 14 14:03:39 2006 From: mwh at python.net (Michael Hudson) Date: Tue, 14 Feb 2006 13:03:39 +0000 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F1BE0F.8090406@canterbury.ac.nz> (Greg Ewing's message of "Wed, 15 Feb 2006 00:25:03 +1300") References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <43F1BE0F.8090406@canterbury.ac.nz> Message-ID: <2moe1aay8k.fsf@starship.python.net> Greg Ewing writes: > Guido van Rossum wrote: > >> There's also the consideration for APIs that, informally, accept >> either a string or a sequence of objects. > > My preference these days is not to design APIs that > way. It's never necessary and it avoids a lot of > problems. Oh yes. Cheers, mwh -- ZAPHOD: Listen three eyes, don't try to outweird me, I get stranger things than you free with my breakfast cereal. -- The Hitch-Hikers Guide to the Galaxy, Episode 7 From jeremy at alum.mit.edu Tue Feb 14 14:05:32 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 14 Feb 2006 08:05:32 -0500 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: <43F178F8.60506@v.loewis.de> References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43F178F8.60506@v.loewis.de> Message-ID: On 2/14/06, "Martin v. L?wis" wrote: > Jeremy Hylton wrote: > > The compiler in question is gcc and the warning can be turned off with > > -Wno-write-strings. I think we'd be better off leaving that option > > on, though. This warning will help me find places where I'm passing a > > string literal to a function that does not take a const char*. That's > > valuable, not insensate. > > Hmm. I'd say this depends on what your reaction to the warning is. > If you sprinkle const_casts in the code, nothing is gained. Except for the Python APIs, we would declare the function as taking a const char* if took a const char*. If the function legitimately takes a char*, then you have to change the code to avoid a segfault. > Perhaps there is some value in finding functions which ought to expect > const char*. For that, occasional checks should be sufficient; I cannot > see a point in having code permanently pass with that option. In > particular not if you are interfacing with C libraries. I don't understand what you mean: I'm not sure what you mean by "occasional checks" or "permanently pass". The compiler flags are always the same. Jeremy From barry at python.org Tue Feb 14 14:32:42 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Feb 2006 08:32:42 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F1C075.3060207@canterbury.ac.nz> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> <43F1C075.3060207@canterbury.ac.nz> Message-ID: <20E03E3D-CDDC-49E7-BF14-A6070FFC09C1@python.org> On Feb 14, 2006, at 6:35 AM, Greg Ewing wrote: > Barry Warsaw wrote: > >> This makes me think I want an unsigned byte type, which b[0] would >> return. > > Come to think of it, this is something I don't > remember seeing discussed. I've been thinking > that bytes[i] would return an integer, but is > the intention that it would return another bytes > object? A related question: what would bytes([104, 101, 108, 108, 111, 8004]) return? An exception hopefully. I also think you'd want bytes([x for x in some_bytes_object]) to return an object equal to the original. -Barry From foom at fuhm.net Tue Feb 14 17:08:30 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 14 Feb 2006 11:08:30 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F17E1D.8030905@v.loewis.de> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F17E1D.8030905@v.loewis.de> Message-ID: <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> On Feb 14, 2006, at 1:52 AM, Martin v. L?wis wrote: > Phillip J. Eby wrote: >> I was just pointing out that since byte strings are bytes by >> definition, >> then simply putting those bytes in a bytes() object doesn't alter the >> existing encoding. So, using latin-1 when converting a string to >> bytes >> actually seems like the the One Obvious Way to do it. > > This is a misconception. In Python 2.x, the type str already *is* a > bytes type. So if S is an instance of 2.x str, bytes(S) does not need > to do any conversion. You don't need to assume it is latin-1: it's > already bytes. > >> In fact, the 'encoding' argument seems useless in the case of str >> objects, >> and it seems it should default to latin-1 for unicode objects. > > I agree with the former, but not with the latter. There shouldn't be a > conversion of Unicode objects to bytes at all. If you want bytes from > a Unicode string U, write > > bytes(U.encode(encoding)) I like it, it makes sense. Unicode strings are simply not allowed as arguments to the byte constructor. Thinking about it, why would it be otherwise? And if you're mixing str-strings and unicode-strings, that means the str-strings you're sometimes giving are actually not byte strings, but character strings anyhow, so you should be encoding those too. bytes(s_or_U.encode('utf-8')) is a perfectly good spelling. Kill the encoding argument, and you're left with: Python2.X: - bytes(bytes_object) -> copy constructor - bytes(str_object) -> copy the bytes from the str to the bytes object - bytes(sequence_of_ints) -> make bytes with the values of the ints, error on overflow Python3.X removes str, and most APIs that did return str return bytes instead. Now all you have is: - bytes(bytes_object) -> copy constructor - bytes(sequence_of_ints) -> make bytes with the values of the ints, error on overflow Nice and simple. James From pje at telecommunity.com Tue Feb 14 17:25:01 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 14 Feb 2006 11:25:01 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> References: <43F17E1D.8030905@v.loewis.de> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F17E1D.8030905@v.loewis.de> Message-ID: <5.1.1.6.0.20060214111701.02121758@mail.telecommunity.com> At 11:08 AM 2/14/2006 -0500, James Y Knight wrote: >On Feb 14, 2006, at 1:52 AM, Martin v. L?wis wrote: > >>Phillip J. Eby wrote: >>>I was just pointing out that since byte strings are bytes by >>>definition, >>>then simply putting those bytes in a bytes() object doesn't alter the >>>existing encoding. So, using latin-1 when converting a string to >>>bytes >>>actually seems like the the One Obvious Way to do it. >> >>This is a misconception. In Python 2.x, the type str already *is* a >>bytes type. So if S is an instance of 2.x str, bytes(S) does not need >>to do any conversion. You don't need to assume it is latin-1: it's >>already bytes. >> >>>In fact, the 'encoding' argument seems useless in the case of str >>>objects, >>>and it seems it should default to latin-1 for unicode objects. >> >>I agree with the former, but not with the latter. There shouldn't be a >>conversion of Unicode objects to bytes at all. If you want bytes from >>a Unicode string U, write >> >> bytes(U.encode(encoding)) > >I like it, it makes sense. Unicode strings are simply not allowed as >arguments to the byte constructor. Thinking about it, why would it be >otherwise? And if you're mixing str-strings and unicode-strings, that >means the str-strings you're sometimes giving are actually not byte >strings, but character strings anyhow, so you should be encoding >those too. bytes(s_or_U.encode('utf-8')) is a perfectly good spelling. Actually, I think you mean: if isinstance(s_or_U, str): s_or_U = s_or_U.decode('utf-8') b = bytes(s_or_U.encode('utf-8')) Or maybe: if isinstance(s_or_U, unicode): s_or_U = s_or_U.encode('utf-8') b = bytes(s_or_U) Which is why I proposed that the boilerplate logic get moved *into* the bytes constructor. I think this use case is going to be common in today's Python, but in truth I'm not as sure what bytes() will get used *for* in today's Python. I'm probably overprojecting based on the need to use str objects now, but bytes aren't going to be a replacement for str for a good while anyway. >Kill the encoding argument, and you're left with: > >Python2.X: >- bytes(bytes_object) -> copy constructor >- bytes(str_object) -> copy the bytes from the str to the bytes object >- bytes(sequence_of_ints) -> make bytes with the values of the ints, >error on overflow > >Python3.X removes str, and most APIs that did return str return bytes >instead. Now all you have is: >- bytes(bytes_object) -> copy constructor >- bytes(sequence_of_ints) -> make bytes with the values of the ints, >error on overflow > >Nice and simple. I could certainly live with that approach, and it certainly rules out all the "when does the encoding argument apply and when should it be an error to pass it" questions. :) From mal at egenix.com Tue Feb 14 17:47:39 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 14 Feb 2006 17:47:39 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F17E1D.8030905@v.loewis.de> <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> Message-ID: <43F209AB.5020706@egenix.com> James Y Knight wrote: > Kill the encoding argument, and you're left with: > > Python2.X: > - bytes(bytes_object) -> copy constructor > - bytes(str_object) -> copy the bytes from the str to the bytes object > - bytes(sequence_of_ints) -> make bytes with the values of the ints, > error on overflow > > Python3.X removes str, and most APIs that did return str return bytes > instead. Now all you have is: > - bytes(bytes_object) -> copy constructor > - bytes(sequence_of_ints) -> make bytes with the values of the ints, > error on overflow > > Nice and simple. Albeit, too simple. The above approach would basically remove the possibility to easily create bytes() from literals in Py3k, since literals in Py3k create Unicode objects, e.g. bytes("123") would not work in Py3k. It's hard to imagine how you'd provide a decent upgrade path for bytes() if you introduce the above semantics in Py2.x. People would start writing bytes("123") in Py2.x and expect it to also work in Py3k, which it wouldn't. To prevent this, you'd have to outrule bytes() construction from strings altogether, which doesn't look like a viable option either. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From alan.gauld at freenet.co.uk Mon Feb 13 01:35:16 2006 From: alan.gauld at freenet.co.uk (Alan Gauld) Date: Mon, 13 Feb 2006 00:35:16 -0000 Subject: [Python-Dev] [Tutor] nice() References: <038701c63004$733603c0$132c4fca@csmith> <00a001c6302b$82d51f10$0b01a8c0@xp> <20060212161410.5F02.JCARLSON@uci.edu> Message-ID: <00a401c63035$5cad50a0$0b01a8c0@xp> >> However I do dislike the name nice() - there is already a nice() in the >> os module with a fairly well understood function. But I'm sure some > Presumably it would be located somewhere like the math module. For sure, but let's avoid as many name clashes as we can. Python is very good at managing namespaces but there are still a lot of folks who favour the from x import * mode of working. Alan G. From jcarlson at uci.edu Tue Feb 14 18:28:54 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 14 Feb 2006 09:28:54 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> References: <43F17E1D.8030905@v.loewis.de> <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> Message-ID: <20060214092222.5F36.JCARLSON@uci.edu> James Y Knight wrote: > I like it, it makes sense. Unicode strings are simply not allowed as > arguments to the byte constructor. Thinking about it, why would it be > otherwise? And if you're mixing str-strings and unicode-strings, that > means the str-strings you're sometimes giving are actually not byte > strings, but character strings anyhow, so you should be encoding > those too. bytes(s_or_U.encode('utf-8')) is a perfectly good spelling. I also like the removal of the encoding... > Kill the encoding argument, and you're left with: > > Python2.X: > - bytes(bytes_object) -> copy constructor > - bytes(str_object) -> copy the bytes from the str to the bytes object > - bytes(sequence_of_ints) -> make bytes with the values of the ints, > error on overflow > > Python3.X removes str, and most APIs that did return str return bytes > instead. Now all you have is: > - bytes(bytes_object) -> copy constructor > - bytes(sequence_of_ints) -> make bytes with the values of the ints, > error on overflow What's great is that this already works: >>> import array >>> array.array('b', [1,2,3]) array('b', [1, 2, 3]) >>> array.array('b', "hello") array('b', [104, 101, 108, 108, 111]) >>> array.array('b', u"hello") Traceback (most recent call last): File "", line 1, in ? TypeError: array initializer must be list or string >>> array.array('b', [150]) Traceback (most recent call last): File "", line 1, in ? OverflowError: signed char is greater than maximum >>> array.array('B', [150]) array('B', [150]) >>> array.array('B', [350]) Traceback (most recent call last): File "", line 1, in ? OverflowError: unsigned byte integer is greater than maximum And out of the deal we can get both signed and unsigned ints. Re: Adam Olsen > I'm starting to wonder, do we really need anything fancy? Wouldn't it > be sufficient to have a way to compactly store 8-bit integers? It already exists. It could just use another interface. The buffer interface offers any array the ability to return strings. That may have to change to return bytes objects in Py3k. - Josiah From crutcher at gmail.com Tue Feb 14 18:48:59 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Tue, 14 Feb 2006 09:48:59 -0800 Subject: [Python-Dev] [Tutor] nice() In-Reply-To: <00a401c63035$5cad50a0$0b01a8c0@xp> References: <038701c63004$733603c0$132c4fca@csmith> <00a001c6302b$82d51f10$0b01a8c0@xp> <20060212161410.5F02.JCARLSON@uci.edu> <00a401c63035$5cad50a0$0b01a8c0@xp> Message-ID: On 2/12/06, Alan Gauld wrote: > >> However I do dislike the name nice() - there is already a nice() in the > >> os module with a fairly well understood function. But I'm sure some > > > Presumably it would be located somewhere like the math module. > > For sure, but let's avoid as many name clashes as we can. > Python is very good at managing namespaces but there are still a > lot of folks who favour the > > from x import * > > mode of working. Yes, and there are people who insist on drinking and driving, that doesn't mean cars should be designed with that as a motivating assumption. There are just too many places where you are going to get name clashes, where something which is _obvious_ in one context will have a different ( and _obvious_ ) meaning in another. Lets just keep the namespaces clean, and not worry about inter-module conflicts. -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From mal at egenix.com Tue Feb 14 18:58:11 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 14 Feb 2006 18:58:11 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> Message-ID: <43F21A33.2070504@egenix.com> Guido van Rossum wrote: > On 2/13/06, M.-A. Lemburg wrote: >> Guido van Rossum wrote: >>> It'd be cruel and unusual punishment though to have to write >>> >>> bytes("abc", "Latin-1") >>> >>> I propose that the default encoding (for basestring instances) ought >>> to be "ascii" just like everywhere else. (Meaning, it should really be >>> the system default encoding, which defaults to "ascii" and is >>> intentionally hard to change.) >> We're talking about Py3k here: "abc" will be a Unicode string, >> so why restrict the conversion to 7 bits when you can have 8 bits >> without any conversion problems ? > > As Phillip guessed, I was indeed thinking about introducing bytes() > sooner than that, perhaps even in 2.5 (though I don't want anything > rushed). Hmm, that is probably going to be too early. As the thread shows there are lots of things to take into account, esp. since if you plan to introduce byte() in 2.x, the upgrade path to 3.x would have to be carefully planned. Otherwise, we end up introducing a feature which is meant to prepare for 3.x and then we end up causing breakage when the move is finally implemented. > Even in Py3k though, the encoding issue stands -- what if the file > encoding is Unicode? Then using Latin-1 to encode bytes by default > might not by what the user expected. Or what if the file encoding is > something totally different? (Cyrillic, Greek, Japanese, Klingon.) > Anything default but ASCII isn't going to work as expected. ASCII > isn't going to work as expected either, but it will complain loudly > (by throwing a UnicodeError) whenever you try it, rather than causing > subtle bugs later. I think there's a misunderstanding here: in Py3k, all "string" literals will be converted from the source code encoding to Unicode. There are no ambiguities - a Klingon character will still map to the same ordinal used to create the byte content regardless of whether the source file is encoded in UTF-8, UTF-16 or some Klingon charset (are there any ?). Furthermore, by restricting to ASCII you'd also outrule hex escapes which seem to be the natural choice for presenting binary data in literals - the Unicode representation would then only be an implementation detail of the way Python treats "string" literals and a user would certainly expect to find e.g. \x88 in the bytes object if she writes bytes('\x88'). But maybe you have something different in mind... I'm talking about ways to create bytes() in Py3k using "string" literals. >> While we're at it: I'd suggest that we remove the auto-conversion >> from bytes to Unicode in Py3k and the default encoding along with >> it. > > I'm not sure which auto-conversion you're talking about, since there > is no bytes type yet. If you're talking about the auto-conversion from > str to unicode: the bytes type should not be assumed to have *any* > properties that the current str type has, and that includes > auto-conversion. I was talking about the automatic conversion of 8-bit strings to Unicode - which was a key feature to make the introduction of Unicode less painful, but will no longer be necessary in Py3k. >> In Py3k the standard lib will have to be Unicode compatible >> anyway and string parser markers like "s#" will have to go away >> as well, so there's not much need for this anymore. >> >> (Maybe a bit radical, but I guess that's what Py3k is meant for.) > > Right. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From michael.walter at gmail.com Tue Feb 14 19:15:37 2006 From: michael.walter at gmail.com (Michael Walter) Date: Tue, 14 Feb 2006 19:15:37 +0100 Subject: [Python-Dev] [Tutor] nice() In-Reply-To: References: <038701c63004$733603c0$132c4fca@csmith> <00a001c6302b$82d51f10$0b01a8c0@xp> <20060212161410.5F02.JCARLSON@uci.edu> <00a401c63035$5cad50a0$0b01a8c0@xp> Message-ID: <877e9a170602141015w22880f38n64fcde7a019f9a63@mail.gmail.com> It doesn't seem to me that math.nice has an obvious meaning. Regards, Michael On 2/14/06, Crutcher Dunnavant wrote: > On 2/12/06, Alan Gauld wrote: > > >> However I do dislike the name nice() - there is already a nice() in the > > >> os module with a fairly well understood function. But I'm sure some > > > > > Presumably it would be located somewhere like the math module. > > > > For sure, but let's avoid as many name clashes as we can. > > Python is very good at managing namespaces but there are still a > > lot of folks who favour the > > > > from x import * > > > > mode of working. > > Yes, and there are people who insist on drinking and driving, that > doesn't mean cars should be designed with that as a motivating > assumption. There are just too many places where you are going to get > name clashes, where something which is _obvious_ in one context will > have a different ( and _obvious_ ) meaning in another. Lets just keep > the namespaces clean, and not worry about inter-module conflicts. > > -- > Crutcher Dunnavant > littlelanguages.com > monket.samedi-studios.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > From foom at fuhm.net Tue Feb 14 19:35:44 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 14 Feb 2006 13:35:44 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F209AB.5020706@egenix.com> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F17E1D.8030905@v.loewis.de> <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> <43F209AB.5020706@egenix.com> Message-ID: <8AD0979A-C095-45A2-B7BA-466C617E28D0@fuhm.net> On Feb 14, 2006, at 11:47 AM, M.-A. Lemburg wrote: > The above approach would basically remove the possibility to easily > create bytes() from literals in Py3k, since literals in Py3k create > Unicode objects, e.g. bytes("123") would not work in Py3k. That is true. And I think that is correct. There should be b"string" syntax. > It's hard to imagine how you'd provide a decent upgrade path > for bytes() if you introduce the above semantics in Py2.x. > > People would start writing bytes("123") in Py2.x and expect > it to also work in Py3k, which it wouldn't. Agreed, it won't work. > To prevent this, you'd have to outrule bytes() construction > from strings altogether, which doesn't look like a viable > option either. I don't think you have to do that, you just have to provide b"string". I'd like to point out that the previous proposal had the same issue: On Feb 13, 2006, at 8:11 PM, Guido van Rossum wrote: > On 2/13/06, James Y Knight wrote: >> In py3k, when the str object is eliminated, then what do you have? >> Perhaps >> - bytes("\x80"), you get an error, encoding is required. There is no >> such thing as "default encoding" anymore, as there's no str object. >> - bytes("\x80", encoding="latin-1"), you get a bytestring with a >> single byte of value 0x80. >> > > Yes to both again. James From foom at fuhm.net Tue Feb 14 19:36:26 2006 From: foom at fuhm.net (James Y Knight) Date: Tue, 14 Feb 2006 13:36:26 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060214111701.02121758@mail.telecommunity.com> References: <43F17E1D.8030905@v.loewis.de> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F17E1D.8030905@v.loewis.de> <5.1.1.6.0.20060214111701.02121758@mail.telecommunity.com> Message-ID: On Feb 14, 2006, at 11:25 AM, Phillip J. Eby wrote: > At 11:08 AM 2/14/2006 -0500, James Y Knight wrote: >> I like it, it makes sense. Unicode strings are simply not allowed as >> arguments to the byte constructor. Thinking about it, why would it be >> otherwise? And if you're mixing str-strings and unicode-strings, that >> means the str-strings you're sometimes giving are actually not byte >> strings, but character strings anyhow, so you should be encoding >> those too. bytes(s_or_U.encode('utf-8')) is a perfectly good >> spelling. > Actually, I think you mean: > > if isinstance(s_or_U, str): > s_or_U = s_or_U.decode('utf-8') > > b = bytes(s_or_U.encode('utf-8')) > > Or maybe: > > if isinstance(s_or_U, unicode): > s_or_U = s_or_U.encode('utf-8') > > b = bytes(s_or_U) > > Which is why I proposed that the boilerplate logic get moved *into* > the bytes constructor. I think this use case is going to be common > in today's Python, but in truth I'm not as sure what bytes() will > get used *for* in today's Python. I'm probably overprojecting > based on the need to use str objects now, but bytes aren't going to > be a replacement for str for a good while anyway. I most certainly *did not* mean that. If you are mixing together str and unicode instances, the str instances _must be_ in the default encoding (ascii). Otherwise, you are bound for failure anyhow, e.g. ''.join(['\x95', u'1']). Str is used for two things right now: 1) a byte string. 2) a unicode string restricted to 7bit ASCII. These two uses are separate and you cannot mix them without causing disaster. You've created an interface which can take either a utf8 byte-string, or unicode character string. But that's wrong and can only cause problems. It should take either an encoded bytestring, or a unicode character string. Not both. If it takes a unicode character string, there are two ways of spelling that in current python: a "str" object with only ASCII in it, or a "unicode" object with arbitrary characters in it. bytes(s_or_U.encode('utf-8')) works correctly with both. James From guido at python.org Tue Feb 14 20:07:09 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 11:07:09 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F1A301.2060609@voidspace.org.uk> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> <43F1A301.2060609@voidspace.org.uk> Message-ID: On 2/14/06, Fuzzyman wrote: > In Python 3K, when the string data-type has gone, Technically it won't be gone; str will mean what it already means in Jython and IronPython (for which CPython uses unicode in 2.x). > what will > ``open(filename).read()`` return ? Since you didn't specify an open mode, it'll open it as a text file using some default encoding (or perhaps it can guess the encoding from file metadata -- this is all OS specific). So it'll return a string. If you open the file in binary mode, however, read() will return a bytes object. I'm currently considering whether we should have a single open() function which returns different types of objects depending on a string parameter's value, or whether it makes more sense to have different functions, e.g. open() for text files and openbinary() for binary files. I believe Fredrik Lundh wants open() to use binary mode and opentext() for text files, but that seems backwards -- surely text files are more commonly used, and surely the most common operation should have the shorter name -- call it the Huffman Principle. > Will the object returned have a > ``decode`` method, to coerce to a unicode string ? No, the object returned will *be* a (unicode) string. But a bytes object (returned by a binary open operation) will have a decode() method. > Also, what datatype will ``u'some string'.encode('ascii')`` return ? It will be a syntax error (u"..." will be illegal). The str.encode() method will return a bytes object (if the design goes as planned -- none of this is set in stone yet). > I assume that when the ``bytes`` datatype is implemented, we will be > able to do ``open(filename, 'wb').write(bytes(somedata))`` ? Hmmm... I > probably ought to read the bytes PEP and the Py3k one... Sort of (except perhaps we'd be using openbinary(filename, 'w")). Perhaps write(somedata) should automatically coerce the data to bytes? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 14 20:16:32 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 11:16:32 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F182B6.8020409@v.loewis.de> References: <43F182B6.8020409@v.loewis.de> Message-ID: On 2/13/06, "Martin v. L?wis" wrote: > I'm actually opposed to bdist_egg, from a conceptual point of view. > I think it is wrong if Python creates its own packaging format > (just as it was wrong that Java created jar files - but they are > without deployment procedures even today). I think Jars are a lower-level thing than what we're talking about here; they're no different than shared libraries, and for an architecture that has its own bytecode and toolchain it only makes sense to invent its own cross-platform shared library format (especially given the "deploy anywhere" slogan). > The burden should be > on developer's side, for creating packages for the various systems, > not on the users side, when each software comes with its own > deployment infrastructure. Well, just like Java, if you have pure Python code, why should a developer have to duplicate the busy-work of creating distributions for different platforms? (Especially since there are so many different target platforms -- RPM, .deb, Windows, MSI, Mac, fink, and what have you -- I'm no expert but ISTM there are too many!) > OTOH, users are fond of eggs, for reasons that I haven't yet > understood. I'm neutral on them; to be honest I don't even understand the difference between eggs and setuptools yet. :-) I imagine that users don't particularly care about eggs, but do care about the ease of use of the tools around them, i.e. ez_setup. > From a release management point of view, I would still like to > make another bdist_msi release before contributing it to Python. Please go ahead. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nas at arctrix.com Tue Feb 14 20:31:07 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 14 Feb 2006 12:31:07 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> Message-ID: <20060214193107.GA25293@mems-exchange.org> On Mon, Feb 13, 2006 at 08:07:49PM -0800, Guido van Rossum wrote: > On 2/13/06, Neil Schemenauer wrote: > > "\x80".encode('latin-1') > > But in 2.5 we can't change that to return a bytes object without > creating HUGE incompatibilities. People could spell it bytes(s.encode('latin-1')) in order to make it work in 2.X. That spelling would provide a way of ensuring the type of the return value. > You missed the part where I said that introducing the bytes type > *without* a literal seems to be a good first step. A new type, even > built-in, is much less drastic than a new literal (which requires > lexer and parser support in addition to everything else). Are you concerned about the implementation effort? If so, I don't think that's justified since adding a new string prefix should be pretty straightforward (relative to rest of the effort involved). Are you comfortable with the proposed syntax? Neil From pje at telecommunity.com Tue Feb 14 20:53:37 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 14 Feb 2006 14:53:37 -0500 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: <43F182B6.8020409@v.loewis.de> <43F182B6.8020409@v.loewis.de> Message-ID: <5.1.1.6.0.20060214142842.020e99b0@mail.telecommunity.com> (Disclaimer: I'm not currently promoting the addition of bdist_egg or any egg-specific features for the 2.5 timeframe, but neither am I opposed. This message is just to clarify a few points and questions under discussion, not to advocate a particular outcome. If you read this and think you see arguments for *doing* anything, you're projecting your own conclusions where there is only analysis.) At 11:16 AM 2/14/2006 -0800, Guido van Rossum wrote: >On 2/13/06, "Martin v. L?wis" wrote: > > I'm actually opposed to bdist_egg, from a conceptual point of view. > > I think it is wrong if Python creates its own packaging format > > (just as it was wrong that Java created jar files - but they are > > without deployment procedures even today). > >I think Jars are a lower-level thing than what we're talking about >here; they're no different than shared libraries, and for an >architecture that has its own bytecode and toolchain it only makes >sense to invent its own cross-platform shared library format >(especially given the "deploy anywhere" slogan). Java, however, layers many things atop jars, including resources (files, images, messages, etc.) and metadata (manifests, deployment descriptors, etc.). Eggs are the same. To think that jars or eggs are a "packaging format" is a conceptual error if by "packaging format" you're equating them with .rpm, .deb, .msi, etc. It is merely a convenient side benefit that .jar files and .egg files are convenient transport mechanisms for what's inside them - the jar or egg. Jars and eggs are conceptual entities independent of the distribution format, and in the case of eggs there are two other formats (.egg directory and .egg-info tags) that can be used to express the conceptual entity. > > The burden should be > > on developer's side, for creating packages for the various systems, > > not on the users side, when each software comes with its own > > deployment infrastructure. > >Well, just like Java, if you have pure Python code, why should a >developer have to duplicate the busy-work of creating distributions >for different platforms? (Especially since there are so many different >target platforms -- RPM, .deb, Windows, MSI, Mac, fink, and what have >you -- I'm no expert but ISTM there are too many!) Indeed. Placing the burden on the developer's side simply means that it doesn't happen until volunteers pick it up, which happens slowly and only for "popular enough" packages. Which means that as a practical matter, developers cannot release packages that depend on other packages without committing to some small set of target platforms and packaging systems -- the situation that setuptools was created to help change. > > OTOH, users are fond of eggs, for reasons that I haven't yet > > understood. > >I'm neutral on them; to be honest I don't even understand the >difference between eggs and setuptools yet. :-) Eggs are a way of associating metadata and resources with installed Python packages. ".egg" is a zip or directory file layout that is one implementation of this concept. Setuptools is a set of distutils enhancements that make it easier to build, test, distribute and deploy eggs, including the pkg_resources module (egg runtime support) and the easy_install package manager. > I imagine that users >don't particularly care about eggs, but do care about the ease of use >of the tools around them, i.e. ez_setup. And developers of course also care about not having to create those myriad installation formats, for platforms they may not even have. :) They also care about being able to specify dependencies reliably, which rules out entire classes of support issues and debugging. It actually makes reuse of Python packages practical *without* unnecessarily tying the result to just one of the myriad platforms that Python runs on. Some developers also like the plugin features, the ability to easily get data from their package directories, etc. (Setuptools also offers a lot of creature comforts that the distutils doesn't, and some of those conveniences depend on eggs, but others do not.) From just at letterror.com Tue Feb 14 21:35:50 2006 From: just at letterror.com (Just van Rossum) Date: Tue, 14 Feb 2006 21:35:50 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: Message-ID: Guido van Rossum wrote: > > what will > > ``open(filename).read()`` return ? > > Since you didn't specify an open mode, it'll open it as a text file > using some default encoding (or perhaps it can guess the encoding from > file metadata -- this is all OS specific). So it'll return a string. > > If you open the file in binary mode, however, read() will return a > bytes object. I'm currently considering whether we should have a > single open() function which returns different types of objects > depending on a string parameter's value, or whether it makes more > sense to have different functions, e.g. open() for text files and > openbinary() for binary files. I believe Fredrik Lundh wants open() to > use binary mode and opentext() for text files, but that seems > backwards -- surely text files are more commonly used, and surely the > most common operation should have the shorter name -- call it the > Huffman Principle. +1 for two functions. My choice would be open() for binary and opentext() for text. I don't find that backwards at all: the text function is going to be more different from the current open() function then the binary function would be since in many ways the str type is closer to bytes than to unicode. Maybe it's even better to use opentext() AND openbinary(), and deprecate plain open(). We could even introduce them at the same time as bytes() (and leave the open() deprecation for 3.0). Just From crutcher at gmail.com Tue Feb 14 22:41:21 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Tue, 14 Feb 2006 13:41:21 -0800 Subject: [Python-Dev] [Tutor] nice() In-Reply-To: <877e9a170602141015w22880f38n64fcde7a019f9a63@mail.gmail.com> References: <038701c63004$733603c0$132c4fca@csmith> <00a001c6302b$82d51f10$0b01a8c0@xp> <20060212161410.5F02.JCARLSON@uci.edu> <00a401c63035$5cad50a0$0b01a8c0@xp> <877e9a170602141015w22880f38n64fcde7a019f9a63@mail.gmail.com> Message-ID: On 2/14/06, Michael Walter wrote: > It doesn't seem to me that math.nice has an obvious meaning. I don't disagree, I think math.nice is a terrible name. I was objecting to the desire to try to come up with interesting, different names in every module namespace. > Regards, > Michael > > On 2/14/06, Crutcher Dunnavant wrote: > > On 2/12/06, Alan Gauld wrote: > > > >> However I do dislike the name nice() - there is already a nice() in the > > > >> os module with a fairly well understood function. But I'm sure some > > > > > > > Presumably it would be located somewhere like the math module. > > > > > > For sure, but let's avoid as many name clashes as we can. > > > Python is very good at managing namespaces but there are still a > > > lot of folks who favour the > > > > > > from x import * > > > > > > mode of working. > > > > Yes, and there are people who insist on drinking and driving, that > > doesn't mean cars should be designed with that as a motivating > > assumption. There are just too many places where you are going to get > > name clashes, where something which is _obvious_ in one context will > > have a different ( and _obvious_ ) meaning in another. Lets just keep > > the namespaces clean, and not worry about inter-module conflicts. > > > > -- > > Crutcher Dunnavant > > littlelanguages.com > > monket.samedi-studios.com > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > > > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From thomas at xs4all.net Tue Feb 14 22:46:08 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Tue, 14 Feb 2006 22:46:08 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: <43F182B6.8020409@v.loewis.de> Message-ID: <20060214214608.GA6027@xs4all.nl> On Tue, Feb 14, 2006 at 11:16:32AM -0800, Guido van Rossum wrote: > Well, just like Java, if you have pure Python code, why should a > developer have to duplicate the busy-work of creating distributions > for different platforms? (Especially since there are so many different > target platforms -- RPM, .deb, Windows, MSI, Mac, fink, and what have > you -- I'm no expert but ISTM there are too many!) Actually, that's where distutils and bdist_* comes in. Mr. Random Developer writes a regular distutils setup.py, and I can install the latest, not-quite-in-apt version by doing 'setup.py bdist_deb' and installing the resulting .deb. Very convenient for both parties ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From unknown_kev_cat at hotmail.com Tue Feb 14 23:05:08 2006 From: unknown_kev_cat at hotmail.com (Joe Smith) Date: Tue, 14 Feb 2006 17:05:08 -0500 Subject: [Python-Dev] bdist_* to stdlib? References: Message-ID: "Guido van Rossum" wrote in message news:ca471dc20602131604v12a4d70eq9d41b5ce543f3264 at mail.gmail.com... > In private email, Phillip Eby suggested to add these things to the > 2.5. standard library: > > bdist_deb, bdist_msi, and friends > > He explained them as follows: > > """ > bdist_deb makes .deb files (packages for Debian-based Linux distros, like > Ubuntu). bdist_msi makes .msi installers for Windows (it's by Martin v. > Loewis). Marc Lemburg proposed on the distutils-sig that these and > various > other implemented bdist_* formats (other than bdist_egg) be included in > the > next Python release, and there was no opposition there that I recall. > """ > I don't like the idea of bdist_deb very much. The idea behind the debian packaging system is that unlike with RPM and Windows, package management should be clean. Windows and RPM are known for major dependency problems, letting packages damage each other, having packages that do not uninstall cleanly (i.e. packages that leave junk all over the place) and generally messing the sytem up quite baddly over time, so that the OS is usually removed and re-installed periodically.) The Debian style system attempts to overcome these deficiencies, and generally does a decent job with it. The problem is that this can really only work if packages are well maintained, and adhere to a set of policies that help to further mitigate these problems. Even with all of that, packages from one debian based distribution may well cause problems with a different one. For that reason it is quite rare to see .debs distributed by parties other than those directly involved with a Debian-based distribution, and even then they are normally targeted specifically at one distibution. Making it easy to generate .debs of python modules will likely result in a noticable increase in the number of .debs that do not target a specific distribution and/or do not follow the policies of that distribution. So basically what I am saying is that such a system has a pretty good chance of resulting in debs that mess-up users systems, and that is not good. I'm not saying don't do it, but if it would be included in the standard library, procede with caution! From aleaxit at gmail.com Tue Feb 14 23:37:59 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 14 Feb 2006 14:37:59 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: Message-ID: On 2/14/06, Just van Rossum wrote: ... > Maybe it's even better to use opentext() AND openbinary(), and deprecate > plain open(). We could even introduce them at the same time as bytes() > (and leave the open() deprecation for 3.0). What about shorter names, such as 'text' instead of 'opentext' and 'data' instead of 'openbinary'? By eschewing the 'open' prefix we might make it easy to eventually migrate off it. Maybe text and data could be two subclasses of file, with file remaining initially as it is (and perhaps becoming an abstract-only baseclass at the time 'open' is deprecated). In real life, people do all the time use 'open' inappropriately (on non-text files on Windows): one of the most frequent tasks on python-help has to do with diagnosing that this is what happened and suggest the addition of an explicit 'rb' or 'wb' argument. This unending chore, in particular, makes me very wary of forever keeping open to mean "open this _text_ file". Alex From barry at python.org Tue Feb 14 23:48:57 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Feb 2006 17:48:57 -0500 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: Message-ID: <1139957337.13758.13.camel@geddy.wooz.org> On Tue, 2006-02-14 at 14:37 -0800, Alex Martelli wrote: > What about shorter names, such as 'text' instead of 'opentext' and > 'data' instead of 'openbinary'? By eschewing the 'open' prefix we > might make it easy to eventually migrate off it. Maybe text and data > could be two subclasses of file, with file remaining initially as it > is (and perhaps becoming an abstract-only baseclass at the time 'open' > is deprecated). I was actually thinking about static methods file.text() and file.data() which seem nicely self descriptive, if a little bit longer. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060214/0bf23082/attachment-0001.pgp From guido at python.org Tue Feb 14 23:51:20 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 14:51:20 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: Message-ID: On 2/14/06, Just van Rossum wrote: > Guido van Rossum wrote: > > [...] surely text files are more commonly used, and surely the > > most common operation should have the shorter name -- call it the > > Huffman Principle. > > +1 for two functions. > > My choice would be open() for binary and opentext() for text. I don't > find that backwards at all: the text function is going to be more > different from the current open() function then the binary function > would be since in many ways the str type is closer to bytes than to > unicode. It's still backwards because the current open function defaults to text on Windows (the only platform where it matters any more). > Maybe it's even better to use opentext() AND openbinary(), and deprecate > plain open(). We could even introduce them at the same time as bytes() > (and leave the open() deprecation for 3.0). And then, on 2/14/06, Alex Martelli wrote: > What about shorter names, such as 'text' instead of 'opentext' and > 'data' instead of 'openbinary'? By eschewing the 'open' prefix we > might make it easy to eventually migrate off it. Maybe text and data > could be two subclasses of file, with file remaining initially as it > is (and perhaps becoming an abstract-only baseclass at the time 'open' > is deprecated). Plain 'text' and 'data' don't convey the fact that we're talking about opening I/O objects here. If you want, we could say textfile() and datafile(). (I'm fine with data instead of binary.) But somehow I still like the 'open' verb. It has a long and rich tradition. And it also nicely conveys that it is a factory function which may return objects of different types (though similar in API) based upon either additional arguments (e.g. buffering) or the environment (e.g. encodings) or even inspection of the file being opened. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:25 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:25 -0800 Subject: [Python-Dev] bytes type discussion Message-ID: I'm about to send 6 or 8 replies to various salient messages in the PEP 332 revival thread. That's probably a sign that there's still a lot to be sorted out. In the mean time, to save you reading through all those responses, here's a summary of where I believe I stand. Let's continue the discussion in this new thread unless there are specific hairs to be split in the other thread that aren't addressed below or by later posts. Non-controversial (or almost): - we need a new PEP; PEP 332 won't cut it - no b"..." literal - bytes objects are mutable - bytes objects are composed of ints in range(256) - you can pass any iterable of ints to the bytes constructor, as long as they are in range(256) - longs or anything with an __index__ method should do, too - when you index a bytes object, you get a plain int - repr(bytes[1,0 20, 30]) == 'bytes([10, 20, 30])' Somewhat controversial: - it's probably too big to attempt to rush this into 2.5 - bytes("abc") == bytes(map(ord, "abc")) - bytes("\x80\xff") == bytes(map(ord, "\x80\xff")) == bytes([128, 256]) Very controversial: - bytes("abc", "encoding") == bytes("abc") # ignores the "encoding" argument - bytes(u"abc") == bytes("abc") # for ASCII at least - bytes(u"\x80\xff") raises UnicodeError - bytes(u"\x80\xff", "latin-1") == bytes("\x80\xff") Martin von Loewis's alternative for the "very controversial" set is to disallow an encoding argument and (I believe) also to disallow Unicode arguments. In 3.0 this would leave us with s.encode() as the only way to convert a string (which is always unicode) to bytes. The problem with this is that there's no code that works in both 2.x and 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:29 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:29 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F17F79.9090407@v.loewis.de> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> <43F17F79.9090407@v.loewis.de> Message-ID: On 2/13/06, "Martin v. L?wis" wrote: > Guido van Rossum wrote: > >>In py3k, when the str object is eliminated, then what do you have? > >>Perhaps > >>- bytes("\x80"), you get an error, encoding is required. There is no > >>such thing as "default encoding" anymore, as there's no str object. > >>- bytes("\x80", encoding="latin-1"), you get a bytestring with a > >>single byte of value 0x80. > > > > Yes to both again. > > Please reconsider, and don't give bytes() an encoding= argument. > It doesn't need one. In Python 3, people should write > > "\x80".encode("latin-1") > > if they absolutely want to, although they better write > > bytes([0x80]) > > Now, the first form isn't valid in 2.5, but > > bytes(u"\x80".encode("latin-1")) > > could work in all versions. In 3.0, I agree that .encode() should return a bytes object. I'd almost be convinced that in 2.x bytes() doesn't need an encoding argument, except it will require excessive copying. bytes(u.encode("utf8")) will certainly use 2*len(u) bytes space (plus a constant); bytes(u, "utf8") only needs len(u) bytes. In 3.0, bytes(s.encode(xxx)) would also create an extra copy, since the bytes type is mutable (we all agree on that, don't we?). I think that's a good enough argument for 2.x. We could keep the extended API as an alternative form in 3.x, or automatically translate calls to bytes(x, y) into x.encode(y). BTW I think we'll need a new PEP instead of PEP 332. The latter has almost no details relevant to this discussion, and it seems to treat bytes as a near-synonym for str in 2.x. That's not the way this discussion is going it seems. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:33 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:33 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <20060214080921.GW10226@xs4all.nl> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <20060214080921.GW10226@xs4all.nl> Message-ID: On 2/14/06, Thomas Wouters wrote: > On Mon, Feb 13, 2006 at 03:44:27PM -0800, Guido van Rossum wrote: > > > But adding an encoding doesn't help. The str.encode() method always > > assumes that the string itself is ASCII-encoded, and that's not good > > enough: > > > >>> "abc".encode("latin-1") > > 'abc' > > >>> "abc".decode("latin-1") > > u'abc' > > >>> "abc\xf0".decode("latin-1") > > u'abc\xf0' > > >>> "abc\xf0".encode("latin-1") > > Traceback (most recent call last): > > File "", line 1, in ? > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position > > 3: ordinal not in range(128) (Note that I've since been convinced that bytes(s) where type(s) == str should just return a bytes object containing the same bytes as s, regardless of encoding. So basically you're preaching to the choir now. The only remaining question is what if anything to do with an encoding argment when the first argument is of type str...) > These comments disturb me. I never really understood why (byte) strings grew > the 'encode' method, since 8-bit strings *are already encoded*, by their > very nature. I mean, I understand it's useful because Python does > non-unicode encodings like 'hex', but I don't really understand *why*. The > benefits don't seem to outweigh the cost (but that's hindsight.) It may also have something to do with Jython compatibility (which has str and unicode being the same thing) or 3.0 future-proofing. > Directly encoding a (byte) string into a unicode encoding is mostly useless, > as you've shown. The only use-case I can think of is translating ASCII in, > for instance, EBCDIC. Encoding anything into an ASCII superset is a no-op, > unless the system encoding isn't 'ascii' (and that's pretty rare, and not > something a Python programmer should depend on.) On the other hand, the fact > that (byte) strings have an 'encode' method creates a lot of confusion in > unicode-newbies, and causes programs to break only when input is non-ASCII. > And non-ASCII input just happens too often and too unpredictably in > 'real-world' code, and not enough in European programmers' tests ;P Oh, there are lots of ways that non-ASCII input can break code, you don't have to invoke encode() on str objects to get that effect. :/ > Unicode objects and strings are not the same thing. We shouldn't treat them > as the same thing. Well in 3.0 they *will* be the same thing, and in Jython they already are. > They share an interface (like lists and tuples do), and > if you only use that interface, treating them as the same kind object is > mostly ok. They actually share *less* of an interface than lists and tuples, > though, as comparing strings to unicode objects can raise an exception, > whereas comparing lists to tuples is not expected to. No, it causes silent surprises since [1,2,3] != (1,2,3). > For anything less > trivial than indexing, slicing and most of the string methods, and anything > what so ever involving non-ASCII (or, rather, non-system-encoding), unicode > objects and strings *must* be treated separately. For instance, there is no > correct way to do: > > s.split("\x80") > > unless you know the type of 's'. If it's unicode, you want u"\x80" instead > of "\x80". If it's not unicode, splitting "\x80" may not even be sensible, > but you wouldn't know from looking at the code -- maybe it expects a > specific encoding (or encoding family), maybe not. As soon as you deal with > unicode, you need to really understand the concept, and too many programmers > don't. And it's very hard to tell from someone's comments whether they fail > to understand or just get some of the terminology wrong; that's why Guido's > comments about 'encoding a byte string' and 'what if the file encoding is > Unicode' scare me. The unicode/string mixup almost makes me wish Python > was statically typed. I'm mostly trying to reflect various broken mental models that users may have. Believe me, my own confusion is nothing compared to the confusion that occurs in less gifted users. :-) The only use case for mixing ASCII and Unicode that I *wanted* to work right was the mixing of pure ASCII strings (typically literals) with Unicode data. And that works. Where things unfortunately fall flat is when you start reading data from files or interactive input and it gives you some encoded str object instead of a Unicode object. Our mistake was that we didn't foresee this clearly enough. Perhaps open(filename).read(), where the file contains non-ASCII bytes, should have been changed to either return a Unicode string (if an encoding can somehow be guessed), or raise an exception, rather than returning an str object in some unknown (and usually unknowable) encoding. I hope to fix that in 3.0 too, BTW. > So please, please, please don't make the mistake of 'doing something' with > the 'encoding' argument to 'bytes(s, encoding)' when 's' is a (byte) string. > It wouldn't actually be usable except for the same things as 'str.encode': > to convert from ASCII to non-ASCII-supersets, or to convert to non-unicode > encodings (such as 'hex'.) You can achieve those two by doing, e.g., > 'bytes(s.encode('hex'))' if you really want to. Ignoring the encoding > (rather than raising an exception) would also allow code to be trivially > portable between Python 2.x and Py3K, when "" is actually a unicode object. > > Not that I'm happy with ignoring anything, but not ignoring would be bigger > crime here. I'm beginning to see that this is a pretty reasonable interpretation. > Oh, and while on the subject, I'm not convinced going all-unicode in Py3K is > a good idea either, but maybe I should save that discussion for PyCon. I'm > not thinking "why do we need unicode" anymore (which I did two years ago ;) > but I *am* thinking it'll be a big step for 90% of the programmers if they > have to grasp unicode and encodings to be able to even do 'raw_input()' > sensibly. I know I spend an inordinate amount of time trying to explain the > basics on #python on irc.freenode.net already. I'm actually hoping that by having all strings be Unicode we'd *reduce* the amount of confusion. The key (see above where I admitted this as our biggest Unicode mistake) is to make sure that the encoding/decoding is built into all I/O operations. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:37 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:37 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <20060214193107.GA25293@mems-exchange.org> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> <20060214193107.GA25293@mems-exchange.org> Message-ID: On 2/14/06, Neil Schemenauer wrote: > People could spell it bytes(s.encode('latin-1')) in order to make it > work in 2.X. That spelling would provide a way of ensuring the type > of the return value. At the cost of an extra copying step. [Guido] > > You missed the part where I said that introducing the bytes type > > *without* a literal seems to be a good first step. A new type, even > > built-in, is much less drastic than a new literal (which requires > > lexer and parser support in addition to everything else). > > Are you concerned about the implementation effort? If so, I don't > think that's justified since adding a new string prefix should be > pretty straightforward (relative to rest of the effort involved). Not so much the implementation but also the documentation, updating 3rd party Python preprocessors, etc. > Are you comfortable with the proposed syntax? Not entirely, since I don't know what b"abcdef" would mean (where is a Unicode Euro character typed in whatever source encoding was used). Instead of b"abc" (only ASCII) you could write bytes("abc"). Instead of b"\xf0\xff\xee" you could write bytes([0xf0, 0xff, 0xee]). The key disconnect for me is that if bytes are not characters, we shouldn't use a literal notation that resembles the literal notation for characters. And there's growing consensus that a bytes type should be considered as an array of (8-bit unsigned) ints. Also, bytes objects are (in my mind anyway) mutable. We have no other literal notation for mutable objects. What would the following code print? for i in range(2): b = b"abc" print b b[0] = ord("A") Would the second output line print abc or Abc? I guess the only answer that makes sense is that it should print abc both times; but that means that b"abc" must be internally implemented by creating a new bytes object each time. Perhaps the implementation effort isn't so minimal after all... (PS why is there a reply-to in your email the excludes you from the list of recipients but includes me?) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:36 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:36 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> <43F18371.9040306@v.loewis.de> Message-ID: On 2/14/06, Adam Olsen wrote: > I'm starting to wonder, do we really need anything fancy? Wouldn't it > be sufficient to have a way to compactly store 8-bit integers? > > In 2.x we could convert unicode like this: > bytes(ord(c) for c in u"It's...".encode('utf-8')) Yuck. > u"It's...".byteencode('utf-8') # Shortcut for above Yuck**2. I'd like to avoid adding new APIs to existing types to return bytes instead of str. (It's okay to change existing APIs to *accept* bytes as an alternative to str though.) > In 3.0 it changes to: > "It's...".encode('utf-8') > u"It's...".byteencode('utf-8') # Same as above, kept for compatibility No. 3.0 won't have "backward compatibility" features. That's the whole point of 3.0. > Passing a str or unicode directly to bytes() would be an error. > repr(bytes(...)) would produce bytes([1,2,3]). I'm fine with that. > Probably need a __bytes__() method that print can call, or even better > a __print__(file) method[0]. The write() methods would of course have > to support bytes objects. Right on the latter. > I realize it would be odd for the interactive interpret to print them > as a list of ints by default: > >>> u"It's...".byteencode('utf-8') > [73, 116, 39, 115, 46, 46, 46] No. This prints the repr() which should include the type. bytes([73, 116, 39, 115, 46, 46, 46]) is the right thing to print here. > But maybe it's time we stopped hiding the real nature of bytes from users? That's the whole point. > [0] By this I mean calling objects recursively and telling them what > file to print to, rather than getting a temporary string from them and > printing that. I always wondered why you could do that from C > extensions but not from Python code. I want to keep the Python-level API small. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:41 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:41 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> References: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> Message-ID: On 2/13/06, Barry Warsaw wrote: > This makes me think I want an unsigned byte type, which b[0] would > return. In another thread I think someone mentioned something about > fixed width integral types, such that you could have an object that > was guaranteed to be 8-bits wide, 16-bits wide, etc. Maybe you also > want signed and unsigned versions of each. This may seem like YAGNI > to many people, but as I've been working on a tightly embedded/ > extended application for the last few years, I've definitely had > occasions where I wish I could more closely and more directly model > my C values as Python objects (without using the standard workarounds > or writing my own C extension types). So I'm taking that the specific properties you want to model are the overflow behavior, right? N-bit unsigned is defined as arithmethic mod 2**N; N-bit signed is a bit more tricky to define but similar. These never overflow but instead just throw away bits in an exactly specified manner (2's complement arithmetic). While I personally am comfortable with writing (x+y) & 0xFFFF (for 16-bit unsigned), I can see that someone who spends a lot of time doing arithmetic in this field might want specialized types. But I'm not sure that that's what the Numeric folks want -- I believe they're more interested in saving space, not in the mod 2**N properties. So (here I'm to some extent guessing) they have different array types whose elements are ints or floats of various widths; I'm guessing they also have scalars of those widths for consistency or to guide the creation of new arrays from scalars. I wouldn't be surprised if, rather than requiring N-bit 2's complement, they would prefer more flexible control over overflow -- e.g. ignore, warn, error, turn into NaN, etc. > But anyway, without hyper-generalizing, it's still worth asking > whether a bytes type is just a container of byte objects, where the > contained objects would be distinct, fixed 8-bit unsigned integral > types. There's certainly a point to treating bytes as ints; I don't know if it's more compelling than to treating them as unit bytes. But if we decide that the bytes types contains ints, b[0] should return a plain int (whose value necessarily is in range(0, 256)), not some new unsigned-8-bit type. And creating a bytes object from a list of ints should accept any input values as long as their __index__ value is in that same range. I.e. bytes([1, 2L]) should be the same as bytes([1L, 2]); and bytes([-1]) should raise a ValueError. > > There's also the consideration for APIs that, informally, accept > > either a string or a sequence of objects. Many of these exist, and > > they are probably all being converted to support unicode as well as > > str (if it makes sense at all). Should a bytes object be considered as > > a sequence of things, or as a single thing, from the POV of these > > types of APIs? Should we try to standardize how code tests for the > > difference? (Currently all sorts of shortcuts are being taken, from > > isinstance(x, (list, tuple)) to isinstance(x, basestring).) > > I think bytes objects are very much like string objects today -- > they're the photons of Python since they can act like either > sequences or scalars, depending on the context. For example, we have > code that needs to deal with situations where an API can return > either a scalar or a sequence of those scalars. So we have a utility > function like this: > > def thingiter(obj): > try: > it = iter(obj) > except TypeError: > yield obj > else: > for item in it: > yield item > > Maybe there's a better way to do this, but the most obvious problem > is that (for our use cases), this fails for strings because in this > context we want strings to act like scalars. So we add a little test > just before the "try:" like "if isinstance(obj, basestring): yield > obj". But that's yucky. > > I don't know what the solution is -- if there /is/ a solution short > of special case tests like above, but I think the key observation is > that sometimes you want your string to act like a sequence and > sometimes you want it to act like a scalar. I suspect bytes objects > will be the same way. I agree it's icky, and I'd rather not design APIs like that -- but I can't help it that others continue to want to use that idiom. I also agree that most likely we'll want to treat bytes the same as strings here. But no basestring (bytes are mutable and don't behave like sequences of characters). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:38 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:38 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> Message-ID: On 2/13/06, Adam Olsen wrote: > What would that imply for repr()? To support eval(repr(x)) it would > have to produce whatever format the source code includes to begin > with. I'm not sure that's a requirement. (I do think that in 2.x, str(bytes(s)) == s should hold as long as type(s) == str.) > If I understand correctly there's three main candidates: > 1. Direct copying to str in 2.x, pretending it's latin-1 in unicode in 3.x I'm not sure what you mean, but I'm guessing you're thinking that the repr() of a bytes object created from bytes('abc\xf0') would be bytes('abc\xf0') under this rule. What's so bad about that? > 2. Direct copying to str/unicode if it's only ascii values, switching > to a list of hex literals if there's any non-ascii values That works for me too. But why hex literals? As MvL stated, a list of decimals would be just as useful. > 3. b"foo" literal with ascii for all ascii characters (other than \ > and "), \xFF for individual characters that aren't ascii > > Given the choice I prefer the third option, with the second option as > my runner up. The first option just screams "silent errors" to me. The 3rd is out of the running for many reasons. I'm not sure I understand your "silent errors" fear; can you elaborate? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:47 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:47 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> Message-ID: On 2/13/06, Phillip J. Eby wrote: > At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: > >On 2/13/06, Phillip J. Eby wrote: > > > I didn't mean that it was the only purpose. In Python 2.x, practical code > > > has to sometimes deal with "string-like" objects. That is, code that takes > > > either strings or unicode. If such code calls bytes(), it's going to want > > > to include an encoding so that unicode conversions won't fail. > > > >That sounds like a rather hypothetical example. Have you thought it > >through? Presumably code that accepts both str and unicode either > >doesn't care about encodings, but simply returns objects of the same > >type as the arguments -- and then it's unlikely to want to convert the > >arguments to bytes; or it *does* care about encodings, and then it > >probably already has to special-case str vs. unicode because it has to > >control how str objects are interpreted. > > Actually, it's the other way around. Code that wants to output > uninterpreted bytes right now and accepts either strings or Unicode has to > special-case *unicode* -- not str, because str is the only "bytes type" we > currently have. But this is assuming that the str input is indeed uninterpreted bytes. That may be a tacit assumption or agreement but it may be wrong. Also, there are many ways to interpret "uninterpreted bytes" -- is it an image, a sound file, or UTF-8 text? In 2 out of those 3, passing unicode is more likely a bug than anything else (except in Jython). > This creates an interesting issue in WSGI for Jython, which of course only > has one (unicode-based) string type now. Since there's no bytes type in > Python in general, the only solution we could come up with was to treat > such strings as latin-1: I believe that's the general convention in Jython, as it matches the default (albeit deprecated) conversion between bytes and characters in Java itself. > http://www.python.org/peps/pep-0333.html#unicode-issues > > This is why I'm biased towards latin-1 encoding of unicode to bytes; it's > "the same thing" as an uninterpreted string of bytes. But in CPython this is not how this is generally done. > I think the difference in our viewpoints is that you're still thinking > "string" thoughts, whereas I'm thinking "byte" thoughts. Bytes are just > bytes; they don't *have* an encoding. I think when one side of the equation is Unicode, in CPython, I can be forgiven for thinking string thoughts, since Unicode is never used to carry binary bytes in CPython. You may have to craft some kind of different rule for Jython; it doesn't have a default encoding used when str meets unicode. > So, if you think of "converting a string to bytes" as meaning "create an > array of numerals corresponding to the characters in the string", then this > leads to a uniform result whether the characters are in a str or a unicode > object. In other words, to me, bytes(str_or_unicode) should be treated as: > > bytes(map(ord, str_or_unicode)) > > In other words, without an encoding, bytes() should simply treat str and > unicode objects *as if they were a sequence of integers*, and produce an > error when an integer is out of range. This is a logical and consistent > interpretation in the absence of an encoding, because in that case you > don't care about the encoding - it's just raw data. I see your point (now that you mentioned Jython). But I still don't think that this is a good default for CPython. > If, however, you include an encoding, then you're stating that you want to > encode the *meaning* of the string, not merely its integer values. Note that in Python 3000 we won't be using str/unicode to carry integer values around, since we will have the bytes type. So there, it makes sense to think of the conversion to always involve an encoding, possibly a default one. (And I think the default might more usefully be UTF-8 then.) > >What would bytes("abc\xf0", "latin-1") *mean*? Take the string > >"abc\xf0", interpret it as being encoded in XXX, and then encode from > >XXX to Latin-1. But what's XXX? As I showed in a previous post, > >"abc\xf0".encode("latin-1") *fails* because the source for the > >encoding is assumed to be ASCII. > > I'm saying that XXX would be the same encoding as you specified. i.e., > including an encoding means you are encoding the *meaning* of the string. That would be the same as ignoring the encoding argument when the input is str in CPython 2.x, right? I believe we started out saying we didn't want to ignore the encoding. Perhaps we need to reconsider that, given the Jython requirement? Then code that converts str to bytes and needs to be portable between Jython and CPython could write b = bytes(s, "latin-1") > However, I believe I mainly proposed this as an alternative to having > bytes(str_or_unicode) work like bytes(map(ord,str_or_unicode)), which I > think is probably a saner default. Sorry, i still don't buy that. > >Your argument for symmetry would be a lot stronger if we used Latin-1 > >for the conversion between str and Unicode. But we don't. > > But that's because we're dealing with its meaning *as a string*, not merely > as ordinals in a sequence of bytes. Well, *sometimes* the user *meant* it as a string, and *sometimes* she *didn't*. But we can't tell. I think it's safer to force her to be explicit. > > I like the > >other interpretation (which I thought was yours too?) much better: str > ><--> bytes conversions don't use encodings by simply change the type > >without changing the bytes; > > I like it better too. The part you didn't like was where MAL and I believe > this should be extended to Unicode characters in the 0-255 range also. :) I still don't. > >There's one property that bytes, str and unicode all share: type(x[0]) > >== type(x), at least as long as len(x) >= 1. This is perhaps the > >ultimate test for string-ness. > > > >Or should b[0] be an int, if b is a bytes object? That would change > >things dramatically. > > +1 for it being an int. Heck, I'd want to at least consider the > possibility of introducing a character type (chr?) in Python 3.0, and > getting rid of the "iterating a string yields strings" > characteristic. I've found it to be a bit of a pain when dealing with > heterogeneous nested sequences that contain strings. Can you give an example of that pain? What would a chr type behave like? Would it be a tiny int or a tiny string or something else again? Would you write its literals as 'c'? This would be a huge change and it needs more thought going into it than "sometimes it can be a bit of a pain", since I'm sure that's also true with the char-as-tiny-int interpretation. > >There's also the consideration for APIs that, informally, accept > >either a string or a sequence of objects. Many of these exist, and > >they are probably all being converted to support unicode as well as > >str (if it makes sense at all). Should a bytes object be considered as > >a sequence of things, or as a single thing, from the POV of these > >types of APIs? Should we try to standardize how code tests for the > >difference? (Currently all sorts of shortcuts are being taken, from > >isinstance(x, (list, tuple)) to isinstance(x, basestring).) > > I'm inclined to think of certain features at least in terms of the buffer > interface, but that's not something that's really exposed at the Python level. (And where it is, it's wrong. :-) If bytes support the buffer interface, we get another interesting issue -- regular expressions over bytes. Brr. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:13:44 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:13:44 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <20E03E3D-CDDC-49E7-BF14-A6070FFC09C1@python.org> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> <43F1C075.3060207@canterbury.ac.nz> <20E03E3D-CDDC-49E7-BF14-A6070FFC09C1@python.org> Message-ID: On 2/14/06, Barry Warsaw wrote: > A related question: what would bytes([104, 101, 108, 108, 111, 8004]) > return? An exception hopefully. Absolutely. > I also think you'd want bytes([x > for x in some_bytes_object]) to return an object equal to the original. You mean if types(some_bytes_object) is bytes? Yes. But that doesn't constrain the API much. Anyway, I'm now convinced that bytes should act as an array of ints, where the ints are restricted to range(0, 256) but have type int. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 00:14:07 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 15:14:07 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F21A33.2070504@egenix.com> References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F21A33.2070504@egenix.com> Message-ID: On 2/14/06, M.-A. Lemburg wrote: > Guido van Rossum wrote: > > As Phillip guessed, I was indeed thinking about introducing bytes() > > sooner than that, perhaps even in 2.5 (though I don't want anything > > rushed). > > Hmm, that is probably going to be too early. As the thread shows > there are lots of things to take into account, esp. since if you > plan to introduce bytes() in 2.x, the upgrade path to 3.x would > have to be carefully planned. Otherwise, we end up introducing > a feature which is meant to prepare for 3.x and then we end up > causing breakage when the move is finally implemented. You make a good point. Someone probably needs to write up a new PEP summarizing this discussion (or rather, consolidating the agreement that is slowly emerging, where there is agreement, and summarizing the key open questions). > > Even in Py3k though, the encoding issue stands -- what if the file > > encoding is Unicode? Then using Latin-1 to encode bytes by default > > might not by what the user expected. Or what if the file encoding is > > something totally different? (Cyrillic, Greek, Japanese, Klingon.) > > Anything default but ASCII isn't going to work as expected. ASCII > > isn't going to work as expected either, but it will complain loudly > > (by throwing a UnicodeError) whenever you try it, rather than causing > > subtle bugs later. > > I think there's a misunderstanding here: in Py3k, all "string" > literals will be converted from the source code encoding to > Unicode. There are no ambiguities - a Klingon character will still > map to the same ordinal used to create the byte content regardless > of whether the source file is encoded in UTF-8, UTF-16 or > some Klingon charset (are there any ?). OK, so a string (literal or otherwise) containing a Klingon character won't be acceptable to the bytes() constructor in 3.0. It shouldn't be in 2.x either then. I still think that someone who types a file in Latin-1 and enters non-ASCII Latin-1 characters in a string literal and then passes it to the bytes() constructor might expect to get bytes encoded in Latin-1, and someone who types a file in UTF-8 and enters non-ASCII Unicode characters might expect to get UTF-8-encoded bytes. Since they can't both get what they want, we should disallow both, and only allow ASCII. > Furthermore, by restricting to ASCII you'd also outrule hex escapes > which seem to be the natural choice for presenting binary data in > literals - the Unicode representation would then only be an > implementation detail of the way Python treats "string" literals > and a user would certainly expect to find e.g. \x88 in the bytes object > if she writes bytes('\x88'). I guess we'l just have to disappoint her. Too bad for the person who wrote bytes("\x12\x34\x56\x78\x9a\xbc\xde\xf0") -- they'll have to write bytes([0x12,0x34,0x56,0x78,0x9a,0xbc,0xde,0xf0]). Not so bad IMO and certainly easier than a *mixture* of hex and ASCII like '\xabc\xdef'. > But maybe you have something different in mind... I'm talking > about ways to create bytes() in Py3k using "string" literals. I'm not sure that's going to be common practive except for ASCII characters used in network protocols. > >> While we're at it: I'd suggest that we remove the auto-conversion > >> from bytes to Unicode in Py3k and the default encoding along with > >> it. > > > > I'm not sure which auto-conversion you're talking about, since there > > is no bytes type yet. If you're talking about the auto-conversion from > > str to unicode: the bytes type should not be assumed to have *any* > > properties that the current str type has, and that includes > > auto-conversion. > > I was talking about the automatic conversion of 8-bit strings to > Unicode - which was a key feature to make the introduction of > Unicode less painful, but will no longer be necessary in Py3k. OK. The bytes type certainly won't have this property. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Wed Feb 15 00:14:29 2006 From: bob at redivi.com (Bob Ippolito) Date: Tue, 14 Feb 2006 15:14:29 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: Message-ID: <571D2F15-F2D6-42DB-9AC2-575C45922243@redivi.com> On Feb 14, 2006, at 2:05 PM, Joe Smith wrote: > > "Guido van Rossum" wrote in message > news:ca471dc20602131604v12a4d70eq9d41b5ce543f3264 at mail.gmail.com... >> In private email, Phillip Eby suggested to add these things to the >> 2.5. standard library: >> >> bdist_deb, bdist_msi, and friends >> >> He explained them as follows: >> >> """ >> bdist_deb makes .deb files (packages for Debian-based Linux >> distros, like >> Ubuntu). bdist_msi makes .msi installers for Windows (it's by >> Martin v. >> Loewis). Marc Lemburg proposed on the distutils-sig that these and >> various >> other implemented bdist_* formats (other than bdist_egg) be >> included in >> the >> next Python release, and there was no opposition there that I recall. >> """ >> > > I don't like the idea of bdist_deb very much. > The idea behind the debian packaging system is that unlike with RPM > and > Windows, package management should be clean. > > Windows and RPM are known for major dependency problems, letting > packages > damage each other, having packages that do not uninstall cleanly (i.e. > packages that leave junk all over the place) and generally messing > the sytem > up quite baddly over time, so that the OS is usually removed and > re-installed periodically.) This is one problem that eggs go a LONG way towards solving, especially for platforms such as Windows and OS X that do not ship with an intelligent package management solution. The way that eggs are built more or less guarantees that they remain consistent, because it temporarily replaces file/open/etc and some other functions with sanity checks to make sure that the installation layout is self-contained** and thus compatible with eggs. It's not a real chroot, of course, but it's good enough for all practical purposes. The only things that easy_install overwrites** in the context of eggs are other eggs with an identical filename (version, platform, etc.), unless explicitly asked to do otherwise (e.g. remove some existing older version). Uninstallation is of course similarly clean, because it just nukes one directory or .egg file, and/or an associated .pth file. ** The exception is scripts. Scripts go wherever --install-scripts= point to, and AFAIK there is no means to ensure that the scripts from one egg do not interfere with the scripts for another egg or anything else on the PATH. I'm also not sure what the uninstallation story with scripts is. -bob From thomas at xs4all.net Wed Feb 15 00:15:37 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 00:15:37 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <1139957337.13758.13.camel@geddy.wooz.org> References: <1139957337.13758.13.camel@geddy.wooz.org> Message-ID: <20060214231537.GB6027@xs4all.nl> On Tue, Feb 14, 2006 at 05:48:57PM -0500, Barry Warsaw wrote: > On Tue, 2006-02-14 at 14:37 -0800, Alex Martelli wrote: > > > What about shorter names, such as 'text' instead of 'opentext' and > > 'data' instead of 'openbinary'? By eschewing the 'open' prefix we > > might make it easy to eventually migrate off it. Maybe text and data > > could be two subclasses of file, with file remaining initially as it > > is (and perhaps becoming an abstract-only baseclass at the time 'open' > > is deprecated). > > I was actually thinking about static methods file.text() and file.data() > which seem nicely self descriptive, if a little bit longer. Make them classmethods though, like dict.fromkeys. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Wed Feb 15 00:25:58 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 00:25:58 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: Message-ID: <20060214232558.GC6027@xs4all.nl> On Tue, Feb 14, 2006 at 05:05:08PM -0500, Joe Smith wrote: > I don't like the idea of bdist_deb very much. > The idea behind the debian packaging system is that unlike with RPM and > Windows, package management should be clean. The idea behind RPM is also that package management should be clean. Debian packages, on average, do a better job, and 'dpkg' deals a bit more flexibly with overwritten files and such, but it's not that big a difference. > The Debian style system attempts to overcome these deficiencies, and > generally does a decent job with it. The problem is that this can really > only work if packages are well maintained, and adhere to a set of policies > that help to further mitigate these problems. Making it easy to generate > .debs of python modules will likely result in a noticable increase in the > number of .debs that do not target a specific distribution and/or do not > follow the policies of that distribution. That sounds like "oh no, what if the user presses the wrong button". Users can already mess up the system if they do the wrong thing. Distutils offers a simple, generic way of saying 'install this' while letting distutils figure out most of the details. bdist_deb can then put it all in debian-specific locations, in the debian-preferred way, while registering all the files so they get deleted properly on deinstall. Things get more complicated when you have pre-/post-install/remove scripts, but those are pretty rare for the average Python packages, and since they would (in the Python package) most likely run from setup.py, those would break at bdist-time, not deb-install-time. It's not easier for bdist-deb created .deb's to break things than it is for arbitrary developer-built .deb's to do so, and it's quite a bit easier for 'setup.py install' to break things. At least a .deb can be easily removed. And the alternative to bdist_deb is in many cases 'setup.py install'. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From bob at redivi.com Wed Feb 15 00:35:14 2006 From: bob at redivi.com (Bob Ippolito) Date: Tue, 14 Feb 2006 15:35:14 -0800 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: Message-ID: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> On Feb 14, 2006, at 3:13 PM, Guido van Rossum wrote: > I'm about to send 6 or 8 replies to various salient messages in the > PEP 332 revival thread. That's probably a sign that there's still a > lot to be sorted out. In the mean time, to save you reading through > all those responses, here's a summary of where I believe I stand. > Let's continue the discussion in this new thread unless there are > specific hairs to be split in the other thread that aren't addressed > below or by later posts. > > Non-controversial (or almost): > > - we need a new PEP; PEP 332 won't cut it > > - no b"..." literal > > - bytes objects are mutable > > - bytes objects are composed of ints in range(256) > > - you can pass any iterable of ints to the bytes constructor, as long > as they are in range(256) Sounds like array.array('B'). Will the bytes object support the buffer interface? Will it accept objects supporting the buffer interface in the constructor (or a class method)? If so, will it be a copy or a view? Current array.array behavior says copy. > - longs or anything with an __index__ method should do, too > > - when you index a bytes object, you get a plain int When slicing a bytes object, do you get another bytes object or a list? If its a bytes object, is it a copy or a view? Current array.array behavior says copy. > - repr(bytes[1,0 20, 30]) == 'bytes([10, 20, 30])' > > Somewhat controversial: > > - it's probably too big to attempt to rush this into 2.5 > > - bytes("abc") == bytes(map(ord, "abc")) > > - bytes("\x80\xff") == bytes(map(ord, "\x80\xff")) == bytes([128, > 256]) It would be VERY controversial if ord('\xff') == 256 ;) > Very controversial: > > - bytes("abc", "encoding") == bytes("abc") # ignores the "encoding" > argument > > - bytes(u"abc") == bytes("abc") # for ASCII at least > > - bytes(u"\x80\xff") raises UnicodeError > > - bytes(u"\x80\xff", "latin-1") == bytes("\x80\xff") > > Martin von Loewis's alternative for the "very controversial" set is to > disallow an encoding argument and (I believe) also to disallow Unicode > arguments. In 3.0 this would leave us with s.encode() as the > only way to convert a string (which is always unicode) to bytes. The > problem with this is that there's no code that works in both 2.x and > 3.0. Given a base64 or hex string, how do you get a bytes object out of it? Currently str.decode('base64') and str.decode('hex') are good solutions to this... but you get a str object back. -bob From nas at arctrix.com Wed Feb 15 00:38:33 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 14 Feb 2006 16:38:33 -0700 Subject: [Python-Dev] byte literals unnecessary [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> <20060214193107.GA25293@mems-exchange.org> Message-ID: <20060214233832.GA26510@mems-exchange.org> On Tue, Feb 14, 2006 at 03:13:37PM -0800, Guido van Rossum wrote: > Also, bytes objects are (in my mind anyway) mutable. We have no other > literal notation for mutable objects. What would the following code > print? > > for i in range(2): > b = b"abc" > print b > b[0] = ord("A") > > Would the second output line print abc or Abc? > > I guess the only answer that makes sense is that it should print abc > both times; but that means that b"abc" must be internally implemented > by creating a new bytes object each time. Perhaps the implementation > effort isn't so minimal after all... I agree. I was thinking that bytes() would be immutable and therefore very similar to the current str object. You've convinced me that a literal representation is not needed. Thanks for clarifying your position. > (PS why is there a reply-to in your email the excludes you from the > list of recipients but includes me?) Maybe you should ask your coworkers. :-) I think gmail is trying to do something intelligent with the Mail-Followup-To header. Neil From pje at telecommunity.com Wed Feb 15 00:44:13 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 14 Feb 2006 18:44:13 -0500 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <571D2F15-F2D6-42DB-9AC2-575C45922243@redivi.com> References: Message-ID: <5.1.1.6.0.20060214184119.0366cbd0@mail.telecommunity.com> At 03:14 PM 2/14/2006 -0800, Bob Ippolito wrote: >I'm also not sure what the uninstallation story >with scripts is. The scripts have enough breadcrumbs in them that you can figure out what egg they go with. More precisely, an egg contains enough information for you to search PATH for its scripts and verify that they still refer to the egg before removing them. This is of course fragile if you put the scripts in some random location not on your PATH. Anyway, actual *implementation* of uninstallation features isn't going to be until the 0.7 development cycle. From barry at python.org Wed Feb 15 00:51:48 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Feb 2006 18:51:48 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> Message-ID: <1139961108.13757.32.camel@geddy.wooz.org> On Tue, 2006-02-14 at 15:13 -0800, Guido van Rossum wrote: > So I'm taking that the specific properties you want to model are the > overflow behavior, right? N-bit unsigned is defined as arithmethic mod > 2**N; N-bit signed is a bit more tricky to define but similar. These > never overflow but instead just throw away bits in an exactly > specified manner (2's complement arithmetic). That would be my use case, yep. > While I personally am comfortable with writing (x+y) & 0xFFFF (for > 16-bit unsigned), I can see that someone who spends a lot of time > doing arithmetic in this field might want specialized types. I'd put it in the "annoying, although there exists a workaround that might confound newbies" category. Which means it's definitely not urgent enough to address for 2.5 -- if ever -- especially given your current stance on bytes(bunch_of_ints)[0]. The two are of course separate issues, but thinking about one lead to the other. > But I'm not sure that that's what the Numeric folks want -- I believe > they're more interested in saving space, not in the mod 2**N > properties. Could be. I don't care about space savings. And I definitely have no clue what the Numeric folks want. ;) > There's certainly a point to treating bytes as ints; I don't know if > it's more compelling than to treating them as unit bytes. But if we > decide that the bytes types contains ints, b[0] should return a plain > int (whose value necessarily is in range(0, 256)), not some new > unsigned-8-bit type. And creating a bytes object from a list of ints > should accept any input values as long as their __index__ value is in > that same range. > > I.e. bytes([1, 2L]) should be the same as bytes([1L, 2]); and > bytes([-1]) should raise a ValueError. That seems fine to me. > I agree it's icky, and I'd rather not design APIs like that -- but I > can't help it that others continue to want to use that idiom. I also > agree that most likely we'll want to treat bytes the same as strings > here. But no basestring (bytes are mutable and don't behave like > sequences of characters). That's interesting. So bytes really behave a lot more like some weird string/lists hybrid then? It makes some sense. You read 801 bytes from a binary file, twiddle bytes 223 and 741 and then write those bytes back out to a different binary file. If we don't inherit from basestring, what I'm worried about is that for those who do continue to use the idiom described previously, we'll have to extend our isinstance() to include both basestring and bytes. Which definitely gets ickier. But if bytes are mutable, as make sense, then it also makes sense that they don't inherit from basestring. BTW, using that idiom is a bit of a hedge against such API (which you may not control). It allows us to say "okay, at /this/ point I don't know whether I have a scalar or a sequence, but from this point forward, I know I have something I can safely iterate over." I wonder if it makes sense to add a more fundamental abstract base class that can be used as a marker for "photonic behavior". I don't know what that class would be called, but you'd then have a hierarchy like this: photonic basestring str unicode bytes OTOH, it seems like a lot to add for a specialized (and some would say dubious) use case. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060214/9acabe4a/attachment.pgp From guido at python.org Wed Feb 15 01:13:07 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 16:13:07 -0800 Subject: [Python-Dev] byte literals unnecessary [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060214233832.GA26510@mems-exchange.org> References: <43F118E0.6090704@voidspace.org.uk> <43F11BEC.8050902@voidspace.org.uk> <20060214193107.GA25293@mems-exchange.org> <20060214233832.GA26510@mems-exchange.org> Message-ID: On 2/14/06, Neil Schemenauer wrote: > Maybe you should ask your coworkers. :-) I think gmail is trying to > do something intelligent with the Mail-Followup-To header. But you're the only person for whom it does that. Do you have a funny gmail setting? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 01:17:11 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 16:17:11 -0800 Subject: [Python-Dev] bytes type discussion In-Reply-To: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> References: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> Message-ID: On 2/14/06, Bob Ippolito wrote: > On Feb 14, 2006, at 3:13 PM, Guido van Rossum wrote: > > - we need a new PEP; PEP 332 won't cut it > > > > - no b"..." literal > > > > - bytes objects are mutable > > > > - bytes objects are composed of ints in range(256) > > > > - you can pass any iterable of ints to the bytes constructor, as long > > as they are in range(256) > > Sounds like array.array('B'). Sure. > Will the bytes object support the buffer interface? Do you want them to? I suppose they should *not* support the *text* part of that API. > Will it accept > objects supporting the buffer interface in the constructor (or a > class method)? If so, will it be a copy or a view? Current > array.array behavior says copy. bytes() should always copy -- thanks for asking. > > - longs or anything with an __index__ method should do, too > > > > - when you index a bytes object, you get a plain int > > When slicing a bytes object, do you get another bytes object or a > list? If its a bytes object, is it a copy or a view? Current > array.array behavior says copy. Another bytes object which is a copy. (Why would you even think about views here? They are evil.) > > - repr(bytes[1,0 20, 30]) == 'bytes([10, 20, 30])' > > > > Somewhat controversial: > > > > - it's probably too big to attempt to rush this into 2.5 > > > > - bytes("abc") == bytes(map(ord, "abc")) > > > > - bytes("\x80\xff") == bytes(map(ord, "\x80\xff")) == bytes([128, > > 256]) > > It would be VERY controversial if ord('\xff') == 256 ;) Oops. :-) > > Very controversial: > > > > - bytes("abc", "encoding") == bytes("abc") # ignores the "encoding" > > argument > > > > - bytes(u"abc") == bytes("abc") # for ASCII at least > > > > - bytes(u"\x80\xff") raises UnicodeError > > > > - bytes(u"\x80\xff", "latin-1") == bytes("\x80\xff") > > > > Martin von Loewis's alternative for the "very controversial" set is to > > disallow an encoding argument and (I believe) also to disallow Unicode > > arguments. In 3.0 this would leave us with s.encode() as the > > only way to convert a string (which is always unicode) to bytes. The > > problem with this is that there's no code that works in both 2.x and > > 3.0. > > Given a base64 or hex string, how do you get a bytes object out of > it? Currently str.decode('base64') and str.decode('hex') are good > solutions to this... but you get a str object back. I don't know -- you can propose an API you like here. base64 is as likely to encode text as binary data, so I don't think it's wrong for those things to return strings. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Wed Feb 15 01:24:46 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 01:24:46 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: Message-ID: <20060215002446.GD6027@xs4all.nl> On Tue, Feb 14, 2006 at 03:13:25PM -0800, Guido van Rossum wrote: > Martin von Loewis's alternative for the "very controversial" set is to > disallow an encoding argument and (I believe) also to disallow Unicode > arguments. In 3.0 this would leave us with s.encode() as the > only way to convert a string (which is always unicode) to bytes. The > problem with this is that there's no code that works in both 2.x and > 3.0. Unless you only ever create (byte)strings by doing s.encode(), and only send them to code that is either byte/string-agnostic or -aware. Oh, and don't use indexing, only slicing (length-1 if you have to.) I guess it depends on howmuch code will accept a bytes-string where currently a string is the norm (and a unicode object is default-encoded.) I'm still worried that all this is quite a big leap. Very few people understand the intricacies of unicode encodings. (Almost everyone understands unicode, except they don't know it yet; it's the encodings that are the problem.) By forcing everything to be unicode without a uniform encoding-detection scheme, we're forcing every programmer who opens a file or reads from the network to think about encodings. This will be a pretty big step for newbie programmers. And it's not just that. The encoding of network streams or files may be entirely unknown beforehand, and depend on the content: a content-encoding, a HTML tag. Will bytes-strings get string methods for easy searching of content descriptors? Will the 're' module accept bytes-strings? What would the literals you want to search for, look like? Do I really do 'if bytes("Content-Type:") in data:' and such? Should data perhaps get read using the opentext() equivalent of 'decode('ascii', 'replace')' and then parsed the 'normal' way? What about data gotten from an extension? And nevermind what the 'right way' for that is; what will *programmers* do? The 'right way' often escapes them. It may well be that I'm thinking too conservatively, too stuck in the old ways, but I think we're being too hasty in dismissing the ol' string. Don't get me wrong, I really like the idea of as much of Python doing unicode as possible, and the idea of a mutable bytes type sounds good to me too. I just don't like the wide gap between the troublesome-to-get unicode object and the unreadable-repr, weird-indexing, hard-to-work-with bytes-string. I don't think adding something inbetween is going to work (we basically have that now, the normal string), so I suggest the bytes-string becomes a bit more 'string' and a bit less 'sequence of bytes'. Perhaps in the form of: - A bytes type that repr()'s to something readable - A way to write byte literals that doesn't bleed the eyes, and isn't so fragile in the face of source-encoding (all the suggestions so far have you explicitly re-stating the source-encoding at each bytes("".encode())) If you have to wonder why that's fragile, just think about a recoding editor. Alternatively, get a short way to say 'encode in source-encoding' (I can't think of anything better than b"..." for the above two... Except... hmm... didn't `` become available in Py3k? Too little visual distinction?) - A way to manipulation the bytes as character-strings. Pattern matching, splitting, finding, slicing, etc. Quite like current strings. - Disallowing any interaction between bytes and real (meaning 'unicode') strings. Not "oh, let's assume ascii or the default encoding", either. If the user wants to explicitly decode using 'ascii', that's their choice, but they should consciously make it. - Mutable or immutable, I don't know. I fear that if the bytes type was easy enough to handle and mutable, and the normal (unicode) strings were immutable, people may end up using bytes all the time. In fact, they may do that anyway; I'm sure Python will grow entire subcults that prefer doing 'string("\xa1Python!")' where 'string' is 'bytes(arg.encode("iso-8859-1"))' Bytes should be easy enough to manipulate 'as strings' to do the basic tasks, but not easy enough to encourage people to forget about that whole annoying 'encoding' business and just use them instead (which is basically what we have now.) On the other hand, if people don't want to deal with that whole encoding business, we should allow them to -- consciously. We can offer a variety of hints and tips on how to figure out the encoding of something, but we can't do the thinking for them (trust me, I've tried.) When a file's encoding is specified in file metadata, that's great, really great. When a network connection is handled by a library that knows how to deal with the content (*cough*Twisted*cough*) and can decode it for you, that's really great too. But we're not there yet, not by a long shot. And explaining encodings to a ADHD-infested teenager high on adrenalin and creative inspiration who just wants to connect to an IRC server to make his bot say "Hi!", well, that's hard. I'd rather they don't go and do PHP instead. Doing it right is hard, but it's even harder to do it all right the first time, and Python never really worried about that ;P -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at v.loewis.de Wed Feb 15 01:25:15 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 01:25:15 +0100 Subject: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification In-Reply-To: References: <43ECC84C.9020002@v.loewis.de> <43ECE71D.2050402@v.loewis.de> <43F178F8.60506@v.loewis.de> Message-ID: <43F274EB.3030900@v.loewis.de> Jeremy Hylton wrote: >>Perhaps there is some value in finding functions which ought to expect >>const char*. For that, occasional checks should be sufficient; I cannot >>see a point in having code permanently pass with that option. In >>particular not if you are interfacing with C libraries. > > > I don't understand what you mean: I'm not sure what you mean by > "occasional checks" or "permanently pass". The compiler flags are > always the same. I'm objecting to the "this warning should never occur" rule. If the warning is turned on in a regular build, then clearly it is desirable to make it go away in all cases, and add work-arounds to make it go away if necessary. This is bad, because it means you add work-arounds to code where really no work-around is necessary (e.g. because it is *known* that some function won't modify the storage behind a char*, even though it doesn't take a const char*). So it is appropriate that the warning generates many false positives. Therefore, it should be a manual interaction to turn this warning on, inspect all the messages, and fix those that need correction, then turn the warning off again. Regards, Martin From tjreedy at udel.edu Wed Feb 15 01:32:12 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 14 Feb 2006 19:32:12 -0500 Subject: [Python-Dev] nice() References: <004f01c630c0$f051e1f0$5f2c4fca@csmith> <43F1B68B.5010604@canterbury.ac.nz> Message-ID: "Greg Ewing" wrote in message news:43F1B68B.5010604 at canterbury.ac.nz... > I don't think you're doing anyone any favours by trying to protect > them from having to know about these things, because they *need* to > know about them if they're not to write algorithms that seem to > work fine on tests but mysteriously start producing garbage when > run on real data, I agree. Here was my 'kick-in-the-butt' lesson (from 20+ years ago): the 'simplified for computation' formula for standard deviation, found in too many statistics books without a warning as to its danger, and specialized for three data points, is sqrt( ((a*a+b*b+c*c)-(a+b+c)**2/3.0) /2.0). After 1000s of ok calculations, the data were something like a,b,c = 10005,10006,10007. The correct answer is 1.0 but with numbers rounded to 7 digits, the computed answer is sqrt(-.5) == CRASH. I was aware that subtraction lost precision but not how rounding could make a theoretically guaranteed non-negative difference negative. Of course, Python floats being C doubles makes such glitches much rarer. Not exposing C floats is a major newbie (and journeyman) protection feature. Terry Jan Reedy From jimjjewett at gmail.com Wed Feb 15 01:39:54 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 14 Feb 2006 19:39:54 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] Message-ID: On 2/14/06, Neil Schemenauer wrote: > People could spell it bytes(s.encode('latin-1')) in order to make it > work in 2.X. Guido wrote: > At the cost of an extra copying step. That sounds like an implementation issue. If it is important enough to matter, then why not just add some smarts to the bytes constructor? If the argument is a str, and the constructor owns the only reference, then go ahead and use the argument's own underlying array; the string itself will be deallocated when (or before) the constructor returns, so no one else can use it expecting an immutable. -jJ From python at rcn.com Wed Feb 15 01:41:07 2006 From: python at rcn.com (Raymond Hettinger) Date: Tue, 14 Feb 2006 19:41:07 -0500 Subject: [Python-Dev] bytes type discussion References: Message-ID: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> [Guido van Rossum] > Somewhat controversial: > > - bytes("abc") == bytes(map(ord, "abc")) At first glance, this seems obvious and necessary, so if it's somewhat controversial, then I'm missing something. What's the issue? Raymond From martin at v.loewis.de Wed Feb 15 01:45:39 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 01:45:39 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> References: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> Message-ID: <43F279B3.5080104@v.loewis.de> Bob Ippolito wrote: >>Martin von Loewis's alternative for the "very controversial" set is to >>disallow an encoding argument and (I believe) also to disallow Unicode >>arguments. In 3.0 this would leave us with s.encode() as the >>only way to convert a string (which is always unicode) to bytes. The >>problem with this is that there's no code that works in both 2.x and >>3.0. > > > Given a base64 or hex string, how do you get a bytes object out of > it? Currently str.decode('base64') and str.decode('hex') are good > solutions to this... but you get a str object back. If s is a base64 string, bytes(s.decode("base64")) should work. In 2.x, it returns a str, which is then copied into bytes; in 3.x, .decode("base64") returns a byte string already (*), for which an extra copy is made. I would prefer to see base64.decodestring to return bytes, though - perhaps even in 2.x already. Regards, Martin (*) Interestingly enough, the "base64" encoding will work reversed in terms of types, compared to all other encodings. Where .encode returns bytes normally, it will return a string for base64, and vice versa (assuming the bytes type has .decode/.encode methods). From greg.ewing at canterbury.ac.nz Wed Feb 15 01:51:03 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 13:51:03 +1300 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <20060214214608.GA6027@xs4all.nl> References: <43F182B6.8020409@v.loewis.de> <20060214214608.GA6027@xs4all.nl> Message-ID: <43F27AF7.1080706@canterbury.ac.nz> Thomas Wouters wrote: > Actually, that's where distutils and bdist_* comes in. Mr. Random Developer > writes a regular distutils setup.py, and I can install the latest, > not-quite-in-apt version by doing 'setup.py bdist_deb' and installing the > resulting .deb. Why not just do 'setup.py install' directly? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From bob at redivi.com Wed Feb 15 01:56:00 2006 From: bob at redivi.com (Bob Ippolito) Date: Tue, 14 Feb 2006 16:56:00 -0800 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> Message-ID: On Feb 14, 2006, at 4:17 PM, Guido van Rossum wrote: > On 2/14/06, Bob Ippolito wrote: >> On Feb 14, 2006, at 3:13 PM, Guido van Rossum wrote: >>> - we need a new PEP; PEP 332 won't cut it >>> >>> - no b"..." literal >>> >>> - bytes objects are mutable >>> >>> - bytes objects are composed of ints in range(256) >>> >>> - you can pass any iterable of ints to the bytes constructor, as >>> long >>> as they are in range(256) >> >> Sounds like array.array('B'). > > Sure. > >> Will the bytes object support the buffer interface? > > Do you want them to? > > I suppose they should *not* support the *text* part of that API. I would imagine that it'd be convenient for integrating with existing extensions... e.g. initializing an array or Numeric array with one. >> Will it accept >> objects supporting the buffer interface in the constructor (or a >> class method)? If so, will it be a copy or a view? Current >> array.array behavior says copy. > > bytes() should always copy -- thanks for asking. I only really ask because it's worth fully specifying these things. Copy seems a lot more sensible given the rest of the interpreter and stdlib (e.g. buffer(x) seems to always return a read-only buffer). >>> - longs or anything with an __index__ method should do, too >>> >>> - when you index a bytes object, you get a plain int >> >> When slicing a bytes object, do you get another bytes object or a >> list? If its a bytes object, is it a copy or a view? Current >> array.array behavior says copy. > > Another bytes object which is a copy. > > (Why would you even think about views here? They are evil.) I mention views because that's what numpy/Numeric/numarray/etc. do... It's certainly convenient at times to have that functionality, for example, to work with only the alpha channel in an RGBA image. Probably too magical for the bytes type. >>> import numpy >>> image = numpy.array(list('RGBARGBARGBA')) >>> alpha = image[3::4] >>> alpha array([A, A, A], dtype=(string,1)) >>> alpha[:] = 'X' >>> image array([R, G, B, X, R, G, B, X, R, G, B, X], dtype=(string,1)) >>> Very controversial: >>> >>> - bytes("abc", "encoding") == bytes("abc") # ignores the "encoding" >>> argument >>> >>> - bytes(u"abc") == bytes("abc") # for ASCII at least >>> >>> - bytes(u"\x80\xff") raises UnicodeError >>> >>> - bytes(u"\x80\xff", "latin-1") == bytes("\x80\xff") >>> >>> Martin von Loewis's alternative for the "very controversial" set >>> is to >>> disallow an encoding argument and (I believe) also to disallow >>> Unicode >>> arguments. In 3.0 this would leave us with s.encode() >>> as the >>> only way to convert a string (which is always unicode) to bytes. The >>> problem with this is that there's no code that works in both 2.x and >>> 3.0. >> >> Given a base64 or hex string, how do you get a bytes object out of >> it? Currently str.decode('base64') and str.decode('hex') are good >> solutions to this... but you get a str object back. > > I don't know -- you can propose an API you like here. base64 is as > likely to encode text as binary data, so I don't think it's wrong for > those things to return strings. That's kinda true I guess -- but you'd still need an encoding in py3k to turn base64 -> text. A lot of the current codecs infrastructure doesn't make sense in py3k -- for example, the 'zlib' encoding, which is really a bytes transform, or 'unicode_escape' which is a text transform. I suppose there aren't too many different ways you'd want to encode or decode data to binary (beyond the text codecs), they should probably just live in a module -- something like the binascii we have now. I do find the codecs infrastructure to be convenient at times (maybe too convenient), but since you're not interested in adding functions to existing types then a module seems like the best approach. -bob From thomas at xs4all.net Wed Feb 15 02:00:10 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 02:00:10 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F27AF7.1080706@canterbury.ac.nz> References: <43F182B6.8020409@v.loewis.de> <20060214214608.GA6027@xs4all.nl> <43F27AF7.1080706@canterbury.ac.nz> Message-ID: <20060215010010.GE6027@xs4all.nl> On Wed, Feb 15, 2006 at 01:51:03PM +1300, Greg Ewing wrote: > Thomas Wouters wrote: > > Actually, that's where distutils and bdist_* comes in. Mr. Random Developer > > writes a regular distutils setup.py, and I can install the latest, > > not-quite-in-apt version by doing 'setup.py bdist_deb' and installing the > > resulting .deb. > Why not just do 'setup.py install' directly? Because that *does* overwrite files the package system might not want overwritten, and the resulting install is not listed in the packaging system, not taken into account on upgrades, etc. I don't want to keep track of a separate list of distutils-installed packages; that's what I use APT for. If I wanted to keep manually massaging my system after each install or upgrade, I'd be using Gentoo or FreeBSD ;) (I should point out that CPAN and CPANPLUS on FreeBSD do this slightly better; they register packages installed through CPAN (or actually the build/install part of it, MakefileMaker or whatever it's called) with the FreeBSD packaging database. I don't know what distutils does on FreeBSD, but that packaging database is just a bunch of files in appropriately named directories in /var/db/pkg...) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg.ewing at canterbury.ac.nz Wed Feb 15 02:00:21 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 14:00:21 +1300 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: Message-ID: <43F27D25.7070208@canterbury.ac.nz> Joe Smith wrote: > Windows and RPM are known for major dependency problems, letting packages > damage each other, having packages that do not uninstall cleanly (i.e. > packages that leave junk all over the place) and generally messing the sytem > up quite baddly over time, so that the OS is usually removed and > re-installed periodically.) I'm disappointed that the various Linux distributions still don't seem to have caught onto the very simple idea of *not* scattering files all over the place when installing something. MacOSX seems to be the only system so far that has got this right -- organising the system so that everything related to a given application or library can be kept under a single directory, clearly labelled with a version number. I haven't looked closely into eggs yet, but if they allow Python packages to be managed this way, and do it cross- platform, that's a very good reason to prefer using eggs over a platform-specific package format. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 15 02:06:17 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 14:06:17 +1300 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: Message-ID: <43F27E89.8050207@canterbury.ac.nz> Alex Martelli wrote: > What about shorter names, such as 'text' instead of 'opentext' and > 'data' instead of 'openbinary'? Because those words are just names for pieces of data, with nothing to connect them with files or the act of opening a file. I think the association of "open" with "file" is established strongly enough in programmers' brains that dropping it now would just lead to unnecessary confusion. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Wed Feb 15 02:11:24 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 15 Feb 2006 02:11:24 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> Message-ID: <43F27FBC.1000209@v.loewis.de> Raymond Hettinger wrote: >>- bytes("abc") == bytes(map(ord, "abc")) > > > At first glance, this seems obvious and necessary, so if it's somewhat > controversial, then I'm missing something. What's the issue? There is an "implicit Latin-1" assumption in that code. Suppose you do # -*- coding: koi-8r -*- print bytes("????? ??? ??????") in Python 2.x, then this means something (*). In Python 3, it gives you an exception, as the ordinals of this are suddenly above 256. Or, perhaps worse, the code # -*- coding: utf-8 -*- print bytes("Martin v. L?wis") will work in 2.x and 3.x, but produce different numbers (**). Regards, Martin (*) [231, 215, 201, 196, 207, 32, 215, 193, 206, 32, 242, 207, 211, 211, 213, 205] (**) In 2.x, this will give [77, 97, 114, 116, 105, 110, 32, 118, 46, 32, 76, 195, 182, 119, 105, 115] whereas in 3.x, it will give [77, 97, 114, 116, 105, 110, 32, 118, 46, 32, 76, 246, 119, 105, 115] From guido at python.org Wed Feb 15 02:15:03 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 14 Feb 2006 17:15:03 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: Message-ID: On 2/14/06, Jim Jewett wrote: > On 2/14/06, Neil Schemenauer wrote: > > People could spell it bytes(s.encode('latin-1')) in order to make it > > work in 2.X. > > Guido wrote: > > At the cost of an extra copying step. > > That sounds like an implementation issue. If it is important > enough to matter, then why not just add some smarts to the > bytes constructor? Short answer: you can't. > If the argument is a str, and the constructor owns the only > reference, then go ahead and use the argument's own > underlying array; the string itself will be deallocated when > (or before) the constructor returns, so no one else can use > it expecting an immutable. Hard to explain, but the VM usually keeps an extra reference on the stack so the refcount is never 1. But you can't rely on that so assuming that it's safe to reuse the storage if it's >1. Also, since the str's underlying array is allocated inline with the str header, this require str and bytes to have the same object layout. But since bytes are mutable, they can't. Summary: you don't understand the implementation well enough to suggest these kinds of things. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From trentm at ActiveState.com Wed Feb 15 02:22:14 2006 From: trentm at ActiveState.com (Trent Mick) Date: Tue, 14 Feb 2006 17:22:14 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F27D25.7070208@canterbury.ac.nz> References: <43F27D25.7070208@canterbury.ac.nz> Message-ID: <20060215012214.GA31050@activestate.com> [Greg Ewing wrote] > MacOSX seems to be the only system so far that has got > this right -- organising the system so that everything > related to a given application or library can be kept > under a single directory, clearly labelled with a > version number. ActivePython and MacPython have to install stuff to: /usr/local/bin/... /Library/Frameworks/Python.framework/... /Applications/MacPython-2.4/... # just MacPython does this /Library/Documentation/Help/... # Symlink needed here to have a hope of registration with # Apple's (crappy) help viewer system to work. Also, a receipt of the installation ends up here: /Library/Receipts/$package_name/... though Apple does not provide tools for uninstallation using those receipts. Mac OS X's installation tech ain't no panacea. If one is just distributing a single .app, then it is okay. If one is just distributing a library with no UI (graphical or otherwise) for the user, then it is okay. And "okay" here still means a pretty poor installation experience for the user: open DMG, don't run the app from here, drag it to your Applications folder, then eject this window/disk, then run it from /Applications, etc. Trent -- Trent Mick TrentM at ActiveState.com From greg.ewing at canterbury.ac.nz Wed Feb 15 02:30:29 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 14:30:29 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F118E0.6090704@voidspace.org.uk> <20060214080921.GW10226@xs4all.nl> Message-ID: <43F28435.4070008@canterbury.ac.nz> Guido van Rossum wrote: > The only remaining question is what if anything to do with an > encoding argment when the first argument is of type str...) From what you said earlier about str in 2.x being interpretable as a unicode string which contains only ascii, it seems to me that if you say bytes(s, encoding) where s is a str, then by the presence of the encoding argument you're saying that you want s to be treated as unicode and encoded using the specified encoding. So the result should be the same as bytes(u, encoding) where u is a unicode string containing the same code points as s. This implies that it should be an error if s contains non-ascii characters. This interpretation would satisfy the requirement for a single call signature covering both unicode and str-used-as-ascii-characters, while providing a different call signature (without encoding) for str-used-as-bytes. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From bob at redivi.com Wed Feb 15 02:24:06 2006 From: bob at redivi.com (Bob Ippolito) Date: Tue, 14 Feb 2006 17:24:06 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F27D25.7070208@canterbury.ac.nz> References: <43F27D25.7070208@canterbury.ac.nz> Message-ID: <3310CF02-B340-4DC2-98AD-341A0C60FC96@redivi.com> On Feb 14, 2006, at 5:00 PM, Greg Ewing wrote: > Joe Smith wrote: > >> Windows and RPM are known for major dependency problems, letting >> packages >> damage each other, having packages that do not uninstall cleanly >> (i.e. >> packages that leave junk all over the place) and generally messing >> the sytem >> up quite baddly over time, so that the OS is usually removed and >> re-installed periodically.) > > I'm disappointed that the various Linux distributions > still don't seem to have caught onto the very simple > idea of *not* scattering files all over the place when > installing something. > > MacOSX seems to be the only system so far that has got > this right -- organising the system so that everything > related to a given application or library can be kept > under a single directory, clearly labelled with a > version number. > > I haven't looked closely into eggs yet, but if they allow > Python packages to be managed this way, and do it cross- > platform, that's a very good reason to prefer using eggs > over a platform-specific package format. It should also be mentioned that eggs and platform-specific package formats are absolutely not mutually exclusive. You could use apt/rpm/ ports/etc. to fetch/build/install eggs too. There are very few reasons not to use eggs -- in theory anyway, the implementation isn't finished yet. The only things that really need to change are the packages like Twisted, numpy, or SciPy that don't have a distutils-based main setup.py... Technically, since egg is just a specification, they could even implement it themselves without the help of setuptools (though that seems like a bad approach). -bob From thomas at xs4all.net Wed Feb 15 02:35:03 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 02:35:03 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F27D25.7070208@canterbury.ac.nz> References: <43F27D25.7070208@canterbury.ac.nz> Message-ID: <20060215013503.GF6027@xs4all.nl> On Wed, Feb 15, 2006 at 02:00:21PM +1300, Greg Ewing wrote: > Joe Smith wrote: > > > Windows and RPM are known for major dependency problems, letting packages > > damage each other, having packages that do not uninstall cleanly (i.e. > > packages that leave junk all over the place) and generally messing the sytem > > up quite baddly over time, so that the OS is usually removed and > > re-installed periodically.) > > I'm disappointed that the various Linux distributions > still don't seem to have caught onto the very simple > idea of *not* scattering files all over the place when > installing something. Well, as an end user, I honestly don't care. I install stuff through apt, it installs the dependencies for me, does basic configuration where applicable (often asking for user-input once, then remembering the settings) and allows me to deinstall when I'm tired of a package. As long as apt handles it, I couldn't care less whether it's installed in separate directories, large bzip2 archives with suitable playmates from mixed ethnicity to improve social contact, or spread out across every 17th byte of a logical volume. As a programmer, I also don't care. I tell distutils which modules/packages, data files and scripts to install, and it does the rest. And that's why I like my Python packages to become .deb's through bdist_deb :) You-think-too-much'ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg.ewing at canterbury.ac.nz Wed Feb 15 02:59:24 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 14:59:24 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> Message-ID: <43F28AFC.7050807@canterbury.ac.nz> Guido van Rossum wrote: > On 2/13/06, Phillip J. Eby wrote: > >>At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: >> >>>On 2/13/06, Phillip J. Eby wrote: >>> >>>What would bytes("abc\xf0", "latin-1") *mean*? >> >>I'm saying that XXX would be the same encoding as you specified. i.e., >>including an encoding means you are encoding the *meaning* of the string. No, this is wrong. As I understand it, the encoding argument to bytes() is meant to specify how to *encode* characters into the bytes object. If you want to be able to specify how to *decode* a str argument as well, you'd need a third argument. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From bob at redivi.com Wed Feb 15 03:18:48 2006 From: bob at redivi.com (Bob Ippolito) Date: Tue, 14 Feb 2006 18:18:48 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <20060215012214.GA31050@activestate.com> References: <43F27D25.7070208@canterbury.ac.nz> <20060215012214.GA31050@activestate.com> Message-ID: On Feb 14, 2006, at 5:22 PM, Trent Mick wrote: > [Greg Ewing wrote] >> MacOSX seems to be the only system so far that has got >> this right -- organising the system so that everything >> related to a given application or library can be kept >> under a single directory, clearly labelled with a >> version number. > > ActivePython and MacPython have to install stuff to: > > /usr/local/bin/... The /usr/local/bin links are superfluous.. people should really be putting sys.prefix/bin on their path, cause that's where distutils scripts get installed to. > /Library/Frameworks/Python.framework/... > /Applications/MacPython-2.4/... # just MacPython does this ActivePython doesn't install app bundles for IDLE or anything? > /Library/Documentation/Help/... > # Symlink needed here to have a hope of registration with > # Apple's (crappy) help viewer system to work. It is pretty bad.. probably even worth punting on this step. > > Also, a receipt of the installation ends up here: > > /Library/Receipts/$package_name/... > > though Apple does not provide tools for uninstallation using those > receipts. That stuff is really behind the scenes stuff that's wholly managed by Installer.app and is pretty much irrelevant. > Mac OS X's installation tech ain't no panacea. If one is just > distributing a single .app, then it is okay. If one is just > distributing > a library with no UI (graphical or otherwise) for the user, then it is > okay. And "okay" here still means a pretty poor installation > experience > for the user: open DMG, don't run the app from here, drag it to your > Applications folder, then eject this window/disk, then run it from > /Applications, etc. Single apps are better than OK. Download them by whatever means you want, put them wherever you want, and run them. You can run any well- behaved application from a DMG (or a CD, or a USB key, or any other readable media). Libraries are not so great, as you've said. However, only developers should have to install libraries. Good applications are shipped with all of the libraries they need embedded in the application bundle. Dynamic linkage should only really happen internally, and to vendor supplied libraries. -bob From greg.ewing at canterbury.ac.nz Wed Feb 15 04:03:09 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 16:03:09 +1300 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <20060215012214.GA31050@activestate.com> References: <43F27D25.7070208@canterbury.ac.nz> <20060215012214.GA31050@activestate.com> Message-ID: <43F299ED.9060402@canterbury.ac.nz> Trent Mick wrote: > ActivePython and MacPython have to install stuff to: > > /usr/local/bin/... > /Library/Frameworks/Python.framework/... > /Applications/MacPython-2.4/... # just MacPython does this It's not perfect, but it's still a lot better than the situation on any other unix I've seen so far. It's a bit more complicated with something like Python, which is really several things - a library, an application, and some unix programs (the latter of which don't really fit into the MacOSX structure). At least all of the myriad library and header files go together under a single easily-identified directory, if you know where to look for it. > /Library/Documentation/Help/... > # Symlink needed here to have a hope of registration with > # Apple's (crappy) help viewer system to work. I didn't know about that one. It never even occurred to me that Python might *have* Apple Help Viewer files. I use Firefox to view all my Python documentation. :-) > Also, a receipt of the installation ends up here: > > /Library/Receipts/$package_name/... > > though Apple does not provide tools for uninstallation using those > receipts. And I hope they don't! I'd rather see progress towards a system where you don't *need* a special tool to uninstall something. It should be as simple and obvious as dragging a file or folder to the trash. > open DMG, don't run the app from here, drag it to your > Applications folder, then eject this window/disk, then run it from > /Applications, A decently-designed application should be runnable from anywhere, including a dmg, if the user wants to do that. If an app refuses to run from a dmg, I consider that a bug in the application. Likewise, the user should be able to put it anywhere on the HD, not just the Applications folder. Also I consider the need for a dmg in the first place to be a bug in the Web. :-) (You should be able to just directly download the .app file.) This sort of thing is still not quite as smooth as it was under Classic MacOS, but I'm hopeful of improvement. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 15 04:19:19 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 16:19:19 +1300 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <20060215013503.GF6027@xs4all.nl> References: <43F27D25.7070208@canterbury.ac.nz> <20060215013503.GF6027@xs4all.nl> Message-ID: <43F29DB7.7090409@canterbury.ac.nz> Thomas Wouters wrote: > Well, as an end user, I honestly don't care. > As a programmer, I also don't care. Perhaps I've been burned once too often by someone's oh-so-clever installer script screwing up and leaving me to wade through an impenetrable pile of makefiles, shell scripts and m4 macros trying to figure out what went wrong and what I can possibly do to fix it, but I've become a deep believer in keeping things simple. Common sense suggests that a system which keeps everything related to a package, and only to that package, in one directory, has got to be more robust than one which scatters files far and wide and then relies on some elaborate bookkeeping system to try to make sure things don't step on each other's toes. When everything goes right, I don't care either. But things go wrong often enough to make me care about unnecessary complexity in the tools I use. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 15 04:34:23 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 16:34:23 +1300 Subject: [Python-Dev] bytes type discussion In-Reply-To: <20060215002446.GD6027@xs4all.nl> References: <20060215002446.GD6027@xs4all.nl> Message-ID: <43F2A13F.4030604@canterbury.ac.nz> Thomas Wouters wrote: > > The encoding of network streams or files may be > entirely unknown beforehand, and depend on the content: a content-encoding, > a HTML tag. Will bytes-strings get string methods for easy > searching of content descriptors? Seems to me this is a case where you want to be able to change encodings in the middle of reading the stream. You start off reading the data as ascii, and once you've figured out the encoding, you switch to that and carry on reading. Are there any plans to make it possible to change the encoding of a text file object on the fly like this? If that would be awkward, maybe file objects themselves shouldn't be where the decoding occurs, but decoders should be separate objects that wrap byte streams. Under that model, opentext(filename, encoding) would be a factory function that did something like codecs.streamdecoder(encoding, openbinary(filename)) Having codecs be stream filters might be a good idea anyway, since then you could use them to wrap anything that can be treated as a stream of bytes (sockets, some custom object in your program, etc.), you could create pipelines of encoders and decoders, etc. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From oliphant.travis at ieee.org Wed Feb 15 04:39:49 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Tue, 14 Feb 2006 20:39:49 -0700 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: Message-ID: Guido van Rossum wrote: > I'm about to send 6 or 8 replies to various salient messages in the > PEP 332 revival thread. That's probably a sign that there's still a > lot to be sorted out. In the mean time, to save you reading through > all those responses, here's a summary of where I believe I stand. > Let's continue the discussion in this new thread unless there are > specific hairs to be split in the other thread that aren't addressed > below or by later posts. I hope bytes objects will be pickle-able? If so, and they support the buffer protocol, then many NumPy users will be very happy. -Travis From rrr at ronadam.com Wed Feb 15 04:45:26 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 14 Feb 2006 21:45:26 -0600 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F28AFC.7050807@canterbury.ac.nz> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> <43F28AFC.7050807@canterbury.ac.nz> Message-ID: <43F2A3D6.3060703@ronadam.com> Greg Ewing wrote: > Guido van Rossum wrote: >> On 2/13/06, Phillip J. Eby wrote: >> >>> At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote: >>> >>>> On 2/13/06, Phillip J. Eby wrote: >>>> >>>> What would bytes("abc\xf0", "latin-1") *mean*? >>> I'm saying that XXX would be the same encoding as you specified. i.e., >>> including an encoding means you are encoding the *meaning* of the string. > > No, this is wrong. As I understand it, the encoding > argument to bytes() is meant to specify how to *encode* > characters into the bytes object. If you want to be able > to specify how to *decode* a str argument as well, you'd > need a third argument. I'm not sure I understand why this would be needed? But maybe it's still too early to pin anything down. My first impression and thoughts were: (and seems incorrect now) bytes(object) -> byte sequence of objects value Basically a "memory dump" of objects value. And so... object(bytes) -> copy of original object This would reproduce a copy of the original object as long as the from and to object are the same type with no encoding needed. If they are different then you would get garbage, or an error. But that would be a programming error and not a language issue. It would be up to the programmer to not do that. Of course this is one of those easier to say than do concepts I'm sure. And I was thinking a bytes argument of more than one item would indicate a byte sequence. bytes(1,2,3) -> bytes([1,2,3]) Where any values above 255 would give an error, but it seems an explicit list is preferred. And that's fine because it creates a way for bytes to know how to handle everything else. (I think) bytes([1,2,3]] -> bytes[(1,2,3)] Which is fine... so ??? b = bytes(0L) -> bytes([0,0,0,0]) long(b) -> 0L convert it back to 0L And ... b = bytes([0L]) -> bytes([0]) # a single byte int(b) -> 0 convert it back to 0 long(b) -> 0L It's up to the programmer to know if it's safe. Working with raw data is always a programmer needs to be aware of what's going on thing. But would it be any different with strings? You wouldn't ever want to encode one type's bytes into a different type directly. It would be better to just encode it back to the original type, then use *it's* encoding method to change it. so... b = bytes(s) -> bytes( raw sequence of bytes ) Weather or not you get a single byte per char or multiple bytes per character would depend on the strings encoding. s = str(bytes, encoding) -> original string You need to specify it here, because there is more than one sting encoding. To avoid encodings entirely we would need a type for each encoding. (which isn't really avoiding anything) And it's the "raw data so programmer needs to be aware" situation again. Don't decode to something other than what it is. If someone needs automatic encoding/decoding, then they probably should write a class to do what they want. Something roughly like... class bytekeeper(object): b = None t = None e = None def __init__(self, obj, enc='bytes') # or whatever encoding self.e = enc self.t = type(obj) self.b = bytes(obj) def decode(self): ... Would we be able to subclass bytes? class bytekeeper(bytes): ? ... Ok.. enough rambling... I wonder how much of this is way out in left field. ;) cheers, Ronald Adam And as fa In this case the encoding argument would only be needed not to From oliphant.travis at ieee.org Wed Feb 15 04:41:19 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Tue, 14 Feb 2006 20:41:19 -0700 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods Message-ID: After some revisions, PEP 357 is ready for more comments. Please voice any concerns. -Travis -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-0357.txt Url: http://mail.python.org/pipermail/python-dev/attachments/20060214/c4033bd8/attachment.txt From rhamph at gmail.com Wed Feb 15 05:14:45 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 21:14:45 -0700 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: Message-ID: On 2/14/06, Just van Rossum wrote: > +1 for two functions. > > My choice would be open() for binary and opentext() for text. I don't > find that backwards at all: the text function is going to be more > different from the current open() function then the binary function > would be since in many ways the str type is closer to bytes than to > unicode. > > Maybe it's even better to use opentext() AND openbinary(), and deprecate > plain open(). We could even introduce them at the same time as bytes() > (and leave the open() deprecation for 3.0). Thus providing us with a transition period, even with warnings on use of the old function. I think coming up with a way to transition that doesn't silently break code and doesn't leave us with permanent ugly names is the hardest challenge here. +1 on opentext(), openbinary() -1 on silently changing open() in a way that results in breakage -- Adam Olsen, aka Rhamphoryncus From fdrake at acm.org Wed Feb 15 05:23:45 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 14 Feb 2006 23:23:45 -0500 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F2A13F.4030604@canterbury.ac.nz> References: <20060215002446.GD6027@xs4all.nl> <43F2A13F.4030604@canterbury.ac.nz> Message-ID: <200602142323.45930.fdrake@acm.org> On Tuesday 14 February 2006 22:34, Greg Ewing wrote: > Seems to me this is a case where you want to be able > to change encodings in the middle of reading the stream. > You start off reading the data as ascii, and once you've > figured out the encoding, you switch to that and carry > on reading. Not quite. The proper response in this case is often to re-start decoding with the correct encoding, since some of the data extracted so far may have been decoded incorrectly. A very carefully constructed application may be able to go back and re-decode any data saved from the stream with the previous encoding, but that seems like it would be pretty fragile in practice. There may be cases where switching encoding on the fly makes sense, but I'm not aware of any actual examples of where that approach would be required. -Fred -- Fred L. Drake, Jr. From fdrake at acm.org Wed Feb 15 05:40:53 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 14 Feb 2006 23:40:53 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <200602131552.44424.fdrake@acm.org> Message-ID: <200602142340.53360.fdrake@acm.org> On Tuesday 14 February 2006 03:09, Neal Norwitz wrote: > While you are here, are you planning to do the doc releases for 2.5? > You are tentatively listed in PEP 356. (Technically it says TBD with > a ? next to your name.) Releases generally aren't a problem, since they're heavily automated and scheduled well in advance. I'm glad to continue helping with that, especially since that seems to be about all I can get to sometimes. > I think this was the quick hack I did. I hope there are many > concerns. :-) For example, if the doc build fails, ... Hmmm, this > probably isn't a problem. The doc won't be updated, but will still be > the last good version. So if I send mail when the doc doesn't build, > then it might not be so bad. Seems reasonable to me. > I still need to > switch over the failure mails to go to python-checkins. There are too > many right now though. Unless people don't mind getting several > messages about refleaks every day? Anyone? Documentation build errors should probably be separated from leak detection reports. I don't know what it would take to get them separated. > That shouldn't be a problem. See http://docs.python.org/dev/2.4/ Works for me! Thanks for putting the effort into this. The general question of where the development docs should show up remains. There are a number of options: 1. www.python.org/dev/doc/, where I'd put them at one point 2. www.python.org/doc/..., which is reasonable, but new 3. docs.python.org/dev/, which seems reasonable, but docs.python.org proponents may not like 4. www.python.org/dev/doc/ for trunk documentation, and docs.python.org/ and/or www.python.org/doc/current/ for maintenance updates That last one has a certain appeal. It would allow corrections to go online quicker, so people using python.org or a mirror would get updates quickly (an advantage of delivering docs over the net!), and I wouldn't get so many repeat reports of commonly-noticed typos. The released versions would still be available via www.python.org/doc/x.y.z/. My own inclination is that if we continue to use docs.python.org, it should contain only one copy of the documentation, and that should be for the most recent "stable" release (though perhaps an updated version of the documentation). I'm not really on either side of the fence about whether docs.python.org is the "right thing" to do; the idea came out of the folks interested in advocacy. -Fred -- Fred L. Drake, Jr. From rhamph at gmail.com Wed Feb 15 05:41:02 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 21:41:02 -0700 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F27FBC.1000209@v.loewis.de> References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> <43F27FBC.1000209@v.loewis.de> Message-ID: On 2/14/06, "Martin v. L?wis" wrote: > Raymond Hettinger wrote: > >>- bytes("abc") == bytes(map(ord, "abc")) > > > > > > At first glance, this seems obvious and necessary, so if it's somewhat > > controversial, then I'm missing something. What's the issue? > > There is an "implicit Latin-1" assumption in that code. Suppose > you do > > # -*- coding: koi-8r -*- > print bytes("????? ??? ??????") > > in Python 2.x, then this means something (*). In Python 3, it gives > you an exception, as the ordinals of this are suddenly above 256. > > Or, perhaps worse, the code > > # -*- coding: utf-8 -*- > print bytes("Martin v. L?wis") > > will work in 2.x and 3.x, but produce different numbers (**). My assumption is these would become errors in 3.x. bytes(str) is only needed so you can do bytes(u"abc".encode('utf-8')) and have it work in 2.x and 3.x. (I wonder if maybe they should be an error in 2.x as well. Source encoding is for unicode literals, not str literals.) -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Wed Feb 15 06:02:32 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 22:02:32 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> <43F18371.9040306@v.loewis.de> Message-ID: On 2/14/06, Guido van Rossum wrote: > On 2/14/06, Adam Olsen wrote: > > In 3.0 it changes to: > > "It's...".encode('utf-8') > > u"It's...".byteencode('utf-8') # Same as above, kept for compatibility > > No. 3.0 won't have "backward compatibility" features. That's the whole > point of 3.0. Conceded. > > I realize it would be odd for the interactive interpret to print them > > as a list of ints by default: > > >>> u"It's...".byteencode('utf-8') > > [73, 116, 39, 115, 46, 46, 46] > > No. This prints the repr() which should include the type. bytes([73, > 116, 39, 115, 46, 46, 46]) is the right thing to print here. Typo, sorry :) -- Adam Olsen, aka Rhamphoryncus From nnorwitz at gmail.com Wed Feb 15 06:04:48 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 14 Feb 2006 21:04:48 -0800 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <200602142340.53360.fdrake@acm.org> References: <200602131552.44424.fdrake@acm.org> <200602142340.53360.fdrake@acm.org> Message-ID: On 2/14/06, Fred L. Drake, Jr. wrote: > > Releases generally aren't a problem, since they're heavily automated and > scheduled well in advance. I'm glad to continue helping with that, > especially since that seems to be about all I can get to sometimes. Great, I updated the PEP. > Documentation build errors should probably be separated from leak detection > reports. I don't know what it would take to get them separated. Yup, they already are AFAICT. I will activate the 2.4 doc builds to send failures to python-checkins unless someone has a better idea. These should be very rare. The destination is controlled by FAILURE_MAILTO in Misc/build.sh. > The general question of where the development docs should show up remains. [4 options sliced] Agreed, I don't have a strong opinion either. There should definitely only be one place to look though. That should make things easier. What do others think? > My own inclination is that if we continue to use docs.python.org, it should > contain only one copy of the documentation, and that should be for the most > recent "stable" release (though perhaps an updated version of the > documentation). +1 n From rhamph at gmail.com Wed Feb 15 06:11:49 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 22:11:49 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <43ed7605.487813468@news.gmane.org> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <43F10035.8080207@egenix.com> <43F17CF1.1060902@v.loewis.de> Message-ID: On 2/14/06, Guido van Rossum wrote: > On 2/13/06, Adam Olsen wrote: > > If I understand correctly there's three main candidates: > > 1. Direct copying to str in 2.x, pretending it's latin-1 in unicode in 3.x > > I'm not sure what you mean, but I'm guessing you're thinking that the > repr() of a bytes object created from bytes('abc\xf0') would be > > bytes('abc\xf0') > > under this rule. What's so bad about that? See below. > > 2. Direct copying to str/unicode if it's only ascii values, switching > > to a list of hex literals if there's any non-ascii values > > That works for me too. But why hex literals? As MvL stated, a list of > decimals would be just as useful. PEBKAC. Yeah, decimals are simpler and shorter even. > > 3. b"foo" literal with ascii for all ascii characters (other than \ > > and "), \xFF for individual characters that aren't ascii > > > > Given the choice I prefer the third option, with the second option as > > my runner up. The first option just screams "silent errors" to me. > > The 3rd is out of the running for many reasons. > > I'm not sure I understand your "silent errors" fear; can you elaborate? I think it's that someone will create a unicode object with real latin-1 characters and it'll get passed through without errors, the code assuming it's 8bit-as-latin-1. If they had put other unicode characters in they would have gotten an exception instead. However, at this point all the posts on latin-1 encoding/decoding have become so muddled in my mind that I don't know what they're suggesting. I think I'll wait for the pep to clear that up. -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Wed Feb 15 06:20:16 2006 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Feb 2006 22:20:16 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11BEC.8050902@voidspace.org.uk> <20060214193107.GA25293@mems-exchange.org> Message-ID: On 2/14/06, Guido van Rossum wrote: > Not entirely, since I don't know what b"abcdef" would mean > (where is a Unicode Euro character typed in whatever source > encoding was used). SyntaxError I would hope. Ascii and hex escapes only please. :) Although I'm not arguing for or against byte literals. They do make for a much terser form, but they're not strictly necessary. -- Adam Olsen, aka Rhamphoryncus From nnorwitz at gmail.com Wed Feb 15 06:24:57 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 14 Feb 2006 21:24:57 -0800 Subject: [Python-Dev] 2.5 release schedule Message-ID: I was hoping to get a lot more feedback about PEP 356 and the 2.5 release schedule. http://www.python.org/peps/pep-0356.html I updated the schedule it is now: alpha 1: May 6, 2006 [planned] alpha 2: June 3, 2006 [planned] alpha 3: July 1, 2006 [planned] beta 1: July 29, 2006 [planned] beta 2: August 26, 2006 [planned] rc 1: September 16, 2006 [planned] final: September 30, 2006 [planned] What do people think about that? There are still a lot of features we want to add. Is this ok with everyone? Do you think it's realistic? We still need a release manager. No one has heard from Anthony. If he isn't interested is someone else interested in trying their hand at it? There are many changes necessary in PEP 101 because since the last release both python and pydotorg have transitioned from CVS to SVN. Creosote also moved. n From janssen at parc.com Wed Feb 15 06:32:09 2006 From: janssen at parc.com (Bill Janssen) Date: Tue, 14 Feb 2006 21:32:09 PST Subject: [Python-Dev] how to upload new MacPython web page? Message-ID: <06Feb14.213215pst."58633"@synergy1.parc.xerox.com> We (the pythonmac-sig mailing list) seem to have converged (almost -- still talking about the logo) on a new download page for MacPython, to replace the page currently at http://www.python.org/download/download_mac.html. The strawman can be seen at http://bill.janssen.org/mac/new-macpython-page.html. How do I get the bits changed on python.org (when we're finished)? Bill From brett at python.org Wed Feb 15 06:49:17 2006 From: brett at python.org (Brett Cannon) Date: Tue, 14 Feb 2006 21:49:17 -0800 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: Message-ID: On 2/14/06, Neal Norwitz wrote: > I was hoping to get a lot more feedback about PEP 356 and the 2.5 > release schedule. > > http://www.python.org/peps/pep-0356.html > > I updated the schedule it is now: > > alpha 1: May 6, 2006 [planned] > alpha 2: June 3, 2006 [planned] > alpha 3: July 1, 2006 [planned] > beta 1: July 29, 2006 [planned] > beta 2: August 26, 2006 [planned] > rc 1: September 16, 2006 [planned] > final: September 30, 2006 [planned] > > What do people think about that? There are still a lot of features we > want to add. Is this ok with everyone? Do you think it's realistic? > Speaking as one of the people who has a PEP to implement, I am okay with it. -Brett From nnorwitz at gmail.com Wed Feb 15 06:58:46 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 14 Feb 2006 21:58:46 -0800 Subject: [Python-Dev] 2.5 PEP Message-ID: Attached is the 2.5 release PEP 356. It's also available from: http://www.python.org/peps/pep-0356.html Does anyone have any comments? Is this good or bad? Feel free to send to me comments. We need to ensure that PEPs 308, 328, and 343 are implemented. We have possible volunteers for 308 and 343, but not 328. Brett is doing 352 and Martin is doing 353. We also need to resolve a bunch of other implementation details about providing the C AST to Python, bdist_* issues and a few more possible stdlib modules. Don't be shy, tell the world what you think about these. Can someone go through PEP 4 and 11 and determine what work needs to be done? The more we distribute the work, the easier it will be on everyone. You don't really want to listen to me whine any more do you? ;-) Thank you, n -------------- next part -------------- PEP: 356 Title: Python 2.5 Release Schedule Version: $Revision: 42375 $ Author: Neal Norwitz, GvR Status: Draft Type: Informational Created: 07-Feb-2006 Python-Version: 2.5 Post-History: Abstract This document describes the development and release schedule for Python 2.5. The schedule primarily concerns itself with PEP-sized items. Small features may be added up to and including the first beta release. Bugs may be fixed until the final release. There will be at least two alpha releases, two beta releases, and one release candidate. The release date is planned 30 September 2006. Release Manager TBD (Anthony Baxter?) Martin von Loewis is building the Windows installers, Fred Drake the doc packages, and TBD (Sean Reifschneider?) the RPMs. Release Schedule alpha 1: May 6, 2006 [planned] alpha 2: June 3, 2006 [planned] alpha 3: July 1, 2006 [planned] beta 1: July 29, 2006 [planned] beta 2: August 26, 2006 [planned] rc 1: September 16, 2006 [planned] final: September 30, 2006 [planned] Completed features for 2.5 PEP 309: Partial Function Application PEP 314: Metadata for Python Software Packages v1.1 (should PEP 314 be marked final?) PEP 341: Unified try-except/try-finally to try-except-finally PEP 342: Coroutines via Enhanced Generators - AST-based compiler - Add support for reading shadow passwords (http://python.org/sf/579435) - any()/all() builtin truth functions - new hashlib module add support for SHA-224, -256, -384, and -512 (replaces old md5 and sha modules) - new cProfile module suitable for profiling long running applications with minimal overhead Planned features for 2.5 PEP 308: Conditional Expressions (Someone volunteered on python-dev, is there progress?) PEP 328: Absolute/Relative Imports (Needs volunteer, mail python-dev if interested) PEP 343: The "with" Statement (nn: I have a possible volunteer.) Note there are two separate implementation parts: interpreter changes and python code for utilities. PEP 352: Required Superclass for Exceptions (Brett Cannon is expected to implement this.) PEP 353: Using ssize_t as the index type MvL expects this to be complete in March. Access to C AST from Python Add bdist_msi to the distutils package. (MvL wants one more independent release first.) Add bdist_deb to the distutils package? (see http://mail.python.org/pipermail/python-dev/2006-February/060926.html) Add bdist_egg to the distutils package??? Add setuptools to the standard library. Add wsgiref to the standard library. (GvR: I have a bunch more that could/would/should be added. -- Still true?) Deferred until 2.6: - None Open issues This PEP needs to be updated and release managers confirmed. - Review PEP 4: Deprecate and/or remove the modules - Review PEP 11: Remove support for platforms as described Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: From greg.ewing at canterbury.ac.nz Wed Feb 15 07:44:12 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 19:44:12 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F2A3D6.3060703@ronadam.com> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> <43F28AFC.7050807@canterbury.ac.nz> <43F2A3D6.3060703@ronadam.com> Message-ID: <43F2CDBC.8090305@canterbury.ac.nz> Ron Adam wrote: > My first impression and thoughts were: (and seems incorrect now) > > bytes(object) -> byte sequence of objects value > > Basically a "memory dump" of objects value. As I understand the current intentions, this is correct. The bytes constructor would have two different signatures: (1) bytes(seq) --> interprets seq as a sequence of integers in the range 0..255, exception otherwise (2a) bytes(str, encoding) --> encodes the characters of (2b) bytes(unicode, encoding) the string using the specified encoding In (2a) the string would be interpreted as containing ascii characters, with an exception otherwise. In 3.0, (2a) will disappear leaving only (1) and (2b). > And I was thinking a bytes argument of more than one item would indicate > a byte sequence. > > bytes(1,2,3) -> bytes([1,2,3]) But then you have to test the argument in the one-argument case and try to guess whether it should be interpreted as a sequence or an integer. Best to avoid having to do that. > Which is fine... so ??? > > b = bytes(0L) -> bytes([0,0,0,0]) No, bytes(0L) --> TypeError because 0L doesn't implement the iterator protocol or the buffer interface. I suppose long integers might be enhanced to support the buffer interface in 3.0, but that doesn't seem like a good idea, because the bytes you got that way would depend on the internal representation of long integers. In particular, bytes(0x12345678L) via the buffer interface would most likely *not* give you bytes[0x12, 0x34, 0x56, 0x78]). Maybe types should grow a __bytes__ method? Greg From greg.ewing at canterbury.ac.nz Wed Feb 15 07:44:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 19:44:20 +1300 Subject: [Python-Dev] bytes type discussion In-Reply-To: <200602142323.45930.fdrake@acm.org> References: <20060215002446.GD6027@xs4all.nl> <43F2A13F.4030604@canterbury.ac.nz> <200602142323.45930.fdrake@acm.org> Message-ID: <43F2CDC4.4060700@canterbury.ac.nz> Fred L. Drake, Jr. wrote: > The proper response in this case is often to re-start decoding > with the correct encoding, since some of the data extracted so far may have > been decoded incorrectly. If the protocol has been sensibly designed, that shouldn't happen, since everything up to the coding marker should be ascii (or some other protocol-defined initial coding). For protocols that are not sensibly designed (or if you're just trying to guess) what you suggest may be needed. But it would be good to have a nicer way of going about it for when the protocol is sensible. Greg From fdrake at acm.org Wed Feb 15 08:12:37 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 15 Feb 2006 02:12:37 -0500 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F2CDC4.4060700@canterbury.ac.nz> References: <200602142323.45930.fdrake@acm.org> <43F2CDC4.4060700@canterbury.ac.nz> Message-ID: <200602150212.37625.fdrake@acm.org> On Wednesday 15 February 2006 01:44, Greg Ewing wrote: > If the protocol has been sensibly designed, that shouldn't > happen, since everything up to the coding marker should > be ascii (or some other protocol-defined initial coding). Indeed. > For protocols that are not sensibly designed (or if you're > just trying to guess) what you suggest may be needed. But > it would be good to have a nicer way of going about it > for when the protocol is sensible. I agree in principle, but the example of using an HTML tag as a source of document encoding information isn't sensible. Unfortunately, it's still part of the HTML specification. :-( I'm not opposing a way to do a sensible thing, but wanted to note that it wasn't going to be right for all cases, with such an example having been mentioned already (though the issues with it had not been fully spelled out). -Fred -- Fred L. Drake, Jr. From martin at v.loewis.de Wed Feb 15 09:03:49 2006 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 15 Feb 2006 09:03:49 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> <43F27FBC.1000209@v.loewis.de> Message-ID: <43F2E065.3080807@v.loewis.de> Adam Olsen wrote: > My assumption is these would become errors in 3.x. bytes(str) is only > needed so you can do bytes(u"abc".encode('utf-8')) and have it work in > 2.x and 3.x. I think the proposal for bytes(seq) to mean bytes(map(ord, seq)) was meant to be valid for both 2.x and 3.x, on the grounds that you should be able to write byte string constants in the same way in all versions. > (I wonder if maybe they should be an error in 2.x as well. Source > encoding is for unicode literals, not str literals.) Source encoding applies to the entire source code, including (byte) string literals, comments, identifiers, and keywords. IOW, if you declare your source encoding is utf-8, the keyword "print" must be represented with the bytes that represent the Unicode letters for "p","r","i","n", and "t" in UTF-8. Regards, Martin From martin at v.loewis.de Wed Feb 15 09:14:37 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 09:14:37 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F2CDC4.4060700@canterbury.ac.nz> References: <20060215002446.GD6027@xs4all.nl> <43F2A13F.4030604@canterbury.ac.nz> <200602142323.45930.fdrake@acm.org> <43F2CDC4.4060700@canterbury.ac.nz> Message-ID: <43F2E2ED.9050305@v.loewis.de> Greg Ewing wrote: > If the protocol has been sensibly designed, that shouldn't > happen, since everything up to the coding marker should > be ascii (or some other protocol-defined initial coding). XML, for one protocol, requires you to restart over. The initial sequence could be UTF-16, or it could be EBCDIC. You read a few bytes (up to four), then know which of these it is. Then you start over, reading further if it looks like an ASCII superset, to find out the real encoding. You normally then start over, although switching at that point could also work. > For protocols that are not sensibly designed (or if you're > just trying to guess) what you suggest may be needed. But > it would be good to have a nicer way of going about it > for when the protocol is sensible. There might be buffering of decoded strings already, (ie. beyond the point to which you have read), so you would need to unbuffer these, and reinterpret them. To support that, you really need to buffer both the original bytes, and the decoded ones, since the encoding might not roundtrip. Regards, Martin From martin at v.loewis.de Wed Feb 15 09:19:33 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 09:19:33 +0100 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: Message-ID: <43F2E415.2000205@v.loewis.de> Neal Norwitz wrote: > What do people think about that? There are still a lot of features we > want to add. Is this ok with everyone? Do you think it's realistic? My view on schedules is that they need to exist, whether they are followed or not. So having one is orders of magnitude better than having none. This specific one "looks right" also. > We still need a release manager. No one has heard from Anthony. If > he isn't interested is someone else interested in trying their hand at > it? He might be on vacation, no need to worry yet. If he doesn't want to do it, I would. Regards, Martin From alain.poirier at net-ng.com Wed Feb 15 09:22:43 2006 From: alain.poirier at net-ng.com (Alain Poirier) Date: Wed, 15 Feb 2006 09:22:43 +0100 Subject: [Python-Dev] 2.5 PEP In-Reply-To: References: Message-ID: <200602150922.43810.alain.poirier@net-ng.com> Hi, 2 questions: - is (c)ElementTree still planned for inclusion ? - isn't the current implementation of itertools.tee (cache of previous generated values) incompatible with the new possibility to feed a generator (PEP 342) ? Regards Neal Norwitz a ?crit : > Attached is the 2.5 release PEP 356. It's also available from: > http://www.python.org/peps/pep-0356.html > > Does anyone have any comments? Is this good or bad? Feel free to > send to me comments. > > We need to ensure that PEPs 308, 328, and 343 are implemented. We > have possible volunteers for 308 and 343, but not 328. Brett is doing > 352 and Martin is doing 353. > > We also need to resolve a bunch of other implementation details about > providing the C AST to Python, bdist_* issues and a few more possible > stdlib modules. Don't be shy, tell the world what you think about > these. > > Can someone go through PEP 4 and 11 and determine what work needs to be > done? > > The more we distribute the work, the easier it will be on everyone. > You don't really want to listen to me whine any more do you? ;-) > > Thank you, From ncoghlan at gmail.com Wed Feb 15 09:33:54 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2006 18:33:54 +1000 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <571D2F15-F2D6-42DB-9AC2-575C45922243@redivi.com> References: <571D2F15-F2D6-42DB-9AC2-575C45922243@redivi.com> Message-ID: <43F2E772.5010001@gmail.com> Bob Ippolito wrote: > ** The exception is scripts. Scripts go wherever --install-scripts= > point to, and AFAIK there is no means to ensure that the scripts from > one egg do not interfere with the scripts for another egg or anything > else on the PATH. I'm also not sure what the uninstallation story > with scripts is. Hopefully PEP 338 will go some way towards fixing that - in Python 2.5, the '-m' switch should be able to run modules inside eggs as scripts, reducing the need to install them directly into the filesystem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From brett at python.org Wed Feb 15 09:34:35 2006 From: brett at python.org (Brett Cannon) Date: Wed, 15 Feb 2006 00:34:35 -0800 Subject: [Python-Dev] C AST to Python discussion Message-ID: As per Neal's prodding email, here is a thread to discuss where we want to go with the C AST to Python stuff and what I think are the core issues at the moment. First issue is the ast-objects branch. Work is being done on it, but it still leaks some references (Neal or Martin can correct me if I am wrong). We really should choose either this branch or the current solution before really diving into coding stuff for exposing the AST so as to not waste too much time. Basically the issues are that the current solution will require using a serialization form to go from C to Python and back again. The PyObjects solution in the branch won't need this. One protects us from ending up with an unusable AST since the seralization can keep the original AST around and if the version passed back in from Python code is junk it can be tossed and the original version used. The PyObjects branch most likely won't have this since the actual AST will most likely be passed to Python code. But there is performance issues with all of this seralization compared to a simple Pyobject pointer into Pythonland. Jeremy supports the serialization option. I am personally indifferent while leaning towards the serialization. Then there is the API. First we need to decide if AST modification is allowed or not. It has been argued on my blog by someone (see http://sayspy.blogspot.com/2006/02/possibilities-of-ast.html for the entry on this whole topic which highly mirrors this email) that Guido won't okay AST transformations since it can lead to control flow changes behind the scenes. I say that is fine as long as knowing that AST transformations are occurring are sufficiently obvious. I say allow transformations. Once that is settled, I see three places for possible access to the AST. One is the command line like -m. Totally obvious to the user as long as they are not just working off of the .pyc files. Next is something like sys.ast_transformations that is a list of functions that are passed in the AST (and return a new version if modifications are allowed). This could allow chaining of AST transformations by feeding the next function with another one. Next is per-object AST access. This could get expensive since if we don't keep a copy of the AST with the code objects (which we probably shouldn't since that is wasted memory if the AST is not used a lot) we will need to read the code a second time to get the AST regenerated. I personally think we should choose an initial global access API to the AST as a starting API. I like the sys.ast_transformations idea since it is simple and gives enough access that whether read-only or read-write is allowed something like PyChecker can get the access it needs. It also allows for simple Python scripts that can install the desired functions and then compile or check the passed-in files. Obviously write accesss would be needed for optimization stuff (such as if the peepholer was rewritten in Python and used by default), but we can also expose this later if we want. In terms of 2.5, I think we really need to settle on the fate of the ast-objects branch. If we can get the very basic API for exposing the AST to Python code in 2.5 that would be great, but I don't view that as critical as choosing on the final AST implementation style since wasting work on a version that will disappear would just plain suck. It would be great to resolve this before the PyCon sprints since a good chunk of the AST-caring folk will be there for at least part of the time. -Brett From rhamph at gmail.com Wed Feb 15 09:39:10 2006 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 15 Feb 2006 01:39:10 -0700 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F2E065.3080807@v.loewis.de> References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> <43F27FBC.1000209@v.loewis.de> <43F2E065.3080807@v.loewis.de> Message-ID: On 2/15/06, "Martin v. L?wis" wrote: > Adam Olsen wrote: > > (I wonder if maybe they should be an error in 2.x as well. Source > > encoding is for unicode literals, not str literals.) > > Source encoding applies to the entire source code, including (byte) > string literals, comments, identifiers, and keywords. IOW, if you > declare your source encoding is utf-8, the keyword "print" must > be represented with the bytes that represent the Unicode letters > for "p","r","i","n", and "t" in UTF-8. Although it does apply to the entire source file, I think this is more for convenience (try telling an editor that only a single line is Shift_JIS!) than to allow 8-bit (or 16-bit?!) str literals. Indeed, you could have arbitrary 8-bit str literals long before the source encoding was added. Keywords and identifiers continue to be limited to ascii characters (even if they make a roundtrip through other encodings), and comments continue to be ignored. Source encoding exists so that you can write u"123" with the encoding stated once at the top of the file, rather than "123".decode('utf-8') with the encoding repeated everywhere. Making it an error to have 8-bit str literals in 2.x would help educate the user that they will change behavior in 3.0 and not be 8-bit str literals anymore. -- Adam Olsen, aka Rhamphoryncus From stephen at xemacs.org Wed Feb 15 09:45:23 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 15 Feb 2006 17:45:23 +0900 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F209AB.5020706@egenix.com> (M.'s message of "Tue, 14 Feb 2006 17:47:39 +0100") References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <43F17E1D.8030905@v.loewis.de> <436018FF-702F-4BB0-95D7-4A596A4B0216@fuhm.net> <43F209AB.5020706@egenix.com> Message-ID: <87lkwdypr0.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "M" == "M.-A. Lemburg" writes: M> James Y Knight wrote: >> Nice and simple. M> Albeit, too simple. M> The above approach would basically remove the possibility to M> easily create bytes() from literals in Py3k, since literals in M> Py3k create Unicode objects, e.g. bytes("123") would not work M> in Py3k. No, it just rules out a builtin easy way to create bytes() from literals. But who needs to do that? codec writers and people implementing wire protocols with bytes() that look like character strings but aren't. OK, so this makes life hard on codec writers. But those implementing wire protocols can use existing codecs, presumably 'ascii' will do 99% of the time: def make_wire_token (unicode_string, encoding='ascii'): return bytes(unicode_string.encode(encoding)) Everybody else is just asking for trouble by using bytes() for character strings. It would really be desirable to have "string" be a Unicode literal in Py3k, and u"string" a syntax error. M> To prevent [people from learning to write "bytes('string')" in M> 2.x and expecting that to work in Py3k], you'd have to outrule M> bytes() construction from strings altogether, which doesn't M> look like a viable option either. Why not? Either bytes() are the same as strings, in which case why change the name? or they're not, in which case we ask people to jump through the required hoops to create them. Maybe I'm missing some huge use case, of course, but it looks to me like the use cases are pretty specialized, and are likely to involve explicit coding anyway. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From ncoghlan at gmail.com Wed Feb 15 09:48:27 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2006 18:48:27 +1000 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: References: Message-ID: <43F2EADB.7060906@gmail.com> Travis E. Oliphant wrote: > 3) A new C-API function PyNumber_Index will be added with signature > > Py_ssize_t PyNumber_index (PyObject *obj) > There's a typo in the function name here. Other than that, the PEP looks pretty much fine to me. About the only other quibble is that it could arguably do with a link to the thread where we discussed (and discarded) 'discrete' and 'ordinal' as alternative names (you mention the discussion, but don't give a reference). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From t-meyer at ihug.co.nz Wed Feb 15 09:48:43 2006 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Wed, 15 Feb 2006 21:48:43 +1300 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: Message-ID: > We still need a release manager. No one has heard from Anthony. It is the peak of the summer down here. Perhaps he is lucky enough to be enjoying it away from computers for a while? =Tony.Meyer From just at letterror.com Wed Feb 15 09:51:44 2006 From: just at letterror.com (Just van Rossum) Date: Wed, 15 Feb 2006 09:51:44 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: Message-ID: Guido van Rossum wrote: > If bytes support the buffer interface, we get another interesting > issue -- regular expressions over bytes. Brr. We already have that: >>> import re, array >>> re.search('\2', array.array('B', [1, 2, 3, 4])).group() array('B', [2]) >>> Not sure whether to blame array or re, though... Just From greg.ewing at canterbury.ac.nz Wed Feb 15 09:43:57 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Feb 2006 21:43:57 +1300 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: Message-ID: <43F2E9CD.7000102@canterbury.ac.nz> Brett Cannon wrote: > One protects us from ending up with an unusable AST since > the seralization can keep the original AST around and if the version > passed back in from Python code is junk it can be tossed and the > original version used. I don't understand why this is an issue. If Python code produces junk and tries to use it as an AST, then it's buggy and deserves what it gets. All the AST compiler should be responsible for is to try not to crash the interpreter under those conditions. But that's true whatever method is used for passing ASTs from Python to the compiler. The PyObjects branch most likely won't have > this since the actual AST will most likely be passed to Python code. > But there is performance issues with all of this seralization compared > to a simple Pyobject pointer into Pythonland. Jeremy supports the > serialization option. I am personally indifferent while leaning > towards the serialization. > > Then there is the API. First we need to decide if AST modification is > allowed or not. It has been argued on my blog by someone (see > http://sayspy.blogspot.com/2006/02/possibilities-of-ast.html for the > entry on this whole topic which highly mirrors this email) that Guido > won't okay AST transformations since it can lead to control flow > changes behind the scenes. I say that is fine as long as knowing that > AST transformations are occurring are sufficiently obvious. I say > allow transformations. > > Once that is settled, I see three places for possible access to the > AST. One is the command line like -m. Totally obvious to the user as > long as they are not just working off of the .pyc files. Next is > something like sys.ast_transformations that is a list of functions > that are passed in the AST (and return a new version if modifications > are allowed). This could allow chaining of AST transformations by > feeding the next function with another one. Next is per-object AST > access. This could get expensive since if we don't keep a copy of the > AST with the code objects (which we probably shouldn't since that is > wasted memory if the AST is not used a lot) we will need to read the > code a second time to get the AST regenerated. > > I personally think we should choose an initial global access API to > the AST as a starting API. I like the sys.ast_transformations idea > since it is simple and gives enough access that whether read-only or > read-write is allowed something like PyChecker can get the access it > needs. It also allows for simple Python scripts that can install the > desired functions and then compile or check the passed-in files. > Obviously write accesss would be needed for optimization stuff (such > as if the peepholer was rewritten in Python and used by default), but > we can also expose this later if we want. > > In terms of 2.5, I think we really need to settle on the fate of the > ast-objects branch. If we can get the very basic API for exposing the > AST to Python code in 2.5 that would be great, but I don't view that > as critical as choosing on the final AST implementation style since > wasting work on a version that will disappear would just plain suck. > It would be great to resolve this before the PyCon sprints since a > good chunk of the AST-caring folk will be there for at least part of > the time. > > -Brett > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg.ewing%40canterbury.ac.nz From ncoghlan at gmail.com Wed Feb 15 10:01:21 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2006 19:01:21 +1000 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <792E2907-AC7F-4607-9E37-C06EAA599E6A@redivi.com> Message-ID: <43F2EDE1.5070201@gmail.com> Bob Ippolito wrote: > On Feb 14, 2006, at 4:17 PM, Guido van Rossum wrote: >> (Why would you even think about views here? They are evil.) > > I mention views because that's what numpy/Numeric/numarray/etc. > do... It's certainly convenient at times to have that functionality, > for example, to work with only the alpha channel in an RGBA image. > Probably too magical for the bytes type. The key difference between numpy arrays and normal sequences is that the length of a sequence can change, but the shape of a numpy array is essentially fixed. So view behaviour can be reserved for a dimensioned array type (if the numpy folks ever find the time to finish writing their PEP. . .) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From thomas at xs4all.net Wed Feb 15 10:22:30 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 10:22:30 +0100 Subject: [Python-Dev] how to upload new MacPython web page? In-Reply-To: <06Feb14.213215pst."58633"@synergy1.parc.xerox.com> References: <06Feb14.213215pst."58633"@synergy1.parc.xerox.com> Message-ID: <20060215092229.GG6027@xs4all.nl> On Tue, Feb 14, 2006 at 09:32:09PM -0800, Bill Janssen wrote: > We (the pythonmac-sig mailing list) seem to have converged (almost -- > still talking about the logo) on a new download page for MacPython, to > replace the page currently at > http://www.python.org/download/download_mac.html. The strawman can be > seen at http://bill.janssen.org/mac/new-macpython-page.html. > > How do I get the bits changed on python.org (when we're finished)? pydotorg at python.org is probably the right email address (although most of them are on here as well.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From ncoghlan at gmail.com Wed Feb 15 10:28:36 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2006 19:28:36 +1000 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <43F2E9CD.7000102@canterbury.ac.nz> References: <43F2E9CD.7000102@canterbury.ac.nz> Message-ID: <43F2F444.4010604@gmail.com> Greg Ewing wrote: > Brett Cannon wrote: >> One protects us from ending up with an unusable AST since >> the seralization can keep the original AST around and if the version >> passed back in from Python code is junk it can be tossed and the >> original version used. > > I don't understand why this is an issue. If Python code > produces junk and tries to use it as an AST, then it's > buggy and deserves what it gets. All the AST compiler > should be responsible for is to try not to crash the > interpreter under those conditions. But that's true > whatever method is used for passing ASTs from Python > to the compiler. I'd prefer the AST node be real Python objects. The arena approach seems to be working reasonably well, but I still don't see a good reason for using a specialised memory allocation scheme when it really isn't necessary and we have a perfectly good memory management system for PyObject's. On the 'unusable AST' front, if AST transformation code creates illegal output, then the main thing is to raise an exception complaining about what's wrong with it. I believe that may need a change to the compiler whether the modified AST was serialised or not. In terms of reverting back to the untransformed AST if the transformation fails, then that option is up to the code doing the transformation. Instead of serialising all the time (even for cases where the AST is just being inspected instead of transformed), we can either let the AST objects support the copy/deepcopy protocol, or else provide a method to clone a tree before trying to transform it. A unified representation means we only have one API to learn, that is accessible from both Python and C. It also eliminates any need to either implement features twice (once in Python and once in C) or else let the Python and C API's diverge to the point where what you can do with one differs from what you can do with the other. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From thomas at xs4all.net Wed Feb 15 10:52:27 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 10:52:27 +0100 Subject: [Python-Dev] 2.5 PEP In-Reply-To: References: Message-ID: <20060215095227.GH6027@xs4all.nl> On Tue, Feb 14, 2006 at 09:58:46PM -0800, Neal Norwitz wrote: > We need to ensure that PEPs 308, 328, and 343 are implemented. We > have possible volunteers for 308 and 343, but not 328. Brett is doing > 352 and Martin is doing 353. I can volunteer for 328 if no one else wants it, I've messed with the import mechanism before (and besides, it's fun.) I've also written an unfinished 308 implementation to get myself acquainted with the AST code more. 'Unfinished' means that it works completely, except for some cases of ambiguous syntax. I can fix that in a few days if the deadline nears and there's no working patch. (Naively adding if/else expressions broke list comprehensions with an 'if' clause, and fixing that broke list comprehensions with 'for x in lambda:0, lambda:1', and fixing that broke list comprehensions altogether... I added "clean up Grammar file" to the PyCon core sprint topics for that reason. I guess 308 wasn't as much a trainer implementation as people thought ;) The syntax part of 328 is probably easier (but the rest isn't.) > Access to C AST from Python If this still needs work when I finish grokking the AST code and the PyObj branch of it, I can help. I should have more than enough spare time to finish these things before alpha 1. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Wed Feb 15 11:03:09 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 11:03:09 +0100 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <43F2F444.4010604@gmail.com> References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> Message-ID: <20060215100309.GI6027@xs4all.nl> On Wed, Feb 15, 2006 at 07:28:36PM +1000, Nick Coghlan wrote: > On the 'unusable AST' front, if AST transformation code creates illegal > output, then the main thing is to raise an exception complaining about > what's wrong with it. I believe that may need a change to the compiler > whether the modified AST was serialised or not. I would personally prefer the AST validation to be a separate part of the compiler. It means the one or the other can be out of sync, but it also means it can be accessed directly (validating AST before sending it to the compiler) and the compiler (or CFG generator, or something between AST and CFG) can decide not to validate internally generated AST for non-debug builds, for instance. I like both those reasons. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From stephen at xemacs.org Wed Feb 15 11:06:21 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 15 Feb 2006 19:06:21 +0900 Subject: [Python-Dev] bytes type discussion In-Reply-To: <200602142323.45930.fdrake@acm.org> (Fred L. Drake, Jr.'s message of "Tue, 14 Feb 2006 23:23:45 -0500") References: <20060215002446.GD6027@xs4all.nl> <43F2A13F.4030604@canterbury.ac.nz> <200602142323.45930.fdrake@acm.org> Message-ID: <878xsdym02.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Fred" == Fred L Drake, writes: Fred> On Tuesday 14 February 2006 22:34, Greg Ewing wrote: >> Seems to me this is a case where you want to be able to change >> encodings in the middle of reading the stream. You start off >> reading the data as ascii, and once you've figured out the >> encoding, you switch to that and carry on reading. Fred> Not quite. The proper response in this case is often to Fred> re-start decoding with the correct encoding, since some of Fred> the data extracted so far may have been decoded incorrectly. Fred> A very carefully constructed application may be able to go Fred> back and re-decode any data saved from the stream with the Fred> previous encoding, but that seems like it would be pretty Fred> fragile in practice. I believe GNU Emacs is currently doing this. AIUI, they save annotations where the codec is known to be non-invertible (eg, two charset-changing escape sequences in a row). I do think this is fragile, and a robust application really should buffer everything it's not sure of decoding correctly. Fred> There may be cases where switching encoding on the fly makes Fred> sense, but I'm not aware of any actual examples of where Fred> that approach would be required. This is exactly what ISO 2022 formalizes: switching encodings on the fly. mboxes of Japanese mail often contain random and unsignaled encoding changes. A terminal emulator may need to switch when logging in to a remote system. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From ncoghlan at gmail.com Wed Feb 15 11:09:33 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2006 20:09:33 +1000 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: Message-ID: <43F2FDDD.3030200@gmail.com> Guido van Rossum wrote: > But somehow I still like the 'open' verb. It has a long and rich > tradition. And it also nicely conveys that it is a factory function > which may return objects of different types (though similar in API) > based upon either additional arguments (e.g. buffering) or the > environment (e.g. encodings) or even inspection of the file being > opened. If we went with longer names, a slight variation on the opentext/openbinary idea would be to use opentext and opendata. That is, "give me something that looks like a text file (it contains characters)", or "give me something that looks like a data file (it contains bytes)". "opentext" would map to "codecs.open" (that is, accepting an encoding argument) "opendata" would map to the standard "open", but with the 'b' in the mode string added automatically. So the mode choices common to both would be: 'r'/'w'/'a' - read/write/append (default 'r') ''/'+' - update (IOError if file does not already exist) (default '') opentext would allow the additional option: ''/'U' - universal newlines (default '') Neither of them would accept a 'b' in the mode string. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From tim at pollenation.net Wed Feb 15 11:11:49 2006 From: tim at pollenation.net (Tim Parkin) Date: Wed, 15 Feb 2006 10:11:49 +0000 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: Message-ID: <43F2FE65.5040308@pollenation.net> Guido van Rossum wrote: > (Now that I work for Google I realize more than ever before the > importance of keeping URLs stable; PageRank(tm) numbers don't get > transferred as quickly as contents. I have this worry too in the > context of the python.org redesign; 301 permanent redirect is *not* > going to help PageRank of the new page.) Hi Guido, Could you expand on why 301 redirects won't help with the transfer of page rank (if you're allowed)? We've done exactly this on many sites and the pagerank (or more relevantly the search rankings on specific terms) has transferred almost overnight. The bigger pagerank updates (both algorithm changes and overhauls in approach) seem to only happen every few months and these also seem to take notice of 301 redirects (they generally clear up any supplemental results). The addition of the docs.python.org was also intended (I thought) to be used in the google customised search (the google page you go to when you search from python.org). I'm not sure if that go lost in implementation but the idea was that the google box would have a radio button for docs.python.org. I agree that docs.python.org should only be the current documentation however what about the large amount of people who use 2.3 as standard? perhaps the docs23.python.org makes sense. In terms of pagerank for the different versions of the docs, would it make sense to 'hide' the older versions of the docs with a noindex so that general google searches will only return the current docs. +1 on docs.python.org only containing current (with the caveat that there be an equivalent for users of specific versions, e.g. 2.3 users) Tim Parkin p.s. All my knowledge of how google work is gained through personal research so the terminology, techniques and results may be completely wrong (and also may vary from time to time) - however they do reflect direct experience. p.p.s regarding 'site:', 'allinurl:' and other google modifiers; It would seem a good idea to create a single page that helped site users make such searches without having to learn how the modifiers work. It maybe should be noted that you can also add a 'temporary redirects' (302's) which is taken by google to mean "leave the original search results in place". This has also worked for us (old urls remain the same as far as google is concerned). From rrr at ronadam.com Wed Feb 15 11:24:38 2006 From: rrr at ronadam.com (Ron Adam) Date: Wed, 15 Feb 2006 04:24:38 -0600 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F2CDBC.8090305@canterbury.ac.nz> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> <43F28AFC.7050807@canterbury.ac.nz> <43F2A3D6.3060703@ronadam.com> <43F2CDBC.8090305@canterbury.ac.nz> Message-ID: <43F30166.1070902@ronadam.com> Greg Ewing wrote: > Ron Adam wrote: > >> My first impression and thoughts were: (and seems incorrect now) >> >> bytes(object) -> byte sequence of objects value >> >> Basically a "memory dump" of objects value. > > As I understand the current intentions, this is correct. > The bytes constructor would have two different signatures: > > (1) bytes(seq) --> interprets seq as a sequence of > integers in the range 0..255, > exception otherwise > > (2a) bytes(str, encoding) --> encodes the characters of > (2b) bytes(unicode, encoding) the string using the specified > encoding > > In (2a) the string would be interpreted as containing > ascii characters, with an exception otherwise. In 3.0, > (2a) will disappear leaving only (1) and (2b). I was presuming it would be done in C code and it will just need a pointer to the first byte, memchr(), and then read n bytes directly into a new memory range via memcpy(). But I don't know if that's possible with Pythons object model. (My C skills are a bit rusty as well) However, if it's done with a Python iterator and then each item is translated to bytes in a sequence, (much slower), an encoding will need to be known for it to work correctly. Unfortunately Unicode strings don't set an attribute to indicate it's own encoding. So bytes() can't just do encoding = s.encoding to find out, it would need to be specified in this case. And that should give you a byte object that is equivalent to the bytes in memory, providing Python doesn't compress data internally to save space. (?, I don't think it does) I'd prefer the first version *if possible* because of the performance. >> And I was thinking a bytes argument of more than one item would indicate >> a byte sequence. >> >> bytes(1,2,3) -> bytes([1,2,3]) > > But then you have to test the argument in the one-argument > case and try to guess whether it should be interpreted as > a sequence or an integer. Best to avoid having to do that. Yes, I agree. >> Which is fine... so ??? >> >> b = bytes(0L) -> bytes([0,0,0,0]) > > No, bytes(0L) --> TypeError because 0L doesn't implement > the iterator protocol or the buffer interface. It wouldn't need it if it was a direct C memory copy. > I suppose long integers might be enhanced to support the > buffer interface in 3.0, but that doesn't seem like a good > idea, because the bytes you got that way would depend on > the internal representation of long integers. In particular, Since some longs will be of different length, yes a bytes(0L) could give differing results on different platforms, but it will always give the same result on the platform it is run on. I actually think this is a plus and not a problem. If you are using Python to implement a byte interface you need to *know* it is different, not have it hidden. bytesize = len(bytes(0L)) # find how long a long is Cheers, Ronald Adam From ncoghlan at gmail.com Wed Feb 15 11:29:45 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2006 20:29:45 +1000 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <20060215100309.GI6027@xs4all.nl> References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> <20060215100309.GI6027@xs4all.nl> Message-ID: <43F30299.3090708@gmail.com> Thomas Wouters wrote: > On Wed, Feb 15, 2006 at 07:28:36PM +1000, Nick Coghlan wrote: > >> On the 'unusable AST' front, if AST transformation code creates illegal >> output, then the main thing is to raise an exception complaining about >> what's wrong with it. I believe that may need a change to the compiler >> whether the modified AST was serialised or not. > > I would personally prefer the AST validation to be a separate part of the > compiler. It means the one or the other can be out of sync, but it also > means it can be accessed directly (validating AST before sending it to the > compiler) and the compiler (or CFG generator, or something between AST and > CFG) can decide not to validate internally generated AST for non-debug > builds, for instance. > > I like both those reasons. Aye, I was thinking much the same thing. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From thomas at xs4all.net Wed Feb 15 11:37:46 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 11:37:46 +0100 Subject: [Python-Dev] Generalizing *args and **kwargs Message-ID: <20060215103746.GJ6027@xs4all.nl> I've been thinking about generalization of the *args/**kwargs syntax for quite a while, and even though I'm pretty sure Guido (and many people) will consider it overgeneralization, I am finally going to suggest it. This whole idea is not something dear to my heart, although I obviously would like to see it happen. If the general vote is 'no', I'll write a small PEP or add it to PEP 13 and be done with it. The grand total of the generalization would be something like this: Allow 'unpacking' of arbitrary iterables in sequences: >>> iterable = (1, 2) >>> ['a', 'b', *iterable, 'c'] ['a', 'b', 1, 2, 'c'] >>> ('a', 'b', *iterable, 'c') ('a', 'b', 1, 2, 'c') Possibly also allow 'unpacking' in list comprehensions and genexps: >>> [ *subseq for subseq in [(1, 2), (3, 4)] ] [1, 2, 3, 4] (You can already do this by adding an extra 'for' loop inside the LC) Allow 'unpacking' of mapping types (anything supporting 'items' or 'iteritems') in dictionaries: >>> args = {'verbose': 1} >>> defaults = {'verbose': 0} >>> {**defaults, **args, 'fixedopt': 1} {'verbose': 1, 'fixedopt': 1} Allow 'packing' in assignment, stuffing left-over items in a list. >>> a, b, *rest = range(5) >>> a, b, rest (0, 1, [2, 3, 4]) >>> a, b, *rest = range(2) (0, 1, []) (A list because you can't always take the type of the RHS and it's the right Python type for 'an arbitrary length homogeneous sequence'.) While generalizing that, it may also make sense to allow: >>> def spam(*args, **kwargs): ... return args, kwargs ... >>> args = (1, 2); kwargs = {'eggs': 'no'} >>> spam(*args, 3) ((1, 2, 3), {}) >>> spam(*args, 3, **kwargs, spam='extra', eggs='yes') ((1, 2, 3), {'spam': 'extra', 'eggs': 'yes'}) (In spite of the fact that both are already possible by fiddling args/kwargs beforehand or doing '*(args + (3,))'.) Maybe it also makes sense on the defining side, particularly for keyword arguments to indicate 'keyword-only arguments'. Maybe with a '**' without a name attached: >>> def spam(pos1, pos2, **, kwarg1=.., kwarg2=..) But I dunno yet. Although I've made it look like I have a working implementation, I haven't. I know exactly how to do it, though, except for the AST part ;) Once I figure out how to properly work with the AST code I'll probably write this patch whether it's a definite 'no' or not, just to see if I can. I wouldn't mind if people gave their opinion, though. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From rhamph at gmail.com Wed Feb 15 12:08:46 2006 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 15 Feb 2006 04:08:46 -0700 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F30166.1070902@ronadam.com> References: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> <43F28AFC.7050807@canterbury.ac.nz> <43F2A3D6.3060703@ronadam.com> <43F2CDBC.8090305@canterbury.ac.nz> <43F30166.1070902@ronadam.com> Message-ID: On 2/15/06, Ron Adam wrote: > Greg Ewing wrote: > > Ron Adam wrote: > >> b = bytes(0L) -> bytes([0,0,0,0]) > > > > No, bytes(0L) --> TypeError because 0L doesn't implement > > the iterator protocol or the buffer interface. > > It wouldn't need it if it was a direct C memory copy. > > > I suppose long integers might be enhanced to support the > > buffer interface in 3.0, but that doesn't seem like a good > > idea, because the bytes you got that way would depend on > > the internal representation of long integers. In particular, > > Since some longs will be of different length, yes a bytes(0L) could give > differing results on different platforms, but it will always give the > same result on the platform it is run on. I actually think this is a > plus and not a problem. If you are using Python to implement a byte > interface you need to *know* it is different, not have it hidden. > > bytesize = len(bytes(0L)) # find how long a long is I believe you're confusing a C long with a Python long. A Python long is implemented as an array and has variable size. In any case we already have the struct module: >>> import struct >>> struct.calcsize('l') 4 -- Adam Olsen, aka Rhamphoryncus From simon at arrowtheory.com Wed Feb 15 23:07:34 2006 From: simon at arrowtheory.com (Simon Burton) Date: Wed, 15 Feb 2006 22:07:34 +0000 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: Message-ID: <20060215220734.45db375a.simon@arrowtheory.com> On Wed, 15 Feb 2006 00:34:35 -0800 Brett Cannon wrote: > As per Neal's prodding email, here is a thread to discuss where we > want to go with the C AST to Python stuff and what I think are the > core issues at the moment. > > First issue is the ast-objects branch. Work is being done on it, but > it still leaks some references (Neal or Martin can correct me if I am > wrong). I've been doing the heavy lifting on ast-objects the last few weeks. Today it finally passed the python test suite. The last thing to do is the addition of XDECREF's, so yes, it is leaking a lot of references. I won't make it to PyCon (it's a long way for me to come), but gee I've left all the fun stuff for you to do ! :) Even if AST transforms are not allowed, I see it as the strongest form of code reflection, and long over-due in python. Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com From smiles at worksmail.net Wed Feb 15 12:54:44 2006 From: smiles at worksmail.net (Smith) Date: Wed, 15 Feb 2006 05:54:44 -0600 Subject: [Python-Dev] nice() References: Message-ID: <000e01c63226$fc342660$7c2c4fca@csmith> I am reluctantly posting here since this is of less intense interest than other things being discussed right now, but this is related to the areclose proposal that was discussed here recently. The following discussion ends with things that python-dev might want to consider in terms of adding a function that allows something other than the default 12- and 17-digit precision representations of numbers that str() and repr() give. Such a function (like nice(), perhaps named trim()?) would provide a way to convert fp numbers that are being used in comparisons into a precision that reflects the user's preference. Everyone knows that fp numbers must be compared with caution, but there is a void in the relative-error department for exercising such caution, thus the proposal for something like 'areclose'. The problem with areclose(), however, is that it only solves one part of the problem that needs to be solved if two fp's *are* going to be compared: if you are going to check if a < b you would need to do something like not areclose(a,b) and a < b With something like trim() (a.k.a nice()) you could do trim(a) < trim(b) to get the comparison to 12-digit default precision or arbitrary precision with optional arguments, e.g. to 3 digits of precision: trim(a,3) < trim(b,3) >From a search on the documentation, I don't see that the name trim() is taken yet. OK, comments responding to Greg follow. | From: Greg Ewing greg.ewing at canterbury.ac.nz | Smith wrote: | || computing the bin boundaries for a histogram || where bins are a width of 0.1: || ||||| for i in range(20): || ... if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)): || ... print i,repr(i*.1),repr(i/10.),i*.1,i/10. | | I don't see how that has any relevance to the way bin boundaries | would be used in practice, which is to say something like | | i = int(value / 0.1) | bin[i] += 1 # modulo appropriate range checks This is just masking the issue by converting numbers to integers. The fact remains that two mathematically equal numbers can have two different internal representations with one being slightly larger than the exact integer value and one smaller: >>> a=(23*.1)*10;a 23.000000000000004 >>> b=2.3/.1;b 22.999999999999996 >>> int(a/.1),int(b/.1) (230, 229) Part of the answer in this context is to use round() rather than int so you are getting to the closest integer. || For, say, garden variety numbers that aren't full of garbage digits || resulting from fp computation, the boundaries computed as 0.1*i are\ || not going to agree with such simple numbers as 1.4 and 0.7. | | Because the arithmetic is binary rather than decimal. But even using | decimal, you get the same sort of problems using a bin width of | 1.0/3.0. The solution is to use an algorithm that isn't sensitive | to those problems, then it doesn't matter what base your arithmetic | is done in. Agreed. | || I understand that the above really is just a patch over the problem, || but I'm wondering if it moves the problem far enough away that most || users wouldn't have to worry about it. | | No, it doesn't. The problems are not conveniently grouped together | in some place you can get away from; they're scattered all over the | place where you can stumble upon one at any time. | Yes, even a simple computation of the wrong type can lead to unexpected results. I agree. || So perhaps this brings us back to the original comment that "fp || issues are a learning opportunity." They are. The question I have is || "how || soon do they need to run into them?" Is decreasing the likelihood || that they will see the problem (but not eliminate it) a good thing || for the python community or not? | | I don't think you're doing anyone any favours by trying to protect | them from having to know about these things, because they *need* to | know about them if they're not to write algorithms that seem to | work fine on tests but mysteriously start producing garbage when | run on real data, possibly without it even being obvious that it is | garbage. Mostly I agree, but if you go to the extreme then why don't we just drop floating point comparisons altogether and force the programmer to convert everything to integers and make their own bias evident (like converting to int rather than nearest int). Or we drop the fp comparison operators and introduce fp comparison functions that require the use of tolerance terms to again make the assumptions transparent: def lt(x, y, rel_err = 1e-5, abs_err = 1e-8): return not areclose(x,y,rel_err,abs_err) and int(x-y)<=0 print lt(a,b,0,1e-10) --> False (they are equal to that tolerance) print lt(a,b,0,1e-20) --> True (a is less than b at that tolerance) The fact is, we make things easier and let the programmer shoot themselves in the foot if they want to by providing things like fp comparisons and even functions like sum that do dumb-sums (though Raymond Hettinger's Python Recipe at ASPN provides a smart-sum). I think the biggest argument for something like nice() is that it fills the void for a simple way to round numbers to a relative error rather than an absolute error. round() handles absolute error--it rounds to a given precision. str() rounds to the 12th digit and repr() to the 17th digit. There is nothing else except build-your-own solutions to rounding to an arbitrary significant figure. nice() would fill that niche and provide the default 12 significant digit solution. I agree that making all float comparisions default to 12-digit precision would not be smart. That would be throwing away 5 digits that someone might really want. Providing a simple way to specify the desired significance is something that is needed, especially since fp issues are such a thorny issue. The user that explicitly uses nice(x) References: Message-ID: <43F31C36.4080109@voidspace.org.uk> Adam Olsen wrote: > On 2/14/06, Just van Rossum wrote: > >> +1 for two functions. >> >> My choice would be open() for binary and opentext() for text. I don't >> find that backwards at all: the text function is going to be more >> different from the current open() function then the binary function >> would be since in many ways the str type is closer to bytes than to >> unicode. >> >> Maybe it's even better to use opentext() AND openbinary(), and deprecate >> plain open(). We could even introduce them at the same time as bytes() >> (and leave the open() deprecation for 3.0). >> > > Thus providing us with a transition period, even with warnings on use > of the old function. > [snip..] I personally like the move towards all unicode strings, basically any text where you don't know the encoding used is 'random binary data'. This works fine, so long as you are in control of the text source. *However*, it leaves the following problem : The current situation (treating byte-sequences as text and assuming they are an ascii-superset encoded text-string) *works* (albeit with many breakages), simply because this assumption is usually correct. Forcing the programmer to be aware of encodings, also pushes the same requirement onto the user (who is often the source of the text in question). Currently you can read a text file and process it - making sure that any changes/requirements only use ascii characters. It therefore doesn't matter what 8 bit ascii-superset encoding is used in the original. If you force the programmer to specify the encoding in order to read the file, they would have to pass that requirement onto their user. Their user is even less likely to be encoding aware than the programmer. What this means, is that for simple programs where the programmer doesn't want to have to worry about encoding, or can't force the user to be aware, they will read in the file as bytes. Modules will quickly and inevitably be created implementing all the 'string methods' for bytes. New programmers will gravitate to these and the old mess will continue, but with a more awkward hybrid than before. (String manipulations of byte sequences will no longer be a core part of the language - and so be harder to use.) Not sure what we can do to obviate this of course... but is this change actually going to improve the situation or make it worse ? All the best, Michael Foord -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060215/7ce57d63/attachment.htm From raymond.hettinger at verizon.net Wed Feb 15 13:45:32 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 15 Feb 2006 07:45:32 -0500 Subject: [Python-Dev] nice() References: <000e01c63226$fc342660$7c2c4fca@csmith> Message-ID: <001401c6322d$b5c017a0$b83efea9@RaymondLaptop1> [Smith] > The following discussion ends with things that python-dev might want to > consider in terms of adding a function that allows something other than the > default 12- and 17-digit precision representations of numbers that str() and > repr() give. Such a function (like nice(), perhaps named trim()?) would > provide a way to convert fp numbers that are being used in comparisons into a > precision that reflects the user's preference. -1 See posts by Greg, Terry, and myself which recommend against trim(), nice(), or other variants. For the purpose of precision sensitive comparisons, these constructs are unfit for their intended purpose -- they are error-prone and do not belong in Python. They may have some legitimate uses, but those tend to be dominated by the existing round() function. If anything, then some variant of is_close() can go in the math module. BUT, the justification should not be for newbies to ignore issues with floating-point equality comparisons. The justification would have to be that folks with some numerical sophistication have a recurring need for the function (with sophistication meaning that they know how to come up with relative and absolute tolerances that make their application succeed over the full domain of possible inputs). Raymond ---- relevant posts from Greg and Terry ---- [Greg Ewing] >> I don't think you're doing anyone any favours by trying to protect >> them from having to know about these things, because they *need* to >> know about them if they're not to write algorithms that seem to >> work fine on tests but mysteriously start producing garbage when >> run on real data, [Terry Reedy] > I agree. Here was my 'kick-in-the-butt' lesson (from 20+ years ago): the > 'simplified for computation' formula for standard deviation, found in too > many statistics books without a warning as to its danger, and specialized > for three data points, is sqrt( ((a*a+b*b+c*c)-(a+b+c)**2/3.0) /2.0). > After 1000s of ok calculations, the data were something like a,b,c = > 10005,10006,10007. The correct answer is 1.0 but with numbers rounded to 7 > digits, the computed answer is sqrt(-.5) == CRASH. I was aware that > subtraction lost precision but not how rounding could make a theoretically > guaranteed non-negative difference negative. > > Of course, Python floats being C doubles makes such glitches much rarer. > Not exposing C floats is a major newbie (and journeyman) protection > feature. [Greg Ewing] > I don't think you're doing anyone any favours by trying to protect > them from having to know about these things, because they *need* to > know about them if they're not to write algorithms that seem to > work fine on tests but mysteriously start producing garbage when > run on real data, I recommend rejecting trim(), nice(), areclose(), and all variants. Greg, Terry, and myself have > > OK, comments responding to Greg follow. > > > | From: Greg Ewing greg.ewing at canterbury.ac.nz > | Smith wrote: > | > || computing the bin boundaries for a histogram > || where bins are a width of 0.1: > || > ||||| for i in range(20): > || ... if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)): > || ... print i,repr(i*.1),repr(i/10.),i*.1,i/10. > | > | I don't see how that has any relevance to the way bin boundaries > | would be used in practice, which is to say something like > | > | i = int(value / 0.1) > | bin[i] += 1 # modulo appropriate range checks > > This is just masking the issue by converting numbers to integers. The fact > remains that two mathematically equal numbers can have two different internal > representations with one being slightly larger than the exact integer value > and one smaller: > >>>> a=(23*.1)*10;a > 23.000000000000004 >>>> b=2.3/.1;b > 22.999999999999996 >>>> int(a/.1),int(b/.1) > (230, 229) > > Part of the answer in this context is to use round() rather than int so you > are getting to the closest integer. > > > || For, say, garden variety numbers that aren't full of garbage digits > || resulting from fp computation, the boundaries computed as 0.1*i are\ > || not going to agree with such simple numbers as 1.4 and 0.7. > | > | Because the arithmetic is binary rather than decimal. But even using > | decimal, you get the same sort of problems using a bin width of > | 1.0/3.0. The solution is to use an algorithm that isn't sensitive > | to those problems, then it doesn't matter what base your arithmetic > | is done in. > > Agreed. > > | > || I understand that the above really is just a patch over the problem, > || but I'm wondering if it moves the problem far enough away that most > || users wouldn't have to worry about it. > | > | No, it doesn't. The problems are not conveniently grouped together > | in some place you can get away from; they're scattered all over the > | place where you can stumble upon one at any time. > | > > Yes, even a simple computation of the wrong type can lead to unexpected > results. I agree. > > || So perhaps this brings us back to the original comment that "fp > || issues are a learning opportunity." They are. The question I have is > || "how > || soon do they need to run into them?" Is decreasing the likelihood > || that they will see the problem (but not eliminate it) a good thing > || for the python community or not? > | > | I don't think you're doing anyone any favours by trying to protect > | them from having to know about these things, because they *need* to > | know about them if they're not to write algorithms that seem to > | work fine on tests but mysteriously start producing garbage when > | run on real data, possibly without it even being obvious that it is > | garbage. > > Mostly I agree, but if you go to the extreme then why don't we just drop > floating point comparisons altogether and force the programmer to convert > everything to integers and make their own bias evident (like converting to int > rather than nearest int). Or we drop the fp comparison operators and introduce > fp comparison functions that require the use of tolerance terms to again make > the assumptions transparent: > > def lt(x, y, rel_err = 1e-5, abs_err = 1e-8): > return not areclose(x,y,rel_err,abs_err) and int(x-y)<=0 > print lt(a,b,0,1e-10) --> False (they are equal to that tolerance) > print lt(a,b,0,1e-20) --> True (a is less than b at that tolerance) > > The fact is, we make things easier and let the programmer shoot themselves in > the foot if they want to by providing things like fp comparisons and even > functions like sum that do dumb-sums (though Raymond Hettinger's Python Recipe > at ASPN provides a smart-sum). > > I think the biggest argument for something like nice() is that it fills the > void for a simple way to round numbers to a relative error rather than an > absolute error. round() handles absolute error--it rounds to a given > precision. str() rounds to the 12th digit and repr() to the 17th digit. There > is nothing else except build-your-own solutions to rounding to an arbitrary > significant figure. nice() would fill that niche and provide the default 12 > significant digit solution. > > I agree that making all float comparisions default to 12-digit precision would > not be smart. That would be throwing away 5 digits that someone might really > want. Providing a simple way to specify the desired significance is something > that is needed, especially since fp issues are such a thorny issue. The user > that explicitly uses nice(x) getting a result that they expect, e.g. > > nice(2.3/.1)==nice((23*.1)*10) > > and also getting a subtle reminder that their result is only true at the > default (12th digit) precision level. > > /c > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/python%40rcn.com From gustavo at niemeyer.net Wed Feb 15 13:35:52 2006 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Wed, 15 Feb 2006 10:35:52 -0200 Subject: [Python-Dev] Generalizing *args and **kwargs In-Reply-To: <20060215103746.GJ6027@xs4all.nl> References: <20060215103746.GJ6027@xs4all.nl> Message-ID: <20060215123552.GA8946@localhost.localdomain> > I've been thinking about generalization of the *args/**kwargs syntax for > quite a while, and even though I'm pretty sure Guido (and many people) will > consider it overgeneralization, I am finally going to suggest it. This whole > idea is not something dear to my heart, although I obviously would like to > see it happen. If the general vote is 'no', I'll write a small PEP or add it > to PEP 13 and be done with it. A PEP would be great, even if not accepted. At least we'll have it discussed in a single place and avoid rediscussing it everytime someone figures out it's a nice idea. Have a look for the subject "Extending tuple unpacking" in the mailing list for a recent discussion on the topic. -- Gustavo Niemeyer http://niemeyer.net From lists at janc.be Wed Feb 15 13:49:04 2006 From: lists at janc.be (Jan Claeys) Date: Wed, 15 Feb 2006 13:49:04 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F27D25.7070208@canterbury.ac.nz> References: <43F27D25.7070208@canterbury.ac.nz> Message-ID: <1140007745.13739.7.camel@localhost.localdomain> Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing: > I'm disappointed that the various Linux distributions > still don't seem to have caught onto the very simple > idea of *not* scattering files all over the place when > installing something. > > MacOSX seems to be the only system so far that has got > this right -- organising the system so that everything > related to a given application or library can be kept > under a single directory, clearly labelled with a > version number. Those directories might be mounted on entirely different hardware (even over a network), often with different characteristics (access speed, writeability, etc.). -- Jan Claeys From tim at pollenation.net Wed Feb 15 16:14:41 2006 From: tim at pollenation.net (Tim Parkin) Date: Wed, 15 Feb 2006 15:14:41 +0000 Subject: [Python-Dev] how to upload new MacPython web page? In-Reply-To: <20060215092229.GG6027@xs4all.nl> References: <06Feb14.213215pst."58633"@synergy1.parc.xerox.com> <20060215092229.GG6027@xs4all.nl> Message-ID: <43F34561.2070209@pollenation.net> Thomas Wouters wrote: >On Tue, Feb 14, 2006 at 09:32:09PM -0800, Bill Janssen wrote: > > >>We (the pythonmac-sig mailing list) seem to have converged (almost -- >>still talking about the logo) on a new download page for MacPython, to >>replace the page currently at >>http://www.python.org/download/download_mac.html. The strawman can be >>seen at http://bill.janssen.org/mac/new-macpython-page.html. >> >>How do I get the bits changed on python.org (when we're finished)? >> >> > >pydotorg at python.org is probably the right email address (although most of >them are on here as well.) > > > I'm happy to upload the pages when you're ready. Tim From jeremy at alum.mit.edu Wed Feb 15 16:29:38 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 15 Feb 2006 10:29:38 -0500 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <20060215220734.45db375a.simon@arrowtheory.com> References: <20060215220734.45db375a.simon@arrowtheory.com> Message-ID: I am still -1 on the ast-objects branch. It adds a lot of boilerplate code and its makes complicated what is now simple. I'll see if I can get a rough cut of the marshal code ready today, so there will be a complete implementation of my original plan. I also think we should keep the transformation api simple. If we provide an extension module, along the lines of the parser module, users can write transformations with that module. They can also write their own wrapper script that runs a script after applying transformations. I agree that the question of saved bytecode files still needs to be resolved. I'm not sure that extending the bytecode format to record modifications is enough, since you also have a filename problem: How do you manage two versions of a module, one compiled with transformation and one compiled without? How about we arrange for some open space time at PyCon to discuss? Unfortunately, the compiler talk isn't until the last day and I can't stay for sprints. It would be better to have the talk, then the open space, then the sprint. Jeremy On 2/15/06, Simon Burton wrote: > On Wed, 15 Feb 2006 00:34:35 -0800 > Brett Cannon wrote: > > > As per Neal's prodding email, here is a thread to discuss where we > > want to go with the C AST to Python stuff and what I think are the > > core issues at the moment. > > > > First issue is the ast-objects branch. Work is being done on it, but > > it still leaks some references (Neal or Martin can correct me if I am > > wrong). > > I've been doing the heavy lifting on ast-objects the last few weeks. > Today it finally passed the python test suite. The last thing to do is > the addition of XDECREF's, so yes, it is leaking a lot of references. > > I won't make it to PyCon (it's a long way for me to come), but gee I've left > all the fun stuff for you to do ! > :) > > Even if AST transforms are not allowed, I see it as the strongest form of > code reflection, and long over-due in python. > > Simon. > > > -- > Simon Burton, B.Sc. > Licensed PO Box 8066 > ANU Canberra 2601 > Australia > Ph. 61 02 6249 6940 > http://arrowtheory.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From aahz at pythoncraft.com Wed Feb 15 16:47:08 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 15 Feb 2006 07:47:08 -0800 Subject: [Python-Dev] 2.5 PEP In-Reply-To: <20060215095227.GH6027@xs4all.nl> References: <20060215095227.GH6027@xs4all.nl> Message-ID: <20060215154708.GA8059@panix.com> On Wed, Feb 15, 2006, Thomas Wouters wrote: > > I can volunteer for 328 if no one else wants it, I've messed with the import > mechanism before (and besides, it's fun.) I've also written an unfinished > 308 implementation to get myself acquainted with the AST code more. > 'Unfinished' means that it works completely, except for some cases of > ambiguous syntax. I can fix that in a few days if the deadline nears and > there's no working patch. If you want to also take over the PEP328 editing, please be my guest. I keep making time for it that gets overridden by other things. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From foom at fuhm.net Wed Feb 15 17:48:18 2006 From: foom at fuhm.net (James Y Knight) Date: Wed, 15 Feb 2006 11:48:18 -0500 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F31C36.4080109@voidspace.org.uk> References: <43F31C36.4080109@voidspace.org.uk> Message-ID: <3201938A-3FE3-435E-A84C-8ECD16F3AF70@fuhm.net> On Feb 15, 2006, at 7:19 AM, Fuzzyman wrote: > [snip..] > > I personally like the move towards all unicode strings, basically > any text where you don't know the encoding used is 'random binary > data'. This works fine, so long as you are in control of the text > source. *However*, it leaves the following problem : > > The current situation (treating byte-sequences as text and assuming > they are an ascii-superset encoded text-string) *works* (albeit > with many breakages), simply because this assumption is usually > correct. > > Forcing the programmer to be aware of encodings, also pushes the > same requirement onto the user (who is often the source of the text > in question). > > Currently you can read a text file and process it - making sure > that any changes/requirements only use ascii characters. It > therefore doesn't matter what 8 bit ascii-superset encoding is used > in the original. If you force the programmer to specify the > encoding in order to read the file, they would have to pass that > requirement onto their user. Their user is even less likely to be > encoding aware than the programmer. Or the programmer can just use "iso-8859-1" and call it done. That will get you the same "I don't care" behavior as now. James From smiles at worksmail.net Wed Feb 15 15:29:01 2006 From: smiles at worksmail.net (Smith) Date: Wed, 15 Feb 2006 08:29:01 -0600 Subject: [Python-Dev] math.areclose ...? References: <00dd01c63142$3dd61280$892c4fca@csmith> <43F23815.4030307@acm.org> Message-ID: <004d01c63251$4ea87340$452c4fca@csmith> A problem that I pointed out with the proposed areclose() function is that it has within it a fp comparison. If such a function is to have greater utility, it should allow the user to specify how significant to consider the computed error. A natural extension of being able to tell if 2 fp numbers are close is to make a more general comparison. For that purpose, a proposed fpcmp function is appended. From that, fp boolean comparison operators (le, gt, ...) are easily constructed. Python allows fp comparison. This is significantly of source of surprises and learning experiences. Are any of these proposals of interest for providing tools to more intelligently make the fp comparisons? ### #new proposal for the areclose() function def areclose(x,y,atol=1e-8,rtol=1e-5,prec=12): """Return False if the |x-y| is greater than atol or greater than the absolute value of the larger of x and y, otherwise True. The comparison is made by computing a difference that should be 0 if the two numbers satisfy either condition; prec controls the precision of the value that is obtained, e.g. 8.3__e-17 is obtained for (2.1-2)-.1. But rounding to the 12th digit (the default precision) the value of 0.0 is returned indicating that for that precision there is no (significant) error.""" diff = abs(x-y) return round(diff-atol,prec)<=0 or \ round(diff-rtol*max(abs(x),abs(y)),prec)<=0 #fp cmp def fpcmp(x,y,atol=1e-8,rtol=1e-5,prec=12): """Return 0 if x and y are close in the absolute or relative sense. If not, then return -1 if x < y or +1 if x > y. Note: prec controls how many digits of the error are retained when checking for closeness.""" if areclose(x,y,atol,rtol,prec): return 0 else: return cmp(x,y) # fp comparisons functions def lt(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)==-1 def le(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec) in (-1,0) def eq(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)==0 def gt(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)==1 def ge(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec) in (0,1) def ne(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)<>0 ### From guido at python.org Wed Feb 15 18:17:44 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 09:17:44 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F2FDDD.3030200@gmail.com> References: <43F2FDDD.3030200@gmail.com> Message-ID: On 2/15/06, Nick Coghlan wrote: > If we went with longer names, a slight variation on the opentext/openbinary > idea would be to use opentext and opendata. After some thinking I don't like opendata any more -- often data is text, so the term is wrong. openbinary is fine but long. So how about openbytes? This clearly links the resulting object with the bytes type, which is mutually reassuring. Regarding open vs. opentext, I'm still not sure. I don't want to generalize from the openbytes precedent to openstr or openunicode (especially since the former is wrong in 2.x and the latter is wrong in 3.0). I'm tempting to hold out for open() since it's most compatible. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 18:25:59 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 09:25:59 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F31C36.4080109@voidspace.org.uk> References: <43F31C36.4080109@voidspace.org.uk> Message-ID: On 2/15/06, Fuzzyman wrote: > Forcing the programmer to be aware of encodings, also pushes the same > requirement onto the user (who is often the source of the text in question). The programmer shouldn't have to be aware of encodings most of the time -- it's the job of the I/O library to determine the end user's (as opposed to the language's) default encoding dynamically and act accordingly. Users who use non-ASCII characters without informing the OS of their encoding are in a world of pain, *unless* they use the OS default encoding (which may vary per locale). If the OS can figure out the default encoding, so can the Python I/O library. Many apps won't have to go beyond this at all. Note that I don't want to use this OS/user default encoding as the default encoding between bytes and strings; once you are reading bytes you are writing "grown-up" code and you will have to be explicit. It's only the I/O library that should automatically encode on write and decode on read. > Currently you can read a text file and process it - making sure that any > changes/requirements only use ascii characters. It therefore doesn't matter > what 8 bit ascii-superset encoding is used in the original. If you force the > programmer to specify the encoding in order to read the file, they would > have to pass that requirement onto their user. Their user is even less > likely to be encoding aware than the programmer. I disagree -- the user most likely has set or received a default encoding when they first got the computer, and that's all they are using. If other tools (notepad, wordpad, emacs, vi etc.) can figure out the encoding, so can Python's I/O library. > What this means, is that for simple programs where the programmer doesn't > want to have to worry about encoding, or can't force the user to be aware, > they will read in the file as bytes. Of course not! > Modules will quickly and inevitably be > created implementing all the 'string methods' for bytes. New programmers > will gravitate to these and the old mess will continue, but with a more > awkward hybrid than before. (String manipulations of byte sequences will no > longer be a core part of the language - and so be harder to use.) This seems an unlikely development if we do the conversions in the I/O library. > Not sure what we can do to obviate this of course... but is this change > actually going to improve the situation or make it worse ? I'm not worried about this scenario. "What if all the programmers in the world suddenly became dumb?" -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Wed Feb 15 18:29:14 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 15 Feb 2006 18:29:14 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F2FDDD.3030200@gmail.com> Message-ID: <43F364EA.3010507@egenix.com> Guido van Rossum wrote: > On 2/15/06, Nick Coghlan wrote: >> If we went with longer names, a slight variation on the opentext/openbinary >> idea would be to use opentext and opendata. > > After some thinking I don't like opendata any more -- often data is > text, so the term is wrong. openbinary is fine but long. So how about > openbytes? This clearly links the resulting object with the bytes > type, which is mutually reassuring. > > Regarding open vs. opentext, I'm still not sure. I don't want to > generalize from the openbytes precedent to openstr or openunicode > (especially since the former is wrong in 2.x and the latter is wrong > in 3.0). I'm tempting to hold out for open() since it's most > compatible. Maybe a weird idea, but why not use static methods on the bytes and str type objects for this ?! E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 renamed to str.openfile()) After all, you are in a certain way constructing object of the given types - only that the input to these constructors happen to be files in the file system. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 15 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From barry at python.org Wed Feb 15 18:51:49 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 12:51:49 -0500 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F2FDDD.3030200@gmail.com> Message-ID: <1140025909.13758.43.camel@geddy.wooz.org> On Wed, 2006-02-15 at 09:17 -0800, Guido van Rossum wrote: > Regarding open vs. opentext, I'm still not sure. I don't want to > generalize from the openbytes precedent to openstr or openunicode > (especially since the former is wrong in 2.x and the latter is wrong > in 3.0). I'm tempting to hold out for open() since it's most > compatible. If we go with two functions, I'd much rather hang them off of the file type object then add two new builtins. I really do think file.bytes() and file.text() (a.k.a. open.bytes() and open.text()) is better than opentext() or openbytes(). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/d9fe7709/attachment.pgp From barry at python.org Wed Feb 15 18:53:43 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 12:53:43 -0500 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F364EA.3010507@egenix.com> References: <43F2FDDD.3030200@gmail.com> <43F364EA.3010507@egenix.com> Message-ID: <1140026023.13781.45.camel@geddy.wooz.org> On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > Maybe a weird idea, but why not use static methods on the > bytes and str type objects for this ?! > > E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 > renamed to str.openfile()) That's also not a bad idea, but I'd leave off one or the other of the redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() seem fine to me (we all know what 'open' means, right? :). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/1762e892/attachment.pgp From amk at amk.ca Wed Feb 15 19:57:40 2006 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 15 Feb 2006 13:57:40 -0500 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: <20060215220734.45db375a.simon@arrowtheory.com> Message-ID: <20060215185740.GA11670@rogue.amk.ca> On Wed, Feb 15, 2006 at 10:29:38AM -0500, Jeremy Hylton wrote: > Unfortunately, the compiler talk isn't until the last day and I can't > stay for sprints. It would be better to have the talk, then the open > space, then the sprint. If you mean "Implementation of the Python Bytecode Compiler", that's on Saturday at 10:50, so you have a whole day in which to fit an open space event. Unfortunately there are already a lot of open space events on that day, and the next open slot is at 3:15PM. But if you don't need a room to talk in, I'm sure you can find a comfortable place for 5 or 6 people to chat. --amk From barry at python.org Wed Feb 15 19:02:13 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 13:02:13 -0500 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: Message-ID: <1140026533.14811.2.camel@geddy.wooz.org> On Wed, 2006-02-15 at 00:34 -0800, Brett Cannon wrote: > I personally think we should choose an initial global access API to > the AST as a starting API. I like the sys.ast_transformations idea > since it is simple and gives enough access that whether read-only or > read-write is allowed something like PyChecker can get the access it > needs. I haven't been following the AST stuff closely enough, but I'm not crazy about putting access to this in the sys module. It seems like it clutters that up with a name that will be rarely used by the average Python programmer. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/bdc6a7c1/attachment.pgp From mal at egenix.com Wed Feb 15 19:02:58 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 15 Feb 2006 19:02:58 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <1140026023.13781.45.camel@geddy.wooz.org> References: <43F2FDDD.3030200@gmail.com> <43F364EA.3010507@egenix.com> <1140026023.13781.45.camel@geddy.wooz.org> Message-ID: <43F36CD2.5090704@egenix.com> Barry Warsaw wrote: > On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > >> Maybe a weird idea, but why not use static methods on the >> bytes and str type objects for this ?! >> >> E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 >> renamed to str.openfile()) > > That's also not a bad idea, but I'd leave off one or the other of the > redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() > seem fine to me (we all know what 'open' means, right? :). Thinking about it, I like your idea better (file.bytes() and file.text()). Anyway, as long as we don't start adding openthis() and openthat() I guess I'm happy ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 15 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From barry at python.org Wed Feb 15 19:06:44 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 13:06:44 -0500 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: Message-ID: <1140026804.14812.9.camel@geddy.wooz.org> On Tue, 2006-02-14 at 21:24 -0800, Neal Norwitz wrote: > We still need a release manager. No one has heard from Anthony. If > he isn't interested is someone else interested in trying their hand at > it? There are many changes necessary in PEP 101 because since the > last release both python and pydotorg have transitioned from CVS to > SVN. Creosote also moved. I would definitely like to see a PEP 101 update as part of the 2.5 RM's responsibilities, and I think it could be done while spinning the first alpha release. I know others have volunteered, but in a pinch I'd be happy to dust off my RM hat and help out too. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/53fb74f2/attachment-0001.pgp From barry at python.org Wed Feb 15 19:07:51 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 13:07:51 -0500 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F36CD2.5090704@egenix.com> References: <43F2FDDD.3030200@gmail.com> <43F364EA.3010507@egenix.com> <1140026023.13781.45.camel@geddy.wooz.org> <43F36CD2.5090704@egenix.com> Message-ID: <1140026871.14818.11.camel@geddy.wooz.org> On Wed, 2006-02-15 at 19:02 +0100, M.-A. Lemburg wrote: > Anyway, as long as we don't start adding openthis() and openthat() > I guess I'm happy ;-) Me too! :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/7f59464d/attachment.pgp From guido at python.org Wed Feb 15 19:29:30 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 10:29:30 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F36CD2.5090704@egenix.com> References: <43F2FDDD.3030200@gmail.com> <43F364EA.3010507@egenix.com> <1140026023.13781.45.camel@geddy.wooz.org> <43F36CD2.5090704@egenix.com> Message-ID: On 2/15/06, M.-A. Lemburg wrote: > Barry Warsaw wrote: > > On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > > > >> Maybe a weird idea, but why not use static methods on the > >> bytes and str type objects for this ?! > >> > >> E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 > >> renamed to str.openfile()) > > > > That's also not a bad idea, but I'd leave off one or the other of the > > redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() > > seem fine to me (we all know what 'open' means, right? :). > > Thinking about it, I like your idea better (file.bytes() > and file.text()). This is better than making it a static/class method on file (which has the problem that it might return something that's not a file at all -- file is a particular stream implementation, there may be others) but I don't like the tight coupling it creates between a data type and an I/O library. I still think that having global (i.e. built-in) factory functions for creating various stream types makes the most sense. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jimjjewett at gmail.com Wed Feb 15 19:38:41 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 15 Feb 2006 13:38:41 -0500 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: Message-ID: On 2/14/06, Neil Schemenauer wrote: > People could spell it bytes(s.encode('latin-1')) Guido wrote: > At the cost of an extra copying step. I asked: > ... why not just add some smarts to the bytes constructor? Guido wrote: > ... the VM usually keeps an extra reference > on the stack so the refcount is never 1. But > you can't rely on that I did miss this, but _PyString_Resize seems to work around it, and I'm not sure that the bytes object can't be just as intimate. Even if that is insurmountable, bytes objects could recognize two states -- one normal, and one for "I'm delegating to a string, and have to copy to my own buffer before I actually mutate anything." Then a new bytes object would still need its own header, but the data copying could often be avoided. But back to the possibility of not creating even a new object header... > the str's underlying array is allocated inline > with the str header, this require str and > bytes to have the same object layout. But > since bytes are mutable, they can't. Looking at the arraymodule, the only extra fields in an array are weakrefs, description (which will no longer be needed) and tracking for the indirection. There are even a few extra bytes leftover that could be used to indicate that ob_item was redirected later, the way tables do with small_table. -jJ From janssen at parc.com Wed Feb 15 19:59:44 2006 From: janssen at parc.com (Bill Janssen) Date: Wed, 15 Feb 2006 10:59:44 PST Subject: [Python-Dev] str object going in Py3K In-Reply-To: Your message of "Wed, 15 Feb 2006 09:51:49 PST." <1140025909.13758.43.camel@geddy.wooz.org> Message-ID: <06Feb15.105950pst."58633"@synergy1.parc.xerox.com> > If we go with two functions, I'd much rather hang them off of the file > type object then add two new builtins. I really do think file.bytes() > and file.text() (a.k.a. open.bytes() and open.text()) is better than > opentext() or openbytes(). +1. The default behavior of the current open() in opening files as text is particularly grating. This would make things much clearer. Bill From jason.orendorff at gmail.com Wed Feb 15 20:01:37 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Wed, 15 Feb 2006 14:01:37 -0500 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] Message-ID: Instead of byte literals, how about a classmethod bytes.from_hex(), which works like this: # two equivalent things expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, 131, 79, 229, 201, 46, 106]) It's just a nicety; the former fits my brain a little better. This would work fine both in 2.5 and in 3.0. I thought about unicode.encode('hex'), but obviously it will continue to return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13', 'zip', 'uu', etc.) generally scare me. And now that bytes and text are going to be two very different types, they're even weirder than before. Consider: text.encode('utf-8') ==> bytes text.encode('rot13') ==> text bytes.encode('zip') ==> bytes bytes.encode('uu') ==> text (?) This state of affairs seems kind of crazy to me. Actually users trying to figure out Unicode would probably be better served if bytes.encode() and text.decode() did not exist. -j -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060215/224a3fe4/attachment.html From martin at v.loewis.de Wed Feb 15 20:04:07 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 20:04:07 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> <43F27FBC.1000209@v.loewis.de> <43F2E065.3080807@v.loewis.de> Message-ID: <43F37B27.8070008@v.loewis.de> Adam Olsen wrote: > Making it an error to have 8-bit str literals in 2.x would help > educate the user that they will change behavior in 3.0 and not be > 8-bit str literals anymore. You would like to ban string literals from the language? Remember: all string literals are currently 8-bit (byte) strings. Regards, Martin From guido at python.org Wed Feb 15 20:16:51 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 11:16:51 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: On 2/15/06, Jason Orendorff wrote: > Instead of byte literals, how about a classmethod bytes.from_hex(), which > works like this: > > # two equivalent things > expected_md5_hash = > bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') > expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, > 131, 79, 229, 201, 46, 106]) > > It's just a nicety; the former fits my brain a little better. This would > work fine both in 2.5 and in 3.0. Yes, this looks nice. > I thought about unicode.encode('hex'), but obviously it will continue to > return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13', > 'zip', 'uu', etc.) generally scare me. And now that bytes and text are > going to be two very different types, they're even weirder than before. > Consider: > > text.encode('utf-8') ==> bytes > text.encode('rot13') ==> text > bytes.encode('zip') ==> bytes > bytes.encode('uu') ==> text (?) > > This state of affairs seems kind of crazy to me. > > Actually users trying to figure out Unicode would probably be better served > if bytes.encode() and text.decode() did not exist. Yeah, the pseudogeneralizations seem to be a mistake -- they are almost universally frowned upon. I'll happily send their to their grave in Py3k. It would be better if the signature of text.encode() always returned a bytes object. But why deny the bytes object a decode() method if text objects have an encode() method? I'd say there are two "symmetric" API flavors possible (t and b are text and bytes objects, respectively, where text is a string type, either str or unicode; enc is an encoding name): - b.decode(enc) -> t; t.encode(enc) -> b - b = bytes(t, enc); t = text(b, enc) I'm not sure why one flavor would be preferred over the other, although having both would probably be a mistake. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From trentm at ActiveState.com Wed Feb 15 20:18:56 2006 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 15 Feb 2006 11:18:56 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: <43F27D25.7070208@canterbury.ac.nz> <20060215012214.GA31050@activestate.com> Message-ID: <20060215191856.GA7705@activestate.com> [Bob Ippolito wrote] >... > > /Library/Frameworks/Python.framework/... > > /Applications/MacPython-2.4/... # just MacPython does this > > ActivePython doesn't install app bundles for IDLE or anything? It does, but puts them under here instead: /Library/Frameworks/Python.framework/Versions/X.Y/Resources/ >... > >Also, a receipt of the installation ends up here: > > > > /Library/Receipts/$package_name/... > > > >though Apple does not provide tools for uninstallation using those > >receipts. > > That stuff is really behind the scenes stuff that's wholly managed by > Installer.app and is pretty much irrelevant. Sure. > Single apps are better than OK. Download them by whatever means you > want, put them wherever you want, and run them. You can run any well- > behaved application from a DMG (or a CD, or a USB key, or any other > readable media). For naive or new-to-mac users it is a confusing process to get the .app bundle to an appropriate place and then start running it. Why else have various app distributors out there come up with myriad slick background images for their DMG's trying to instruct users what to do with the icons in the mounted DMG's Finder window? On Windows you download an MSI (it ends up in your browser downloads folder), it starts the installation, and the end of the installation it starts the app for you. The app is nicely in Program Files. No need to eject something. No need to find somewhere to drag the icon. I'll grant that having the whole thing in one bundle is cool/handy/cute. ...anyway this is getting seriously OT for python-dev. :) Trent -- Trent Mick TrentM at ActiveState.com From theller at python.net Wed Feb 15 20:21:03 2006 From: theller at python.net (Thomas Heller) Date: Wed, 15 Feb 2006 20:21:03 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: Jason Orendorff wrote: > Instead of byte literals, how about a classmethod bytes.from_hex(), which > works like this: > > # two equivalent things > expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') I hope this will also be equivalent: > expected_md5_hash = bytes.from_hex('5c 53 50 24 ca c5 19 91 53 e3 83 4f e5 c9 2e 6a') Thomas From jcarlson at uci.edu Wed Feb 15 20:25:01 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 15 Feb 2006 11:25:01 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F30166.1070902@ronadam.com> References: <43F2CDBC.8090305@canterbury.ac.nz> <43F30166.1070902@ronadam.com> Message-ID: <20060215111844.5F64.JCARLSON@uci.edu> Ron Adam wrote: > Greg Ewing wrote: > > Ron Adam wrote: > >> b = bytes(0L) -> bytes([0,0,0,0]) > > > > No, bytes(0L) --> TypeError because 0L doesn't implement > > the iterator protocol or the buffer interface. > > It wouldn't need it if it was a direct C memory copy. Yes it would. Python long integers are stored as arrays of signed 16-bit short ints. See longintrepr.h from the source. - Josiah From bob at redivi.com Wed Feb 15 20:23:22 2006 From: bob at redivi.com (Bob Ippolito) Date: Wed, 15 Feb 2006 11:23:22 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <1140007745.13739.7.camel@localhost.localdomain> References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> Message-ID: On Feb 15, 2006, at 4:49 AM, Jan Claeys wrote: > Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing: >> I'm disappointed that the various Linux distributions >> still don't seem to have caught onto the very simple >> idea of *not* scattering files all over the place when >> installing something. >> >> MacOSX seems to be the only system so far that has got >> this right -- organising the system so that everything >> related to a given application or library can be kept >> under a single directory, clearly labelled with a >> version number. > > Those directories might be mounted on entirely different hardware > (even > over a network), often with different characteristics (access speed, > writeability, etc.). Huh? What does that have to do with anything? I've never seen a system where /usr/include, /usr/lib, /usr/bin, etc. are not all on the same mount. It's not really any different with OS X either. -bob From martin at v.loewis.de Wed Feb 15 20:24:17 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 20:24:17 +0100 Subject: [Python-Dev] 2.5 PEP In-Reply-To: <200602150922.43810.alain.poirier@net-ng.com> References: <200602150922.43810.alain.poirier@net-ng.com> Message-ID: <43F37FE1.2090201@v.loewis.de> Alain Poirier wrote: > - is (c)ElementTree still planned for inclusion ? It is included already. Regards, Martin From martin at v.loewis.de Wed Feb 15 20:26:24 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 20:26:24 +0100 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <20060215100309.GI6027@xs4all.nl> References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> <20060215100309.GI6027@xs4all.nl> Message-ID: <43F38060.2070901@v.loewis.de> Thomas Wouters wrote: > I would personally prefer the AST validation to be a separate part of the > compiler. It means the one or the other can be out of sync, but it also > means it can be accessed directly (validating AST before sending it to the > compiler) and the compiler (or CFG generator, or something between AST and > CFG) can decide not to validate internally generated AST for non-debug > builds, for instance. That's how the ast-objects branch currently works. There is a method checking that the tree actually conforms to the grammar. Regards, Martin From trentm at ActiveState.com Wed Feb 15 20:28:48 2006 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 15 Feb 2006 11:28:48 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F299ED.9060402@canterbury.ac.nz> References: <43F27D25.7070208@canterbury.ac.nz> <20060215012214.GA31050@activestate.com> <43F299ED.9060402@canterbury.ac.nz> Message-ID: <20060215192848.GB7705@activestate.com> [Greg Ewing wrote] > It's not perfect, but it's still a lot better than the > situation on any other unix I've seen so far. Better than Unix, sure. But you *can* (and ActivePython does do) install everything under: /opt/$app_name/... > > open DMG, don't run the app from here, drag it to your > > Applications folder, then eject this window/disk, then run it from > > /Applications, > > A decently-designed application should be runnable from > anywhere, including a dmg, if the user wants to do that. > If an app refuses to run from a dmg, I consider that a > bug in the application. Yes, but the typical user probably *wants* to run the app from their /Applications folder (or somewhere else on their harddrive). When they start running from the mounted DMG, they can't then unmount the DMG to clean up. Actually the typical non-geek user doesn't care where they run the app from. They don't want to worry about those details. Trent -- Trent Mick TrentM at ActiveState.com From martin at v.loewis.de Wed Feb 15 20:37:17 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 20:37:17 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: <43F382ED.2080101@v.loewis.de> Jason Orendorff wrote: > expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') This looks good, although it duplicates expected_md5_hash = binascii.unhexlify('5c535024cac5199153e3834fe5c92e6a') Regards, Martin From guido at python.org Wed Feb 15 20:38:47 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 11:38:47 -0800 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <43F2FE65.5040308@pollenation.net> References: <43F2FE65.5040308@pollenation.net> Message-ID: On 2/15/06, Tim Parkin wrote: > Guido van Rossum wrote: > > > (Now that I work for Google I realize more than ever before the > > importance of keeping URLs stable; PageRank(tm) numbers don't get > > transferred as quickly as contents. I have this worry too in the > > context of the python.org redesign; 301 permanent redirect is *not* > > going to help PageRank of the new page.) > Could you expand on why 301 redirects won't help with the transfer of > page rank (if you're allowed)? We've done exactly this on many sites and > the pagerank (or more relevantly the search rankings on specific terms) > has transferred almost overnight. The bigger pagerank updates (both > algorithm changes and overhauls in approach) seem to only happen every > few months and these also seem to take notice of 301 redirects (they > generally clear up any supplemental results). OK, perhaps I stand corrected. I don't actually know that much about PageRank! I still don't like docs.python.org, and adding more like it seems a mistake; but it's possible that this is because of a poor execution of the idea (there's no "search docs" button near the search button on the old python.org). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 20:40:49 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 11:40:49 -0800 Subject: [Python-Dev] Generalizing *args and **kwargs In-Reply-To: <20060215103746.GJ6027@xs4all.nl> References: <20060215103746.GJ6027@xs4all.nl> Message-ID: On 2/15/06, Thomas Wouters wrote: > I've been thinking about generalization of the *args/**kwargs syntax for > quite a while, and even though I'm pretty sure Guido (and many people) will > consider it overgeneralization, I am finally going to suggest it. This whole > idea is not something dear to my heart, although I obviously would like to > see it happen. If the general vote is 'no', I'll write a small PEP or add it > to PEP 13 and be done with it. Feel free to write a PEP so that at least we have a concrete proposal where all the nuts and bolts have been thought through. I'm currently not able to give much thought to any more new proposals, so don't expect me to look at it any time soon. Unless a miracle occurs it's off the table for 2.5 so there's no hurry. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 20:43:13 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 11:43:13 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <8245655246816499522@unknownmsgid> References: <1140025909.13758.43.camel@geddy.wooz.org> <8245655246816499522@unknownmsgid> Message-ID: On 2/15/06, Bill Janssen wrote: > The default behavior of the current open() in opening files as text is > particularly grating. Why? Are you perhaps one of those rare folks who read more binary data than text? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim at pollenation.net Wed Feb 15 21:08:32 2006 From: tim at pollenation.net (Tim Parkin) Date: Wed, 15 Feb 2006 20:08:32 +0000 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> Message-ID: <43F38A40.606@pollenation.net> Guido van Rossum wrote: > On 2/15/06, Tim Parkin wrote: > >>Guido van Rossum wrote: >> >>>I have this worry too in the >>>context of the python.org redesign; 301 permanent redirect is *not* >>>going to help PageRank of the new page.) >>Could you expand on why 301 redirects won't help with the transfer of >>page rank (if you're allowed)? We've done exactly this on many sites and >>the pagerank (or more relevantly the search rankings on specific terms) >>has transferred almost overnight. The bigger pagerank updates (both >>algorithm changes and overhauls in approach) seem to only happen every >>few months and these also seem to take notice of 301 redirects (they >>generally clear up any supplemental results). > > OK, perhaps I stand corrected. I don't actually know that much about PageRank! > No problem, I don't think that many people do and the general consensus seems to be that, although the calculations behind pagerank may be one of the core parts of the google algorithm, there are so many additional algorithms* that affect searches on a case by case and day by day basis that the value from is almost meaningless (apart from possibly 0-2 may be a problem 3-5 is normal, 6-9 is generally good and 10 I've not seen) * (for instance, patents on working out the value of inbound links based on there age, how many other inbound links appeared around the same time, the status of the originating site as an 'authority' site, the text contained in the inbound link and title attributes, etc and the general relation between the inbound links and the 'theme' of the target site ['theme' == the distribution of important keywords across the site]) > I still don't like docs.python.org, and adding more like it seems a > mistake; but it's possible that this is because of a poor execution of > the idea (there's no "search docs" button near the search button on > the old python.org). I'll try and make a more functional/usable google search page on the new site. Tim Parkin p.s. I hope you didn't think I was digging for 'insider info'.. From jeremy at alum.mit.edu Wed Feb 15 21:07:01 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 15 Feb 2006 15:07:01 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> Message-ID: As I said in an earlier message, there's no need to have a separate domain to restrict queries to just the doc/current part of python.org. Just type "site:python.org/doc/current your query here" If there isn't any other rationale, maybe we can redirects docs.python.org back to www.python.org? Jeremy On 2/15/06, Guido van Rossum wrote: > On 2/15/06, Tim Parkin wrote: > > Guido van Rossum wrote: > > > > > (Now that I work for Google I realize more than ever before the > > > importance of keeping URLs stable; PageRank(tm) numbers don't get > > > transferred as quickly as contents. I have this worry too in the > > > context of the python.org redesign; 301 permanent redirect is *not* > > > going to help PageRank of the new page.) > > > Could you expand on why 301 redirects won't help with the transfer of > > page rank (if you're allowed)? We've done exactly this on many sites and > > the pagerank (or more relevantly the search rankings on specific terms) > > has transferred almost overnight. The bigger pagerank updates (both > > algorithm changes and overhauls in approach) seem to only happen every > > few months and these also seem to take notice of 301 redirects (they > > generally clear up any supplemental results). > > OK, perhaps I stand corrected. I don't actually know that much about PageRank! > > I still don't like docs.python.org, and adding more like it seems a > mistake; but it's possible that this is because of a poor execution of > the idea (there's no "search docs" button near the search button on > the old python.org). > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From g.brandl at gmx.net Wed Feb 15 21:13:14 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 15 Feb 2006 21:13:14 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> Message-ID: Jeremy Hylton wrote: > As I said in an earlier message, there's no need to have a separate > domain to restrict queries to just the doc/current part of python.org. > Just type > "site:python.org/doc/current your query here" > > If there isn't any other rationale, maybe we can redirects > docs.python.org back to www.python.org? If something like Fredrik's new doc system is adopted, it would be extremely convenient to refer someone to just docs.python.org/os.path.join without looking up how the page is actually named. Georg From guido at python.org Wed Feb 15 21:33:10 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 12:33:10 -0800 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F2CDC4.4060700@canterbury.ac.nz> References: <20060215002446.GD6027@xs4all.nl> <43F2A13F.4030604@canterbury.ac.nz> <200602142323.45930.fdrake@acm.org> <43F2CDC4.4060700@canterbury.ac.nz> Message-ID: On 2/14/06, Greg Ewing wrote: > Fred L. Drake, Jr. wrote: > > > The proper response in this case is often to re-start decoding > > with the correct encoding, since some of the data extracted so far may have > > been decoded incorrectly. > > If the protocol has been sensibly designed, that shouldn't > happen, since everything up to the coding marker should > be ascii (or some other protocol-defined initial coding). > > For protocols that are not sensibly designed (or if you're > just trying to guess) what you suggest may be needed. But > it would be good to have a nicer way of going about it > for when the protocol is sensible. I think that the implementation of encoding-guessing or auto-encoding-upgrade techniques should be left out of the standard library design for now. I know that XML does something like this, but fortunately we employ dedicated C code to parse XML so that particular case should be taken care of without complicating the rest of the standard I/O library. As far as searching bytes objects, that shouldn't be a problem as long as the search 'string' is also specified as a bytes object. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim at pollenation.net Wed Feb 15 21:52:49 2006 From: tim at pollenation.net (Tim Parkin) Date: Wed, 15 Feb 2006 20:52:49 +0000 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> Message-ID: <43F394A1.3080100@pollenation.net> Jeremy Hylton wrote: > As I said in an earlier message, there's no need to have a separate > domain to restrict queries to just the doc/current part of python.org. > Just type > "site:python.org/doc/current your query here" > > If there isn't any other rationale, maybe we can redirects > docs.python.org back to www.python.org? One possible reason, I'd like to be able to serve the docs up integrated with the new design (with a full hierarchical navigation). I had planned on leaving the docs.python.org as the raw tex2html conversion. If we got rid of the docs.python.org would we still want the www.python.org in the current style? Personally I was hoping that nearly all of the site could be in the new html structure and design for consistency and usability reasons. Tim Parkin From bokr at oz.net Wed Feb 15 21:53:06 2006 From: bokr at oz.net (Bengt Richter) Date: Wed, 15 Feb 2006 20:53:06 GMT Subject: [Python-Dev] bytes type discussion References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> Message-ID: <43f3873a.885434597@news.gmane.org> On Tue, 14 Feb 2006 19:41:07 -0500, "Raymond Hettinger" wrote: >[Guido van Rossum] >> Somewhat controversial: >> >> - bytes("abc") == bytes(map(ord, "abc")) > >At first glance, this seems obvious and necessary, so if it's somewhat >controversial, then I'm missing something. What's the issue? > ord("x") gets the source encoding's ord value of "x", but if that is not unicode or latin-1, it will break when PY 3000 makes "x" unicode. This means until Py 3000 plain str string literals have to use ascii and escapes in order to preserve the meaning when "x" == u"x". But the good news is bytes(map(ord(u"x"))) works fine for any source encoding now or after PY 3000. You just have to type characters into your editor between the quotes that look on the screen like any of the first 256 unicode characters (or use ascii escapes for unshowables). The u"x" translates x into unicode according to the *character* of x, whatever the source encoding, so all you have to do is choose characters of the first 256 unicodes. This happens to be latin-1, but you can ignore that unless you are interested in the actual byte values. If they have byte meaning, escapes are clearer anyway, and they work in a unicode string (where "x".decode(source_encoding) might fail on an illegal character). The solution is to use u"x" for now or use ascii-only with escapes, and just map ord on either kind of string. This should work when u"x" becomes equivalent to "x". The unicode that comes from a current u"x" string defines a *character* sequence. If you use legal latin-1 *characters* in whatever source encoding your editor and coding cookie say, you will get the *characters* you see inside the quotes in the u"..." literal translated to unicode, and the first 256 characters of unicode happen to be the latin-1 set, so map ord just works. With a unicode string you don't have to think about encoding, just use ord/unichr in range(0,256). Hex escapes within unicode strings work as expected, so IMO it's pretty clean. I think I have shown this in a couple of other posts in the orignal thread (where I created and compiled source code in several encodings including utf-8 and comiled with coding cookies and exec'd the result) I could always have overlooked something, but I am hopeful. Regards, Bengt Richter From fredrik at pythonware.com Wed Feb 15 21:53:53 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Feb 2006 21:53:53 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available References: <43F2FE65.5040308@pollenation.net> Message-ID: Georg Brandl wrote: > If something like Fredrik's new doc system is adopted, it would be extremely > convenient to refer someone to just > > docs.python.org/os.path.join > > without looking up how the page is actually named. you could of course reserve a toplevel directory for that purpose; e.g. http://python.org/lib/os.path.join or perhaps http://python.org/tag/os.path.join http://python.org/tag/print etc. From guido at python.org Wed Feb 15 21:58:32 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 12:58:32 -0800 Subject: [Python-Dev] bytes type needs a new champion Message-ID: Skip has mentioned in private email that he's not available to update PEP 332. I've therefore rejected that PEP; the current ideas are rather different so we might as well start a new PEP. Anyway, we need a new PEP author who can take the current discussion and turn it into a coherent PEP. I've tried to keep up with the current thread but it takes too much time to organize it all and I need to start focusing on the 2.5 release schedule. Any volunteers? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Wed Feb 15 21:56:54 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Feb 2006 21:56:54 +0100 Subject: [Python-Dev] 2.5 PEP References: <200602150922.43810.alain.poirier@net-ng.com> <43F37FE1.2090201@v.loewis.de> Message-ID: Martin v. Löwis wrote: > > - is (c)ElementTree still planned for inclusion ? > > It is included already. in the xml.etree package, in case someone's looking for it in the usual place. that is, import xml.etree.ElementTree as ET import xml.etree.cElementTree as ET will work in any 2.5 that has a working pyexpat. (is the xmlplus/xmlcore issue still an issue, btw?) From mal at egenix.com Wed Feb 15 22:07:02 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 15 Feb 2006 22:07:02 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: <43F397F6.4090402@egenix.com> Jason Orendorff wrote: > Instead of byte literals, how about a classmethod bytes.from_hex(), which > works like this: > > # two equivalent things > expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') > expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, > 131, 79, 229, 201, 46, 106]) > > It's just a nicety; the former fits my brain a little better. This would > work fine both in 2.5 and in 3.0. > > I thought about unicode.encode('hex'), but obviously it will continue to > return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13', > 'zip', 'uu', etc.) generally scare me. Those are not pseudo-encodings, they are regular codecs. It's a common misunderstanding that codecs are only seen as serving the purpose of converting between Unicode and strings. The codec system is deliberately designed to be general enough to also work with many other types, e.g. it is easily possible to write a codec that convert between the hex literal sequence you have above to a list of ordinals: """ Hex string codec Converts between a list of ordinals and a two byte hex literal string. Usage: >>> codecs.encode([1,2,3], 'hexstring') '010203' >>> codecs.decode(_, 'hexstring') [1, 2, 3] (c) 2006, Marc-Andre Lemburg. """ import codecs class Codec(codecs.Codec): def encode(self, input, errors='strict'): """ Convert hex ordinal list to hex literal string. """ if not isinstance(input, list): raise TypeError('expected list of integers') return ( ''.join(['%02x' % x for x in input]), len(input)) def decode(self,input,errors='strict'): """ Convert hex literal string to hex ordinal list. """ if not isinstance(input, str): raise TypeError('expected string of hex literals') size = len(input) if not size % 2 == 0: raise TypeError('input string has uneven length') return ( [int(input[(i<<1):(i<<1)+2], 16) for i in range(size >> 1)], size) class StreamWriter(Codec,codecs.StreamWriter): pass class StreamReader(Codec,codecs.StreamReader): pass def getregentry(): return (Codec().encode,Codec().decode,StreamReader,StreamWriter) > And now that bytes and text are > going to be two very different types, they're even weirder than before. > Consider: > > text.encode('utf-8') ==> bytes > text.encode('rot13') ==> text > bytes.encode('zip') ==> bytes > bytes.encode('uu') ==> text (?) > > This state of affairs seems kind of crazy to me. Really ? It all depends on what you use the codecs for. The above usages through the .encode() and .decode() methods is not the only way you can make use of them. To get full access to the codecs, you'll have to use the codecs module. > Actually users trying to figure out Unicode would probably be better served > if bytes.encode() and text.decode() did not exist. You're missing the point: the .encode() and .decode() methods are merely interfaces to the registered codecs. Whether they make sense for a certain codec depends on the codec, not the methods that interface to it, and again, codecs do not only exist to convert between Unicode and strings. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 15 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From thomas at xs4all.net Wed Feb 15 22:27:13 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 22:27:13 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: Message-ID: <20060215212713.GK6027@xs4all.nl> On Wed, Feb 15, 2006 at 01:38:41PM -0500, Jim Jewett wrote: > On 2/14/06, Neil Schemenauer wrote: > > People could spell it bytes(s.encode('latin-1')) > > Guido wrote: > > At the cost of an extra copying step. > > I asked: > > ... why not just add some smarts to the bytes constructor? > > Guido wrote: > > > ... the VM usually keeps an extra reference > > on the stack so the refcount is never 1. But > > you can't rely on that > > I did miss this, but _PyString_Resize seems to > work around it, and I'm not sure that the bytes > object can't be just as intimate. No, _PyString_Resize doesn't work around it. _PyString_Resize only works if the refcount is exactly one: only the caller has a reference. And by 'caller', I mean 'the calling C function'. Besides that, the caller takes care to only use _PyString_Resize on strings it created itself. Theoretically it could 'steal' a reference from someplace else, but I haven't seen _PyString_Resize-using code do that, and it would be a recipe for disaster. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From janssen at parc.com Wed Feb 15 22:27:15 2006 From: janssen at parc.com (Bill Janssen) Date: Wed, 15 Feb 2006 13:27:15 PST Subject: [Python-Dev] str object going in Py3K In-Reply-To: Your message of "Wed, 15 Feb 2006 11:43:13 PST." Message-ID: <06Feb15.132716pst."58633"@synergy1.parc.xerox.com> Well, I probably am, but that's not the reason. Reading has nothing to do with it. The default mode (text) corrupts data on write on a certain platform (Windows) by inserting extra bytes in the data stream. This bug particularly exhibits itself when programs developed on Linux or Mac OS X are then run on a Windows platform. I think it's a bug to default to a mode which modifies the data stream. The default mode should be 'binary'; people interested in exploiting the obsolete Windows distinction between "text" and "binary" should have to use a mode switch (I suggest "t") to put a file stream in 'text' mode. Bill > On 2/15/06, Bill Janssen wrote: > > The default behavior of the current open() in opening files as text is > > particularly grating. > > Why? Are you perhaps one of those rare folks who read more binary data > than text? > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Feb 15 22:37:52 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 13:37:52 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <-8169137970497330069@unknownmsgid> References: <-8169137970497330069@unknownmsgid> Message-ID: On 2/15/06, Bill Janssen wrote: > Well, I probably am, but that's not the reason. Reading has nothing > to do with it. Actually if you read binary data in text mode on Windows you also get corrupt (and often truncated) data, unless you're lucky enough that the binary data contains neither ^Z (EOF) nor CRLF. > The default mode (text) corrupts data on write on a certain platform > (Windows) by inserting extra bytes in the data stream. This bug > particularly exhibits itself when programs developed on Linux or Mac > OS X are then run on a Windows platform. I think it's a bug to > default to a mode which modifies the data stream. The default mode > should be 'binary'; people interested in exploiting the obsolete > Windows distinction between "text" and "binary" should have to use a > mode switch (I suggest "t") to put a file stream in 'text' mode. This might have been a possibility in Python 2.x where binary reads return strings. In Python 3000 binary files will return bytes objects while text files will return strings (which are decoded from unicode using an encoding that's determined when the file is opened, taking into account system and user settings as well as possible overrides passed to open()). I expect that the APIs for reading and writing binary data will be sufficiently different from that for reading/writing text that even staunch Unix programmers won't make the mistake of using the text API for creating binary files. I realize that's not the answer you're looking for, but for backwards compatibility we can't change the default on Windows in Python 2.x, so the point is moot until 3.0 or until a new binary file API is added to 2.x. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gjc at inescporto.pt Wed Feb 15 21:55:05 2006 From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro) Date: Wed, 15 Feb 2006 20:55:05 +0000 Subject: [Python-Dev] math.areclose ...? In-Reply-To: <004d01c63251$4ea87340$452c4fca@csmith> References: <00dd01c63142$3dd61280$892c4fca@csmith> <43F23815.4030307@acm.org> <004d01c63251$4ea87340$452c4fca@csmith> Message-ID: <1140036905.8544.3.camel@localhost.localdomain> Please, I don't much care about the fine points of the function's semantics, but PLEASE rename that function to are_close. Every time I see this subject in my email client I have to think for a few seconds what the hell 'areclose' means. This time it's not just because of the new PEP 8, 'areclose' is really really hard to read. -- Gustavo J. A. M. Carneiro The universe is always one step beyond logic From barry at python.org Wed Feb 15 22:42:35 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 16:42:35 -0500 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: <1140039755.14818.29.camel@geddy.wooz.org> On Wed, 2006-02-15 at 14:01 -0500, Jason Orendorff wrote: > Instead of byte literals, how about a classmethod bytes.from_hex(), > which works like this: > > # two equivalent things > expected_md5_hash = > bytes.from_hex('5c535024cac5199153e3834fe5c92e6a') > expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, > 227, 131, 79, 229, 201, 46, 106]) Kind of like binascii.unhexlify() but returning a bytes object. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/a7594dd8/attachment.pgp From ncoghlan at gmail.com Wed Feb 15 22:43:24 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2006 07:43:24 +1000 Subject: [Python-Dev] Generalizing *args and **kwargs In-Reply-To: <20060215103746.GJ6027@xs4all.nl> References: <20060215103746.GJ6027@xs4all.nl> Message-ID: <43F3A07C.60801@gmail.com> Thomas Wouters wrote: > Although I've made it look like I have a working implementation, I haven't. > I know exactly how to do it, though, except for the AST part ;) Once I > figure out how to properly work with the AST code I'll probably write this > patch whether it's a definite 'no' or not, just to see if I can. I wouldn't > mind if people gave their opinion, though. A phase 1 for Python 2.5 that allowed keyword args to go between "*args" and "**kwds" at the call site would be nice (Guido even approved the concept already, it's that it hasn't irritated anyone enough to actually tweak the grammar. . .) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From arekm at pld-linux.org Wed Feb 15 22:43:35 2006 From: arekm at pld-linux.org (Arkadiusz Miskiewicz) Date: Wed, 15 Feb 2006 22:43:35 +0100 Subject: [Python-Dev] how bugfixes are handled? Message-ID: Hi, How bugfixes are handled? I've posted a bug and a patch + test case for a quite common issue (see google, problem mentioned on this ml) long time ago and nothing happened with it http://sourceforge.net/tracker/index.php?func=detail&aid=1380952&group_id=5470&atid=305470 Is anyone reviewing fixes on regular basis? Or just some bugfixes are reviewed + commited depending on interest of commiters? Thanks, -- Arkadiusz Mi?kiewicz PLD/Linux Team http://www.t17.ds.pwr.wroc.pl/~misiek/ http://ftp.pld-linux.org/ From guido at python.org Wed Feb 15 22:48:16 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 13:48:16 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F397F6.4090402@egenix.com> References: <43F397F6.4090402@egenix.com> Message-ID: On 2/15/06, M.-A. Lemburg wrote: > Jason Orendorff wrote: > > Also the pseudo-encodings ('hex', 'rot13', > > 'zip', 'uu', etc.) generally scare me. > > Those are not pseudo-encodings, they are regular codecs. > > It's a common misunderstanding that codecs are only seen as serving > the purpose of converting between Unicode and strings. > > The codec system is deliberately designed to be general enough > to also work with many other types, e.g. it is easily possible to > write a codec that convert between the hex literal sequence you > have above to a list of ordinals: It's fine that the codec system supports this. However it's questionable that these encodings are invoked using the standard encode() and decode() APIs; and it will be more questionable once encode() returns a bytes object. Methods that return different types depending on the value of an argument are generally a bad idea. (Hence the movement to have separate opentext and openbinary or openbytes functions.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Wed Feb 15 22:53:50 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2006 07:53:50 +1000 Subject: [Python-Dev] 2.5 PEP In-Reply-To: References: Message-ID: <43F3A2EE.8060208@gmail.com> Neal Norwitz wrote: > Attached is the 2.5 release PEP 356. It's also available from: > http://www.python.org/peps/pep-0356.html > > Does anyone have any comments? Is this good or bad? Feel free to > send to me comments. > > We need to ensure that PEPs 308, 328, and 343 are implemented. We > have possible volunteers for 308 and 343, but not 328. Brett is doing > 352 and Martin is doing 353. PEP 338 is pretty much ready to go, too - just waiting on Guido's review and pronouncement on the specific API used in the latest update (his last PEP parade said he was OK with the general concept, but I only posted the PEP 302 compliant version after that). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From barry at python.org Wed Feb 15 22:57:41 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 16:57:41 -0500 Subject: [Python-Dev] A codecs nit (was Re: bytes.from_hex()) In-Reply-To: <43F397F6.4090402@egenix.com> References: <43F397F6.4090402@egenix.com> Message-ID: <1140040661.14818.42.camel@geddy.wooz.org> On Wed, 2006-02-15 at 22:07 +0100, M.-A. Lemburg wrote: > Those are not pseudo-encodings, they are regular codecs. > > It's a common misunderstanding that codecs are only seen as serving > the purpose of converting between Unicode and strings. > > The codec system is deliberately designed to be general enough > to also work with many other types, e.g. it is easily possible to > write a codec that convert between the hex literal sequence you > have above to a list of ordinals: Slightly off-topic, but one thing that's always bothered me about the current codecs implementation is that str.encode() (and friends) implicitly treats its argument as module, and imports it, even if the module doesn't live in the encodings package. That seems like a mistake to me (and a potential security problem if the import has side-effects). I don't know whether at the very least restricting the imports to the encodings package would make sense or would break things. >>> import sys >>> sys.modules['smtplib'] Traceback (most recent call last): File "", line 1, in ? KeyError: 'smtplib' >>> ''.encode('smtplib') Traceback (most recent call last): File "", line 1, in ? LookupError: unknown encoding: smtplib >>> sys.modules['smtplib'] I can't see any reason for allowing any randomly importable module to act like an encoding. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/b3306099/attachment-0001.pgp From guido at python.org Wed Feb 15 22:58:42 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 13:58:42 -0800 Subject: [Python-Dev] how bugfixes are handled? In-Reply-To: References: Message-ID: We're all volunteers here, and we get a large volume of bugs. Unfortunately, bugfixes are reviewed on a voluntary basis. Are you aware of the standing offer that if you review 5 bugs/patches some of the developers will pay attention to your bug/patch? On 2/15/06, Arkadiusz Miskiewicz wrote: > Hi, > > How bugfixes are handled? > > I've posted a bug and a patch + test case for a quite common issue (see > google, problem mentioned on this ml) long time ago and nothing happened > with it > http://sourceforge.net/tracker/index.php?func=detail&aid=1380952&group_id=5470&atid=305470 > > Is anyone reviewing fixes on regular basis? Or just some bugfixes are > reviewed + commited depending on interest of commiters? > > Thanks, > -- > Arkadiusz Mi?kiewicz PLD/Linux Team > http://www.t17.ds.pwr.wroc.pl/~misiek/ http://ftp.pld-linux.org/ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Wed Feb 15 23:12:20 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Feb 2006 23:12:20 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available References: <43F2FE65.5040308@pollenation.net> Message-ID: Georg Brandl wrote: > If something like Fredrik's new doc system is adopted don't hold your breath, by the way. it's clear that the current PSF-sponsored site overhaul won't lead to anything remotely close to a best-of-breed python- powered site, and I'm beginning to think that I should spend my time on other stuff. I find it a bit sad that we'll end up with a butt-ugly static and boring python.org site when we have so much talent in the python universe, but I guess that's in- evitable at this stage in Python's evolution. From martin at v.loewis.de Wed Feb 15 23:15:00 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2006 23:15:00 +0100 Subject: [Python-Dev] ssize_t branch merged Message-ID: <43F3A7E4.1090505@v.loewis.de> Just in case you haven't noticed, I just merged the ssize_t branch (PEP 353). If you have any corrections to the code to make which you would consider bug fixes, just go ahead. If you are uncertain how specific problems should be resolved, feel free to ask. If you think certain API changes should be made, please discuss them here - they would need to be reflected in the PEP as well. Regards, Martin From fredrik at pythonware.com Wed Feb 15 23:28:59 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Feb 2006 23:28:59 +0100 Subject: [Python-Dev] bytes type discussion References: Message-ID: Guido van Rossum wrote: > - it's probably too big to attempt to rush this into 2.5 After reading some of the discussion, and seen some of the arguments, I'm beginning to feel that we need working code to get this right. It would be nice if we could get a bytes() type into the first alpha, so the design can get some real-world exposure in real-world apps/libs be- fore 2.5 final. From thomas at xs4all.net Wed Feb 15 23:39:43 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 15 Feb 2006 23:39:43 +0100 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: Message-ID: <20060215223943.GL6027@xs4all.nl> On Wed, Feb 15, 2006 at 11:28:59PM +0100, Fredrik Lundh wrote: > After reading some of the discussion, and seen some of the arguments, > I'm beginning to feel that we need working code to get this right. > > It would be nice if we could get a bytes() type into the first alpha, so > the design can get some real-world exposure in real-world apps/libs be- > fore 2.5 final. I agree that working code would be nice, but I don't see why it should be in an alpha release. IMHO it shouldn't be in an alpha release until it at least looks good enough for the developers, and good enough to put in a PEP. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fredrik at pythonware.com Wed Feb 15 23:51:07 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 15 Feb 2006 23:51:07 +0100 Subject: [Python-Dev] bytes type discussion References: <20060215223943.GL6027@xs4all.nl> Message-ID: Thomas Wouters wrote: > > After reading some of the discussion, and seen some of the arguments, > > I'm beginning to feel that we need working code to get this right. > > > > It would be nice if we could get a bytes() type into the first alpha, so > > the design can get some real-world exposure in real-world apps/libs be- > > fore 2.5 final. > > I agree that working code would be nice, but I don't see why it should be in > an alpha release. IMHO it shouldn't be in an alpha release until it at least > looks good enough for the developers, and good enough to put in a PEP. I'm not convinced that the PEP will be good enough without experience from using a bytes type in *real-world* (i.e. *existing*) byte-crunching applications. if we put it in an early alpha, we can use it with real code, fix any issues that arises, and even remove it if necessary, before 2.5 final. if it goes in late, we'll be stuck with whatever the PEP says. From fuzzyman at voidspace.org.uk Wed Feb 15 23:54:09 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 15 Feb 2006 22:54:09 +0000 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F31C36.4080109@voidspace.org.uk> Message-ID: <43F3B111.9080107@voidspace.org.uk> Guido van Rossum wrote: > On 2/15/06, Fuzzyman wrote: > >> Forcing the programmer to be aware of encodings, also pushes the same >> requirement onto the user (who is often the source of the text in question). >> > > The programmer shouldn't have to be aware of encodings most of the > time -- it's the job of the I/O library to determine the end user's > (as opposed to the language's) default encoding dynamically and act > accordingly. Users who use non-ASCII characters without informing the > OS of their encoding are in a world of pain, *unless* they use the OS > default encoding (which may vary per locale). If the OS can figure out > the default encoding, so can the Python I/O library. Many apps won't > have to go beyond this at all. > > Note that I don't want to use this OS/user default encoding as the > default encoding between bytes and strings; once you are reading bytes > you are writing "grown-up" code and you will have to be explicit. It's > only the I/O library that should automatically encode on write and > decode on read. > > >> Currently you can read a text file and process it - making sure that any >> changes/requirements only use ascii characters. It therefore doesn't matter >> what 8 bit ascii-superset encoding is used in the original. If you force the >> programmer to specify the encoding in order to read the file, they would >> have to pass that requirement onto their user. Their user is even less >> likely to be encoding aware than the programmer. >> > > I disagree -- the user most likely has set or received a default > encoding when they first got the computer, and that's all they are > using. If other tools (notepad, wordpad, emacs, vi etc.) can figure > out the encoding, so can Python's I/O library. > > I'm intrigued by the encoding guessing techniques you envisage. I currently use a modified version of something contained within docutils. I read the file in binary and first check for UTF8 or UTF16 BOM. Then I try to decode the text using the following encodings (in this order) : ascii UTF8 locale.nl_langinfo(locale.CODESET) locale.getlocale()[1] locale.getdefaultlocale()[1] ISO8859-1 cp1252 (The encodings returned by the locale calls are only used on platforms for which they exist.) The first decode that doesn't blow up, I assume is correct. The problem I have is that I usually (for the application I have in mind anyway) then want to re-encode into a consistent encoding rather than back into the original encoding. If the encoding of the original (usually unspecified) is any arbitrary 8-bit ascii superset (as it usually is), then it will probably not blow up if decoded with any other arbitrary 8 bit encoding. This means I sometimes get junk. I'm curious if there is any extra things I could do ? This is possibly beyond the scope of this discussion (in which case I apologise), but we are discussing the techniques the I/O layer would use to 'guess' the encoding of a file opened in text mode - so maybe it's not so off topic. There is also the following cookbook recipe that uses an heuristic to guess encoding : http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/163743 XML, HTML, or other text streams may also contain additional information about their encoding - which be unreliable. :-) All the best, Michael Foord From tim at pollenation.net Thu Feb 16 00:02:00 2006 From: tim at pollenation.net (Tim Parkin) Date: Wed, 15 Feb 2006 23:02:00 +0000 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> Message-ID: <43F3B2E8.6090103@pollenation.net> Fredrik Lundh wrote: > Georg Brandl wrote: >>If something like Fredrik's new doc system is adopted > > don't hold your breath, by the way. it's clear that the current PSF-sponsored > site overhaul won't lead to anything remotely close to a best-of-breed python- > powered site, and I'm beginning to think that I should spend my time on other > stuff. > > I find it a bit sad that we'll end up with a butt-ugly static and boring python.org > site when we have so much talent in the python universe, but I guess that's in- > evitable at this stage in Python's evolution. > > Some very large sites - and some may say some very interesting, very large sites - are delivered as static html (for some time the two biggest sites in the uk were both delivered as static html, one of which was bbc.co.uk and the other was sportinglife.com for which I used to be the main web developer. As far as I know the bbc and sporting life still both use static html for a large portion of their content). Regarding the python site, it was a concious decision to deliver the pages as static html. This was for many reasons, of which a prominent one (but by no means the only major one) was mirroring. One of the advantages of a semantically structured website that uses css for layout and style is that, as far as design goes, you are welcome to re-style the html using css; we can also offer it as an alternate stylesheet (just as I've added a 'large font' style and a 'default font settings' style). However, design is a subjective thing - I've spent quite a bit of time reacting to the majority of constructive feedback (probably far too much time when I should have been getting content migrated) but obviously it won't please everyone :-) As for cutting edge, it's using twisted, restructured text, nevow, clean urls, xhtml, semantic markup, css2, interfaces, adaption, eggs, the path module, moinmoin, yaml (to avoid xml), etc - just because it's generating all of the html up front rather than at runtime doesn't mean that it's not best-of-breed (although I'm not sure what best-of-breed is; I'm presuming it's some sort of accolade for excellence in python programming; something I don't think I would be qualified to judge, never mind receive). However, back to the Goerg's comment, we could use mod_write to map: /lib/sets to: /doc/lib/module-sets.html with rewriteRule ^/lib/(.*)$ /doc/lib/module-$1.html [L,R=301] (not tested) Whether that is a good idea or not is another matter. Tim Parkin From fredrik at pythonware.com Thu Feb 16 00:11:38 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 00:11:38 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available References: <43F2FE65.5040308@pollenation.net> <43F3B2E8.6090103@pollenation.net> Message-ID: Tim Parkin wrote: > As for cutting edge, it's using twisted, restructured text, nevow, clean > urls, xhtml, semantic markup, css2, interfaces, adaption, eggs, the path > module, moinmoin, yaml (to avoid xml), that's not cutting edge, that's buzzword bingo. > something I don't think I would be qualified to judge,never mind receive). no, you're not qualified. yet, someone gave you total control over the future of python.org, and there's no way to make you give it up, despite the fact that you're over a year late and the stuff you've delivered this far is massively underwhelming. that's the problem. From nas at arctrix.com Thu Feb 16 00:14:08 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 15 Feb 2006 23:14:08 +0000 (UTC) Subject: [Python-Dev] bytes type needs a new champion References: Message-ID: Guido van Rossum wrote: > Anyway, we need a new PEP author who can take the current > discussion and turn it into a coherent PEP. I'm not sure that I have time to be the official champion. Right now I'm spending some time to collect all the ideas presented in the email messages and put them into a draft PEP. Hopefully that will be useful. Neil From guido at python.org Thu Feb 16 00:15:48 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 15:15:48 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F3B111.9080107@voidspace.org.uk> References: <43F31C36.4080109@voidspace.org.uk> <43F3B111.9080107@voidspace.org.uk> Message-ID: On 2/15/06, Michael Foord wrote: > I'm intrigued by the encoding guessing techniques you envisage. Don't hold your breath. *I* am not very interested in guessing encodings -- I was just commenting on posts by others that mentioned difficulties caused by this approach. My position is that the standard library (with the exception of XML processing code perhaps) shouldn't be *guessing* encodings but simply using the encoding specified by the user (or the OS default) in the environment or some such place. (It is OS dependent how to retrieve this information but my hypothesis is that every OS with any kind of text support has a way to get this info -- even if it's as rudimentary as "it's always ASCII" (v7 Unix :-) or "it's always UTF-8" (I am hoping this will eventually be the answer in the distant future). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 16 00:20:16 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 15:20:16 -0800 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <20060215223943.GL6027@xs4all.nl> Message-ID: I'm actually assuming to put this off until 2.6 anyway. On 2/15/06, Fredrik Lundh wrote: > Thomas Wouters wrote: > > > > After reading some of the discussion, and seen some of the arguments, > > > I'm beginning to feel that we need working code to get this right. > > > > > > It would be nice if we could get a bytes() type into the first alpha, so > > > the design can get some real-world exposure in real-world apps/libs be- > > > fore 2.5 final. > > > > I agree that working code would be nice, but I don't see why it should be in > > an alpha release. IMHO it shouldn't be in an alpha release until it at least > > looks good enough for the developers, and good enough to put in a PEP. > > I'm not convinced that the PEP will be good enough without experience > from using a bytes type in *real-world* (i.e. *existing*) byte-crunching > applications. > > if we put it in an early alpha, we can use it with real code, fix any issues > that arises, and even remove it if necessary, before 2.5 final. if it goes in > late, we'll be stuck with whatever the PEP says. > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 16 00:21:08 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 15:21:08 -0800 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: <43F3A7E4.1090505@v.loewis.de> References: <43F3A7E4.1090505@v.loewis.de> Message-ID: Great! I'll mark the PEP as accepted. (Which doesn't mean you can't update it if changes are found necessary.) --Guido On 2/15/06, "Martin v. L?wis" wrote: > Just in case you haven't noticed, I just merged > the ssize_t branch (PEP 353). > > If you have any corrections to the code to make which > you would consider bug fixes, just go ahead. > > If you are uncertain how specific problems should be resolved, > feel free to ask. > > If you think certain API changes should be made, please > discuss them here - they would need to be reflected in the > PEP as well. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Thu Feb 16 00:54:56 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 12:54:56 +1300 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F30152.3030905@ronadam.com> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> <43F28AFC.7050807@canterbury.ac.nz> <43F2A3D6.3060703@ronadam.com> <43F2CDBC.8090305@canterbury.ac.nz> <43F30152.3030905@ronadam.com> Message-ID: <43F3BF50.103@canterbury.ac.nz> Ron Adam wrote: > I was presuming it would be done in C code and it will just need a > pointer to the first byte, memchr(), and then read n bytes directly into > a new memory range via memcpy(). If the object supports the buffer interface, it can be done that way. But if not, it would seem to make sense to fall back on the iterator protocol. > However, if it's done with a Python iterator and then each item is > translated to bytes in a sequence, (much slower), an encoding will need > to be known for it to work correctly. No, it won't. When using the bytes(x) form, encoding has nothing to do with it. It's purely a conversion from one representation of an array of 0..255 to another. When you *do* want to perform encoding, you use bytes(u, encoding) and say what encoding you want to use. > Unfortunately Unicode strings > don't set an attribute to indicate it's own encoding. I think you don't understand what an encoding is. Unicode strings don't *have* an encoding, because theyre not encoded! Encoding is what happens when you go from a unicode string to something else. > Since some longs will be of different length, yes a bytes(0L) could give > differing results on different platforms, It's not just a matter of length. I'm not sure of the details, but I believe longs are currently stored as an array of 16-bit chunks, of which only 15 bits are used. I'm having trouble imagining a use for low-level access to that format, other than just treating it as an opaque lump of data for turning back into a long later -- in which case why not just leave it as a long in the first place. Greg From fredrik at pythonware.com Thu Feb 16 01:09:04 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 01:09:04 +0100 Subject: [Python-Dev] bytes type discussion References: <20060215223943.GL6027@xs4all.nl> Message-ID: Guido wrote: > I'm actually assuming to put this off until 2.6 anyway. makes sense. (but will there be a 2.6? isn't it time to start hacking on 3.0?) From greg.ewing at canterbury.ac.nz Thu Feb 16 01:03:27 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 13:03:27 +1300 Subject: [Python-Dev] nice() In-Reply-To: <000e01c63226$fc342660$7c2c4fca@csmith> References: <000e01c63226$fc342660$7c2c4fca@csmith> Message-ID: <43F3C14F.3020005@canterbury.ac.nz> Smith wrote: > The problem with areclose(), however, is that it > only solves one part of the problem that needs to be solved > if two fp's *are* going to be compared: if you are going to > check if a < b you would need to do something like > > not areclose(a,b) and a < b No, no, no. If your algorithm is well-designed, it won't matter which way the comparison goes if a and b are that close. In any case, the idea behind nice() is fundamentally doomed. IT CANNOT WORK, because the numbers it's returning are still binary, not decimal. Greg From barry at python.org Thu Feb 16 01:29:17 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2006 19:29:17 -0500 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <20060215223943.GL6027@xs4all.nl> Message-ID: <1140049757.14818.45.camel@geddy.wooz.org> On Thu, 2006-02-16 at 01:09 +0100, Fredrik Lundh wrote: > (but will there be a 2.6? isn't it time to start hacking on 3.0?) We know at least there will never be a 2.10, so I think we still have time. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060215/3a87e503/attachment.pgp From nas at arctrix.com Thu Feb 16 01:36:35 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 16 Feb 2006 00:36:35 +0000 (UTC) Subject: [Python-Dev] from __future__ import unicode_strings? Message-ID: I'm in the process of summarizing the dicussion on the bytes object and an idea just occured to me. Imagine that I want to write code that deals with strings and I want to be maximally compatible with P3k. It would be nice if I could add: from __future__ import unicode_strings and have string literals without a 'u' prefix become unicode instances. I'm not sure how tricky the implementation would be but it seems like a useful feature. An even crazier idea is to have that import change 'str' to be an alias for 'unicode'. Neil From bokr at oz.net Thu Feb 16 01:43:24 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 16 Feb 2006 00:43:24 GMT Subject: [Python-Dev] bytes type discussion References: <20060215223943.GL6027@xs4all.nl> Message-ID: <43f3ca66.902630814@news.gmane.org> On Wed, 15 Feb 2006 15:20:16 -0800, Guido van Rossum wrote: >I'm actually assuming to put this off until 2.6 anyway. > >On 2/15/06, Fredrik Lundh wrote: >> Thomas Wouters wrote: >> >> > > After reading some of the discussion, and seen some of the arguments, >> > > I'm beginning to feel that we need working code to get this right. >> > > >> > > It would be nice if we could get a bytes() type into the first alpha, so >> > > the design can get some real-world exposure in real-world apps/libs be- >> > > fore 2.5 final. >> > >> > I agree that working code would be nice, but I don't see why it should be in >> > an alpha release. IMHO it shouldn't be in an alpha release until it at least >> > looks good enough for the developers, and good enough to put in a PEP. >> >> I'm not convinced that the PEP will be good enough without experience >> from using a bytes type in *real-world* (i.e. *existing*) byte-crunching >> applications. >> >> if we put it in an early alpha, we can use it with real code, fix any issues >> that arises, and even remove it if necessary, before 2.5 final. if it goes in >> late, we'll be stuck with whatever the PEP says. >> >> >> I could hardly keep up with reading, never mind trying some things and writing coherently, so if others had that experience, 2.6 sounds +1. I agree with Fredrik that an implementation to try in real-world use cases would probably yield valuable information. As a step in that direction, could we have a sub-thread on what methods to implement for bytes? I.e., which str methods make sense, which special methods? How many methods from list make sense, given that bytes will be mutable? How much of array.array('B') should be emulated? (a protype hack could just wrap array.array for storage). Should the type really be a subclass of int? I think that might be hard for prototyping, since builtin types as bases seem to get priority subclass bypass access from some builtin functions. At least I've had some frustrations with that. If it were a kind of int, would it be an int-string, where int(bytes([65])) would work like ord does with non-length-1? BTW bytes([1,2])[1] by analogy to str should then return bytes([2]), shouldn't it? I have a feeling a lot of str-like methods will bomb if that's not so. >>> int(bytes([1,2])) # faked ;-) Traceback (most recent call last): File "", line 1, in ? TypeError: int() expected a byte, but bytes of length 2 found I've hacked a few pieces, but I think further discussion either in this thread or maybe a bytes prototype spec thread would be fruitful. By the time a prototype spec takes shape, someone will probably have beaten me to something workable, but that's ok ;-) Then a PEP will mostly be writing and collecting rationale references etc. That's really not my favorite kind of work, frankly. But I like thinking and programming. Regards, Bengt Richter From aahz at pythoncraft.com Thu Feb 16 01:55:48 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 15 Feb 2006 16:55:48 -0800 Subject: [Python-Dev] Off-topic: www.python.org In-Reply-To: References: <43F2FE65.5040308@pollenation.net> <43F3B2E8.6090103@pollenation.net> Message-ID: <20060216005548.GB8957@panix.com> On Thu, Feb 16, 2006, Fredrik Lundh wrote: > Tim Parkin wrote: >> >> [...] > > no, you're not qualified. yet, someone gave you total control over the > future of python.org, and there's no way to make you give it up, despite > the fact that you're over a year late and the stuff you've delivered this > far is massively underwhelming. that's the problem. In all fairness to Tim (and despite the fact that emotionally I agree with you), the fact is that there had been essentially no forward motion on www.python.org redesign until he went to work. Even if we end up chucking out all his work in favor of something else, I'll consider the PSF's money well-spent for bringing the community energy into it. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From exarkun at divmod.com Thu Feb 16 01:56:17 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Wed, 15 Feb 2006 19:56:17 -0500 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: Message-ID: <20060216005617.6122.1311347827.divmod.quotient.50@ohm> On Thu, 16 Feb 2006 00:36:35 +0000 (UTC), Neil Schemenauer wrote: >I'm in the process of summarizing the dicussion on the bytes object >and an idea just occured to me. Imagine that I want to write code >that deals with strings and I want to be maximally compatible with >P3k. It would be nice if I could add: > > from __future__ import unicode_strings > >and have string literals without a 'u' prefix become unicode >instances. I'm not sure how tricky the implementation would be but >it seems like a useful feature. FWIW, I've considered this before, and superficially at least, it seems attractive. > >An even crazier idea is to have that import change 'str' to be >an alias for 'unicode'. That's further than I went, though :) Until there's a replacement for str, this would make it impossible to do certain things with that __future__ import in effect. Jean-Paul From guido at python.org Thu Feb 16 02:23:56 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 17:23:56 -0800 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: References: Message-ID: On 2/15/06, Neil Schemenauer wrote: > I'm in the process of summarizing the dicussion on the bytes object > and an idea just occured to me. Imagine that I want to write code > that deals with strings and I want to be maximally compatible with > P3k. It would be nice if I could add: > > from __future__ import unicode_strings > > and have string literals without a 'u' prefix become unicode > instances. I'm not sure how tricky the implementation would be but > it seems like a useful feature. Didn't we have a command-line option to do this? I believe it was removed because nobody could see the point. (Or am I hallucinating? After several days of non-stop discussing bytes that must be considered a possibility.) Of course a per-module switch is much more useful. > An even crazier idea is to have that import change 'str' to be > an alias for 'unicode'. Now *that's* crazy talk. :-) It's probably easier to do that by placing a line str = unicode at the top of the file. Of course (like a good per-module switch should!) this won't affect code in other modules that you invoke so it's not clear that it always does the right thing. But it's a start. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Thu Feb 16 02:25:06 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 15 Feb 2006 20:25:06 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> <43F3B2E8.6090103@pollenation.net> Message-ID: I don't think this message is on-topic for python-dev. There are lots of great places to discuss the design of the python web site, but the list for developers doesn't seem like a good place for it. Do we need a different list for people to gripe^H^H^H^H^H discuss the web site? Jeremy On 2/15/06, Fredrik Lundh wrote: > Tim Parkin wrote: > > > As for cutting edge, it's using twisted, restructured text, nevow, clean > > urls, xhtml, semantic markup, css2, interfaces, adaption, eggs, the path > > module, moinmoin, yaml (to avoid xml), > > that's not cutting edge, that's buzzword bingo. > > > something I don't think I would be qualified to judge,never mind receive). > > no, you're not qualified. yet, someone gave you total control over the > future of python.org, and there's no way to make you give it up, despite > the fact that you're over a year late and the stuff you've delivered this > far is massively underwhelming. that's the problem. > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From thomas at xs4all.net Thu Feb 16 02:43:02 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 16 Feb 2006 02:43:02 +0100 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: References: Message-ID: <20060216014302.GN6027@xs4all.nl> On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote: > > from __future__ import unicode_strings > Didn't we have a command-line option to do this? I believe it was > removed because nobody could see the point. (Or am I hallucinating? > After several days of non-stop discussing bytes that must be > considered a possibility.) We do, and it's not been removed: the -U switch. Python 2.3.5 (#2, Nov 21 2005, 01:27:27) >>> "" u'' Python 2.4.2 (#2, Nov 21 2005, 02:24:28) >>> "" u'' Python 2.5a0 (trunk:42390, Feb 16 2006, 00:12:03) >>> "" u'' I've never seen it *used*, though, and IIRC there were quite a number of stdlib modules that broke when you used it, at least back when it was introduced. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From rrr at ronadam.com Thu Feb 16 03:11:59 2006 From: rrr at ronadam.com (Ron Adam) Date: Wed, 15 Feb 2006 20:11:59 -0600 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <43F3BF50.103@canterbury.ac.nz> References: <43ed8aaf.493103945@news.gmane.org> <5.1.1.6.0.20060213131525.02100c00@mail.telecommunity.com> <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <5.1.1.6.0.20060213203407.03619350@mail.telecommunity.com> <43F28AFC.7050807@canterbury.ac.nz> <43F2A3D6.3060703@ronadam.com> <43F2CDBC.8090305@canterbury.ac.nz> <43F30152.3030905@ronadam.com> <43F3BF50.103@canterbury.ac.nz> Message-ID: <43F3DF6F.5080503@ronadam.com> Greg Ewing wrote: > I think you don't understand what an encoding is. Unicode > strings don't *have* an encoding, because theyre not encoded! > Encoding is what happens when you go from a unicode string > to something else. Ah.. ok, my mental picture was a bit off. I had this reversed somewhat. > It's not just a matter of length. I'm not sure of the > details, but I believe longs are currently stored as an > array of 16-bit chunks, of which only 15 bits are used. > I'm having trouble imagining a use for low-level access > to that format, other than just treating it as an opaque > lump of data for turning back into a long later -- in > which case why not just leave it as a long in the first > place. I had laps thinking Pythons longs are the same as c longs. I know Pythons longs can get much much bigger. The idea was to be able to show the byte data as is in what ever form it takes and not try to change it, weather it's longs, floats, strings, etc. Cheers, Ron From aahz at pythoncraft.com Thu Feb 16 03:35:29 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 15 Feb 2006 18:35:29 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> <43F1C075.3060207@canterbury.ac.nz> <20E03E3D-CDDC-49E7-BF14-A6070FFC09C1@python.org> Message-ID: <20060216023529.GA28587@panix.com> On Tue, Feb 14, 2006, Guido van Rossum wrote: > > Anyway, I'm now convinced that bytes should act as an array of ints, > where the ints are restricted to range(0, 256) but have type int. range(0, 255)? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From greg.ewing at canterbury.ac.nz Thu Feb 16 03:47:55 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 15:47:55 +1300 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F2FDDD.3030200@gmail.com> Message-ID: <43F3E7DB.4010502@canterbury.ac.nz> Guido van Rossum wrote: > So how about > openbytes? This clearly links the resulting object with the bytes > type, which is mutually reassuring. That looks quite nice. Another thought -- what is going to happen to os.open? Will it change to return bytes, or will there be a new os.openbytes? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From nas at arctrix.com Thu Feb 16 03:49:11 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 15 Feb 2006 19:49:11 -0700 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: <20060216014302.GN6027@xs4all.nl> References: <20060216014302.GN6027@xs4all.nl> Message-ID: <20060216024911.GA363@mems-exchange.org> On Thu, Feb 16, 2006 at 02:43:02AM +0100, Thomas Wouters wrote: > On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote: > > > > from __future__ import unicode_strings > > > Didn't we have a command-line option to do this? I believe it was > > removed because nobody could see the point. (Or am I hallucinating? > > After several days of non-stop discussing bytes that must be > > considered a possibility.) > > We do, and it's not been removed: the -U switch. As Guido alluded, the global switch is useless. A per-module switch something that could actually useful. One nice advantage is that you would write code that works the same with Jython (wrt to string literals anyhow). Neil From bob at redivi.com Thu Feb 16 03:49:39 2006 From: bob at redivi.com (Bob Ippolito) Date: Wed, 15 Feb 2006 18:49:39 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <20060216023529.GA28587@panix.com> References: <5.1.1.6.0.20060213170737.020d73e8@mail.telecommunity.com> <43F11047.705@egenix.com> <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> <43F1C075.3060207@canterbury.ac.nz> <20E03E3D-CDDC-49E7-BF14-A6070FFC09C1@python.org> <20060216023529.GA28587@panix.com> Message-ID: <6D24EB74-501D-4F58-BCFB-8A839A73C31F@redivi.com> On Feb 15, 2006, at 6:35 PM, Aahz wrote: > On Tue, Feb 14, 2006, Guido van Rossum wrote: >> >> Anyway, I'm now convinced that bytes should act as an array of ints, >> where the ints are restricted to range(0, 256) but have type int. > > range(0, 255)? No, Guido was correct. range(0, 256) is [0, 1, 2, ..., 255]. -bob From nas at arctrix.com Thu Feb 16 03:55:16 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 15 Feb 2006 19:55:16 -0700 Subject: [Python-Dev] Pre-PEP: The "bytes" object Message-ID: <20060216025515.GA474@mems-exchange.org> This could be a replacement for PEP 332. At least I hope it can serve to summarize the previous discussion and help focus on the currently undecided issues. I'm too tired to dig up the rules for assigning it a PEP number. Also, there are probably silly typos, etc. Sorry. Neil -------------- next part -------------- PEP: XXX Title: The "bytes" object Version: $Revision$ Last-Modified: $Date$ Author: Neil Schemenauer Status: Draft Type: Standards Track Content-Type: text/plain Created: 15-Feb-2006 Python-Version: 2.5 Post-History: Abstract ======== This PEP outlines the introduction of a raw bytes sequence object. Adding the bytes object is one step in the transition to Unicode based str objects. Motivation ========== Python's current string objects are overloaded. They serve to hold both sequences of characters and sequences of bytes. This overloading of purpose leads to confusion and bugs. In future versions of Python, string objects will be used for holding character data. The bytes object will fulfil the role of a byte container. Eventually the unicode built-in will be renamed to str and the str object will be removed. Specification ============= A bytes object stores a mutable sequence of integers that are in the range 0 to 255. Unlike string objects, indexing a bytes object returns an integer. Assigning an element using a object that is not an integer causes a TypeError exception. Assigning an element to a value outside the range 0 to 255 causes a ValueError exception. The __len__ method of bytes returns the number of integers stored in the sequence (i.e. the number of bytes). The constructor of the bytes object has the following signature: bytes([initialiser[, [encoding]]) If no arguments are provided then an object containing zero elements is created and returned. The initialiser argument can be a string or a sequence of integers. The pseudo-code for the constructor is: def bytes(initialiser=[], encoding=None): if isinstance(initialiser, basestring): if encoding is None or encoding.lower() == 'ascii': # raises UnicodeDecodeError if the string contains # non-ASCII characters initialiser = initialiser.encode('ascii') elif isinstance(initialiser, unicode): initialiser = initialiser.encode(encoding) else: # silently ignore the encoding argument if the # initialiser is a str object pass initialiser = [ord(c) for c in initialiser] elif encoding is not None: raise TypeError("explicit encoding invalid for non-string " "initialiser") create bytes object and fill with integers from initialiser return bytes object The __repr__ method returns a string that can be evaluated to generate a new bytes object containing the same sequence of integers. The sequence is represented by a list of ints. For example: >>> repr(bytes[10, 20, 30]) 'bytes([10, 20, 30])' The object has a decode method equivalent to the decode method of the str object. The object has a classmethod fromhex that takes a string of characters from the set [0-9a-zA-Z ] and returns a bytes object (similar to binascii.unhexlify). For example: >>> bytes.fromhex('5c5350ff') bytes([92, 83, 80, 255]]) >>> bytes.fromhex('5c 53 50 ff') bytes([92, 83, 80, 255]]) The object has a hex method that does the reverse conversion (similar to binascii.hexlify): >> bytes([92, 83, 80, 255]]).hex() '5c5350ff' The bytes object has methods similar to the list object: __add__ __contains__ __delitem__ __delslice__ __eq__ __ge__ __getitem__ __getslice__ __gt__ __hash__ __iadd__ __imul__ __iter__ __le__ __len__ __lt__ __mul__ __ne__ __reduce__ __reduce_ex__ __repr__ __rmul__ __setitem__ __setslice__ append count extend index insert pop remove Out of scope issues =================== * If we provide a literal syntax for bytes then it should look distinctly different than the syntax for literal strings. Also, a new type, even built-in, is much less drastic than a new literal (which requires lexer and parser support in addition to everything else). Since there appears to be no immediate need for a literal representation, designing and implementing one is out of the scope of this PEP. * Python 3k will have a much different I/O subsystem. Deciding how that I/O subsystem will work and interact with the bytes object is out of the scope of this PEP. * It has been suggested that a special method named __bytes__ be added to language to allow objects to be converted into byte arrays. This decision is out of scope. Unresolved issues ================= * Perhaps the bytes object should be implemented as a extension module until we are more sure of the design (similar to how the set object was prototyped). * Should the bytes object implement the buffer interface? Probably, but we need to look into the implications of that (e.g. regex operations on byte arrays). * Should the object implement __reversed__ and reverse? Should it implement sort? * Need to clarify what some of the methods do. How are comparisons done? Hashing? Pickling and marshalling? Questions and answers ===================== Q: Why have the optional encoding argument when the encode method of Unicode objects does the same thing. A: In the current version of Python, the encode method returns a str object and we cannot change that without breaking code. The construct bytes(s.encode(...)) is expensive because it has to copy the byte sequence multiple times. Also, Python generally provides two ways of converting an object of type A into an object of type B: ask an A instance to convert itself to a B, or ask the type B to create a new instance from an A. Depending on what A and B are, both APIs make sense; sometimes reasons of decoupling require that A can't know about B, in which case you have to use the latter approach; sometimes B can't know about A, in which case you have to use the former. Q: Why does bytes ignore the encoding argument if the initialiser is a str? A: There is no sane meaning that the encoding can have in that case. str objects *are* byte arrays and they know nothing about the encoding of character data they contain. We need to assume that the programmer has provided str object that already uses the desired encoding. If you need something other than a pure copy of the bytes then you need to first decode the string. For example: bytes(s.decode(encoding1), encoding2) Q: Why not have the encoding argument default to Latin-1 (or some other encoding that covers the entire byte range) rather than ASCII ? A: The system default encoding for Python is ASCII. It seems least confusing to use that default. Also, in Py3k, using Latin-1 as the default might not be what users expect. For example, they might prefer a Unicode encoding. Any default will not always work as expected. At least ASCII will complain loudly if you try to encode non-ASCII data. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From guido at python.org Thu Feb 16 03:57:26 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2006 18:57:26 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F3E7DB.4010502@canterbury.ac.nz> References: <43F2FDDD.3030200@gmail.com> <43F3E7DB.4010502@canterbury.ac.nz> Message-ID: On 2/15/06, Greg Ewing wrote: > Guido van Rossum wrote: > > So how about > > openbytes? This clearly links the resulting object with the bytes > > type, which is mutually reassuring. > > That looks quite nice. > > Another thought -- what is going to happen to os.open? > Will it change to return bytes, or will there be a new > os.openbytes? Hm, I hadn't thought about that yet. On Windows, os.open has the ability to set text or binary mode. But IMO it's better to make this always use binary mode. My expectation is that the Py3k standard I/O library will do all of its own conversions on top of binary files anyway -- if you missed it, I'd like to get rid of any ties to C's stdio. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Thu Feb 16 04:00:16 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 16:00:16 +1300 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F364EA.3010507@egenix.com> References: <43F2FDDD.3030200@gmail.com> <43F364EA.3010507@egenix.com> Message-ID: <43F3EAC0.4050406@canterbury.ac.nz> M.-A. Lemburg wrote: > E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 > renamed to str.openfile()) This seems wrong to me, because it creates an unnecessary dependency of the bytes/str/unicode types on the file type. These types should remain strictly focused on being just containers for data. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 16 04:06:45 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 16:06:45 +1300 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <1140025909.13758.43.camel@geddy.wooz.org> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> Message-ID: <43F3EC45.8090301@canterbury.ac.nz> Barry Warsaw wrote: > If we go with two functions, I'd much rather hang them off of the file > type object then add two new builtins. I really do think file.bytes() > and file.text() (a.k.a. open.bytes() and open.text()) is better than > opentext() or openbytes(). I'm worried about feeping creaturism of the file type here. To my mind, the file type already has too many features, and this hinders code that wants to define its own file-like objects. In 3.0 I'd like to see the file type reduced to having as simple an interface as possible (basically just read/write) and all other stuff (readlines, text codecs, etc.) implemented as wrappers around it. To be compatible with that model, opentext() etc. need to be factory functions returning the appropriate stack of objects. As such they shouldn't be class methods of any type. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 16 04:12:23 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 16:12:23 +1300 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: <43F3ED97.40901@canterbury.ac.nz> Jason Orendorff wrote: > Also the pseudo-encodings ('hex', > 'rot13', 'zip', 'uu', etc.) generally scare me. I think these will have to cease being implemented as encodings in 3.0. They should really never have been in the first place. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 16 04:29:32 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 16:29:32 +1300 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <20060215191856.GA7705@activestate.com> References: <43F27D25.7070208@canterbury.ac.nz> <20060215012214.GA31050@activestate.com> <20060215191856.GA7705@activestate.com> Message-ID: <43F3F19C.1000909@canterbury.ac.nz> Trent Mick wrote: > On Windows you download an MSI (it ends up in your browser downloads > folder), it starts the installation, and the end of the installation it > starts the app for you. Which then conveniently inserts a virus into my system. No, thanks. (Okay up until that last bit, though.) This isn't really a problem with the Mac, but with the Mac-Web interface. If there were a file format (e.g. .app.tar.gz) that the Mac would recognise as an app and unpack automatically and put in an appropriate place, things would be much the same. (Including running it automatically, if you were insane enough to turn that option on.) > ...anyway this is getting seriously OT for python-dev. :) Agreed. I will say no more about it here. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From kbk at shore.net Thu Feb 16 05:00:22 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed, 15 Feb 2006 23:00:22 -0500 (EST) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200602160400.k1G40MGE014091@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 399 open ( +8) / 3042 closed ( +4) / 3441 total (+12) Bugs : 923 open ( +8) / 5553 closed (+13) / 6476 total (+21) RFE : 209 open ( +0) / 198 closed ( +1) / 407 total ( +1) New / Reopened Patches ______________________ urllib proxy_bypass broken (2006-02-07) http://python.org/sf/1426648 opened by Anthony Tuininga Implementation of PEP 357 (2006-02-09) CLOSED http://python.org/sf/1428778 opened by Travis Oliphant pdb: fix for 1326406 (import __main__ pdb failure) (2006-02-10) http://python.org/sf/1429539 opened by Ilya Sandler Implement PEP 357 for real (2006-02-11) http://python.org/sf/1429591 opened by Travis Oliphant PEP 338 implementation (2006-02-11) http://python.org/sf/1429601 opened by Nick Coghlan PEP 338 documentation (2006-02-11) http://python.org/sf/1429605 opened by Nick Coghlan Link Python modules to libpython on linux if --enable-shared (2006-02-11) http://python.org/sf/1429775 opened by Gustavo J. A. M. Carneiro trace.py needs to know about doctests (2006-02-11) http://python.org/sf/1429818 opened by Marius Gedminas Missing HCI sockets in bluetooth code from socketmodule (2006-02-15) http://python.org/sf/1432399 opened by Philippe Biondi Patches Closed ______________ Implementation of PEP 357 (2006-02-10) http://python.org/sf/1428778 closed by ncoghlan Prefer linking against ncursesw over ncurses library (2006-02-09) http://python.org/sf/1428494 closed by loewis Enhancing '-m' to support packages (PEP 338) (2004-10-09) http://python.org/sf/1043356 closed by ncoghlan File-iteration and read* method protection (2006-01-05) http://python.org/sf/1397960 closed by twouters New / Reopened Bugs ___________________ tarfile.open bug / corrupt data (2006-02-08) http://python.org/sf/1427552 opened by Chris86 List not initialized if used as default argument (2006-02-08) CLOSED http://python.org/sf/1427789 opened by Jason Crash on invalid coding pragma (2006-02-09) CLOSED http://python.org/sf/1428264 opened by ocean-city add /usr/local support (2006-02-09) CLOSED http://python.org/sf/1428789 opened by Karol Pietrzak set documentation deficiencies (2006-02-10) CLOSED http://python.org/sf/1429063 opened by Keith Briggs For loop exit early (2006-02-10) http://python.org/sf/1429481 opened by msmith segfault in FreeBSD (2006-02-11) CLOSED http://python.org/sf/1429585 opened by aix-d urllib.py: AttributeError on BadStatusLine (2006-02-11) http://python.org/sf/1429783 opened by Robert Kiendl smtplib: empty mail addresses (2006-02-12) http://python.org/sf/1430298 opened by Freek Dijkstra urlib2 (2006-02-13) http://python.org/sf/1430435 opened by halfik recursive __getattr__ in thread crashes OS X (2006-02-12) http://python.org/sf/1430436 opened by Aaron Swartz CSV Sniffer fails to report mismatch of column counts (2006-02-13) http://python.org/sf/1431091 opened by Vinko Logging hangs thread after detaching a StreamHandler's termi (2006-02-13) http://python.org/sf/1431253 opened by Yang Zhang long path support in win32 part of os.listdir(posixmodule.c) (2006-02-14) http://python.org/sf/1431582 opened by Sergey Dorofeev pydoc still doesn't handle lambda well (2006-02-15) http://python.org/sf/1432260 opened by William McVey Descript of file-object read() method is wrong. (2006-02-15) http://python.org/sf/1432343 opened by Grant Edwards arrayobject should use PyObject_VAR_HEAD (2006-02-15) http://python.org/sf/1432350 opened by Jim Jewett Bugs Closed ___________ Random stack corruption from socketmodule.c (2004-01-13) http://python.org/sf/876637 closed by nnorwitz patch for etree cdata and attr quoting (2006-02-04) http://python.org/sf/1424171 closed by effbot List not initialized if used as default argument (2006-02-08) http://python.org/sf/1427789 closed by birkenfeld msvccompiler.py modified to work with .NET 2005 on win64 (2006-02-06) http://python.org/sf/1425482 closed by loewis email.Message should supress warning from uu.decode (2006-01-18) http://python.org/sf/1409403 closed by bwarsaw Crash on invalid coding pragma (2006-02-09) http://python.org/sf/1428264 closed by birkenfeld add /usr/local support (2006-02-10) http://python.org/sf/1428789 closed by loewis set documentation deficiencies (2006-02-10) http://python.org/sf/1429063 closed by birkenfeld segfault in FreeBSD (2006-02-11) http://python.org/sf/1429585 closed by nnorwitz typo in tutorial (2006-02-12) http://python.org/sf/1430076 closed by effbot New / Reopened RFE __________________ itertools.any and itertools.all (2006-02-15) CLOSED http://python.org/sf/1432437 opened by paul cannon RFE Closed __________ itertools.any and itertools.all (2006-02-15) http://python.org/sf/1432437 closed by birkenfeld From aahz at pythoncraft.com Thu Feb 16 05:20:31 2006 From: aahz at pythoncraft.com (Aahz) Date: Wed, 15 Feb 2006 20:20:31 -0800 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?] In-Reply-To: <6D24EB74-501D-4F58-BCFB-8A839A73C31F@redivi.com> References: <5.1.1.6.0.20060213180846.020d8dd0@mail.telecommunity.com> <5.1.1.6.0.20060213185406.040bae90@mail.telecommunity.com> <2A620DEB-0CB2-4FE0-A57F-78F5DF76320C@python.org> <43F1C075.3060207@canterbury.ac.nz> <20E03E3D-CDDC-49E7-BF14-A6070FFC09C1@python.org> <20060216023529.GA28587@panix.com> <6D24EB74-501D-4F58-BCFB-8A839A73C31F@redivi.com> Message-ID: <20060216042031.GA13683@panix.com> On Wed, Feb 15, 2006, Bob Ippolito wrote: > On Feb 15, 2006, at 6:35 PM, Aahz wrote: >> On Tue, Feb 14, 2006, Guido van Rossum wrote: >>> >>> Anyway, I'm now convinced that bytes should act as an array of ints, >>> where the ints are restricted to range(0, 256) but have type int. >> >> range(0, 255)? > > No, Guido was correct. range(0, 256) is [0, 1, 2, ..., 255]. My mistake -- I wasn't thinking of the literal Python function. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From jcarlson at uci.edu Thu Feb 16 06:36:03 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 15 Feb 2006 21:36:03 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F3ED97.40901@canterbury.ac.nz> References: <43F3ED97.40901@canterbury.ac.nz> Message-ID: <20060215212629.5F6D.JCARLSON@uci.edu> Greg Ewing wrote: > Jason Orendorff wrote: > > > Also the pseudo-encodings ('hex', > > 'rot13', 'zip', 'uu', etc.) generally scare me. > > I think these will have to cease being implemented as > encodings in 3.0. They should really never have been > in the first place. I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', and likely a few others that the two of you may be arguing against should stay as encodings, because strictly speaking, they are defined as encodings of data. They may not be encodings of _unicode_ data, but that doesn't mean that they aren't useful encodings for other kinds of data, some text, some binary, ... - Josiah From nnorwitz at gmail.com Thu Feb 16 06:36:01 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 15 Feb 2006 21:36:01 -0800 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <1140026533.14811.2.camel@geddy.wooz.org> References: <1140026533.14811.2.camel@geddy.wooz.org> Message-ID: On 2/15/06, Barry Warsaw wrote: > > I haven't been following the AST stuff closely enough, but I'm not crazy > about putting access to this in the sys module. It seems like it > clutters that up with a name that will be rarely used by the average > Python programmer. Agreed. I'm hoping we can get rid of lots of code in the compiler module and use the AST provided from C. The compiler module seems the best place to put anything related to the AST. Regardless of what internal approach is used. We can still try to hash out a nice API. n From nnorwitz at gmail.com Thu Feb 16 06:44:19 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 15 Feb 2006 21:44:19 -0800 Subject: [Python-Dev] 2.5 PEP In-Reply-To: References: <200602150922.43810.alain.poirier@net-ng.com> <43F37FE1.2090201@v.loewis.de> Message-ID: On 2/15/06, Fredrik Lundh wrote: > > (is the xmlplus/xmlcore issue still an issue, btw?) What issue are you talking about? n From nnorwitz at gmail.com Thu Feb 16 06:50:47 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 15 Feb 2006 21:50:47 -0800 Subject: [Python-Dev] 2.5 PEP In-Reply-To: <200602150922.43810.alain.poirier@net-ng.com> References: <200602150922.43810.alain.poirier@net-ng.com> Message-ID: On 2/15/06, Alain Poirier wrote: > - isn't the current implementation of itertools.tee (cache of previous > generated values) incompatible with the new possibility to feed a > generator (PEP 342) ? I'm not sure what you are referring to. What is the issue? n From brett at python.org Thu Feb 16 06:55:41 2006 From: brett at python.org (Brett Cannon) Date: Wed, 15 Feb 2006 21:55:41 -0800 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: <20060215220734.45db375a.simon@arrowtheory.com> Message-ID: On 2/15/06, Jeremy Hylton wrote: [SNIP] > How about we arrange for some open space time at PyCon to discuss? > Unfortunately, the compiler talk isn't until the last day and I can't > stay for sprints. It would be better to have the talk, then the open > space, then the sprint. I would definitely be interested in having an open space discussion on where we want to go with this. If we want to generate as much interest we should probably hold it the same day as your talk and have you announce it. Otherwise it could be scheduled at any time before the sprints. -Brett From aleaxit at gmail.com Thu Feb 16 06:59:55 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 15 Feb 2006 21:59:55 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <1140025909.13758.43.camel@geddy.wooz.org> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> Message-ID: <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> On Feb 15, 2006, at 9:51 AM, Barry Warsaw wrote: > On Wed, 2006-02-15 at 09:17 -0800, Guido van Rossum wrote: > >> Regarding open vs. opentext, I'm still not sure. I don't want to >> generalize from the openbytes precedent to openstr or openunicode >> (especially since the former is wrong in 2.x and the latter is wrong >> in 3.0). I'm tempting to hold out for open() since it's most >> compatible. > > If we go with two functions, I'd much rather hang them off of the file > type object then add two new builtins. I really do think file.bytes() > and file.text() (a.k.a. open.bytes() and open.text()) is better than > opentext() or openbytes(). I agree, or, MAL's idea of bytes.open() and unicode.open() is also good. My fondest dream is that we do NOT have an 'open' builtin which has proven to be very error-prone when used in Windows by newbies (as evidenced by beginner errors as seen on c.l.py, the python-help lists, and other venues) -- defaulting 'open' to text is errorprone, defaulting it to binary doesn't seem the greatest idea either, principle "when in doubt, resist the temptation to guess" strongly suggests not having 'open' as a built-in at all. (And namemangling into openthis and openthat seems less Pythonic to me than exploiting namespaces by making structured names, either this.open and that.open or open.this and open.that). IOW, I entirely agree with Barry and Marc Andre. Alex From brett at python.org Thu Feb 16 07:01:18 2006 From: brett at python.org (Brett Cannon) Date: Wed, 15 Feb 2006 22:01:18 -0800 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: <1140026533.14811.2.camel@geddy.wooz.org> Message-ID: On 2/15/06, Neal Norwitz wrote: > On 2/15/06, Barry Warsaw wrote: > > > > I haven't been following the AST stuff closely enough, but I'm not crazy > > about putting access to this in the sys module. It seems like it > > clutters that up with a name that will be rarely used by the average > > Python programmer. > > Agreed. I'm hoping we can get rid of lots of code in the compiler > module and use the AST provided from C. The compiler module seems the > best place to put anything related to the AST. > Sure, fine with me. I am not in love with the sys idea, just seemed reasonable. I just happen to think of the compiler module as this Python implementation of the bytecode compiler and not as this generic package where all compiler-related stuff goes. But if we move towards removing the parts of the compiler package that overlap with any AST being exposed that would be great. -Brett From brett at python.org Thu Feb 16 07:14:17 2006 From: brett at python.org (Brett Cannon) Date: Wed, 15 Feb 2006 22:14:17 -0800 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <43F2F444.4010604@gmail.com> References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> Message-ID: On 2/15/06, Nick Coghlan wrote: > Greg Ewing wrote: > > Brett Cannon wrote: > >> One protects us from ending up with an unusable AST since > >> the seralization can keep the original AST around and if the version > >> passed back in from Python code is junk it can be tossed and the > >> original version used. > > > > I don't understand why this is an issue. If Python code > > produces junk and tries to use it as an AST, then it's > > buggy and deserves what it gets. All the AST compiler > > should be responsible for is to try not to crash the > > interpreter under those conditions. But that's true > > whatever method is used for passing ASTs from Python > > to the compiler. > > I'd prefer the AST node be real Python objects. The arena approach seems to be > working reasonably well, but I still don't see a good reason for using a > specialised memory allocation scheme when it really isn't necessary and we > have a perfectly good memory management system for PyObject's. > If the compiler was hacked on by more people I would agree with this. But few people do and so I am not too worried about using a simple, custom memory system as long as its use is clearly written out for those few who do decide to work on it (and I am willing to be in charge of that, regardless of which solution we go with). Obviously it could be argued that more people don't because of its "special" coding style, but then again the old compiler wasn't special and very few people touched that beast. > On the 'unusable AST' front, if AST transformation code creates illegal > output, then the main thing is to raise an exception complaining about what's > wrong with it. I believe that may need a change to the compiler whether the > modified AST was serialised or not. > That's fine, but I wasn't sure where this exception would be raised. I guess it would come up during the import of a module if it was automatically passing the AST through a list of processing functions. Some might view it as not as bad as a segfault of the interpreter, but worse than just an ImportError. As I said, I am fine with allowing modification, but others have expressed reservations. > In terms of reverting back to the untransformed AST if the transformation > fails, then that option is up to the code doing the transformation. Instead of > serialising all the time (even for cases where the AST is just being inspected > instead of transformed), we can either let the AST objects support the > copy/deepcopy protocol, or else provide a method to clone a tree before trying > to transform it. > I view it as a one-time serialization and a one-time conversion back. So the compiler goes C -> Python objects. That is then subsequently passed into the first function registered to access the AST. The AST returned by that function is then immediately and directly passed to the next function in the list. This continues until the last function in which that returned AST is then converted back to the C representation, verified, and then sent on to the bytecode compiler. > A unified representation means we only have one API to learn, that is > accessible from both Python and C. It also eliminates any need to either > implement features twice (once in Python and once in C) or else let the Python > and C API's diverge to the point where what you can do with one differs from > what you can do with the other. > I suspect that any marshalling from C to Python will have a matching object design based on the AST node layout in the ASDL. So that API won't really be different from C to Python if we stick with the arena solution. And I also realized that marshalling might just go straight C to Python objects and not an intermediary step as I had in my head. Don't know why I thought it might need it or if anyone picked up on that being a possibility. -Brett From brett at python.org Thu Feb 16 07:15:33 2006 From: brett at python.org (Brett Cannon) Date: Wed, 15 Feb 2006 22:15:33 -0800 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <43F30299.3090708@gmail.com> References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> <20060215100309.GI6027@xs4all.nl> <43F30299.3090708@gmail.com> Message-ID: On 2/15/06, Nick Coghlan wrote: > Thomas Wouters wrote: > > On Wed, Feb 15, 2006 at 07:28:36PM +1000, Nick Coghlan wrote: > > > >> On the 'unusable AST' front, if AST transformation code creates illegal > >> output, then the main thing is to raise an exception complaining about > >> what's wrong with it. I believe that may need a change to the compiler > >> whether the modified AST was serialised or not. > > > > I would personally prefer the AST validation to be a separate part of the > > compiler. It means the one or the other can be out of sync, but it also > > means it can be accessed directly (validating AST before sending it to the > > compiler) and the compiler (or CFG generator, or something between AST and > > CFG) can decide not to validate internally generated AST for non-debug > > builds, for instance. > > > > I like both those reasons. > > Aye, I was thinking much the same thing. > Yeah, I would want it to be a separate part as well. -Brett From anthony at interlink.com.au Thu Feb 16 07:45:23 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 16 Feb 2006 17:45:23 +1100 Subject: [Python-Dev] 2.5 - I'm ok to do release management Message-ID: <200602161745.25795.anthony@interlink.com.au> I'm still catching up on the hundreds of python-dev messages from the last couple of days, but a quick note first that I'm ok to do release management for 2.5 Anthony -- Anthony Baxter It's never too late to have a happy childhood. From rasky at develer.com Thu Feb 16 08:01:25 2006 From: rasky at develer.com (Giovanni Bajo) Date: Thu, 16 Feb 2006 08:01:25 +0100 Subject: [Python-Dev] from __future__ import unicode_strings? References: <20060216014302.GN6027@xs4all.nl> Message-ID: <045901c632c6$cd545f90$1abf2997@bagio> Thomas Wouters wrote: >>> from __future__ import unicode_strings > >> Didn't we have a command-line option to do this? I believe it was >> removed because nobody could see the point. (Or am I hallucinating? >> After several days of non-stop discussing bytes that must be >> considered a possibility.) > > We do, and it's not been removed: the -U switch. It's not in the output of "python -h", though. Is it secret or what? Giovanni Bajo From fredrik at pythonware.com Thu Feb 16 08:18:59 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 08:18:59 +0100 Subject: [Python-Dev] 2.5 PEP References: <200602150922.43810.alain.poirier@net-ng.com><43F37FE1.2090201@v.loewis.de> Message-ID: (my mails to python-dev are bouncing; guess that's what you get when you question the PSF's ability to build web sites... trying again.) Neal Norwitz wrote: > > (is the xmlplus/xmlcore issue still an issue, btw?) > > What issue are you talking about? the changes described here http://mail.python.org/pipermail/python-dev/2005-December/058710.html "I'd like to propose that a new package be created in the standard library: xmlcore." which led to this response from a pyxml maintainer: http://mail.python.org/pipermail/python-dev/2005-December/058752.html "I don't agree with the change. You just broke source compatibility between the core package and PyXML." From arekm at pld-linux.org Thu Feb 16 08:28:41 2006 From: arekm at pld-linux.org (Arkadiusz Miskiewicz) Date: Thu, 16 Feb 2006 08:28:41 +0100 Subject: [Python-Dev] how bugfixes are handled? References: Message-ID: Guido van Rossum wrote: > We're all volunteers here, and we get a large volume of bugs. That's obvious (note, I'm not complaining, I'm asking ,,how it works for python''). > Unfortunately, bugfixes are reviewed on a voluntary basis. > > Are you aware of the standing offer that if you review 5 bugs/patches > some of the developers will pay attention to your bug/patch? I wasn't, thanks for information. Still few questions... one of developers/commiters reviews patch and commit it? Few developers has to review single patch? Thanks, -- Arkadiusz Mi?kiewicz PLD/Linux Team http://www.t17.ds.pwr.wroc.pl/~misiek/ http://ftp.pld-linux.org/ From nnorwitz at gmail.com Thu Feb 16 08:32:19 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 15 Feb 2006 23:32:19 -0800 Subject: [Python-Dev] how bugfixes are handled? In-Reply-To: References: Message-ID: On 2/15/06, Arkadiusz Miskiewicz wrote: > > Still few questions... one of developers/commiters reviews patch and commit > it? Few developers has to review single patch? One developer can review and commit a patch. Sometimes we request more input from other developers or interested parties. n From stefan.rank at ofai.at Thu Feb 16 08:37:56 2006 From: stefan.rank at ofai.at (Stefan Rank) Date: Thu, 16 Feb 2006 08:37:56 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> Message-ID: <43F42BD4.3000703@ofai.at> on 16.02.2006 06:59 Alex Martelli said the following: > On Feb 15, 2006, at 9:51 AM, Barry Warsaw wrote: > >> On Wed, 2006-02-15 at 09:17 -0800, Guido van Rossum wrote: >> >>> Regarding open vs. opentext, I'm still not sure. I don't want to >>> generalize from the openbytes precedent to openstr or openunicode >>> (especially since the former is wrong in 2.x and the latter is wrong >>> in 3.0). I'm tempting to hold out for open() since it's most >>> compatible. >> If we go with two functions, I'd much rather hang them off of the file >> type object then add two new builtins. I really do think file.bytes() >> and file.text() (a.k.a. open.bytes() and open.text()) is better than >> opentext() or openbytes(). > > I agree, or, MAL's idea of bytes.open() and unicode.open() is also > good. My fondest dream is that we do NOT have an 'open' builtin > which has proven to be very error-prone when used in Windows by > newbies (as evidenced by beginner errors as seen on c.l.py, the > python-help lists, and other venues) -- defaulting 'open' to text is > errorprone, defaulting it to binary doesn't seem the greatest idea > either, principle "when in doubt, resist the temptation to guess" > strongly suggests not having 'open' as a built-in at all. (And > namemangling into openthis and openthat seems less Pythonic to me > than exploiting namespaces by making structured names, either > this.open and that.open or open.this and open.that). IOW, I entirely > agree with Barry and Marc Andre. > `open`ing a file, i.e. constructing a `file` object, always requires a path argument. In case that Py3k manages to incorporate a `Path` object, I could be more natural to have `.openbytes` and `.opentext` as methods on Path objects. But `bytes.open` and `text/unicode/str.open` looks nice too. Just for the record, stefan From bokr at oz.net Thu Feb 16 08:54:41 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 16 Feb 2006 07:54:41 GMT Subject: [Python-Dev] str object going in Py3K References: <43F2FDDD.3030200@gmail.com> <43F3E7DB.4010502@canterbury.ac.nz> Message-ID: <43f41426.921510492@news.gmane.org> On Wed, 15 Feb 2006 18:57:26 -0800, Guido van Rossum wrote: >On 2/15/06, Greg Ewing wrote: >> Guido van Rossum wrote: >> > So how about >> > openbytes? This clearly links the resulting object with the bytes >> > type, which is mutually reassuring. >> >> That looks quite nice. >> >> Another thought -- what is going to happen to os.open? >> Will it change to return bytes, or will there be a new >> os.openbytes? > >Hm, I hadn't thought about that yet. On Windows, os.open has the >ability to set text or binary mode. But IMO it's better to make this >always use binary mode. > >My expectation is that the Py3k standard I/O library will do all of >its own conversions on top of binary files anyway -- if you missed it, >I'd like to get rid of any ties to C's stdio. > Would the standard I/O module have low level utility stream-processing generators to do things like linesep normalization in text or splitlines etc? I.e., primitives that could be composed for unforseen usefulness, like unix pipeable stuff? Maybe they could even be composable with '|' for unixy left->right piping, e.g., on windows for line in (os.open('somepath') | linechunker | decoder('latin-1')): ... where os.open('path').__or__(linechunker) returns linechunker(os.open('path')), which in turn has an __or__ to do similarly. Just had this bf, but ISTM it reads ok. The equivalent nested generator expression with same assumed primitives would I guess be for line in decoder('latin-1')(linechunker(binaryfile('path'))): ... which doesn't have the same natural left to right reading order to match processing order. Regards, Bengt Richter From talin at acm.org Thu Feb 16 08:05:09 2006 From: talin at acm.org (Talin) Date: Wed, 15 Feb 2006 23:05:09 -0800 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda Message-ID: <43F42425.3070807@acm.org> First off, let me apologize for bringing up a topic that I am sure that everyone is sick of: Lambda. I broached this subject to a couple of members of this list privately, and I got wise feedback on my suggestions which basically amounted to "don't waste your time." However, after having thought about this for several weeks, I came to the conclusion that I felt so strongly about this issue that the path of wisdom simply would not do, and I would have to choose the path of folly. Which I did. In other words, I went ahead and implemented it. Actually, it wasn't too bad, it only took about an hour of reading the ast.c code and the Grammar file (neither of which I had ever looked at before) to get the general sense of what's going on. So the general notion is similar to the various proposals on the Wiki - an inline keyword which serves the function of lambda. I chose the keyword "given" because it reminds me of math textbooks, e.g. "given x, solve for y". And I like the idea of syntactical structures that make sense when you read them aloud. Here's an interactive console session showing it in action. The first example shows a simple closure that returns the square of a number. >>> a = (x*x given x) >>> a(9) 81 You can also put parens around the argument list if you like: >>> a = (x*x given (x)) >>> a(9) 81 Same thing with two arguments, and with the optional parens: >>> a = (x*y given x,y) >>> a(9, 10) 90 >>> a = (x*y given (x,y)) >>> a(9, 10) 90 Yup, keyword arguments work too: >>> a = (x*y given (x=3,y=4)) >>> a(9, 10) 90 >>> a(9) 36 >>> a() 12 Use an empty paren-list to indicate that you want to define a closure with no arguments: >>> a = (True given ()) >>> a() True Note that there are some cases where you have to use the parens around the arguments to avoid a syntactical ambiguity: >>> map( str(x) given x, (1, 2, 3, 4) ) File "", line 1 map( str(x) given x, (1, 2, 3, 4) ) ^ SyntaxError: invalid syntax As you can see, adding the parens makes this work: >>> map( str(x) given (x), (1, 2, 3, 4) ) ['1', '2', '3', '4'] More fun with "map": >>> map( str(x)*3 given (x), (1, 2, 3, 4) ) ['111', '222', '333', '444'] Here's an example that uses the **args syntax: >>> a = (("%s=%s" % pair for pair in kwargs.items()) given **kwargs) >>> list( a(color="red") ) ['color=red'] >>> list( a(color="red", sky="blue") ) ['color=red', 'sky=blue'] I have to say, the more I use it, the more I like it, but I'm sure that this is just a personal taste issue. It looks a lot more natural to me than lambda. I should also mention that I resisted the temptation to make the 'given' keyword an optional generator suffix as in "(a for a in l given l). As I started working with the code, I started to realize that generators and closures, although they have some aspects in common, are very different beasts and should not be conflated lightly. (Plus the implementation would have been messy. I took that as a clue :)) Anyway, if anyone wants to play around with the patch, it is rather small - a couple of lines in Grammar, and a small new function in ast.c, plus a few mods to other functions to get them to call it. The context diff is less than two printed pages. I can post it somewhere if people are interested. Anyway, I am not going to lobby for a language change or write a PEP (unless someone asks me to.) I just wanted to throw this out there and see what people think of it. I definately don't want to start a flame war, although I suspect I already have :/ Now I can stop thinking about this and go back to my TurboGears-based Thesaurus editor :) -- Talin From fredrik at pythonware.com Thu Feb 16 09:33:47 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 09:33:47 +0100 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda References: <43F42425.3070807@acm.org> Message-ID: Talin wrote: > So the general notion is similar to the various proposals on the Wiki - > an inline keyword which serves the function of lambda. I chose the > keyword "given" because it reminds me of math textbooks, e.g. "given x, > solve for y". And I like the idea of syntactical structures that make > sense when you read them aloud. but that's about the only advantage you get from writing (x*x given x) instead of lambda x: x*x right ? or did I miss some subtle detail here ? > I definately don't want to start a flame war, although I suspect I already > have :/ I think most about everything has already been said wrt lambda already, but I guess we could have a little war on spelling issues ;-) From p.f.moore at gmail.com Thu Feb 16 10:28:56 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 16 Feb 2006 09:28:56 +0000 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: References: <43F42425.3070807@acm.org> Message-ID: <79990c6b0602160128g311b0646t7307cb7f6c9ccffe@mail.gmail.com> On 2/16/06, Fredrik Lundh wrote: > Talin wrote: > > I definately don't want to start a flame war, although I suspect I already > > have :/ > > I think most about everything has already been said wrt lambda already, > but I guess we could have a little war on spelling issues ;-) Agreed, but credit to Talin for actually implementing his suggestion. And it's nice to see that the AST makes this sort of experimentation easier. Paul. From gh at ghaering.de Thu Feb 16 10:11:50 2006 From: gh at ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=) Date: Thu, 16 Feb 2006 10:11:50 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: References: <43F2FE65.5040308@pollenation.net> <43F3B2E8.6090103@pollenation.net> Message-ID: <43F441D6.3000702@ghaering.de> Jeremy Hylton wrote: > I don't think this message is on-topic for python-dev. There are lots > of great places to discuss the design of the python web site, but the > list for developers doesn't seem like a good place for it. Do we need > a different list for people to gripe^H^H^H^H^H discuss the web site? [...] Such as http://mail.python.org/mailman/listinfo/pydotorg-redesign ? -- Gerhard From greg.ewing at canterbury.ac.nz Thu Feb 16 10:43:25 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 22:43:25 +1300 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060215212629.5F6D.JCARLSON@uci.edu> References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> Message-ID: <43F4493D.8090902@canterbury.ac.nz> Josiah Carlson wrote: > They may not be encodings of _unicode_ data, But if they're not encodings of unicode data, what business do they have being available through someunicodestring.encode(...)? Greg From greg.ewing at canterbury.ac.nz Thu Feb 16 10:55:42 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2006 22:55:42 +1300 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> Message-ID: <43F44C1E.8040107@canterbury.ac.nz> Brett Cannon wrote: > If the compiler was hacked on by more people I would agree with this. > But few people do This has the potential to be a self-perpetuating situation. There may be few people hacking on it now, but more people may want to in the future. Those people may look at the funky coding style and get discouraged, so there remains only few people working on it, thus apparently justifying the decision to keep the funky coding style. Whereas if there weren't any funky coding style in the first place, more potential compiler hackers might be encouraged to have a go. Also I'm still wondering why we're going to all this effort to build a whole new AST and compiler structure if the purpose isn't to *avoid* all this transformation between different representations. Greg From mal at egenix.com Thu Feb 16 11:24:35 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 16 Feb 2006 11:24:35 +0100 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: <20060216024911.GA363@mems-exchange.org> References: <20060216014302.GN6027@xs4all.nl> <20060216024911.GA363@mems-exchange.org> Message-ID: <43F452E3.3040004@egenix.com> Neil Schemenauer wrote: > On Thu, Feb 16, 2006 at 02:43:02AM +0100, Thomas Wouters wrote: >> On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote: >> >>>> from __future__ import unicode_strings >>> Didn't we have a command-line option to do this? I believe it was >>> removed because nobody could see the point. (Or am I hallucinating? >>> After several days of non-stop discussing bytes that must be >>> considered a possibility.) >> We do, and it's not been removed: the -U switch. > > As Guido alluded, the global switch is useless. A per-module switch > something that could actually useful. One nice advantage is that > you would write code that works the same with Jython (wrt to string > literals anyhow). The global switch is not useless. It's purpose is to test the standard library (or any other piece of Python code) for Unicode compatibility. Since we're not even close to such compatibility, I'm not sure how useful a per-module switch would be. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 16 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Thu Feb 16 11:57:14 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2006 20:57:14 +1000 Subject: [Python-Dev] PEP 338 issue finalisation (was Re: 2.5 PEP) In-Reply-To: References: <43F3A2EE.8060208@gmail.com> Message-ID: <43F45A8A.3050900@gmail.com> Guido van Rossum wrote: > On 2/15/06, Nick Coghlan wrote: >> PEP 338 is pretty much ready to go, too - just waiting on Guido's review and >> pronouncement on the specific API used in the latest update (his last PEP >> parade said he was OK with the general concept, but I only posted the PEP 302 >> compliant version after that). > > I like the PEP and the implementation (which I downloaded from SF). > Here are some comments in the form of diffs (attached). > > Do you have unit tests for everything? I believe I fixed a bug in the > code that reads a bytecode file (it wasn't skipping the timestamp). I haven't worked the filesystem based tests into the unit tests yet, and even the manual tests I was using managed to leave out compiled bytecode files (as you noticed). I'll fix that. Given I do my testing on Linux, I probably still would have forgotten the 'rb' mode definitions on the relevant calls to open() though. . . > +++ pep-0338.txt (working copy) > - The optional argument ``init_globals`` may be used to pre-populate > + The optional argument ``init_globals`` may be a dictionary used to pre-populate > the globals dictionary before the code is executed. The supplied > dictionary will not be modified. I just realised that anything that's a legal argument to "dict.update" will work. I'll fix the function description in the PEP (and the docs patch as well). > --- runpy.py Wed Feb 15 15:56:07 2006 > def get_data(self, pathname): > ! # XXX Unfortunately PEP 302 assumes text data :-( > ! return open(pathname).read() Hmm. The PEP itself requests that a string be returned from get_data(), but doesn't require that the file be opened in text mode. Perhaps the PEP 302 emulation should use binary mode here? Otherwise there could be strange data corruption bugs on Windows. > --- 337,349 ---- > > # This helper is needed as both the PEP 302 emulation and the > # main file execution functions want to read compiled files > + # XXX marshal can also raise EOFError; perhaps that should be > + # turned into ValueError? Some callers expect ValueError. > def _read_compiled_file(compiled_file): > magic = compiled_file.read(4) > if magic != imp.get_magic(): > raise ValueError("File not compiled for this Python version") > + compiled_file.read(4) # Throw away timestamp > return marshal.load(compiled_file) I'm happy to convert EOFError to ValueError here if you'd prefer (using the string representation of the EOFError as the message in the ValueError). Or did you mean changing the behaviour in marshal itself? > --- 392,407 ---- > loader = _get_loader(mod_name) > if loader is None: > raise ImportError("No module named " + mod_name) > + # XXX get_code() is an *optional* loader feature. Is that okay? > code = loader.get_code(mod_name) If the loader doesn't provide access to the code object or the source code, then runpy can't really do anything useful with that module (e.g. if its a C builtin module). Given that PEP 302 states that if you provide get_source() you should also provide get_code(), this check should be sufficient to let runpy.run_module get to everything it can handle. A case could be made for converting the attribute error to an ImportError, I guess. . . > filename = _get_filename(loader, mod_name) > if run_name is None: > run_name = mod_name > + # else: > + # XXX Should we also set sys.modules[run_name] = sys.modules[mod_name]? > + # I know of code that does "import __main__". It should probably > + # get the substitute __main__ rather than the original __main__, > + # if run_name != mod_name > return run_module_code(code, init_globals, run_name, > filename, loader, as_script) Hmm, good point. How about a different solution, though: in run_module_code, I could create a new module object and put it temporarily in sys.modules, and then remove it when done (restoring the original module if necessary). That would mean any module with code that looks up "sys.modules[__name__]" would still work when run via runpy.run_module or runpy.run_file. I also realised that sys.argv[0] should be restored to its original value, too. I'd then change the "as_script" flag to "alter_sys" and have it affect both of the above operations (and grab the import lock to avoid other import or run_module_code operations seeing the altered version of sys.modules). > --- 439,457 ---- > > Returns the resulting top level namespace dictionary > First tries to run as a compiled file, then as a source file > + XXX That's not the same algorithm as used by regular import; > + if the timestamp in the compiled file is not equal to the > + source file's mtime, the compiled file is ignored > + (unless there is no source file -- then the timestamp > + is ignored) They're doing different things though - the import process uses that algorithm to decide which filename to use (.pyo, .pyc or .py). This code in run_file is trying to decide whether the supplied filename points to a compiled file or a source file without a tight coupling to the specific file extension used (e.g. so it works for Unix Python scripts that rely on the shebang line to identify which interpreter to use to run them). I'll add a comment to that effect. Another problem that occurred to me is that the module isn't thread safe at the moment. The PEP 302 emulation isn't protected by the import lock, and the changes to sys.argv in run_module_code will be visible across threads (and may clobber each other or the original if multiple threads invoke the function). On that front, I'll change _get_path_loader to acquire and release the import lock, and the same for run_module_code when "alter_sys" is set to True. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From rhamph at gmail.com Thu Feb 16 12:00:38 2006 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 16 Feb 2006 04:00:38 -0700 Subject: [Python-Dev] bytes type discussion In-Reply-To: <43F37B27.8070008@v.loewis.de> References: <007e01c631c8$82c1b170$b83efea9@RaymondLaptop1> <43F27FBC.1000209@v.loewis.de> <43F2E065.3080807@v.loewis.de> <43F37B27.8070008@v.loewis.de> Message-ID: On 2/15/06, "Martin v. L?wis" wrote: > Adam Olsen wrote: > > Making it an error to have 8-bit str literals in 2.x would help > > educate the user that they will change behavior in 3.0 and not be > > 8-bit str literals anymore. > > You would like to ban string literals from the language? Remember: > all string literals are currently 8-bit (byte) strings. That's a rather literal interpretation of what I said. ;) What I meant was to only accept 7-bit characters, namely ascii. -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Thu Feb 16 12:06:59 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2006 21:06:59 +1000 Subject: [Python-Dev] 2.5 PEP In-Reply-To: References: <200602150922.43810.alain.poirier@net-ng.com> Message-ID: <43F45CD3.9090504@gmail.com> Neal Norwitz wrote: > On 2/15/06, Alain Poirier wrote: >> - isn't the current implementation of itertools.tee (cache of previous >> generated values) incompatible with the new possibility to feed a >> generator (PEP 342) ? > > I'm not sure what you are referring to. What is the issue? The 'tee' object doesn't have a "send" method. (This is true for all of the itertools iterators, I believe). The request is misguided though - the itertools module is designed to operate on output-only iterators, not on generators that expect input via send(). Because the output values might depend on the values sent, then it makes no sense to cache them (or do most of the other things itertools does). The relevant functionality would actually make the most sense as a fork() method on generators, but PEP 342 was trying to be fairly minimalist. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From rhamph at gmail.com Thu Feb 16 12:51:06 2006 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 16 Feb 2006 04:51:06 -0700 Subject: [Python-Dev] Rename str/unicode to text [Was: Re: str object going in Py3K] Message-ID: On 2/15/06, Guido van Rossum wrote: > On 2/15/06, M.-A. Lemburg wrote: > > Barry Warsaw wrote: > > > On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > > > > > >> Maybe a weird idea, but why not use static methods on the > > >> bytes and str type objects for this ?! > > >> > > >> E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 > > >> renamed to str.openfile()) > > > > > > That's also not a bad idea, but I'd leave off one or the other of the > > > redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() > > > seem fine to me (we all know what 'open' means, right? :). > > > > Thinking about it, I like your idea better (file.bytes() > > and file.text()). > > This is better than making it a static/class method on file (which has > the problem that it might return something that's not a file at all -- > file is a particular stream implementation, there may be others) but I > don't like the tight coupling it creates between a data type and an > I/O library. I still think that having global (i.e. built-in) factory > functions for creating various stream types makes the most sense. While we're at it, any chance of renaming str/unicode to text in 3.0? It's a MUCH better name, as evidenced by the opentext/openbytes names. str is just some odd C-ism. Obviously it's a form of gratuitous breakage, but I think the long term benefits are enough that we need to be *sure* that the breakage would be too much before we discount it. This seems the right time to discuss that. (And no, I'm not suggesting any special semantics for text. It's just the name I want.) str literal -> text literal unicode literal -> text literal text file -> text file (duh!) tutorial section called "Strings" -> tutorial section called "Text" Documentation Strings -> Documentation Text String Pattern Matching -> Text Pattern Matching String Services -> Text Services. Actually this is a problem. struct should be used on bytes, not unicode/text. textwrap -> textwrap stringprep -> textprep? Doesn't seem like a descriptive name linecache "Random access to text lines" gettext (not getstring!) -- Adam Olsen, aka Rhamphoryncus From barry at python.org Thu Feb 16 13:38:34 2006 From: barry at python.org (Barry Warsaw) Date: Thu, 16 Feb 2006 07:38:34 -0500 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <43F42425.3070807@acm.org> References: <43F42425.3070807@acm.org> Message-ID: <25C8B5BD-FCBC-45EC-8081-ACD0F472C7CB@python.org> On Feb 16, 2006, at 2:05 AM, Talin wrote: > > Anyway, if anyone wants to play around with the patch, it is rather > small - a couple of lines in Grammar, and a small new function in > ast.c, > plus a few mods to other functions to get them to call it. The context > diff is less than two printed pages. I can post it somewhere if people > are interested. > Please submit a SourceForge patch so others can play with it! -Barry From ncoghlan at gmail.com Thu Feb 16 13:45:29 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2006 22:45:29 +1000 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <79990c6b0602160128g311b0646t7307cb7f6c9ccffe@mail.gmail.com> References: <43F42425.3070807@acm.org> <79990c6b0602160128g311b0646t7307cb7f6c9ccffe@mail.gmail.com> Message-ID: <43F473E9.9030508@gmail.com> Paul Moore wrote: > On 2/16/06, Fredrik Lundh wrote: >> Talin wrote: >>> I definately don't want to start a flame war, although I suspect I already >>> have :/ >> I think most about everything has already been said wrt lambda already, >> but I guess we could have a little war on spelling issues ;-) > > Agreed, but credit to Talin for actually implementing his suggestion. > And it's nice to see that the AST makes this sort of experimentation > easier. Aye to both of those comments (and the infix spelling really is kind of pretty). Who knows, maybe Guido will decide he wants to change the spelling some day. Probably only if the sky fell on him or something, though ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From jeremy at alum.mit.edu Thu Feb 16 13:49:08 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 16 Feb 2006 07:49:08 -0500 Subject: [Python-Dev] C AST to Python discussion In-Reply-To: <43F44C1E.8040107@canterbury.ac.nz> References: <43F2E9CD.7000102@canterbury.ac.nz> <43F2F444.4010604@gmail.com> <43F44C1E.8040107@canterbury.ac.nz> Message-ID: On 2/16/06, Greg Ewing wrote: > Whereas if there weren't any funky coding style in the > first place, more potential compiler hackers might be > encouraged to have a go. I'm trying to make the code simple. The style of code is different than other parts of Python, but a compiler is different than a bytecode engine or implementations of basic types. Different problem domains lead to different program structure. > Also I'm still wondering why we're going to all this effort > to build a whole new AST and compiler structure if the > purpose isn't to *avoid* all this transformation between > different representations. The goal is to get the right representation for the problem. It was harder to understand and modify the compiler when it worked on the concrete parse trees. The compiler now has a couple of abstractions that are well suited to particular phases of compilation. Jeremy From exarkun at divmod.com Thu Feb 16 15:10:20 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Thu, 16 Feb 2006 09:10:20 -0500 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: <43F452E3.3040004@egenix.com> Message-ID: <20060216141020.6122.443333891.divmod.quotient.541@ohm> On Thu, 16 Feb 2006 11:24:35 +0100, "M.-A. Lemburg" wrote: >Neil Schemenauer wrote: >> On Thu, Feb 16, 2006 at 02:43:02AM +0100, Thomas Wouters wrote: >>> On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote: >>> >>>>> from __future__ import unicode_strings >>>> Didn't we have a command-line option to do this? I believe it was >>>> removed because nobody could see the point. (Or am I hallucinating? >>>> After several days of non-stop discussing bytes that must be >>>> considered a possibility.) >>> We do, and it's not been removed: the -U switch. >> >> As Guido alluded, the global switch is useless. A per-module switch >> something that could actually useful. One nice advantage is that >> you would write code that works the same with Jython (wrt to string >> literals anyhow). > >The global switch is not useless. It's purpose is to test the >standard library (or any other piece of Python code) for Unicode >compatibility. > >Since we're not even close to such compatibility, I'm not sure >how useful a per-module switch would be. Just what Neil suggested: developers writing new code benefit from having the behavior which will ultimately be Python's default, rather than the behavior that is known to be destined for obsolescence. Being able to turn this on per-module is useful for the same reason the rest of the future system is useful on a per-module basis. It's easier to convert things incrementally than monolithicly. Jean-Paul From guido at python.org Thu Feb 16 16:07:41 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 07:07:41 -0800 Subject: [Python-Dev] PEP 338 issue finalisation (was Re: 2.5 PEP) In-Reply-To: <43F45A8A.3050900@gmail.com> References: <43F3A2EE.8060208@gmail.com> <43F45A8A.3050900@gmail.com> Message-ID: On 2/16/06, Nick Coghlan wrote: > Guido van Rossum wrote: > > Do you have unit tests for everything? I believe I fixed a bug in the > > code that reads a bytecode file (it wasn't skipping the timestamp). [Hey, I thought I sent that just to you. Is python-dev really interested in this?] > I haven't worked the filesystem based tests into the unit tests yet, and even > the manual tests I was using managed to leave out compiled bytecode files (as > you noticed). I'll fix that. > > Given I do my testing on Linux, I probably still would have forgotten the 'rb' > mode definitions on the relevant calls to open() though. . . But running the unit tests on Windows would have revealed the problem. > > +++ pep-0338.txt (working copy) > > - The optional argument ``init_globals`` may be used to pre-populate > > + The optional argument ``init_globals`` may be a dictionary used to pre-populate > > the globals dictionary before the code is executed. The supplied > > dictionary will not be modified. > > I just realised that anything that's a legal argument to "dict.update" will > work. I'll fix the function description in the PEP (and the docs patch as well). I'm not sure that's a good idea -- you'll never be able to switch to a different implementation then. > > --- runpy.py Wed Feb 15 15:56:07 2006 > > def get_data(self, pathname): > > ! # XXX Unfortunately PEP 302 assumes text data :-( > > ! return open(pathname).read() > > Hmm. > > The PEP itself requests that a string be returned from get_data(), but doesn't > require that the file be opened in text mode. Perhaps the PEP 302 emulation > should use binary mode here? Otherwise there could be strange data corruption > bugs on Windows. But PEP 302 shows as its only example reading from a file with a .txt extension. Adding spurious \r characters is also data corruption. We should probably post to python-dev a request for clarification of PEP 302, but in the mean time I vote for text mode. > > --- 337,349 ---- > > > > # This helper is needed as both the PEP 302 emulation and the > > # main file execution functions want to read compiled files > > + # XXX marshal can also raise EOFError; perhaps that should be > > + # turned into ValueError? Some callers expect ValueError. > > def _read_compiled_file(compiled_file): > > magic = compiled_file.read(4) > > if magic != imp.get_magic(): > > raise ValueError("File not compiled for this Python version") > > + compiled_file.read(4) # Throw away timestamp > > return marshal.load(compiled_file) > > I'm happy to convert EOFError to ValueError here if you'd prefer (using the > string representation of the EOFError as the message in the ValueError). > > Or did you mean changing the behaviour in marshal itself? No -- the alternative is to catch EOFError in _read_compiled_file()'s caller, but that seems worse. You should check marshal.c if it can raise any *other* errors (perhaps OverflowError?). Also, *perhaps* it makes more sense to return None instead of raising ValueError? Since you're always catching it? (Or are you?) > > --- 392,407 ---- > > loader = _get_loader(mod_name) > > if loader is None: > > raise ImportError("No module named " + mod_name) > > + # XXX get_code() is an *optional* loader feature. Is that okay? > > code = loader.get_code(mod_name) > > If the loader doesn't provide access to the code object or the source code, > then runpy can't really do anything useful with that module (e.g. if its a C > builtin module). Given that PEP 302 states that if you provide get_source() > you should also provide get_code(), this check should be sufficient to let > runpy.run_module get to everything it can handle. OK. But a loader could return None from get_code() -- do you check for that? (I don't have the source handy here.) > A case could be made for converting the attribute error to an ImportError, I > guess. . . I'm generally not keen on that; leave it. > > filename = _get_filename(loader, mod_name) > > if run_name is None: > > run_name = mod_name > > + # else: > > + # XXX Should we also set sys.modules[run_name] = sys.modules[mod_name]? > > + # I know of code that does "import __main__". It should probably > > + # get the substitute __main__ rather than the original __main__, > > + # if run_name != mod_name > > return run_module_code(code, init_globals, run_name, > > filename, loader, as_script) > > Hmm, good point. How about a different solution, though: in run_module_code, I > could create a new module object and put it temporarily in sys.modules, and > then remove it when done (restoring the original module if necessary). That might work too. What happens when you execute "foo.py" as __main__ and then (perhaps indirectly) something does "import foo"? Does a second copy of foo.py get loaded by the regular loader? > That would mean any module with code that looks up "sys.modules[__name__]" > would still work when run via runpy.run_module or runpy.run_file. Yeah, except if they do that, they're likely to also *assign* to that. Well, maybe that would just work, too... > I also realised that sys.argv[0] should be restored to its original value, too. Yup. > I'd then change the "as_script" flag to "alter_sys" and have it affect both of > the above operations (and grab the import lock to avoid other import or > run_module_code operations seeing the altered version of sys.modules). Makes sense. I do wonder if runpy.py isn't getting a bit over-engineered -- it seems a lot of the functionality isn't actually necessary to implement -m foo.bar, and the usefulness in other circumstances is as yet unproven. What do you think of taking a dose of YAGNI here? (Especially since I notice that most of the public APIs are very thin layers over exec or execfile -- people can just use those directly.) > > --- 439,457 ---- > > > > Returns the resulting top level namespace dictionary > > First tries to run as a compiled file, then as a source file > > + XXX That's not the same algorithm as used by regular import; > > + if the timestamp in the compiled file is not equal to the > > + source file's mtime, the compiled file is ignored > > + (unless there is no source file -- then the timestamp > > + is ignored) > > They're doing different things though - the import process uses that algorithm > to decide which filename to use (.pyo, .pyc or .py). This code in run_file is > trying to decide whether the supplied filename points to a compiled file or a > source file without a tight coupling to the specific file extension used (e.g. > so it works for Unix Python scripts that rely on the shebang line to identify > which interpreter to use to run them). > > I'll add a comment to that effect. Ah, good point. So you never go from foo.py to foo.pyc, right? > Another problem that occurred to me is that the module isn't thread safe at > the moment. The PEP 302 emulation isn't protected by the import lock, and the > changes to sys.argv in run_module_code will be visible across threads (and may > clobber each other or the original if multiple threads invoke the function). Another reason to consider cutting it down to only what's needed by -m; -m doesn't need thread-safety (I think). > On that front, I'll change _get_path_loader to acquire and release the import > lock, and the same for run_module_code when "alter_sys" is set to True. OK, just be very, very careful. The import lock is not a regular mutex and if you don't release it you're stuck forever. Just use try/finally... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Thu Feb 16 16:23:10 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 16:23:10 +0100 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda References: <43F42425.3070807@acm.org> <79990c6b0602160128g311b0646t7307cb7f6c9ccffe@mail.gmail.com> Message-ID: Paul Moore wrote: > > I think most about everything has already been said wrt lambda already, > > but I guess we could have a little war on spelling issues ;-) > > Agreed, but credit to Talin for actually implementing his suggestion. > And it's nice to see that the AST makes this sort of experimentation > easier. absolutely! +1 on experimentation! From mal at egenix.com Thu Feb 16 16:29:40 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 16 Feb 2006 16:29:40 +0100 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: <20060216141020.6122.443333891.divmod.quotient.541@ohm> References: <20060216141020.6122.443333891.divmod.quotient.541@ohm> Message-ID: <43F49A64.90308@egenix.com> Jean-Paul Calderone wrote: > On Thu, 16 Feb 2006 11:24:35 +0100, "M.-A. Lemburg" wrote: >> Neil Schemenauer wrote: >>> On Thu, Feb 16, 2006 at 02:43:02AM +0100, Thomas Wouters wrote: >>>> On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote: >>>> >>>>>> from __future__ import unicode_strings >>>>> Didn't we have a command-line option to do this? I believe it was >>>>> removed because nobody could see the point. (Or am I hallucinating? >>>>> After several days of non-stop discussing bytes that must be >>>>> considered a possibility.) >>>> We do, and it's not been removed: the -U switch. >>> As Guido alluded, the global switch is useless. A per-module switch >>> something that could actually useful. One nice advantage is that >>> you would write code that works the same with Jython (wrt to string >>> literals anyhow). >> The global switch is not useless. It's purpose is to test the >> standard library (or any other piece of Python code) for Unicode >> compatibility. >> >> Since we're not even close to such compatibility, I'm not sure >> how useful a per-module switch would be. > > Just what Neil suggested: developers writing new code benefit from having the behavior which will ultimately be Python's default, rather than the behavior that is known to be destined for obsolescence. > > Being able to turn this on per-module is useful for the same reason the rest of the future system is useful on a per-module basis. It's easier to convert things incrementally than monolithicly. Sure, but in this case the option would not only affect the module you define it in, but also all other code that now gets Unicode objects instead of strings as a result of the Unicode literals defined in these modules. It is rather likely that you'll start hitting Unicode-related compatibility bugs in the standard lib more often than you'd like. It's usually better to switch to Unicode in a controlled manner: not by switching all literals to Unicode, but only some, then test things, then switch over some more, etc. This can be done by prepending the literal with the u"" modifier. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 16 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From p.f.moore at gmail.com Thu Feb 16 16:39:30 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 16 Feb 2006 15:39:30 +0000 Subject: [Python-Dev] PEP 338 issue finalisation (was Re: 2.5 PEP) In-Reply-To: References: <43F3A2EE.8060208@gmail.com> <43F45A8A.3050900@gmail.com> Message-ID: <79990c6b0602160739j297b775eq358095cf530cea3b@mail.gmail.com> On 2/16/06, Guido van Rossum wrote: > On 2/16/06, Nick Coghlan wrote: > > The PEP itself requests that a string be returned from get_data(), but doesn't > > require that the file be opened in text mode. Perhaps the PEP 302 emulation > > should use binary mode here? Otherwise there could be strange data corruption > > bugs on Windows. > > But PEP 302 shows as its only example reading from a file with a .txt > extension. Adding spurious \r characters is also data corruption. We > should probably post to python-dev a request for clarification of PEP > 302, but in the mean time I vote for text mode. FWIW, the .txt example was just a toy example. I'd say that binary mode makes sense, as I can imagine using the get_data interface to load image files, for example. It makes getting text files a bit harder (you have to munge CRLF manually) but at least you have the *option* of getting binary files. On reflection, get_data should probably have been given a mode argument. But given that it didn't, binary seems safest. OTOH, I don't know who actually *uses* get_data for real (PJE, for eggs? py2exe?). Their opinions are likely to be of more importance. On the third hand, doing whatever the zipimport module does is likely to be right, as that's the key bit of prior art. Regardless, the PEP should be clarified. I'll make the change once agreement is reached. Paul. From 2005a at usenet.alexanderweb.de Thu Feb 16 17:16:36 2006 From: 2005a at usenet.alexanderweb.de (Alexander Schremmer) Date: Thu, 16 Feb 2006 17:16:36 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available References: <43F2FE65.5040308@pollenation.net> Message-ID: <19igus5puu6e2$.dlg@usenet.alexanderweb.de> On Wed, 15 Feb 2006 21:13:14 +0100, Georg Brandl wrote: > If something like Fredrik's new doc system is adopted, it would be extremely > convenient to refer someone to just > > docs.python.org/os.path.join In fact, PHP does it like php.net/functionname which is even shorter, i.e. they fallback to the documentation if that path does not exist otherwise. Kind regards, Alexander From fredrik at pythonware.com Thu Feb 16 07:18:02 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 07:18:02 +0100 Subject: [Python-Dev] 2.5 PEP In-Reply-To: References: <200602150922.43810.alain.poirier@net-ng.com> <43F37FE1.2090201@v.loewis.de> Message-ID: <368a5cd50602152218n39dec5a5hf68c09eecc119aad@mail.gmail.com> > > (is the xmlplus/xmlcore issue still an issue, btw?) > > What issue are you talking about? the changes described here http://mail.python.org/pipermail/python-dev/2005-December/058710.html "I'd like to propose that a new package be created in the standard library: xmlcore." which led to this response from a pyxml maintainer: http://mail.python.org/pipermail/python-dev/2005-December/058752.html "I don't agree with the change. You just broke source compatibility between the core package and PyXML." From nicolas.chauvat at logilab.fr Thu Feb 16 09:57:00 2006 From: nicolas.chauvat at logilab.fr (Nicolas Chauvat) Date: Thu, 16 Feb 2006 09:57:00 +0100 Subject: [Python-Dev] [Python-projects] AST in Python 2.5 In-Reply-To: References: Message-ID: <20060216085700.GB22366@logilab.fr> On Wed, Feb 15, 2006 at 09:40:17PM -0800, Neal Norwitz wrote: > I'm not sure if anyone here is following the AST discussion on > python-dev, but it would be great if you had any input. pylint is a > pretty big consumer of the compiler module and the decisions with > respect to the AST could impact you. > > http://mail.python.org/pipermail/python-dev/2006-February/060994.html We will jump in with better comments, but I just wanted to make sure you knew about: http://www.logilab.org/projects/astng and the work being done in PyPy: http://codespeak.net/pypy/dist/pypy/doc/parser.html http://codespeak.net/pypy/dist/pypy/module/recparser/ http://codespeak.net/pypy/dist/pypy/doc/interpreter.html http://codespeak.net/pypy/dist/pypy/interpreter/astcompiler/ Here is a bit from our EU reports that is about Workpackage 10 "Aspects and Contracts in Python": WP10 Status =========== Extend language with aspects and contracts * researched how other languages do it (AspectJ, HyperJ, AspectS, etc.) * started allowing AST manipulation (for weaving code and function calls) * started allowing grammar manipulation (for experimenting with syntax) WP10 Status (cont.) =================== AST and grammar manipulation * needed for both WP9 and WP10 * AST nodes are exposed at application-level and a compiler hook * allows to modify the AST at compile-time * syntax can be modified at run-time, but still limited because grammar objects are not fully exposed at application-level WP10 Status (cont.) =================== AST manipulation example:: >>>> 3 + 3 6 >>>> from parser import install_compiler_hook >>>> from hooks import _3becomes2 >>>> install_compiler_hook(_3becomes2) >>>> 3 + 3 4 >>>> -- Nicolas Chauvat logilab.fr - services en informatique avanc?e et gestion de connaissances From fredrik at pythonware.com Thu Feb 16 11:27:49 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 11:27:49 +0100 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <79990c6b0602160128g311b0646t7307cb7f6c9ccffe@mail.gmail.com> References: <43F42425.3070807@acm.org> <79990c6b0602160128g311b0646t7307cb7f6c9ccffe@mail.gmail.com> Message-ID: <368a5cd50602160227y1ddfeab2p9456db86337ceebb@mail.gmail.com> Paul Moore wrote: > > > I definately don't want to start a flame war, although I suspect I already > > > have :/ > > > > I think most about everything has already been said wrt lambda already, > > but I guess we could have a little war on spelling issues ;-) > > Agreed, but credit to Talin for actually implementing his suggestion. > And it's nice to see that the AST makes this sort of experimentation > easier. absolutely! +1 for experimentation. From bsder at allcaps.org Thu Feb 16 11:36:05 2006 From: bsder at allcaps.org (Andrew Lentvorski) Date: Thu, 16 Feb 2006 02:36:05 -0800 Subject: [Python-Dev] nice() In-Reply-To: <000e01c63226$fc342660$7c2c4fca@csmith> References: <000e01c63226$fc342660$7c2c4fca@csmith> Message-ID: <43F45595.7060803@allcaps.org> Smith wrote: > Everyone knows that fp numbers must be compared with caution, but > there is a void in the relative-error department for exercising such > caution, thus the proposal for something like 'areclose'. The problem > with areclose(), however, is that it only solves one part of the > problem that needs to be solved if two fp's *are* going to be > compared: if you are going to check if a < b you would need to do > something like > > not areclose(a,b) and a < b -1 This kind of function, at best, delays the newbie pain of learning about binary floating point very slightly. No matter how you set your test, I can make a pathological case which will catch at the boundary. The standard deviation formula; the area of triangle formula which fails on slivers; ill-conditioned linear equations--the examples are endless which can trip up newbies. On the other hand, people who do care about accurate numerical analysis will not trust that the people who wrote the library really had enough numerical sophistication and will simply rewrite the test *anyhow*. The "best" solution would be to optimize the Decimal module into something sufficiently fast that binary floating point goes away by default in Python. A nice reference about binary floating point is: "What Every Computer Scientist Should Know About Floating-Point Arithmetic" by David Goldberg (available *everywhere*) For truly pedantic details about the gory nastiness of binary floating point, see William Kahan's homepage at Berkeley: http://www.cs.berkeley.edu/~wkahan/ -a From mal at egenix.com Thu Feb 16 17:30:32 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 16 Feb 2006 17:30:32 +0100 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: <045901c632c6$cd545f90$1abf2997@bagio> References: <20060216014302.GN6027@xs4all.nl> <045901c632c6$cd545f90$1abf2997@bagio> Message-ID: <43F4A8A8.5090606@egenix.com> Giovanni Bajo wrote: > Thomas Wouters wrote: > >>>> from __future__ import unicode_strings >>> Didn't we have a command-line option to do this? I believe it was >>> removed because nobody could see the point. (Or am I hallucinating? >>> After several days of non-stop discussing bytes that must be >>> considered a possibility.) >> We do, and it's not been removed: the -U switch. > > > It's not in the output of "python -h", though. Is it secret or what? Yes. We removed it from the help output to not confuse users who are not aware of the fact that this is an experimental switch. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 16 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From John.Marshall at ec.gc.ca Thu Feb 16 17:30:02 2006 From: John.Marshall at ec.gc.ca (John Marshall) Date: Thu, 16 Feb 2006 16:30:02 +0000 Subject: [Python-Dev] Does eval() leak? Message-ID: <43F4A88A.7050100@ec.gc.ca> Hi, Should I expect the virtual memory allocation to go up if I do the following? ----- raw = open("data").read() while True: d = eval(raw) ----- I would have expected the memory allocated to the object referenced by d to be deallocated, garbage collected, and reallocated for the new eval(raw) results, assigned to d. The file contains a large, SIMPLE (no self refs; all native python types/objects) dictionary (>300K). While doing 'd = eval(raw)' in the python interpreter I am monitoring the VIRT column of top and it keeps increasing until I run out of memory. When I use a safe_eval() from: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469 I have no memory problems. I see this under python 2.3.5 (fast and obvious). Thanks, John From fredrik at pythonware.com Thu Feb 16 18:13:53 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 18:13:53 +0100 Subject: [Python-Dev] bytes type discussion References: <20060215223943.GL6027@xs4all.nl> <1140049757.14818.45.camel@geddy.wooz.org> Message-ID: Barry Warsaw wrote: > We know at least there will never be a 2.10, so I think we still have > time. because there's no way to count to 10 if you only have one digit? we used to think that back when the gas price was just below 10 SEK/L, but they found a way... From walter at livinglogic.de Thu Feb 16 18:14:45 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 16 Feb 2006 18:14:45 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43f41426.921510492@news.gmane.org> References: <43F2FDDD.3030200@gmail.com> <43F3E7DB.4010502@canterbury.ac.nz> <43f41426.921510492@news.gmane.org> Message-ID: <43F4B305.3060003@livinglogic.de> Bengt Richter wrote: > On Wed, 15 Feb 2006 18:57:26 -0800, Guido van Rossum wrote: > >> [...] >> My expectation is that the Py3k standard I/O library will do all of >> its own conversions on top of binary files anyway -- if you missed it, >> I'd like to get rid of any ties to C's stdio. >> > Would the standard I/O module have low level utility stream-processing generators > to do things like linesep normalization in text or splitlines etc? I.e., primitives > that could be composed for unforseen usefulness, like unix pipeable stuff? > > Maybe they could even be composable with '|' for unixy left->right piping, e.g., on windows > > for line in (os.open('somepath') | linechunker | decoder('latin-1')): ... > > where os.open('path').__or__(linechunker) returns linechunker(os.open('path')), > which in turn has an __or__ to do similarly. Just had this bf, but ISTM it reads ok. > The equivalent nested generator expression with same assumed primitives would I guess be > > for line in decoder('latin-1')(linechunker(binaryfile('path'))): ... > > which doesn't have the same natural left to right reading order to match processing order. I'm currently implementing something like this, which might go into IPython. See http://styx.livinglogic.de/~walter/IPython/ipipe.py for code. (This requires the current IPython svn trunk) Examples: for f in ils("/usr/lib/python2.3/") | ifilter("name.endswith('.py')"): print f.name, f.size for p in ipwd | ifilter("shell=='/bin/false'") | isort("uid") | \ ieval('"%s (%s)" % (_.name, _.gecos)'): print p The other part of the project is a curses based browser for the output of these pipelines. See http://styx.livinglogic.de/~walter/IPython/newdir.gif for a screenshot of the result of ils("/usr/lib/python2.3/") | ifilter("name.endswith('.py')") Bye, Walter D?rwald From shane.holloway at ieee.org Thu Feb 16 18:20:24 2006 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Thu, 16 Feb 2006 10:20:24 -0700 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F3EC45.8090301@canterbury.ac.nz> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <43F3EC45.8090301@canterbury.ac.nz> Message-ID: On Feb 15, 2006, at 20:06, Greg Ewing wrote: > Barry Warsaw wrote: > >> If we go with two functions, I'd much rather hang them off of the >> file >> type object then add two new builtins. I really do think >> file.bytes() >> and file.text() (a.k.a. open.bytes() and open.text()) is better than >> opentext() or openbytes(). > > I'm worried about feeping creaturism of the file type > here. To my mind, the file type already has too many > features, and this hinders code that wants to define > its own file-like objects. > > In 3.0 I'd like to see the file type reduced to having > as simple an interface as possible (basically just > read/write) and all other stuff (readlines, text codecs, > etc.) implemented as wrappers around it. I'd like to put my 2 cents in a agree with Greg here. Implementing a "complete" file-like object has come to be something of a pain. Perhaps we can do something akin to UserDict -- perhaps UserTextFile and UserBinaryFile? It would be nice if it could handle the default implementation of everything but read and write. Thanks, -Shane From fredrik at pythonware.com Thu Feb 16 18:15:47 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 18:15:47 +0100 Subject: [Python-Dev] Rename str/unicode to text [Was: Re: str object goingin Py3K] References: Message-ID: Adam Olsen wrote: > While we're at it, any chance of renaming str/unicode to text in 3.0? > It's a MUCH better name, as evidenced by the opentext/openbytes names. > str is just some odd C-ism. > > Obviously it's a form of gratuitous breakage, but I think the long > term benefits are enough that we need to be *sure* that the breakage > would be too much before we discount it. it's a very common variable name... From fredrik at pythonware.com Thu Feb 16 18:25:40 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 16 Feb 2006 18:25:40 +0100 Subject: [Python-Dev] Off-topic: www.python.org References: <43F2FE65.5040308@pollenation.net><43F3B2E8.6090103@pollenation.net> <20060216005548.GB8957@panix.com> Message-ID: Aahz wrote: > In all fairness to Tim (and despite the fact that emotionally I agree > with you), the fact is that there had been essentially no forward motion > on www.python.org redesign until he went to work. Even if we end up > chucking out all his work in favor of something else, I'll consider the > PSF's money well-spent for bringing the community energy into it. the problem isn't the work that has already been done, the problem is that things change, and choices that were made years ago are not necessarily true today. more on this in another forum, at some other time. I'll concentrate on the library reference for now... From thomas at xs4all.net Thu Feb 16 18:43:26 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 16 Feb 2006 18:43:26 +0100 Subject: [Python-Dev] Test failures in test_timeout Message-ID: <20060216174326.GA23859@xs4all.nl> I'm seeing spurious test failures in test_timeout, on my own workstation and on macteagle.python.org (now that it crashes less; Apple sent over some new memory.) The problem is pretty simple: both macteagle and my workstation live too closely, network-wise, to www.python.org: class TimeoutTestCase(unittest.TestCase): [...] def setUp(self): [...] self.addr_remote = ('www.python.org', 80) [...] def testConnectTimeout(self): # Test connect() timeout _timeout = 0.001 self.sock.settimeout(_timeout) _t1 = time.time() self.failUnlessRaises(socket.error, self.sock.connect, self.addr_remote) In other words, the test fails because www.python.org responds too quickly. The test on my workstation only fails occasionally, but I do expect macteagle's failure to be more common (since it's connected to www.python.org through (literally) a pair of gigabit switches, whereas my workstation has to pass through a few more switches, two Junipers and some dark fiber.) Lowering the timeout has no effect, as far as I can tell, which is probably a granularity issue. I'm thinking that it could probably try to connect to a less reliable website, but that's just moving the problem around (and possibly harassing an unsuspecting website, particularly around release-time.) Perhaps the test should try to connect to a known unconnecting site, like a firewalled port on www.python.org? Not something that refuses connections, just something that times out. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From lists at janc.be Thu Feb 16 19:09:37 2006 From: lists at janc.be (Jan Claeys) Date: Thu, 16 Feb 2006 19:09:37 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> Message-ID: <1140113378.13739.75.camel@localhost.localdomain> Op wo, 15-02-2006 te 11:23 -0800, schreef Bob Ippolito: > On Feb 15, 2006, at 4:49 AM, Jan Claeys wrote: > > > Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing: > >> I'm disappointed that the various Linux distributions > >> still don't seem to have caught onto the very simple > >> idea of *not* scattering files all over the place when > >> installing something. > > Those directories might be mounted on entirely different hardware > > (even over a network), often with different characteristics (access speed, > > writeability, etc.). > > Huh? What does that have to do with anything? I've never seen a > system where /usr/include, /usr/lib, /usr/bin, etc. are not all on > the same mount. It's not really any different with OS X either. Paths like /etc, /var, /srv, /usr/include and /usr/share are good candidates to be on another mount than the bin & lib directories... BTW, Mac-style packages do exist for Linux too, if you prefer that. Look e.g. at Klik: -- Jan Claeys From guido at python.org Thu Feb 16 19:27:53 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 10:27:53 -0800 Subject: [Python-Dev] 2.5 - I'm ok to do release management In-Reply-To: <200602161745.25795.anthony@interlink.com.au> References: <200602161745.25795.anthony@interlink.com.au> Message-ID: On 2/15/06, Anthony Baxter wrote: > I'm still catching up on the hundreds of python-dev messages from the > last couple of days, but a quick note first that I'm ok to do release > management for 2.5 Thanks! While catching up, yuo can ignore the bytes discussion except for Neil Schemenauer's proto-pep. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 16 19:33:19 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 10:33:19 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> Message-ID: On 2/15/06, Alex Martelli wrote: > I agree, or, MAL's idea of bytes.open() and unicode.open() is also > good. No, the bytes and text data types shouldn't have to be tied to the I/O system. (The latter tends to evolve at a much faster rate so should be isolated.) > My fondest dream is that we do NOT have an 'open' builtin > which has proven to be very error-prone when used in Windows by > newbies (as evidenced by beginner errors as seen on c.l.py, the > python-help lists, and other venues) -- defaulting 'open' to text is > errorprone, defaulting it to binary doesn't seem the greatest idea > either, principle "when in doubt, resist the temptation to guess" > strongly suggests not having 'open' as a built-in at all. Bill Janssen has expressed this sentiment too. But this is because open() *appears* to work for both types to Unix programmers. If open() is *only* usable for text data, even Unix programmers will be using openbytes() from the start. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Thu Feb 16 19:33:36 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Thu, 16 Feb 2006 10:33:36 -0800 Subject: [Python-Dev] [Python-checkins] r42396 - peps/trunk/pep-0011.txt In-Reply-To: <43F45039.2050308@egenix.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> Message-ID: [Moving to python-dev] I don't have a strong opinion. Any one else have an opinion about removing --with-wctype-functions from configure? n -- On 2/16/06, M.-A. Lemburg wrote: > neal.norwitz wrote: > > Author: neal.norwitz > > Date: Thu Feb 16 06:25:37 2006 > > New Revision: 42396 > > > > Modified: > > peps/trunk/pep-0011.txt > > Log: > > MAL says this option should go away in bug report 874534: > > > > The reason for the removal is that the option causes > > semantical problems and makes Unicode work in non-standard > > ways on platforms that use locale-aware extensions to the > > wc-type functions. > > > > Since it wasn't previously announced, we can keep the option until 2.6 > > unless someone feels strong enough to rip it out. > > I've been wanting to rip this out for some time now, but > you're right: I forgot to add this to PEP 11, so let's > wait for another release. > > OTOH, this normally only affects system builders, so perhaps > we could do this a little faster, e.g. add a warning in the > first alpha and then rip it out with one of the last betas ?! > > > Modified: peps/trunk/pep-0011.txt > > > > + Name: Systems using --with-wctype-functions > > + Unsupported in: Python 2.6 > > + Code removed in: Python 2.6 From guido at python.org Thu Feb 16 19:35:27 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 10:35:27 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43f41426.921510492@news.gmane.org> References: <43F2FDDD.3030200@gmail.com> <43F3E7DB.4010502@canterbury.ac.nz> <43f41426.921510492@news.gmane.org> Message-ID: On 2/15/06, Bengt Richter wrote: > On Wed, 15 Feb 2006 18:57:26 -0800, Guido van Rossum wrote: > >My expectation is that the Py3k standard I/O library will do all of > >its own conversions on top of binary files anyway -- if you missed it, > >I'd like to get rid of any ties to C's stdio. > > > Would the standard I/O module have low level utility stream-processing generators > to do things like linesep normalization in text or splitlines etc? I.e., primitives > that could be composed for unforseen usefulness, like unix pipeable stuff? Yes. To get a (very limited) idea of what I'm talking about, see the sio package in the sandbox: http://svn.python.org/view/sandbox/trunk/sio/ -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Feb 16 19:50:08 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 10:50:08 -0800 Subject: [Python-Dev] Rename str/unicode to text [Was: Re: str object going in Py3K] In-Reply-To: References: Message-ID: On 2/16/06, Adam Olsen wrote: > While we're at it, any chance of renaming str/unicode to text in 3.0? > It's a MUCH better name, as evidenced by the opentext/openbytes names. > str is just some odd C-ism. > > Obviously it's a form of gratuitous breakage, but I think the long > term benefits are enough that we need to be *sure* that the breakage > would be too much before we discount it. This seems the right time to > discuss that. I'm +/-0 on this. ABC used text. In almost every other currently popular language it's called string. But the advantage of text is that it's not an abbreviation, and it reinforces the notion that it's not binary data. "Binary string" is a common colloquialism; "binary text" is an oxymoron. Mechanical conversion of code using 'str' (or 'unicode') to use 'text' seems simply enough. OTOH, even if we didn't rename str/unicode to text, opentext would still be a good name for the function that opens a text file. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Thu Feb 16 20:24:54 2006 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 16 Feb 2006 14:24:54 -0500 Subject: [Python-Dev] 2.5 - I'm ok to do release management In-Reply-To: <200602161745.25795.anthony@interlink.com.au> References: <200602161745.25795.anthony@interlink.com.au> Message-ID: <1f7befae0602161124x322d20fcx5479a6f2cb6b9d2f@mail.gmail.com> [Anthony Baxter] > I'm still catching up on the hundreds of python-dev messages from the > last couple of days, but a quick note first that I'm ok to do release > management for 2.5 I, for one, am delighted to see that Australian millionaires don't give up tech work after winning an Olympic gold medal. Congratulations to Anthony on his! I didn't even know that Human-Kangaroo Doubles Luge was a sport until last night. Damn gutsy move letting the roo take top position, and I hope to see more bold thinking like that after Anthony's ribs heal. From jason.orendorff at gmail.com Thu Feb 16 21:00:42 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 16 Feb 2006 15:00:42 -0500 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: References: Message-ID: On 2/15/06, Guido van Rossum wrote: > > > Actually users trying to figure out Unicode would probably be better > served > > if bytes.encode() and text.decode() did not exist. > [...] > It would be better if the signature of text.encode() always returned a > bytes object. But why deny the bytes object a decode() method if text > objects have an encode() method? I agree, text.encode() and bytes.decode() are both swell. It's the other two that bother me. I'd say there are two "symmetric" API flavors possible (t and b are > text and bytes objects, respectively, where text is a string type, > either str or unicode; enc is an encoding name): > > - b.decode(enc) -> t; t.encode(enc) -> b > - b = bytes(t, enc); t = text(b, enc) > > I'm not sure why one flavor would be preferred over the other, > although having both would probably be a mistake. > I prefer constructor flavor; the word "bytes" feels more concrete than "encode". But I worry about constructors being too overloaded. >>> text(b, enc) # decode >>> text(mydict) # repr >>> text(b) # uh... decode with default encoding? -j -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060216/5138b8fa/attachment.htm From guido at python.org Thu Feb 16 21:09:06 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 12:09:06 -0800 Subject: [Python-Dev] PEP 338 issue finalisation (was Re: 2.5 PEP) In-Reply-To: <79990c6b0602160739j297b775eq358095cf530cea3b@mail.gmail.com> References: <43F3A2EE.8060208@gmail.com> <43F45A8A.3050900@gmail.com> <79990c6b0602160739j297b775eq358095cf530cea3b@mail.gmail.com> Message-ID: On 2/16/06, Paul Moore wrote: > On 2/16/06, Guido van Rossum wrote: > > On 2/16/06, Nick Coghlan wrote: > > > > The PEP itself requests that a string be returned from get_data(), but doesn't > > > require that the file be opened in text mode. Perhaps the PEP 302 emulation > > > should use binary mode here? Otherwise there could be strange data corruption > > > bugs on Windows. > > > > But PEP 302 shows as its only example reading from a file with a .txt > > extension. Adding spurious \r characters is also data corruption. We > > should probably post to python-dev a request for clarification of PEP > > 302, but in the mean time I vote for text mode. > > FWIW, the .txt example was just a toy example. I'd say that binary > mode makes sense, as I can imagine using the get_data interface to > load image files, for example. It makes getting text files a bit > harder (you have to munge CRLF manually) but at least you have the > *option* of getting binary files. > > On reflection, get_data should probably have been given a mode > argument. But given that it didn't, binary seems safest. > > OTOH, I don't know who actually *uses* get_data for real (PJE, for > eggs? py2exe?). Their opinions are likely to be of more importance. > > On the third hand, doing whatever the zipimport module does is likely > to be right, as that's the key bit of prior art. It doesn't do any CRLF -> LF translation so this supports the binary theory. > Regardless, the PEP should be clarified. I'll make the change once > agreement is reached. Thanks. Based on the zipimport precedent I propose to make it binary. The example could be changed to read a GIF image. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bh at intevation.de Thu Feb 16 20:59:10 2006 From: bh at intevation.de (Bernhard Herzog) Date: Thu, 16 Feb 2006 20:59:10 +0100 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: (Travis E. Oliphant's message of "Tue, 14 Feb 2006 20:41:19 -0700") References: Message-ID: "Travis E. Oliphant" writes: > 2) The __index__ special method will have the signature > > def __index__(self): > return obj > > Where obj must be either an int or a long or another object > that has the __index__ special method (but not self). So int objects will not have an __index__ method (assuming that ints won't return a different but equal int object). However: > 4) A new operator.index(obj) function will be added that calls > equivalent of obj.__index__() and raises an error if obj does not > implement the special method. So operator.index(1) will raise an exception. I would expect operator.index to be implemented using PyNumber_index. Bernhard -- Intevation GmbH http://intevation.de/ Skencil http://skencil.org/ Thuban http://thuban.intevation.org/ From benji at benjiyork.com Thu Feb 16 20:35:26 2006 From: benji at benjiyork.com (Benji York) Date: Thu, 16 Feb 2006 14:35:26 -0500 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <19igus5puu6e2$.dlg@usenet.alexanderweb.de> References: <43F2FE65.5040308@pollenation.net> <19igus5puu6e2$.dlg@usenet.alexanderweb.de> Message-ID: <43F4D3FE.4040905@benjiyork.com> Alexander Schremmer wrote: > In fact, PHP does it like php.net/functionname which is even shorter, i.e. > they fallback to the documentation if that path does not exist otherwise. Like many things PHP, that seems a bit too magical for my tastes. -- Benji York From guido at python.org Thu Feb 16 21:47:22 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 12:47:22 -0800 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: <20060216025515.GA474@mems-exchange.org> References: <20060216025515.GA474@mems-exchange.org> Message-ID: On 2/15/06, Neil Schemenauer wrote: > This could be a replacement for PEP 332. At least I hope it can > serve to summarize the previous discussion and help focus on the > currently undecided issues. > > I'm too tired to dig up the rules for assigning it a PEP number. > Also, there are probably silly typos, etc. Sorry. I may check it in for you, although right now it would be good if we had some more feedback. I noticed one behavior in your pseudo-code constructor that seems questionable: while in the Q&A section you explain why the encoding is ignored when the argument is a str instance, in fact you require an encoding (and one that's not "ascii") if the str instance contains any non-ASCII bytes. So bytes("\xff") would fail, but bytes("\xff", "blah") would succeed. I think that's a bit strange -- if you ignore the encoding, you should always ignore it. So IMO bytes("\xff") and bytes("\xff", "ascii") should both return the same as bytes([255]). Also, there's a code path where the initializer is a unicode instance and its encode() method is called with None as the argument. I think both could be fixed by setting the encoding to sys.getdefaultencoding() if it is None and the argument is a unicode instance: def bytes(initialiser=[], encoding=None): if isinstance(initialiser, basestring): if isinstance(initialiser, unicode): if encoding is None: encoding = sys.getdefaultencoding() initialiser = initialiser.encode(encoding) initialiser = [ord(c) for c in initialiser] elif encoding is not None: raise TypeError("explicit encoding invalid for non-string " "initialiser") create bytes object and fill with integers from initialiser return bytes object BTW, for folks who want to experiment, it's quite simple to create a working bytes implementation by inheriting from array.array. Here's a quick draft (which only takes str instance arguments): from array import array class bytes(array): def __new__(cls, data=None): b = array.__new__(cls, "B") if data is not None: b.fromstring(data) return b def __str__(self): return self.tostring() def __repr__(self): return "bytes(%s)" % repr(list(self)) def __add__(self, other): if isinstance(other, array): return bytes(super(bytes, self).__add__(other)) return NotImplemented -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Thu Feb 16 21:50:10 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 16 Feb 2006 21:50:10 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> Message-ID: <43F4E582.1040201@egenix.com> Guido van Rossum wrote: > On 2/15/06, Alex Martelli wrote: >> I agree, or, MAL's idea of bytes.open() and unicode.open() is also >> good. > > No, the bytes and text data types shouldn't have to be tied to the I/O > system. (The latter tends to evolve at a much faster rate so should be > isolated.) > >> My fondest dream is that we do NOT have an 'open' builtin >> which has proven to be very error-prone when used in Windows by >> newbies (as evidenced by beginner errors as seen on c.l.py, the >> python-help lists, and other venues) -- defaulting 'open' to text is >> errorprone, defaulting it to binary doesn't seem the greatest idea >> either, principle "when in doubt, resist the temptation to guess" >> strongly suggests not having 'open' as a built-in at all. > > Bill Janssen has expressed this sentiment too. But this is because > open() *appears* to work for both types to Unix programmers. If open() > is *only* usable for text data, even Unix programmers will be using > openbytes() from the start. All the variations aside: What will be the explicit way to open a file in bytes mode and in text mode (I for one would like to move away from open() completely as well) ? Will we have a single file type with two different modes or two different types ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 16 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From guido at python.org Thu Feb 16 22:11:49 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 13:11:49 -0800 Subject: [Python-Dev] Proposal: defaultdict Message-ID: A bunch of Googlers were discussing the best way of doing the following (a common idiom when maintaining a dict of lists of values relating to a key, sometimes called a multimap): if key not in d: d[key] = [] d[key].append(value) An alternative way to spell this uses setdefault(), but it's not very readable: d.setdefault(key, []).append(value) and it also suffers from creating an unnecessary list instance. (Timings were inconclusive; the approaches are within 5-10% of each other in speed.) My conclusion is that setdefault() is a failure -- it was a well-intentioned construct, but doesn't actually create more readable code. Google has an internal data type called a DefaultDict which gets passed a default value upon construction. Its __getitem__ method, instead of raising KeyError, inserts a shallow copy (!) of the given default value into the dict when the value is not found. So the above code, after d = DefaultDict([]) can be written as simply d[key].append(value) Note that of all the possible semantics for __getitem__ that could have produced similar results (e.g. not inserting the default in the underlying dict, or not copying the default value), the chosen semantics are the only ones that makes this example work. Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. Some more design subtleties: - "key in d" still returns False if the key isn't there - "d.get(key)" still returns None if the key isn't there - "d.default" should be a read-only attribute giving the default value Feedback? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Thu Feb 16 22:21:33 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 16 Feb 2006 22:21:33 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <20060216212133.GB23859@xs4all.nl> On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote: > Over lunch with Alex Martelli, he proposed that a subclass of dict > with this behavior (but implemented in C) would be a good addition to > the language. It looks like it wouldn't be hard to implement. It could > be a builtin named defaultdict. The first, required, argument to the > constructor should be the default value. Remaining arguments (even > keyword args) are passed unchanged to the dict constructor. Should a dict subclass really change the constructor/initializer signature in an incompatible way? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tdelaney at avaya.com Thu Feb 16 22:27:04 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Fri, 17 Feb 2006 08:27:04 +1100 Subject: [Python-Dev] Proposal: defaultdict Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com> Guido van Rossum wrote: > Over lunch with Alex Martelli, he proposed that a subclass of dict > with this behavior (but implemented in C) would be a good addition to > the language. It looks like it wouldn't be hard to implement. It could > be a builtin named defaultdict. The first, required, argument to the > constructor should be the default value. Remaining arguments (even > keyword args) are passed unchanged to the dict constructor. > > Feedback? On behalf of everyone who has answered this question on c.l.py, may I say WOOHOO! FWIW, my usual spelling is: try: v = d[key] except: v = d[key] = value which breaks the principle of "write it once". Tim Delaney From exarkun at divmod.com Thu Feb 16 22:53:57 2006 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Thu, 16 Feb 2006 16:53:57 -0500 Subject: [Python-Dev] from __future__ import unicode_strings? In-Reply-To: <43F49A64.90308@egenix.com> Message-ID: <20060216215357.6122.733406986.divmod.quotient.864@ohm> On Thu, 16 Feb 2006 16:29:40 +0100, "M.-A. Lemburg" wrote: >Jean-Paul Calderone wrote: >> On Thu, 16 Feb 2006 11:24:35 +0100, "M.-A. Lemburg" wrote: >>> Neil Schemenauer wrote: >>>> On Thu, Feb 16, 2006 at 02:43:02AM +0100, Thomas Wouters wrote: >>>>> On Wed, Feb 15, 2006 at 05:23:56PM -0800, Guido van Rossum wrote: >>>>> >>>>>>> from __future__ import unicode_strings >>>>>> Didn't we have a command-line option to do this? I believe it was >>>>>> removed because nobody could see the point. (Or am I hallucinating? >>>>>> After several days of non-stop discussing bytes that must be >>>>>> considered a possibility.) >>>>> We do, and it's not been removed: the -U switch. >>>> As Guido alluded, the global switch is useless. A per-module switch >>>> something that could actually useful. One nice advantage is that >>>> you would write code that works the same with Jython (wrt to string >>>> literals anyhow). >>> The global switch is not useless. It's purpose is to test the >>> standard library (or any other piece of Python code) for Unicode >>> compatibility. >>> >>> Since we're not even close to such compatibility, I'm not sure >>> how useful a per-module switch would be. >> >> Just what Neil suggested: developers writing new code benefit from having the behavior which will ultimately be Python's default, rather than the behavior that is known to be destined for obsolescence. >> >> Being able to turn this on per-module is useful for the same reason the rest of the future system is useful on a per-module basis. It's easier to convert things incrementally than monolithicly. > >Sure, but in this case the option would not only affect the module >you define it in, but also all other code that now gets Unicode >objects instead of strings as a result of the Unicode literals >defined in these modules. This is precisely correct. It is also exactly parallel to the only other __future__ import which changes any behavior. Personally, I _also_ like future division. Is it generally considered to have been a mistake? > >It is rather likely that you'll start hitting Unicode-related >compatibility bugs in the standard lib more often than you'd >like. You can guess this. I'll guess that it isn't the case. And who's to say how often I'd like that to happen, anyway? :) Anyone who's afraid that will happen can avoid using the import. Voila, problem solved. > >It's usually better to switch to Unicode in a controlled manner: >not by switching all literals to Unicode, but only some, then >test things, then switch over some more, etc. There's nothing uncontrolled about this proposed feature, so this statement doesn't hold any meaning. > >This can be done by prepending the literal with the u"" modifier. > Anyone who is happier converting one string literal at a time can do this. Anyone who would rather convert a module at a time can use the future import. Jean-Paul From martin at v.loewis.de Thu Feb 16 23:06:15 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2006 23:06:15 +0100 Subject: [Python-Dev] 2.5 PEP In-Reply-To: <368a5cd50602152218n39dec5a5hf68c09eecc119aad@mail.gmail.com> References: <200602150922.43810.alain.poirier@net-ng.com> <43F37FE1.2090201@v.loewis.de> <368a5cd50602152218n39dec5a5hf68c09eecc119aad@mail.gmail.com> Message-ID: <43F4F757.1000603@v.loewis.de> Fredrik Lundh wrote: > http://mail.python.org/pipermail/python-dev/2005-December/058752.html > > "I don't agree with the change. You just broke source compatibility > between the core package and PyXML." I'm still unhappy with that change, and still nobody has told me how to maintain PyXML so that it can continue to work both for 2.5 and for 2.4. Regards, Martin From greg.ewing at canterbury.ac.nz Thu Feb 16 23:55:33 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 17 Feb 2006 11:55:33 +1300 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F502E5.5030004@canterbury.ac.nz> Guido van Rossum wrote: > The first, required, argument to the > constructor should be the default value. I'd like to suggest that this argument be a function for creating default values, rather than an actual default value. This would avoid any confusion over exactly how the default value is copied. (Shallow or deep? How deep?) In an earlier discussion it was pointed out that this would be no less convenient for many common use cases, e.g. in your example, d = defaultdict(list) Also I'm not sure about the name "defaultdict". When I created a class like this recently, I called it an "autodict" (i.e. a dict that automatically extends itself with new entries). And perhaps the value should be called an "initial value" rather than a default value, to more strongly suggest that it becomes a permanent part of the dict. Greg From bokr at oz.net Fri Feb 17 00:15:04 2006 From: bokr at oz.net (Bengt Richter) Date: Thu, 16 Feb 2006 23:15:04 GMT Subject: [Python-Dev] str object going in Py3K References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> Message-ID: <43f4c067.965607660@news.gmane.org> On Wed, 15 Feb 2006 21:59:55 -0800, Alex Martelli wrote: > >On Feb 15, 2006, at 9:51 AM, Barry Warsaw wrote: > >> On Wed, 2006-02-15 at 09:17 -0800, Guido van Rossum wrote: >> >>> Regarding open vs. opentext, I'm still not sure. I don't want to >>> generalize from the openbytes precedent to openstr or openunicode >>> (especially since the former is wrong in 2.x and the latter is wrong >>> in 3.0). I'm tempting to hold out for open() since it's most >>> compatible. >> >> If we go with two functions, I'd much rather hang them off of the file >> type object then add two new builtins. I really do think file.bytes() >> and file.text() (a.k.a. open.bytes() and open.text()) is better than >> opentext() or openbytes(). > >I agree, or, MAL's idea of bytes.open() and unicode.open() is also >good. My fondest dream is that we do NOT have an 'open' builtin >which has proven to be very error-prone when used in Windows by >newbies (as evidenced by beginner errors as seen on c.l.py, the >python-help lists, and other venues) -- defaulting 'open' to text is >errorprone, defaulting it to binary doesn't seem the greatest idea >either, principle "when in doubt, resist the temptation to guess" >strongly suggests not having 'open' as a built-in at all. (And >namemangling into openthis and openthat seems less Pythonic to me >than exploiting namespaces by making structured names, either >this.open and that.open or open.this and open.that). IOW, I entirely >agree with Barry and Marc Andre. > FWIW, I'd vote for file.text and file.bytes I don't like bytes.open or unicode.open because I think types in general should not know about I/O (IIRC Guido said that, so pay attention ;-) Especially unicode. E.g., why should unicode pull in a whole wad of I/O-related code if the user is only using it as intermediary in some encoding change between low level binary input and low level binary output? E.g., consider what you could do with one statement like (untested) s_str.translate(table, delch).encode('utf-8') especially if you didn't have to introduce a phony latin-1 decoding and write it as (untested) s_str.translate(table, delch).decode('latin-1').encode('utf-8') # use str.translate or s_str.decode('latin-1').translate(mapping).encode('utf-8') # use unicode.translate also for delch to avoid exceptions if you have non-ascii in your s_str translation It seems s_str.translate(table, delchars) wants to convert the s_str to unicode if table is unicode, and then use unicode.translate (which bombs on delchars!) instead of just effectively defining str.translate as def translate(self, table, deletechars=None): return ''.join(table[ord(x)] for x in self if deletechars is None or x not in deletechars) IMO, if you want unicode.translate, then write unicode(s_str).translate and use that. Let str.translate just use the str ords, so simple custom decodes can be written without the annoyance of UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 3: ordinal not in range(128) Can we change this? Or what am I missing? I certainly would like to miss the above message for str.translate :-( BTW This would also allow taking advantage of features of both translates if desired, e.g. by s_str.translate(unichartable256, strdelchrs).translate(uniord_to_ustr_mapping). (e.g., the latter permits single to multiple-character substitution) This makes me think a translate method for bytes would be good for py3k (on topic ;-) It it is just too handy a high speed conversion goodie to forgo IMO. ___________ BTW, ISTM that it would be nice to have a chunking-iterator-wrapper-returning-method (as opposed to buffering specification) for file.bytes, so you could plug in file.bytes('path').chunk(1) # maybe keyword opts for simple common record chunking also? in places where you might now have to have (untested) (ord(x) for x in iter(lambda f=open('path','rb'):f.read(1)) if x) or write a helper like def by_byte_ords(path, bufsize=8192): f = open(path, 'rb') buf = f.read(bufsize) while buf: for x in buf: yield ord(x) buf = f.read(bufsize) and plug in by_byte_ords(path) ___________ BTW, bytes([]) would presumably be the file.bytes EOF? Regards, Bengt Richter From martin at v.loewis.de Fri Feb 17 00:21:26 2006 From: martin at v.loewis.de (=?ISO-8859-2?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 00:21:26 +0100 Subject: [Python-Dev] how bugfixes are handled? In-Reply-To: References: Message-ID: <43F508F6.6040801@v.loewis.de> Arkadiusz Miskiewicz wrote: > I wasn't, thanks for information. > > Still few questions... one of developers/commiters reviews patch and commit > it? Few developers has to review single patch? As Neal says, a single committer can review and commit. However, non-committers can also review; this is the point of asking for patch reviews. In many cases, the initial patch will not be "good enough": it will lack documentation and test cases, it will contain bugs, not follow the code formatting guidelines, and it will make changes irrelevant to the issue being addressed ("gratuitous changes"). A reviewer is supposed to sort these all out, and then end up with a final recommendation ("accept" or "reject"). Of course, if it is going to be "reject", there is little point in making the submitter comply with formal criteria. Ideally, a committer then will only have to read the entire review process, agree with it step-by-step, and commit the proposed change. As a historical note: people doing a lot of reviews eventually end up as committers, just because it is easier for the other committers if they also do the final step. Regards, Martin From martin at v.loewis.de Fri Feb 17 00:27:16 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 00:27:16 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F3E7DB.4010502@canterbury.ac.nz> References: <43F2FDDD.3030200@gmail.com> <43F3E7DB.4010502@canterbury.ac.nz> Message-ID: <43F50A54.4070609@v.loewis.de> Greg Ewing wrote: > Another thought -- what is going to happen to os.open? > Will it change to return bytes, or will there be a new > os.openbytes? Nit-pickingly: os.open will continue to return integers. I think it should return OS handles on Windows, instead of C library handles. (also notice that this has nothing to do with stdio: os.open does not use stdio; this is POSIX open). Regards, Martin From martin at v.loewis.de Fri Feb 17 00:33:49 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 00:33:49 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060215212629.5F6D.JCARLSON@uci.edu> References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> Message-ID: <43F50BDD.4010106@v.loewis.de> Josiah Carlson wrote: > I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', > and likely a few others that the two of you may be arguing against > should stay as encodings, because strictly speaking, they are defined as > encodings of data. They may not be encodings of _unicode_ data, but > that doesn't mean that they aren't useful encodings for other kinds of > data, some text, some binary, ... To support them, the bytes type would have to gain a .encode method, and I'm -1 on supporting bytes.encode, or string.decode. Why is s.encode("uu") any better than binascii.b2a_uu(s) Regards, Martin From bokr at oz.net Fri Feb 17 03:25:25 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 02:25:25 GMT Subject: [Python-Dev] str.translate vs unicode.translate (was: Re: str object going in Py3K) References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> Message-ID: <43f507f2.983922896@news.gmane.org> If str becomes unicode for PY 3000, and we then have bytes as out coding-agnostic byte data, then I think bytes should have the str translation method, with a tweak that I would hope could also be done to str now. BTW, str.translate will presumably become unicode.translate, so perhaps unicode.translate should grow a compatible deletechars parameter. But that's not the tweak. The tweak is to eliminate unavoidable pre-conversion to unicode in str(something).translate(u'...', delchars) (and preemptively bytes(something).translate(u'...', delchars)) E.g. suppose you now want to write: s_str.translate(table, delch).encode('utf-8') Note that s_str has no encoding information, and translate is conceptually just a 1:1 substitution minus characters in delch. But if we want to do one-chr:one-unichr substitution by specifying a 256-long table of unicode characters, we cannot. It would be simple to allow it, and that's the tweak I would like. It would allow easy custom decodes. At the moment, if you want to write the above, you have to introduce a phony latin-1 decoding and write it as (not typo-proof) s_str.translate(table, delch).decode('latin-1').encode('utf-8') # use str.translate or s_str.decode('latin-1').translate(mapping).encode('utf-8') # use unicode.translate also for delch to avoid exceptions if you have non-ascii in your s_str (even if delch would have removed them!!) It seems s_str.translate(table, delchars) wants to convert the s_str to unicode if table is unicode, and then use unicode.translate (which bombs on delchars!) instead of just effectively defining str.translate as def translate(self, table, deletechars=None): return ''.join((table or isinstance(table,unicode) and uidentity or sidentity)[ord(x)] for x in self if not deletechars or x not in deletechars) # For convenience in just pruning with deletechars, s_str.translate('', deletechars) deletes without translating, # and s_str.translate(u'', deletechars) does the same and then maps to same-ord unicode characters # given # sidentity = ''.join(chr(i) for i in xrange(256)) # and # uidentity = u''.join(unichr(i) for i in xrrange(256)). IMO, if you want unicode.translate, then it doesn't hurt to write unicode(s_str).translate and use that. Let str.translate just use the str ords, so simple custom decodes can be written without the annoyance of e.g., UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 3: ordinal not in range(128) Can we change this for bytes? And why couldn't we change this for str.translate now? Or what am I missing? I certainly would like to miss the above message for str.translate :-( BTW This would also allow taking advantage of features of both translates if desired, e.g. by s_str.translate(unichartable256, strdelchrs).translate(uniord_to_ustr_or_uniord_mapping). (e.g., the latter permits single to multiple-character substitution) I think at least a tweaked translate method for bytes would be good for py3k, and I hope we can do it for str.translate now. It it is just too handy a high speed conversion goodie to forgo IMO. Regards, Bengt Richter From python at rcn.com Fri Feb 17 05:27:17 2006 From: python at rcn.com (Raymond Hettinger) Date: Thu, 16 Feb 2006 23:27:17 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com> Message-ID: <002301c6337a$7001f140$b83efea9@RaymondLaptop1> >> Over lunch with Alex Martelli, he proposed that a subclass of dict >> with this behavior (but implemented in C) would be a good addition to >> the language I would like to add something like this to the collections module, but a PEP is probably needed to deal with issues like: * implications of a __getitem__ succeeding while get(value, x) returns x (possibly different from the overall default) * implications of a __getitem__ succeeding while __contains__ would fail * whether to add this to the collections module (I would say yes) * whether to allow default functions as well as default values (so you could instantiate a new default list) * comparing all the existing recipes and third-party modules that have already done this * evaluating its fitness for common use cases (i.e. bags and dict of lists). * lay out a few examples: # bag like behavior dd = collections.default_dict() dd.default(0) for elem in collection: dd[elem] += 1 # setdefault-like behavior dd = collections.default_dict() dd.default(list) # instantiate a new list for empty cells for page_number, page in enumerate(book): for word in page.split(): dd[word].append(word) Raymond From guido at python.org Fri Feb 17 05:44:43 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2006 20:44:43 -0800 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F4E582.1040201@egenix.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> Message-ID: On 2/16/06, M.-A. Lemburg wrote: > What will be the explicit way to open a file in bytes mode > and in text mode (I for one would like to move away from > open() completely as well) ? > > Will we have a single file type with two different modes > or two different types ? I'm currently thinking of an I/O stack somewhat like Java's. At the bottom there's a class that lets you do raw unbuffered reads and writes (and seek/tell) on binary files using bytes arrays. We can layer onto this buffering, text encoding/decoding, and more. (Windows CRLF<->LF conversion is also an encoding of sorts). Years ago I wrote a prototype; checkout sandbox/sio/. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at performancedrivers.com Fri Feb 17 05:50:38 2006 From: jack at performancedrivers.com (Jack Diederich) Date: Thu, 16 Feb 2006 23:50:38 -0500 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <1140049757.14818.45.camel@geddy.wooz.org> Message-ID: <20060217045038.GE6100@performancedrivers.com> On Thu, Feb 16, 2006 at 06:13:53PM +0100, Fredrik Lundh wrote: > Barry Warsaw wrote: > > > We know at least there will never be a 2.10, so I think we still have > > time. > > because there's no way to count to 10 if you only have one digit? > > we used to think that back when the gas price was just below 10 SEK/L, > but they found a way... > Of course they found a way. The alternative was cutting taxes. whish-I-was-winking, -Jack From jcarlson at uci.edu Fri Feb 17 06:20:30 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 16 Feb 2006 21:20:30 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F4493D.8090902@canterbury.ac.nz> References: <20060215212629.5F6D.JCARLSON@uci.edu> <43F4493D.8090902@canterbury.ac.nz> Message-ID: <20060216210827.5F88.JCARLSON@uci.edu> Greg Ewing wrote: > > Josiah Carlson wrote: > > > They may not be encodings of _unicode_ data, > > But if they're not encodings of unicode data, what > business do they have being available through > someunicodestring.encode(...)? I had always presumed that bytes objects are going to be able to be a source for encode AND decode, like current non-unicode strings are able to be today. In that sense, if I have a bytes object which is an encoding of rot13, hex, uu, etc., or I have a bytes object which I would like to be in one of those encodings, I should be able to do b.encode(...) or b.decode(...), given that 'b' is a bytes object. Are 'encodings' going to become a mechanism to encode and decode _unicode_ strings, rather than a mechanism to encode and decode _text and data_ strings? That would seem like a backwards step to me, as the email package would need to package their own base-64 encode/decode API and implementation, and similarly for any other package which uses any one of the encodings already available. - Josiah From steve at holdenweb.com Fri Feb 17 06:43:50 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 17 Feb 2006 00:43:50 -0500 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <20060215223943.GL6027@xs4all.nl> <1140049757.14818.45.camel@geddy.wooz.org> Message-ID: Fredrik Lundh wrote: > Barry Warsaw wrote: > > >>We know at least there will never be a 2.10, so I think we still have >>time. > > > because there's no way to count to 10 if you only have one digit? > > we used to think that back when the gas price was just below 10 SEK/L, > but they found a way... > IIRC Guido is on record as saying "There will be no Python 2.10 because I hate the ambiguity of double-digit minor release numbers", or words to that effect. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From steve at holdenweb.com Fri Feb 17 06:59:25 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 17 Feb 2006 00:59:25 -0500 Subject: [Python-Dev] Test failures in test_timeout In-Reply-To: <20060216174326.GA23859@xs4all.nl> References: <20060216174326.GA23859@xs4all.nl> Message-ID: Thomas Wouters wrote: > I'm seeing spurious test failures in test_timeout, on my own workstation and > on macteagle.python.org (now that it crashes less; Apple sent over some new > memory.) The problem is pretty simple: both macteagle and my workstation > live too closely, network-wise, to www.python.org: > > class TimeoutTestCase(unittest.TestCase): > [...] > def setUp(self): > [...] > self.addr_remote = ('www.python.org', 80) > [...] > def testConnectTimeout(self): > # Test connect() timeout > _timeout = 0.001 > self.sock.settimeout(_timeout) > > _t1 = time.time() > self.failUnlessRaises(socket.error, self.sock.connect, > self.addr_remote) > > In other words, the test fails because www.python.org responds too quickly. > > The test on my workstation only fails occasionally, but I do expect > macteagle's failure to be more common (since it's connected to > www.python.org through (literally) a pair of gigabit switches, whereas my > workstation has to pass through a few more switches, two Junipers and some > dark fiber.) Lowering the timeout has no effect, as far as I can tell, which > is probably a granularity issue. > > I'm thinking that it could probably try to connect to a less reliable > website, but that's just moving the problem around (and possibly harassing > an unsuspecting website, particularly around release-time.) Perhaps the test > should try to connect to a known unconnecting site, like a firewalled port > on www.python.org? Not something that refuses connections, just something > that times out. > Couldn't the test use subprocess to start a reliably slow server on localhost? It might even be possible to retrieve the ephemeral port number used by the server, to avoid conflicts with already-used ports on the testing machine. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From steve at holdenweb.com Fri Feb 17 06:55:42 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 17 Feb 2006 00:55:42 -0500 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <43F42425.3070807@acm.org> References: <43F42425.3070807@acm.org> Message-ID: <43F5655E.7060901@holdenweb.com> Talin wrote: > First off, let me apologize for bringing up a topic that I am sure that > everyone is sick of: Lambda. > > I broached this subject to a couple of members of this list privately, > and I got wise feedback on my suggestions which basically amounted to > "don't waste your time." > > However, after having thought about this for several weeks, I came to > the conclusion that I felt so strongly about this issue that the path of > wisdom simply would not do, and I would have to choose the path of > folly. Which I did. > Thereby proving the truth of the old Scottish adage "The better the advice, the worse it's wasted". Not that I haven't been wasted myself a time or two. > In other words, I went ahead and implemented it. Actually, it wasn't too > bad, it only took about an hour of reading the ast.c code and the > Grammar file (neither of which I had ever looked at before) to get the > general sense of what's going on. > > So the general notion is similar to the various proposals on the Wiki - > an inline keyword which serves the function of lambda. I chose the > keyword "given" because it reminds me of math textbooks, e.g. "given x, > solve for y". And I like the idea of syntactical structures that make > sense when you read them aloud. > > Here's an interactive console session showing it in action. > > The first example shows a simple closure that returns the square of a > number. > > >>> a = (x*x given x) > >>> a(9) > 81 > > You can also put parens around the argument list if you like: > > >>> a = (x*x given (x)) > >>> a(9) > 81 > > Same thing with two arguments, and with the optional parens: > > >>> a = (x*y given x,y) > >>> a(9, 10) > 90 > >>> a = (x*y given (x,y)) > >>> a(9, 10) > 90 > > Yup, keyword arguments work too: > > >>> a = (x*y given (x=3,y=4)) > >>> a(9, 10) > 90 > >>> a(9) > 36 > >>> a() > 12 > > Use an empty paren-list to indicate that you want to define a closure > with no arguments: > > >>> a = (True given ()) > >>> a() > True > > Note that there are some cases where you have to use the parens around > the arguments to avoid a syntactical ambiguity: > > >>> map( str(x) given x, (1, 2, 3, 4) ) > File "", line 1 > map( str(x) given x, (1, 2, 3, 4) ) > ^ > SyntaxError: invalid syntax > > As you can see, adding the parens makes this work: > > >>> map( str(x) given (x), (1, 2, 3, 4) ) > ['1', '2', '3', '4'] > > More fun with "map": > > >>> map( str(x)*3 given (x), (1, 2, 3, 4) ) > ['111', '222', '333', '444'] > > Here's an example that uses the **args syntax: > > >>> a = (("%s=%s" % pair for pair in kwargs.items()) given **kwargs) > >>> list( a(color="red") ) > ['color=red'] > >>> list( a(color="red", sky="blue") ) > ['color=red', 'sky=blue'] > > I have to say, the more I use it, the more I like it, but I'm sure that > this is just a personal taste issue. It looks a lot more natural to me > than lambda. > > I should also mention that I resisted the temptation to make the 'given' > keyword an optional generator suffix as in "(a for a in l given l). As I > started working with the code, I started to realize that generators and > closures, although they have some aspects in common, are very different > beasts and should not be conflated lightly. (Plus the implementation > would have been messy. I took that as a clue :)) > > Anyway, if anyone wants to play around with the patch, it is rather > small - a couple of lines in Grammar, and a small new function in ast.c, > plus a few mods to other functions to get them to call it. The context > diff is less than two printed pages. I can post it somewhere if people > are interested. > > Anyway, I am not going to lobby for a language change or write a PEP > (unless someone asks me to.) I just wanted to throw this out there and > see what people think of it. I definately don't want to start a flame > war, although I suspect I already have :/ > > Now I can stop thinking about this and go back to my TurboGears-based > Thesaurus editor :) > Whether or not Guido can steel himself to engage in yet another round of this seemingly interminable discussion, at least this proposal has the merit of being concrete and not hypothetical. It appears to hang together, but I'm not sure I see how it overcomes objections to lambda by replacing it with another keyword. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From steve at holdenweb.com Fri Feb 17 07:09:26 2006 From: steve at holdenweb.com (Steve Holden) Date: Fri, 17 Feb 2006 01:09:26 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060216212133.GB23859@xs4all.nl> References: <20060216212133.GB23859@xs4all.nl> Message-ID: Thomas Wouters wrote: > On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote: > > >>Over lunch with Alex Martelli, he proposed that a subclass of dict >>with this behavior (but implemented in C) would be a good addition to >>the language. It looks like it wouldn't be hard to implement. It could >>be a builtin named defaultdict. The first, required, argument to the >>constructor should be the default value. Remaining arguments (even >>keyword args) are passed unchanged to the dict constructor. > > > Should a dict subclass really change the constructor/initializer signature > in an incompatible way? > Dict is a particularly difficult type to subclass anyway, given that it can take an arbitrary number of arbitrarily-named keyword arguments (among many other argument styles). The proposed behavior is exactly how Icon tables behaved, and it was indeed useful in that language. Guido is right about setdefault being a busted flush. If there's no way to resolve the signature issue (which there may not be, given that dict({'one': 2, 'two': 3}) dict({'one': 2, 'two': 3}.items()) dict({'one': 2, 'two': 3}.iteritems()) dict(zip(('one', 'two'), (2, 3))) dict([['two', 3], ['one', 2]]) dict(one=2, two=3) dict([(['one', 'two'][i-2], i) for i in (2, 3)]) are all valid calls to the type) then a factory function would be a very acceptable substitute, no? (The function could make use of a subclass - there's surely no necessity to provide the default as an initializer argument: it could be provided as an argument to a method present only in the subclass). wishing-i-could-have-lunch-with-alex-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From stephen at xemacs.org Fri Feb 17 07:11:12 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 17 Feb 2006 15:11:12 +0900 Subject: [Python-Dev] bytes type discussion In-Reply-To: (Guido van Rossum's message of "Wed, 15 Feb 2006 12:33:10 -0800") References: <20060215002446.GD6027@xs4all.nl> <43F2A13F.4030604@canterbury.ac.nz> <200602142323.45930.fdrake@acm.org> <43F2CDC4.4060700@canterbury.ac.nz> Message-ID: <871wy2wm4f.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Guido" == Guido van Rossum writes: Guido> I think that the implementation of encoding-guessing or Guido> auto-encoding-upgrade techniques should be left out of the Guido> standard library design for now. As far as I can see, little new design is needed. There's no reason why an encoding-guesser couldn't be written as a codec that detects the coding, then dispatches to the appropriate codec. The only real issue I know of is that if you ask such a codec "who are you?", there are two plausible answers: "autoguess" and the codec actually being used to translate the stream. If there's no API to ask for both of those, the API might want generalization. Guido> As far as searching bytes objects, that shouldn't be a Guido> problem as long as the search 'string' is also specified as Guido> a bytes object. You do need to be a little careful in implementation, as (for example) "case insensitive" should be meaningless for searching bytes objects. This would be especially important if searching and collation become more Unicode conformant. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From bokr at oz.net Fri Feb 17 07:24:57 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 06:24:57 GMT Subject: [Python-Dev] Pre-PEP: The "bytes" object References: <20060216025515.GA474@mems-exchange.org> Message-ID: <43f53a07.996743761@news.gmane.org> On Thu, 16 Feb 2006 12:47:22 -0800, Guido van Rossum wrote: >On 2/15/06, Neil Schemenauer wrote: >> This could be a replacement for PEP 332. At least I hope it can >> serve to summarize the previous discussion and help focus on the >> currently undecided issues. >> >> I'm too tired to dig up the rules for assigning it a PEP number. >> Also, there are probably silly typos, etc. Sorry. > >I may check it in for you, although right now it would be good if we >had some more feedback. > >I noticed one behavior in your pseudo-code constructor that seems >questionable: while in the Q&A section you explain why the encoding is >ignored when the argument is a str instance, in fact you require an >encoding (and one that's not "ascii") if the str instance contains any >non-ASCII bytes. So bytes("\xff") would fail, but bytes("\xff", >"blah") would succeed. I think that's a bit strange -- if you ignore >the encoding, you should always ignore it. So IMO bytes("\xff") and >bytes("\xff", "ascii") should both return the same as bytes([255]). >Also, there's a code path where the initializer is a unicode instance >and its encode() method is called with None as the argument. I think >both could be fixed by setting the encoding to >sys.getdefaultencoding() if it is None and the argument is a unicode >instance: > > def bytes(initialiser=[], encoding=None): > if isinstance(initialiser, basestring): > if isinstance(initialiser, unicode): > if encoding is None: > encoding = sys.getdefaultencoding() > initialiser = initialiser.encode(encoding) > initialiser = [ord(c) for c in initialiser] > elif encoding is not None: > raise TypeError("explicit encoding invalid for non-string " > "initialiser") > create bytes object and fill with integers from initialiser > return bytes object Two things: [1]-------- As the above shows, str is encoding-agnostic and passes through unmodified to bytes (except by ord). I am wondering what it would hurt to allow the same for unicode ords, since unicode is also encoding-agnostic. Please read [2] before deciding that you have already decided this ;-) The beauty of a unicode literal IMO is that it launders away the source encoding into a coding-agnostic character sequence that has stable ords across the universe, so why not use them? It also solves a lot of ecaping grief. But see [2] After all, in either case, an encoding can be specified if so desired. Thus def bytes(initialiser=[], encoding=None): if isinstance(initialiser, basestring): if encoding: initialiser = initialiser.encode(encoding) # XXX for str ?? see [2] initialiser = [ord(c) for c in initialiser] elif encoding is not None: raise TypeError("explicit encoding invalid for non-string " "initialiser") create bytes object and fill with integers from initialiser return bytes object [2]------- One thing I wonder is where sys.getdefaultencoding() gets its info, and whether a module_encoding is also necessary for str arguments with encoding. E.g. if the source encoding is utf-8, and you want sys.getdefaultencoding() finally, don't you first have to do decode from the source encoding, rather than let the default decoding assumption for that be ascii? E.g. for utf-8 source, initialiser.decode('utf-8').encode(sys.getdefaultencodeing()) ? works, but initialiser.encode(sys.getdefaultencodeing()) ? bombs, because it tries to do .decode('ascii') in place of .decode('utf-8') Notice where the following fails (where utf-8 source is written to tutf8.py by tutf.py and using latin-1 as standin for sys.getdefaultencoding()) ----< tutf.py >------------------------------------------- def test(): latin_1_src = """\ # -*- coding: utf-8 -*- print '\\nfrom tutf8 import:' print map(hex,map(ord, 'abc\xf6')) print map(hex,map(ord,'abc\xf6'.decode('utf-8').encode('latin-1'))) print map(hex,map(ord,repr('abc\xf6'.encode('latin-1')))) """ open('tutf8.py','wb').write(latin_1_src.decode('latin-1').encode('utf-8')) if __name__ == '__main__': test() print '\ntutf8.py utf-8 binary line reprs:' print '\n'.join(repr(L) for L in open('tutf8.py','rb').read().splitlines()) import tutf8 ---------------------------------------------------------- The result: [20:17] C:\pywk\pydev\pep0332>py24 tutf.py tutf8.py utf-8 binary line reprs: '# -*- coding: utf-8 -*-' "print '\\nfrom tutf8 import:'" "print map(hex,map(ord, 'abc\xc3\xb6'))" "print map(hex,map(ord,'abc\xc3\xb6'.decode('utf-8').encode('latin-1')))" "print map(hex,map(ord,repr('abc\xc3\xb6'.encode('latin-1'))))" from tutf8 import: ['0x61', '0x62', '0x63', '0xc3', '0xb6'] ['0x61', '0x62', '0x63', '0xf6'] Traceback (most recent call last): File "tutf.py", line 15, in ? import tutf8 File "C:\pywk\pydev\pep0332\tutf8.py", line 5, in ? print map(hex,map(ord,repr('abc+¦'.encode('latin-1')))) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) I.e., if you leave out encoding for a str, you apparently get the native source str representation of the literal, so it would seem that that must be undone if you want to re-encode to anything else. Should there be tutf8.__encoding__ available for this after import tutf8? But that's interesting when str becomes unicode, and all literals will presumably have an internal uniform unicode encoding, so the 'literal'.decode(source_encoding) will in effect already have been done. What does a decode mean on unicode? It seems to mean blow up on non-ascii, so that's not very portable. Why not use latin-1 as the default intermediate str representation when doing a u'something'.decode(enc) ? The restriction to ascii in that context seems artificial. IMHO and with all due respect ISTM the pain of all these considerations is not worth it when the simple practicality of just prefixing a "u" on any ascii literal freely sprinkled with escapes gets you exactly the bytes values you specify in any hex escapes. That's normally what you want. If by 'abc\xf6' you really mean the character with ord value 0xf6 in some encoding, then bytes('abc\xf6'.decode(someenc), destenc) would be the way, so no one is stuck. One danger is that someone is writing an in incomplete source character set and wants to stick in some byte values in hex, happily sticking to the ascii subset plus escapes, but a decode from the source encoding can fail on non-existent character if the "ascii escape" is not in the source character set. E.g., cp1252 is pretty complete, but >>> '\x81'.decode('cp1252') Traceback (most recent call last): File "", line 1, in ? File "d:\python-2.4b1\lib\encodings\cp1252.py", line 22, in decode return codecs.charmap_decode(input,errors,decoding_map) UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 0: character maps to This can't happen with the same literal of ascii plus escapes passed as a unicode literal, given that map(ord, literal) is done on it to get bytes when no encoding is specified. You just get what you expect. It seems practical to me. I'm really trying to help, not piss you off ;-) BTW, I recently posted re str.translate vs unicode.translate, which has some tie-in with this, since I anticipate that bytes.translate would be a useful thing in the absence of str.translate. unicode.translate won't do all one might like to do with bytes.translate, I believe. Both have uses. > >BTW, for folks who want to experiment, it's quite simple to create a >working bytes implementation by inheriting from array.array. Here's a >quick draft (which only takes str instance arguments): > > from array import array > class bytes(array): > def __new__(cls, data=None): > b = array.__new__(cls, "B") > if data is not None: > b.fromstring(data) > return b > def __str__(self): > return self.tostring() > def __repr__(self): > return "bytes(%s)" % repr(list(self)) > def __add__(self, other): > if isinstance(other, array): > return bytes(super(bytes, self).__add__(other)) > return NotImplemented > Cool, thanks. Regards, Bengt Richter From stephen at xemacs.org Fri Feb 17 07:40:53 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 17 Feb 2006 15:40:53 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: (Guido van Rossum's message of "Wed, 15 Feb 2006 11:16:51 -0800") References: Message-ID: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Guido" == Guido van Rossum writes: Guido> I'd say there are two "symmetric" API flavors possible (t Guido> and b are text and bytes objects, respectively, where text Guido> is a string type, either str or unicode; enc is an encoding Guido> name): Guido> - b.decode(enc) -> t; t.encode(enc) -> b -0 When taking a binary file and attaching it to the text of a mail message using BASE64, the tendency to say you're "encoding the file in BASE64" is very strong. I just don't see how such usages can be avoided in discussion, which makes the types of decode and encode hard to remember, and easy to mistake in some contexts. Guido> - b = bytes(t, enc); t = text(b, enc) +1 The coding conversion operation has always felt like a constructor to me, and in this particular usage that's exactly what it is. I prefer the nomenclature to reflect that. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From stephen at xemacs.org Fri Feb 17 07:43:48 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 17 Feb 2006 15:43:48 +0900 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: (Bob Ippolito's message of "Wed, 15 Feb 2006 11:23:22 -0800") References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> Message-ID: <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Bob" == Bob Ippolito writes: Bob> Huh? What does that have to do with anything? I've never Bob> seen a system where /usr/include, /usr/lib, /usr/bin, Bob> etc. are not all on the same mount. It's not really any Bob> different with OS X either. /usr/share often is on a different mount; that's the whole rationale for /usr/share. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From martin at v.loewis.de Fri Feb 17 08:09:23 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 08:09:23 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F576A3.1030604@v.loewis.de> Guido van Rossum wrote: > Feedback? I would like this to be part of the standard dictionary type, rather than being a subtype. d.setdefault([]) (one argument) should install a default value, and d.cleardefault() should remove that setting; d.default should be read-only. Alternatively, d.default could be assignable and del-able. Also, I think has_key/in should return True if there is a default. Regards, Martin From martin at v.loewis.de Fri Feb 17 08:13:04 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 08:13:04 +0100 Subject: [Python-Dev] Does eval() leak? In-Reply-To: <43F4A88A.7050100@ec.gc.ca> References: <43F4A88A.7050100@ec.gc.ca> Message-ID: <43F57780.5050300@v.loewis.de> John Marshall wrote: > Should I expect the virtual memory allocation > to go up if I do the following? python-dev is a list for discussing development of Python, not the development with Python. Please post this question to python-list at python.org. For python-dev, a message explaining where the memory leak is and how to correct it would be more appropriate. Most likely, there is no memory leak in eval. Regards, Martin From bokr at oz.net Fri Feb 17 08:41:37 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 07:41:37 GMT Subject: [Python-Dev] Proposal: defaultdict References: Message-ID: <43f57a43.1013187726@news.gmane.org> On Thu, 16 Feb 2006 13:11:49 -0800, Guido van Rossum wrote: >A bunch of Googlers were discussing the best way of doing the >following (a common idiom when maintaining a dict of lists of values >relating to a key, sometimes called a multimap): > > if key not in d: d[key] = [] > d[key].append(value) > >An alternative way to spell this uses setdefault(), but it's not very readable: > > d.setdefault(key, []).append(value) > >and it also suffers from creating an unnecessary list instance. >(Timings were inconclusive; the approaches are within 5-10% of each >other in speed.) > >My conclusion is that setdefault() is a failure -- it was a >well-intentioned construct, but doesn't actually create more readable >code. > >Google has an internal data type called a DefaultDict which gets >passed a default value upon construction. Its __getitem__ method, >instead of raising KeyError, inserts a shallow copy (!) of the given >default value into the dict when the value is not found. So the above >code, after > > d = DefaultDict([]) > >can be written as simply > > d[key].append(value) > Wouldn't it be more generally powerful to pass type or factory function to use to instantiate a default object when a missing key is encountered, e.g. d = DefaultDict(list) then d[key].append(value) but then you can also do d = DefaultDict(dict) d[key].update(a=123) or class Foo(object): pass d = DefaultDict(Foo) d[key].phone = '415-555-1212' etc. No worries about generalizing shallow copying either ;-) >Note that of all the possible semantics for __getitem__ that could >have produced similar results (e.g. not inserting the default in the >underlying dict, or not copying the default value), the chosen >semantics are the only ones that makes this example work. > >Over lunch with Alex Martelli, he proposed that a subclass of dict >with this behavior (but implemented in C) would be a good addition to >the language. It looks like it wouldn't be hard to implement. It could >be a builtin named defaultdict. The first, required, argument to the >constructor should be the default value. Remaining arguments (even >keyword args) are passed unchanged to the dict constructor. > >Some more design subtleties: > >- "key in d" still returns False if the key isn't there >- "d.get(key)" still returns None if the key isn't there >- "d.default" should be a read-only attribute giving the default value > >Feedback? > See above. Regards, Bengt Richter From lists at hlabs.spb.ru Fri Feb 17 08:27:47 2006 From: lists at hlabs.spb.ru (Dmitry Vasiliev) Date: Fri, 17 Feb 2006 10:27:47 +0300 Subject: [Python-Dev] Test failures in test_timeout In-Reply-To: References: <20060216174326.GA23859@xs4all.nl> Message-ID: <43F57AF3.6060600@hlabs.spb.ru> Steve Holden wrote: > Thomas Wouters wrote: >> I'm thinking that it could probably try to connect to a less reliable >> website, but that's just moving the problem around (and possibly harassing >> an unsuspecting website, particularly around release-time.) Perhaps the test >> should try to connect to a known unconnecting site, like a firewalled port >> on www.python.org? Not something that refuses connections, just something >> that times out. >> > Couldn't the test use subprocess to start a reliably slow server on > localhost? It might even be possible to retrieve the ephemeral port > number used by the server, to avoid conflicts with already-used ports on > the testing machine. About 3 years ago I submitted the patch for test_timeout which had fixed some of these issues: http://sourceforge.net/tracker/index.php?func=detail&aid=728815&group_id=5470&atid=305470 Now I think the patch need more review and need to be updated for the current Python version and maybe some new ideas. -- Dmitry Vasiliev (dima at hlabs.spb.ru) http://hlabs.spb.ru From bob at redivi.com Fri Feb 17 10:07:02 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 17 Feb 2006 01:07:02 -0800 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <43F4D3FE.4040905@benjiyork.com> References: <43F2FE65.5040308@pollenation.net> <19igus5puu6e2$.dlg@usenet.alexanderweb.de> <43F4D3FE.4040905@benjiyork.com> Message-ID: <0B6E419A-90F3-4E9D-8548-B4E317D907AF@redivi.com> On Feb 16, 2006, at 11:35 AM, Benji York wrote: > Alexander Schremmer wrote: >> In fact, PHP does it like php.net/functionname which is even >> shorter, i.e. >> they fallback to the documentation if that path does not exist >> otherwise. > > Like many things PHP, that seems a bit too magical for my tastes. Not only does it fall back to documentation, it falls back to a search for documentation if there isn't a function of that name. It's a convenient feature, I'm sure people would use it if it was there... even if it was something like http://python.org/doc/name -bob From g.brandl at gmx.net Fri Feb 17 10:10:32 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 10:10:32 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: Guido van Rossum wrote: > d = DefaultDict([]) > > can be written as simply > > d[key].append(value) > Feedback? Probably a good idea, has been proposed multiple times on clpy. One good thing would be to be able to specify either a default value or a factory function. While at it, other interesting dict subclasses could be: * sorteddict, practically reinvented by every larger project * keytransformdict, such as d = keytransformdict(str.lower). Georg From walter at livinglogic.de Fri Feb 17 10:31:25 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 17 Feb 2006 10:31:25 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F597ED.7060308@livinglogic.de> Guido van Rossum wrote: > A bunch of Googlers were discussing the best way of doing the > following (a common idiom when maintaining a dict of lists of values > relating to a key, sometimes called a multimap): > > if key not in d: d[key] = [] > d[key].append(value) > > An alternative way to spell this uses setdefault(), but it's not very readable: > > d.setdefault(key, []).append(value) > > and it also suffers from creating an unnecessary list instance. > (Timings were inconclusive; the approaches are within 5-10% of each > other in speed.) > > My conclusion is that setdefault() is a failure -- it was a > well-intentioned construct, but doesn't actually create more readable > code. > > Google has an internal data type called a DefaultDict which gets > passed a default value upon construction. Its __getitem__ method, > instead of raising KeyError, inserts a shallow copy (!) of the given > default value into the dict when the value is not found. So the above > code, after > > d = DefaultDict([]) > > can be written as simply > > d[key].append(value) Using a shallow copy of the default seems a bit too magical to me. How would this be done? Via copy.copy? And passing [] to the constructor of dict has a different meaning already. Fetching the default via a static/class method would solve both problems: class default_dict(dict): def __getitem__(self, key): if key in self: return dict.__getitem__(self, key) else: default = self.getdefault() self[key] = default return default class multi_map(default_dict): @staticmethod def getdefault(self): return [] class counting_dict(default_dict): @staticmethod def getdefault(self): return 0 > [...] Bye, Walter D?rwald From theller at python.net Fri Feb 17 10:42:40 2006 From: theller at python.net (Thomas Heller) Date: Fri, 17 Feb 2006 10:42:40 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: > Guido van Rossum wrote: > >> d = DefaultDict([]) >> >> can be written as simply >> >> d[key].append(value) > >> Feedback? > Ok, setdefault is a horrible name. Would it make sense to come up with a better name? Georg Brandl wrote: > Probably a good idea, has been proposed multiple times on clpy. > One good thing would be to be able to specify either a default value > or a factory function. > > While at it, other interesting dict subclasses could be: > * sorteddict, practically reinvented by every larger project You mean ordereddict, not sorteddict, I hope. > * keytransformdict, such as d = keytransformdict(str.lower). Not sure what you mean by that. What *I* would like is probably more ambitious: I want a dict that allows case-insensitive lookup of string keys, plus ideally I want to use it as class or instance dictionary. Use case: COM wrappers. Thomas From bob at redivi.com Fri Feb 17 10:50:15 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 17 Feb 2006 01:50:15 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060216210827.5F88.JCARLSON@uci.edu> References: <20060215212629.5F6D.JCARLSON@uci.edu> <43F4493D.8090902@canterbury.ac.nz> <20060216210827.5F88.JCARLSON@uci.edu> Message-ID: <44D7B752-6D88-4513-A651-771073741AE2@redivi.com> On Feb 16, 2006, at 9:20 PM, Josiah Carlson wrote: > > Greg Ewing wrote: >> >> Josiah Carlson wrote: >> >>> They may not be encodings of _unicode_ data, >> >> But if they're not encodings of unicode data, what >> business do they have being available through >> someunicodestring.encode(...)? > > I had always presumed that bytes objects are going to be able to be a > source for encode AND decode, like current non-unicode strings are > able > to be today. In that sense, if I have a bytes object which is an > encoding of rot13, hex, uu, etc., or I have a bytes object which I > would > like to be in one of those encodings, I should be able to do > b.encode(...) > or b.decode(...), given that 'b' is a bytes object. > > Are 'encodings' going to become a mechanism to encode and decode > _unicode_ strings, rather than a mechanism to encode and decode _text > and data_ strings? That would seem like a backwards step to me, as > the > email package would need to package their own base-64 encode/decode > API > and implementation, and similarly for any other package which uses any > one of the encodings already available. It would be VERY useful to separate the two concepts. bytes<->bytes transforms should be one function pair, and bytes<->text transforms should be another. The current situation is totally insane: str.decode(codec) -> str or unicode or UnicodeDecodeError or ZlibError or TypeError.. who knows what else str.encode(codec) -> str or unicode or UnicodeDecodeError or TypeError... probably other exceptions Granted, unicode.encode(codec) and unicode.decode(codec) are actually somewhat sane in that the return type is always a str and the exceptions are either UnicodeEncodeError or UnicodeDecodeError. I think that rot13 is the only conceptually text<->text transform (though the current implementation is really bytes<->bytes), everything else is either bytes<->text or bytes<->bytes. -bob From g.brandl at gmx.net Fri Feb 17 10:56:20 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 10:56:20 +0100 Subject: [Python-Dev] http://www.python.org/dev/doc/devel still available In-Reply-To: <0B6E419A-90F3-4E9D-8548-B4E317D907AF@redivi.com> References: <43F2FE65.5040308@pollenation.net> <19igus5puu6e2$.dlg@usenet.alexanderweb.de> <43F4D3FE.4040905@benjiyork.com> <0B6E419A-90F3-4E9D-8548-B4E317D907AF@redivi.com> Message-ID: Bob Ippolito wrote: > On Feb 16, 2006, at 11:35 AM, Benji York wrote: > >> Alexander Schremmer wrote: >>> In fact, PHP does it like php.net/functionname which is even >>> shorter, i.e. >>> they fallback to the documentation if that path does not exist >>> otherwise. >> >> Like many things PHP, that seems a bit too magical for my tastes. > > Not only does it fall back to documentation, it falls back to a > search for documentation if there isn't a function of that name. > > It's a convenient feature, I'm sure people would use it if it was > there... even if it was something like http://python.org/doc/name Yes. Either that or docs.python.org/... would be nice. (alongside with the "custom markers" I proposed one time so that there can be "speaking" URLs like docs.python.org/mutable-default-arguments ) Georg From g.brandl at gmx.net Fri Feb 17 11:00:26 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 11:00:26 +0100 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: References: Message-ID: Bernhard Herzog wrote: > "Travis E. Oliphant" writes: > >> 2) The __index__ special method will have the signature >> >> def __index__(self): >> return obj >> >> Where obj must be either an int or a long or another object >> that has the __index__ special method (but not self). > > So int objects will not have an __index__ method (assuming that ints > won't return a different but equal int object). However: > >> 4) A new operator.index(obj) function will be added that calls >> equivalent of obj.__index__() and raises an error if obj does not >> implement the special method. > > So operator.index(1) will raise an exception. I would expect > operator.index to be implemented using PyNumber_index. I'd expect that __index__ won't be called on an int in the first place. Georg From ncoghlan at gmail.com Fri Feb 17 11:37:57 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2006 20:37:57 +1000 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: References: Message-ID: <43F5A785.9080104@gmail.com> Georg Brandl wrote: > Bernhard Herzog wrote: >> "Travis E. Oliphant" writes: >> >>> 2) The __index__ special method will have the signature >>> >>> def __index__(self): >>> return obj >>> >>> Where obj must be either an int or a long or another object >>> that has the __index__ special method (but not self). >> So int objects will not have an __index__ method (assuming that ints >> won't return a different but equal int object). However: >> >>> 4) A new operator.index(obj) function will be added that calls >>> equivalent of obj.__index__() and raises an error if obj does not >>> implement the special method. >> So operator.index(1) will raise an exception. I would expect >> operator.index to be implemented using PyNumber_index. > > I'd expect that __index__ won't be called on an int in the first place. The PEP has been updated to cover adding the __index__ slot to int/long so that "one check finds all". The slot will just get bypassed for ints and longs by a lot of the C code in the interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Fri Feb 17 11:55:36 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 11:55:36 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: Thomas Heller wrote: >> Probably a good idea, has been proposed multiple times on clpy. >> One good thing would be to be able to specify either a default value >> or a factory function. >> >> While at it, other interesting dict subclasses could be: >> * sorteddict, practically reinvented by every larger project > > You mean ordereddict, not sorteddict, I hope. Well, yes. >> * keytransformdict, such as d = keytransformdict(str.lower). > > Not sure what you mean by that. > > What *I* would like is probably more ambitious: I want a dict that allows case-insensitive > lookup of string keys This is exactly what this would do. All keys are transformed to lowercase when setting and looking up. > plus ideally I want to use it as class or instance dictionary. > Use case: COM wrappers. regards, Georg From mwh at python.net Fri Feb 17 11:57:25 2006 From: mwh at python.net (Michael Hudson) Date: Fri, 17 Feb 2006 10:57:25 +0000 Subject: [Python-Dev] Rename str/unicode to text In-Reply-To: (Guido van Rossum's message of "Thu, 16 Feb 2006 10:50:08 -0800") References: Message-ID: <2mu0ay9rsa.fsf@starship.python.net> Guido van Rossum writes: > OTOH, even if we didn't rename str/unicode to text, opentext would > still be a good name for the function that opens a text file. Hnnrgh, not really. You're not opening a 'text', nor are you constructing something that might reasonably be called an 'opentext'. textfile() seems better. Cheers, mwh -- Q: What are 1000 lawyers at the bottom of the ocean? A: A good start. (A lawyer told me this joke.) -- Michael Str?der, comp.lang.python From p.f.moore at gmail.com Fri Feb 17 12:04:17 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 17 Feb 2006 11:04:17 +0000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <002301c6337a$7001f140$b83efea9@RaymondLaptop1> References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com> <002301c6337a$7001f140$b83efea9@RaymondLaptop1> Message-ID: <79990c6b0602170304t54fd54d9tf99a871d5219ef17@mail.gmail.com> On 2/17/06, Raymond Hettinger wrote: > >> Over lunch with Alex Martelli, he proposed that a subclass of dict > >> with this behavior (but implemented in C) would be a good addition to > >> the language > > I would like to add something like this to the collections module, +1 > but a PEP is probably needed to deal with issues like: +0 (You're probably right, but I fear there's no "perfect answer", so discussions could go round in circles...) Paul. From fuzzyman at voidspace.org.uk Fri Feb 17 12:02:05 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Fri, 17 Feb 2006 11:02:05 +0000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F576A3.1030604@v.loewis.de> References: <43F576A3.1030604@v.loewis.de> Message-ID: <43F5AD2D.1000205@voidspace.org.uk> Martin v. L?wis wrote: > Guido van Rossum wrote: > >> Feedback? >> > > I would like this to be part of the standard dictionary type, > rather than being a subtype. > > d.setdefault([]) (one argument) should install a default value, > and d.cleardefault() should remove that setting; d.default > should be read-only. Alternatively, d.default could be assignable > and del-able. > > Also, I think has_key/in should return True if there is a default. > > And exactly what use would it then be ? Michael Foord > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060217/d2b578b4/attachment.htm From fredrik at pythonware.com Fri Feb 17 12:14:27 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 12:14:27 +0100 Subject: [Python-Dev] Proposal: defaultdict References: Message-ID: Guido van Rossum wrote: > A bunch of Googlers were discussing the best way of doing the > following (a common idiom when maintaining a dict of lists of values > relating to a key, sometimes called a multimap): > > if key not in d: d[key] = [] > d[key].append(value) /.../ > Feedback? +1. check it in, already (as collections.defaultdict, perhaps?) alternatively, you could specialize even further: collections.multimap, which deals with list values only (that shallow copy thing feels a bit questionable, but all alternatives feel slightly overgeneralized...) From fredrik at pythonware.com Fri Feb 17 12:23:44 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 12:23:44 +0100 Subject: [Python-Dev] Rename str/unicode to text References: <2mu0ay9rsa.fsf@starship.python.net> Message-ID: Michael Hudson wrote: > > OTOH, even if we didn't rename str/unicode to text, opentext would > > still be a good name for the function that opens a text file. > > Hnnrgh, not really. You're not opening a 'text', nor are you > constructing something that might reasonably be called an 'opentext'. > textfile() seems better. except that in Python, file is a type, and open is an action. but I agree that textfile reads better (haven't we been through this a couple of times already, btw? iirc, my original textfile proposal was posted in 1846, or so) From fredrik at pythonware.com Fri Feb 17 12:39:17 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 12:39:17 +0100 Subject: [Python-Dev] Proposal: defaultdict References: <43F576A3.1030604@v.loewis.de> Message-ID: Martin v. Löwis wrote: > Also, I think has_key/in should return True if there is a default. and keys should return all possible key values! From g.brandl at gmx.net Fri Feb 17 13:00:29 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 13:00:29 +0100 Subject: [Python-Dev] Deprecate ``multifile``? Message-ID: Hi, as Jim Jewett noted, multifile is supplanted by email as much as mimify etc. but it is not marked as deprecated. Should it be deprecated in 2.5? Georg From mal at egenix.com Fri Feb 17 13:03:29 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 13:03:29 +0100 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: References: <20060216025515.GA474@mems-exchange.org> Message-ID: <43F5BB91.3060609@egenix.com> Guido van Rossum wrote: > On 2/15/06, Neil Schemenauer wrote: >> This could be a replacement for PEP 332. At least I hope it can >> serve to summarize the previous discussion and help focus on the >> currently undecided issues. >> >> I'm too tired to dig up the rules for assigning it a PEP number. >> Also, there are probably silly typos, etc. Sorry. > > I may check it in for you, although right now it would be good if we > had some more feedback. > > I noticed one behavior in your pseudo-code constructor that seems > questionable: while in the Q&A section you explain why the encoding is > ignored when the argument is a str instance, in fact you require an > encoding (and one that's not "ascii") if the str instance contains any > non-ASCII bytes. So bytes("\xff") would fail, but bytes("\xff", > "blah") would succeed. I think that's a bit strange -- if you ignore > the encoding, you should always ignore it. So IMO bytes("\xff") and > bytes("\xff", "ascii") should both return the same as bytes([255]). > Also, there's a code path where the initializer is a unicode instance > and its encode() method is called with None as the argument. I think > both could be fixed by setting the encoding to > sys.getdefaultencoding() if it is None and the argument is a unicode > instance: > > def bytes(initialiser=[], encoding=None): > if isinstance(initialiser, basestring): > if isinstance(initialiser, unicode): > if encoding is None: > encoding = sys.getdefaultencoding() > initialiser = initialiser.encode(encoding) > initialiser = [ord(c) for c in initialiser] > elif encoding is not None: > raise TypeError("explicit encoding invalid for non-string " > "initialiser") > create bytes object and fill with integers from initialiser > return bytes object > > BTW, for folks who want to experiment, it's quite simple to create a > working bytes implementation by inheriting from array.array. Here's a > quick draft (which only takes str instance arguments): > > from array import array > class bytes(array): > def __new__(cls, data=None): > b = array.__new__(cls, "B") > if data is not None: > b.fromstring(data) > return b > def __str__(self): > return self.tostring() > def __repr__(self): > return "bytes(%s)" % repr(list(self)) > def __add__(self, other): > if isinstance(other, array): > return bytes(super(bytes, self).__add__(other)) > return NotImplemented Another hint: If you want to play around with the migration to all Unicode in Py3k, start Python with the -U switch and monkey-patch the builtin str to be an alias for unicode. Ideally, the bytes type should work under both the Py3k conditions and the Py2.x default ones. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From python at rcn.com Fri Feb 17 13:19:11 2006 From: python at rcn.com (Raymond Hettinger) Date: Fri, 17 Feb 2006 07:19:11 -0500 Subject: [Python-Dev] Proposal: defaultdict References: Message-ID: <000f01c633bc$5c1f4820$b83efea9@RaymondLaptop1> > My conclusion is that setdefault() is a failure -- it was a > well-intentioned construct, but doesn't actually create more readable > code. It was an across the board failure: naming, clarity, efficiency. Can we agree to slate dict.setdefault() to disappear in Py3.0? Raymond From walter at livinglogic.de Fri Feb 17 13:45:22 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 17 Feb 2006 13:45:22 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> Message-ID: <43F5C562.4040105@livinglogic.de> Guido van Rossum wrote: > On 2/16/06, M.-A. Lemburg wrote: >> What will be the explicit way to open a file in bytes mode >> and in text mode (I for one would like to move away from >> open() completely as well) ? >> >> Will we have a single file type with two different modes >> or two different types ? > > I'm currently thinking of an I/O stack somewhat like Java's. At the > bottom there's a class that lets you do raw unbuffered reads and > writes (and seek/tell) on binary files using bytes arrays. We can > layer onto this buffering, text encoding/decoding, and more. (Windows > CRLF<->LF conversion is also an encoding of sorts). > > Years ago I wrote a prototype; checkout sandbox/sio/. However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work for encodings that need state (e.g. when reading/writing UTF-16). Switching to stateful encoders/decoders isn't so easy, because the stateful codecs require a stream-API, which brings in a whole bunch of other functionality (readline() etc.), which we'd probably like to keep separate. I have a patch (http://bugs.python.org/1101097) that should fix this problem (at least for all codecs derived from codecs.StreamReader/codecs.StreamWriter). Additionally it would make stateful codecs more useful in the context for iterators/generators. I'd like this patch to go into 2.5. Bye, Walter D?rwald From pje at telecommunity.com Fri Feb 17 13:52:41 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 17 Feb 2006 07:52:41 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <5.1.1.6.0.20060217075002.021e0d00@mail.telecommunity.com> At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote: >Guido van Rossum wrote: > > > d = DefaultDict([]) > > > > can be written as simply > > > > d[key].append(value) > > > Feedback? > >Probably a good idea, has been proposed multiple times on clpy. >One good thing would be to be able to specify either a default value >or a factory function. +1 on factory function, e.g. "DefaultDict(list)". A default value isn't very useful, because for immutable defaults, setdefault() works well enough. If what you want is a copy of some starting object, you can always do something like DefaultDict({1:2,3:4}.copy). From fredrik at pythonware.com Fri Feb 17 13:50:18 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 13:50:18 +0100 Subject: [Python-Dev] Deprecate ``multifile``? References: Message-ID: Georg Brandl wrote: > as Jim Jewett noted, multifile is supplanted by email as much as mimify etc. > but it is not marked as deprecated. Should it be deprecated in 2.5? -0.5 (gratuitous breakage). I think the current "see also/supersedes" link is good enough. From mal at egenix.com Fri Feb 17 13:53:39 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 13:53:39 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> Message-ID: <43F5C753.3070603@egenix.com> Guido van Rossum wrote: > On 2/16/06, M.-A. Lemburg wrote: >> What will be the explicit way to open a file in bytes mode >> and in text mode (I for one would like to move away from >> open() completely as well) ? >> >> Will we have a single file type with two different modes >> or two different types ? > > I'm currently thinking of an I/O stack somewhat like Java's. At the > bottom there's a class that lets you do raw unbuffered reads and > writes (and seek/tell) on binary files using bytes arrays. We can > layer onto this buffering, text encoding/decoding, and more. (Windows > CRLF<->LF conversion is also an encoding of sorts). Sounds like the stackable StreamWriters and -Readers would nicely integrate into this design. > Years ago I wrote a prototype; checkout sandbox/sio/. Thanks. Maybe one of these days I'll get around to having a look - unlike many of the pydev folks, I don't work for Google and can't spend 20% or 50% of my time on Python core development :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From g.brandl at gmx.net Fri Feb 17 14:01:22 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 14:01:22 +0100 Subject: [Python-Dev] Deprecate ``multifile``? In-Reply-To: References: Message-ID: Fredrik Lundh wrote: > Georg Brandl wrote: > >> as Jim Jewett noted, multifile is supplanted by email as much as mimify etc. >> but it is not marked as deprecated. Should it be deprecated in 2.5? > > -0.5 (gratuitous breakage). > > I think the current "see also/supersedes" link is good enough. Well, it would be deprecated like the other email modules, that is, only a note is added to the docs and it is added to PEP 4. There would be no warning. Georg From mal at egenix.com Fri Feb 17 14:10:43 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 14:10:43 +0100 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F5C562.4040105@livinglogic.de> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> Message-ID: <43F5CB53.4000301@egenix.com> Walter D?rwald wrote: > Guido van Rossum wrote: > >> On 2/16/06, M.-A. Lemburg wrote: >>> What will be the explicit way to open a file in bytes mode >>> and in text mode (I for one would like to move away from >>> open() completely as well) ? >>> >>> Will we have a single file type with two different modes >>> or two different types ? >> >> I'm currently thinking of an I/O stack somewhat like Java's. At the >> bottom there's a class that lets you do raw unbuffered reads and >> writes (and seek/tell) on binary files using bytes arrays. We can >> layer onto this buffering, text encoding/decoding, and more. (Windows >> CRLF<->LF conversion is also an encoding of sorts). >> >> Years ago I wrote a prototype; checkout sandbox/sio/. > > However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work > for encodings that need state (e.g. when reading/writing UTF-16). > Switching to stateful encoders/decoders isn't so easy, because the > stateful codecs require a stream-API, which brings in a whole bunch of > other functionality (readline() etc.), which we'd probably like to keep > separate. I have a patch (http://bugs.python.org/1101097) that should > fix this problem (at least for all codecs derived from > codecs.StreamReader/codecs.StreamWriter). Additionally it would make > stateful codecs more useful in the context for iterators/generators. > > I'd like this patch to go into 2.5. The patch as-is won't go into 2.5. It's simply the wrong approach: StreamReaders and -Writers work on streams (hence the name). It doesn't make sense adding functionality to side-step this behavior, since it undermines the design. Like I suggested in the patch discussion, such functionality could be factored out of the implementations of StreamReaders/Writers and put into new StatefulEncoder/Decoder classes, the objects of which then get used by StreamReader/Writer. In addition to that we could extend the codec registry to also maintain slots for the stateful encoders and decoders, if needed. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Fri Feb 17 14:15:05 2006 From: mwh at python.net (Michael Hudson) Date: Fri, 17 Feb 2006 13:15:05 +0000 Subject: [Python-Dev] Rename str/unicode to text In-Reply-To: (Fredrik Lundh's message of "Fri, 17 Feb 2006 12:23:44 +0100") References: <2mu0ay9rsa.fsf@starship.python.net> Message-ID: <2mpslm9leu.fsf@starship.python.net> "Fredrik Lundh" writes: > Michael Hudson wrote: > >> > OTOH, even if we didn't rename str/unicode to text, opentext would >> > still be a good name for the function that opens a text file. >> >> Hnnrgh, not really. You're not opening a 'text', nor are you >> constructing something that might reasonably be called an 'opentext'. >> textfile() seems better. > > except that in Python, file is a type, and open is an action. Well, yeah, but you can interpret each name in a sane way and try to ignore the fact that they refer to the same object... > but I agree that textfile reads better (haven't we been through this > a couple of times already, btw? iirc, my original textfile proposal was > posted in 1846, or so) Yes, that sounds about right. Cheers, mwh -- I'm sorry, was my bias showing again? :-) -- William Tanksley, 13 May 2000 From fredrik at pythonware.com Fri Feb 17 14:16:59 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 14:16:59 +0100 Subject: [Python-Dev] Proposal: defaultdict References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com> <002301c6337a$7001f140$b83efea9@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > I would like to add something like this to the collections module, but a PEP is > probably needed to deal with issues like: frankly, now that Guido is working 50% on Python, do we really have to use the full PEP process also for simple things like this? I'd say we let the BDFL roam free. (if he adds something really lousy, it can always be tweaked/removed before the next final release. not every checkin needs to be final...). From mal at egenix.com Fri Feb 17 14:24:51 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 14:24:51 +0100 Subject: [Python-Dev] str.translate vs unicode.translate In-Reply-To: <43f507f2.983922896@news.gmane.org> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43f507f2.983922896@news.gmane.org> Message-ID: <43F5CEA3.4020601@egenix.com> Bengt Richter wrote: > If str becomes unicode for PY 3000, and we then have bytes as out coding-agnostic > byte data, then I think bytes should have the str translation method, with a tweak > that I would hope could also be done to str now. > > BTW, str.translate will presumably become unicode.translate, so > perhaps unicode.translate should grow a compatible deletechars parameter. I'd much rather like to see .translate() method deprecated. Writing a code for the task is much more effective - the builtin charmap codec will do all the mapping for you, if you have a need to go from bytes to Unicode and vice- versa. We could also have a bytemap codec for doing bytes to bytes conversions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From g.brandl at gmx.net Fri Feb 17 14:26:45 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 14:26:45 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com> <002301c6337a$7001f140$b83efea9@RaymondLaptop1> Message-ID: Fredrik Lundh wrote: > Raymond Hettinger wrote: > >> I would like to add something like this to the collections module, but a PEP is >> probably needed to deal with issues like: > > frankly, now that Guido is working 50% on Python, do we really have to use > the full PEP process also for simple things like this? > > I'd say we let the BDFL roam free. > > (if he adds something really lousy, it can always be tweaked/removed before > the next final release. not every checkin needs to be final...). +1. Georg From mal at egenix.com Fri Feb 17 14:30:14 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 14:30:14 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F50BDD.4010106@v.loewis.de> References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> <43F50BDD.4010106@v.loewis.de> Message-ID: <43F5CFE6.3040502@egenix.com> Martin v. L?wis wrote: > Josiah Carlson wrote: >> I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', >> and likely a few others that the two of you may be arguing against >> should stay as encodings, because strictly speaking, they are defined as >> encodings of data. They may not be encodings of _unicode_ data, but >> that doesn't mean that they aren't useful encodings for other kinds of >> data, some text, some binary, ... > > To support them, the bytes type would have to gain a .encode method, > and I'm -1 on supporting bytes.encode, or string.decode. > > Why is > > s.encode("uu") > > any better than > > binascii.b2a_uu(s) The .encode() and .decode() methods are merely convenience interfaces to the registered codecs (with some extra logic to make sure that only a pre-defined set of return types are allowed). It's up to the user to use them for e.g. UU-encoding or not. The reason we have codecs for UU, zip and the others is that you can use their StreamWriters/Readers in stackable streams. Just because some codecs don't fit into the string.decode() or bytes.encode() scenario doesn't mean that these codecs are useless or that the methods should be banned. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Fri Feb 17 14:39:46 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 14:39:46 +0100 Subject: [Python-Dev] [Python-checkins] r42396 - peps/trunk/pep-0011.txt In-Reply-To: References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> Message-ID: <43F5D222.2030402@egenix.com> Neal Norwitz wrote: > [Moving to python-dev] > > I don't have a strong opinion. Any one else have an opinion about > removing --with-wctype-functions from configure? FWIW, I announced this plan in Dec 2004: http://mail.python.org/pipermail/python-dev/2004-December/050193.html I didn't get any replies back then, so assumed that no-one would object, but forgot to add this to the PEP 11. The reason I'd like to get this removed early rather than later is that some Linux distros happen to use the config switch causing the Python Unicode implementation on those distros to behave inconsistent with regular Python builds. After all we've put a lot of effort into making sure that the Unicode implementation does work independently of the platform, even on platforms that don't have Unicode support at all. Another candidate for removal is the --disable-unicode switch. We should probably add a deprecation warning for that in Py 2.5 and then remove the hundreds of #idef Py_USING_UNICODE from the source code in time for Py 2.6. > n > -- > > On 2/16/06, M.-A. Lemburg wrote: >> neal.norwitz wrote: >>> Author: neal.norwitz >>> Date: Thu Feb 16 06:25:37 2006 >>> New Revision: 42396 >>> >>> Modified: >>> peps/trunk/pep-0011.txt >>> Log: >>> MAL says this option should go away in bug report 874534: >>> >>> The reason for the removal is that the option causes >>> semantical problems and makes Unicode work in non-standard >>> ways on platforms that use locale-aware extensions to the >>> wc-type functions. >>> >>> Since it wasn't previously announced, we can keep the option until 2.6 >>> unless someone feels strong enough to rip it out. >> I've been wanting to rip this out for some time now, but >> you're right: I forgot to add this to PEP 11, so let's >> wait for another release. >> >> OTOH, this normally only affects system builders, so perhaps >> we could do this a little faster, e.g. add a warning in the >> first alpha and then rip it out with one of the last betas ?! >> >>> Modified: peps/trunk/pep-0011.txt >>> >>> + Name: Systems using --with-wctype-functions >>> + Unsupported in: Python 2.6 >>> + Code removed in: Python 2.6 > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Fri Feb 17 14:55:47 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2006 23:55:47 +1000 Subject: [Python-Dev] PEP 338 issue finalisation (was Re: 2.5 PEP) In-Reply-To: References: <43F3A2EE.8060208@gmail.com> <43F45A8A.3050900@gmail.com> Message-ID: <43F5D5E3.8070301@gmail.com> Guido van Rossum wrote: > [Hey, I thought I sent that just to you. Is python-dev really > interested in this?] Force of habit on my part - I saw the python-dev header and automatically dropped "pyd" into the To: field of the reply. Given Paul's contribution on the get_data front, it turned out to be a fortuitous accident :) [init_globals argument] >> I just realised that anything that's a legal argument to "dict.update" will >> work. I'll fix the function description in the PEP (and the docs patch as well). > > I'm not sure that's a good idea -- you'll never be able to switch to a > different implementation then. Good point - I'll change the wording so that (officially, at least) it has to be a dictionary. [_read_compiled_file() error handling] > Also, *perhaps* it makes more sense to return None instead of raising > ValueError? Since you're always catching it? (Or are you?) I've changed this in my local copy. That provides a means for dealing with marshal, too - catching any Exception from marshal.load and convert it to returning None. This approach loses some details on what exactly was wrong with the file, but that doesn't seem like a big issue (and it cleans up some of the other code). [run_module() error handling] > OK. But a loader could return None from get_code() -- do you check for > that? (I don't have the source handy here.) The current version on SF doesn't check it, but I just updated my local copy to fix that. [run_module() interaction with import] > What happens when you execute "foo.py" as __main__ and then (perhaps > indirectly) something does "import foo"? Does a second copy of foo.py > get loaded by the regular loader? Yes - this is the same as if foo.py was run directly from the command line via its filename. [YAGNI and 6 public functions where only 1 has a demonstrated use case] > I do wonder if runpy.py isn't getting a bit over-engineered -- it > seems a lot of the functionality isn't actually necessary to implement > -m foo.bar, and the usefulness in other circumstances is as yet > unproven. What do you think of taking a dose of YAGNI here? > (Especially since I notice that most of the public APIs are very thin > layers over exec or execfile -- people can just use those directly.) I had a look at pdb and profile, and the runpy functions really wouldn't help with either of those. Since I don't have any convincing use cases, I'll demote run_code and run_module_code to be private helper functions and remove the three run*file methods (I might throw those three up on ASPN as a cookbook recipe instead). That leaves the public API containing only run_module, which is all -m really needs. [thread safety and the import lock] >> Another problem that occurred to me is that the module isn't thread safe at >> the moment. The PEP 302 emulation isn't protected by the import lock, and the >> changes to sys.argv in run_module_code will be visible across threads (and may >> clobber each other or the original if multiple threads invoke the function). > > Another reason to consider cutting it down to only what's needed by > -m; -m doesn't need thread-safety (I think). Yeah, thread-safety is only an issue if invoking runpy.run_module from threaded Python code. However, I think this is one of those nasty threading problems where it will work for 99.9% of cases and produce intractable bugs for the remaining 0.1%. If an extra try-finally block can categorically rule out those kinds of problems, then I think it's nicer to include it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Fri Feb 17 14:56:44 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 14:56:44 +0100 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: Message-ID: Georg Brandl wrote: > Hi, > > it has been proposed before, but there was no conclusive answer last time: > is there any chance for 2.5 to include commonly used decorators in a module? No interest at all? Georg From rhamph at gmail.com Fri Feb 17 15:13:51 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 17 Feb 2006 07:13:51 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/16/06, Guido van Rossum wrote: > A bunch of Googlers were discussing the best way of doing the > following (a common idiom when maintaining a dict of lists of values > relating to a key, sometimes called a multimap): > > if key not in d: d[key] = [] > d[key].append(value) > > An alternative way to spell this uses setdefault(), but it's not very readable: > > d.setdefault(key, []).append(value) I'd like to see it done passing a factory function (and with a better name): d.getorset(key, list).append(value) The name is slightly odd but it is effective. Plus it avoids creating a new class when a slight tweak to an existing one will do. > Over lunch with Alex Martelli, he proposed that a subclass of dict > with this behavior (but implemented in C) would be a good addition to > the language. It looks like it wouldn't be hard to implement. It could > be a builtin named defaultdict. The first, required, argument to the > constructor should be the default value. Remaining arguments (even > keyword args) are passed unchanged to the dict constructor. -1 (atleast until you can explain why that's better than .getorset()) -- Adam Olsen, aka Rhamphoryncus From ncoghlan at gmail.com Fri Feb 17 15:27:48 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2006 00:27:48 +1000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <5.1.1.6.0.20060217075002.021e0d00@mail.telecommunity.com> References: <5.1.1.6.0.20060217075002.021e0d00@mail.telecommunity.com> Message-ID: <43F5DD64.9040100@gmail.com> Phillip J. Eby wrote: > At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote: >> Guido van Rossum wrote: >> >>> d = DefaultDict([]) >>> >>> can be written as simply >>> >>> d[key].append(value) >>> Feedback? >> Probably a good idea, has been proposed multiple times on clpy. >> One good thing would be to be able to specify either a default value >> or a factory function. > > +1 on factory function, e.g. "DefaultDict(list)". A default value isn't > very useful, because for immutable defaults, setdefault() works well > enough. If what you want is a copy of some starting object, you can always > do something like DefaultDict({1:2,3:4}.copy). +1 here, too (for permitting a factory function only). This doesn't really limit usage, as you can still supply DefaultDict(partial(copy, x)) or DefaultDict(partial(deepcopy, x)), or (heaven forbid) a lambda expression. . . As others have mentioned, the basic types are all easy, since the typename can be used directly. +1 on supplying that factory function to the constructor, too (the default value is a fundamental part of the defaultdict). That is, I'd prefer: d = defaultdict(func) # The defaultdict is fully defined, but not yet populated d.update(init_values) over: d = defaultdict(init_values) # The defaultdict is partially populated, but not yet fully defined! d.default(func) That is, something that is the same the normal dict except for: def __init__(self, default): self.default = default def __getitem__(self, key): return self.get(key, self.default()) Considering some of Raymond's questions in light of the above > * implications of a __getitem__ succeeding while get(value, x) returns x > (possibly different from the overall default) > * implications of a __getitem__ succeeding while __contains__ would fail These behaviours seem reasonable for a default dictionary - "containment" is based on whether or not the key actually exists in the dictionary as it currently stands, and the default is really a "default default" that can be overridden using 'get'. > * whether to add this to the collections module (I would say yes) > * whether to allow default functions as well as default values (so you could > instantiate a new default list) My preference is for factory functions only, to eliminate ambiguity. # bag like behavior dd = collections.default_dict(int) for elem in collection: dd[elem] += 1 # setdefault-like behavior dd = collections.default_dict(list) for page_number, page in enumerate(book): for word in page.split(): dd[word].append(word) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From walter at livinglogic.de Fri Feb 17 15:38:24 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 17 Feb 2006 15:38:24 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F5CB53.4000301@egenix.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> Message-ID: <43F5DFE0.6090806@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: >> Guido van Rossum wrote: >> >>> [...] >>> Years ago I wrote a prototype; checkout sandbox/sio/. >> However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work >> for encodings that need state (e.g. when reading/writing UTF-16). >> Switching to stateful encoders/decoders isn't so easy, because the >> stateful codecs require a stream-API, which brings in a whole bunch of >> other functionality (readline() etc.), which we'd probably like to keep >> separate. I have a patch (http://bugs.python.org/1101097) that should >> fix this problem (at least for all codecs derived from >> codecs.StreamReader/codecs.StreamWriter). Additionally it would make >> stateful codecs more useful in the context for iterators/generators. >> >> I'd like this patch to go into 2.5. > > The patch as-is won't go into 2.5. It's simply the wrong approach: > StreamReaders and -Writers work on streams (hence the name). It > doesn't make sense adding functionality to side-step this behavior, > since it undermines the design. I agree that using a StreamWriter without a stream somehow feels wrong. > Like I suggested in the patch discussion, such functionality could > be factored out of the implementations of StreamReaders/Writers > and put into new StatefulEncoder/Decoder classes, the objects of > which then get used by StreamReader/Writer. > > In addition to that we could extend the codec registry to also > maintain slots for the stateful encoders and decoders, if needed. We *have* to do it like this otherwise there would be no way to get a StatefulEncoder/Decoder from an encoding name. Does this mean that codecs.lookup() would have to return a 6-tuple? But this would break if someone uses codecs.lookup("foo")[-1]. So maybe codecs.lookup() should return an instance of a subclass of tuple which has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() must be able to handle old 4-tuples returned by old search functions and update those to the new 6-tuples. (But we could drop this again after several releases, once all third party codecs are updated). Bye, Walter D?rwald From ncoghlan at gmail.com Fri Feb 17 15:46:45 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2006 00:46:45 +1000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F5E1D5.50208@gmail.com> Adam Olsen wrote: >> Over lunch with Alex Martelli, he proposed that a subclass of dict >> with this behavior (but implemented in C) would be a good addition to >> the language. It looks like it wouldn't be hard to implement. It could >> be a builtin named defaultdict. The first, required, argument to the >> constructor should be the default value. Remaining arguments (even >> keyword args) are passed unchanged to the dict constructor. > > -1 (atleast until you can explain why that's better than .getorset()) Because the "default default" is a fundamental characteristic of the default dictionary (meaning it works with normal indexing syntax), whereas "getorset" makes it a characteristic of the method call. Besides, if there are going to be any method changes on normal dicts, I'd rather see a boolean third argument "set" to the get method. That is (for a normal dict): def get(self, key, *args): set = False no_default = False if len(args) == 2: default, set = args elif args: default, = args else: no_default = True if key in self: return self[key] if no_default: raise KeyError(repr(key)) if set: self[key] = default return default Using Guido's original example: d.get(key, [], True).append(value) I don't really think this is a replacement for defaultdict, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fredrik at pythonware.com Fri Feb 17 15:50:10 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 15:50:10 +0100 Subject: [Python-Dev] Proposal: defaultdict References: <43F5E1D5.50208@gmail.com> Message-ID: Nick Coghlan wrote: > Using Guido's original example: > > d.get(key, [], True).append(value) hmm. are you sure you didn't just reinvent setdefault ? From mal at egenix.com Fri Feb 17 16:04:09 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 16:04:09 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F5DFE0.6090806@livinglogic.de> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> Message-ID: <43F5E5E9.2040809@egenix.com> Walter D?rwald wrote: > M.-A. Lemburg wrote: > >> Walter D?rwald wrote: >>> Guido van Rossum wrote: >>> >>>> [...] >>>> Years ago I wrote a prototype; checkout sandbox/sio/. >>> However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work >>> for encodings that need state (e.g. when reading/writing UTF-16). >>> Switching to stateful encoders/decoders isn't so easy, because the >>> stateful codecs require a stream-API, which brings in a whole bunch of >>> other functionality (readline() etc.), which we'd probably like to keep >>> separate. I have a patch (http://bugs.python.org/1101097) that should >>> fix this problem (at least for all codecs derived from >>> codecs.StreamReader/codecs.StreamWriter). Additionally it would make >>> stateful codecs more useful in the context for iterators/generators. >>> >>> I'd like this patch to go into 2.5. >> >> The patch as-is won't go into 2.5. It's simply the wrong approach: >> StreamReaders and -Writers work on streams (hence the name). It >> doesn't make sense adding functionality to side-step this behavior, >> since it undermines the design. > > I agree that using a StreamWriter without a stream somehow feels wrong. > >> Like I suggested in the patch discussion, such functionality could >> be factored out of the implementations of StreamReaders/Writers >> and put into new StatefulEncoder/Decoder classes, the objects of >> which then get used by StreamReader/Writer. >> >> In addition to that we could extend the codec registry to also >> maintain slots for the stateful encoders and decoders, if needed. > > We *have* to do it like this otherwise there would be no way to get a > StatefulEncoder/Decoder from an encoding name. > > Does this mean that codecs.lookup() would have to return a 6-tuple? > But this would break if someone uses codecs.lookup("foo")[-1]. Right; though I'd much rather see that people use the direct codecs module lookup APIs: getencoder(), getdecoder(), getreader() and getwriter() instead of using codecs.lookup() directly. > So maybe > codecs.lookup() should return an instance of a subclass of tuple which > has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() > must be able to handle old 4-tuples returned by old search functions and > update those to the new 6-tuples. (But we could drop this again after > several releases, once all third party codecs are updated). This was a design error: I should have not made codecs.lookup() a documented function. I'd suggest we keep codecs.lookup() the way it is and instead add new functions to the codecs module, e.g. codecs.getencoderobject() and codecs.getdecoderobject(). Changing the codec registration is not much of a problem: we could simply allow 6-tuples to be passed into the registry. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From skip at pobox.com Fri Feb 17 16:05:27 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 17 Feb 2006 09:05:27 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060216212133.GB23859@xs4all.nl> References: <20060216212133.GB23859@xs4all.nl> Message-ID: <17397.58935.8669.271616@montanaro.dyndns.org> Guido> Over lunch with Alex Martelli, he proposed that a subclass of Guido> dict with this behavior (but implemented in C) would be a good Guido> addition to the language. Instead, why not define setdefault() the way it should have been done in the first place? When you create a dict it has the current behavior. If you then call its setdefault() method that becomes the default value for missing keys. d = {'a': 1}' d['b'] # raises KeyError d.get('c') # evaluates to None d.setdefault(42) d['b'] # evaluates to 42 d.get('c') # evaluates to 42 For symmetry, setdefault() should probably be undoable: deldefault(), removedefault(), nodefault(), default_free(), whatever. The only question in my mind is whether or not getting a non-existent value under the influence of a given default value should stick that value in the dictionary or not. down-with-more-builtins-ly, y'rs, Skip From ncoghlan at gmail.com Fri Feb 17 16:28:55 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2006 01:28:55 +1000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F5E1D5.50208@gmail.com> Message-ID: <43F5EBB7.7080907@gmail.com> Fredrik Lundh wrote: > Nick Coghlan wrote: > >> Using Guido's original example: >> >> d.get(key, [], True).append(value) > > hmm. are you sure you didn't just reinvent setdefault ? I'm reasonably sure I copied it on purpose, only with a name that isn't 100% misleading as to what it does ;) I think collections.defaultdict is a better approach, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Fri Feb 17 16:29:59 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 16:29:59 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <17397.58935.8669.271616@montanaro.dyndns.org> References: <20060216212133.GB23859@xs4all.nl> <17397.58935.8669.271616@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > Guido> Over lunch with Alex Martelli, he proposed that a subclass of > Guido> dict with this behavior (but implemented in C) would be a good > Guido> addition to the language. > > Instead, why not define setdefault() the way it should have been done in the > first place? When you create a dict it has the current behavior. If you > then call its setdefault() method that becomes the default value for missing > keys. That puts it off until 3.0. >From what I read I think defaultdict won't become builtin anyway. Georg From fdrake at acm.org Fri Feb 17 16:36:11 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 17 Feb 2006 10:36:11 -0500 Subject: [Python-Dev] 2.5 PEP In-Reply-To: <43F4F757.1000603@v.loewis.de> References: <368a5cd50602152218n39dec5a5hf68c09eecc119aad@mail.gmail.com> <43F4F757.1000603@v.loewis.de> Message-ID: <200602171036.11400.fdrake@acm.org> On Thursday 16 February 2006 17:06, Martin v. L?wis wrote: > I'm still unhappy with that change, and still nobody has told me how to > maintain PyXML so that it can continue to work both for 2.5 and for 2.4. Martin, I do intend to write a proper response for you, but have been massively swamped. python-dev topics occaissionally pop up for me, but time has been too limited to get back to the important items, like this one. -Fred -- Fred L. Drake, Jr. From walter at livinglogic.de Fri Feb 17 16:57:01 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 17 Feb 2006 16:57:01 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F5E5E9.2040809@egenix.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> Message-ID: <43F5F24D.2000802@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: >> M.-A. Lemburg wrote: >> >>> [...] >>> Like I suggested in the patch discussion, such functionality could >>> be factored out of the implementations of StreamReaders/Writers >>> and put into new StatefulEncoder/Decoder classes, the objects of >>> which then get used by StreamReader/Writer. >>> >>> In addition to that we could extend the codec registry to also >>> maintain slots for the stateful encoders and decoders, if needed. >> We *have* to do it like this otherwise there would be no way to get a >> StatefulEncoder/Decoder from an encoding name. >> >> Does this mean that codecs.lookup() would have to return a 6-tuple? >> But this would break if someone uses codecs.lookup("foo")[-1]. > > Right; though I'd much rather see that people use the direct > codecs module lookup APIs: > > getencoder(), getdecoder(), getreader() and getwriter() > > instead of using codecs.lookup() directly. OK. >> So maybe >> codecs.lookup() should return an instance of a subclass of tuple which >> has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() >> must be able to handle old 4-tuples returned by old search functions and >> update those to the new 6-tuples. (But we could drop this again after >> several releases, once all third party codecs are updated). > > This was a design error: I should have not made > codecs.lookup() a documented function. > > I'd suggest we keep codecs.lookup() the way it is and > instead add new functions to the codecs module, e.g. > codecs.getencoderobject() and codecs.getdecoderobject(). > > Changing the codec registration is not much of a problem: > we could simply allow 6-tuples to be passed into the > registry. OK, so codecs.lookup() returns 4-tuples, but the registry stores 6-tuples and the search functions must return 6-tuples. And we add codecs.getencoderobject() and codecs.getdecoderobject() as well as new classes codecs.StatefulEncoder and codecs.StatefulDecoder. What about old search functions that return 4-tuples? Bye, Walter D?rwald From bokr at oz.net Fri Feb 17 17:04:10 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 16:04:10 GMT Subject: [Python-Dev] bytes type discussion References: <20060215223943.GL6027@xs4all.nl> <1140049757.14818.45.camel@geddy.wooz.org> Message-ID: <43f5f3bf.1044287125@news.gmane.org> On Fri, 17 Feb 2006 00:43:50 -0500, Steve Holden wrote: >Fredrik Lundh wrote: >> Barry Warsaw wrote: >> >> >>>We know at least there will never be a 2.10, so I think we still have >>>time. >> >> >> because there's no way to count to 10 if you only have one digit? >> >> we used to think that back when the gas price was just below 10 SEK/L, >> but they found a way... >> >IIRC Guido is on record as saying "There will be no Python 2.10 because >I hate the ambiguity of double-digit minor release numbers", or words to >that effect. > Hex? Regards, Bengt Richter From mal at egenix.com Fri Feb 17 17:12:36 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 17:12:36 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F5F24D.2000802@livinglogic.de> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> Message-ID: <43F5F5F4.2000906@egenix.com> Walter D?rwald wrote: > M.-A. Lemburg wrote: >> Walter D?rwald wrote: >>> M.-A. Lemburg wrote: >>> >>>> [...] >>>> Like I suggested in the patch discussion, such functionality could >>>> be factored out of the implementations of StreamReaders/Writers >>>> and put into new StatefulEncoder/Decoder classes, the objects of >>>> which then get used by StreamReader/Writer. >>>> >>>> In addition to that we could extend the codec registry to also >>>> maintain slots for the stateful encoders and decoders, if needed. >>> We *have* to do it like this otherwise there would be no way to get a >>> StatefulEncoder/Decoder from an encoding name. >>> >>> Does this mean that codecs.lookup() would have to return a 6-tuple? >>> But this would break if someone uses codecs.lookup("foo")[-1]. >> >> Right; though I'd much rather see that people use the direct >> codecs module lookup APIs: >> >> getencoder(), getdecoder(), getreader() and getwriter() >> >> instead of using codecs.lookup() directly. > > OK. > >>> So maybe >>> codecs.lookup() should return an instance of a subclass of tuple which >>> has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() >>> must be able to handle old 4-tuples returned by old search functions and >>> update those to the new 6-tuples. (But we could drop this again after >>> several releases, once all third party codecs are updated). >> >> This was a design error: I should have not made >> codecs.lookup() a documented function. >> >> I'd suggest we keep codecs.lookup() the way it is and >> instead add new functions to the codecs module, e.g. >> codecs.getencoderobject() and codecs.getdecoderobject(). >> >> Changing the codec registration is not much of a problem: >> we could simply allow 6-tuples to be passed into the >> registry. > > OK, so codecs.lookup() returns 4-tuples, but the registry stores > 6-tuples and the search functions must return 6-tuples. And we add > codecs.getencoderobject() and codecs.getdecoderobject() as well as new > classes codecs.StatefulEncoder and codecs.StatefulDecoder. What about > old search functions that return 4-tuples? The registry should then simply set the missing entries to None and the getencoderobject()/getdecoderobject() would then have to raise an error. Perhaps we should also deprecate codecs.lookup() in Py 2.5 ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jack at performancedrivers.com Fri Feb 17 17:19:32 2006 From: jack at performancedrivers.com (Jack Diederich) Date: Fri, 17 Feb 2006 11:19:32 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <20060217161932.GF6100@performancedrivers.com> On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote: [snip] > Google has an internal data type called a DefaultDict which gets > passed a default value upon construction. Its __getitem__ method, > instead of raising KeyError, inserts a shallow copy (!) of the given > default value into the dict when the value is not found. So the above > code, after > > d = DefaultDict([]) > > can be written as simply > > d[key].append(value) > > Note that of all the possible semantics for __getitem__ that could > have produced similar results (e.g. not inserting the default in the > underlying dict, or not copying the default value), the chosen > semantics are the only ones that makes this example work. Having __getitem__ insert the returned default value allows it to work with a larger variety of classes. My own ForgivingDict does not do this and works fine for ints and lists but not much else. fd = ForgivingDict(list) fd[key] += [val] # extends the list and does a __setitem__ The += operator isn't useful for dicts. How can you make a defaultdict with a defaultdict as the default? My head asploded when I tried it with the constructor arg. It does seem possible with the 'd.default = func' syntax # empty defaultdict constructor d = defaultdict() d.default = d tree = defaultdict() tree.default = d.copy -jackdied From arigo at tunes.org Fri Feb 17 17:29:32 2006 From: arigo at tunes.org (Armin Rigo) Date: Fri, 17 Feb 2006 17:29:32 +0100 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: References: Message-ID: <20060217162932.GA13840@code0.codespeak.net> Hi Travis, On Tue, Feb 14, 2006 at 08:41:19PM -0700, Travis E. Oliphant wrote: > 2) The __index__ special method will have the signature > > def __index__(self): > return obj > > Where obj must be either an int or a long or another object that has the > __index__ special method (but not self). The "anything but not self" rule is not consistent with any other special method's behavior. IMHO we should just do the same as __nonzero__(): * __nonzero__(x) must return exactly a bool or an int. This ensures that there is no infinite loop in C created by a __nonzero__ that returns something that has a further __nonzero__ method. The rule that the PEP proposes for __index__ (returns anything but not 'self') is not useful, because you can still get infinite loops (you just have to work slightly harder, and even not much). We should just say that __index__ must return an int or a long. A bientot, Armin From monpublic at gmail.com Fri Feb 17 17:27:33 2006 From: monpublic at gmail.com (CM) Date: Fri, 17 Feb 2006 09:27:33 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: +1 It's about time! - C On 2/16/06, Guido van Rossum wrote: > > A bunch of Googlers were discussing the best way of doing the > following (a common idiom when maintaining a dict of lists of values > relating to a key, sometimes called a multimap): > > if key not in d: d[key] = [] > d[key].append(value) > > An alternative way to spell this uses setdefault(), but it's not very > readable: > > d.setdefault(key, []).append(value) > > and it also suffers from creating an unnecessary list instance. > (Timings were inconclusive; the approaches are within 5-10% of each > other in speed.) > > My conclusion is that setdefault() is a failure -- it was a > well-intentioned construct, but doesn't actually create more readable > code. > > Google has an internal data type called a DefaultDict which gets > passed a default value upon construction. Its __getitem__ method, > instead of raising KeyError, inserts a shallow copy (!) of the given > default value into the dict when the value is not found. So the above > code, after > > d = DefaultDict([]) > > can be written as simply > > d[key].append(value) > > Note that of all the possible semantics for __getitem__ that could > have produced similar results (e.g. not inserting the default in the > underlying dict, or not copying the default value), the chosen > semantics are the only ones that makes this example work. > > Over lunch with Alex Martelli, he proposed that a subclass of dict > with this behavior (but implemented in C) would be a good addition to > the language. It looks like it wouldn't be hard to implement. It could > be a builtin named defaultdict. The first, required, argument to the > constructor should be the default value. Remaining arguments (even > keyword args) are passed unchanged to the dict constructor. > > Some more design subtleties: > > - "key in d" still returns False if the key isn't there > - "d.get(key)" still returns None if the key isn't there > - "d.default" should be a read-only attribute giving the default value > > Feedback? > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/monpublic%40gmail.com > -- "A programmer learning programming from Perl is like a chemistry student learning the definition of 'exothermic' with dynamite." - evilpenguin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060217/73f153d5/attachment.htm From fredrik at pythonware.com Fri Feb 17 17:37:20 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 17:37:20 +0100 Subject: [Python-Dev] bytes type discussion References: <20060215223943.GL6027@xs4all.nl> <1140049757.14818.45.camel@geddy.wooz.org> <43f5f3bf.1044287125@news.gmane.org> Message-ID: Bengt Richter wrote: > >> because there's no way to count to 10 if you only have one digit? > >> > >> we used to think that back when the gas price was just below 10 SEK/L, > >> but they found a way... > >> > >IIRC Guido is on record as saying "There will be no Python 2.10 because > >I hate the ambiguity of double-digit minor release numbers", or words to > >that effect. > > Hex? or roman numbers. I've payed X.35 SEK/L for gas... From thomas at xs4all.net Fri Feb 17 17:41:11 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 17 Feb 2006 17:41:11 +0100 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: <20060217162932.GA13840@code0.codespeak.net> References: <20060217162932.GA13840@code0.codespeak.net> Message-ID: <20060217164111.GD23859@xs4all.nl> On Fri, Feb 17, 2006 at 05:29:32PM +0100, Armin Rigo wrote: > > Where obj must be either an int or a long or another object that has the > > __index__ special method (but not self). > The "anything but not self" rule is not consistent with any other > special method's behavior. IMHO we should just do the same as > __nonzero__(): > * __nonzero__(x) must return exactly a bool or an int. Yes, very much so. And in case people worry that this makes wrapping objects harder: proxy objects (for instance) would do 'return operator.index(self._real)'. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From walter at livinglogic.de Fri Feb 17 17:44:56 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 17 Feb 2006 17:44:56 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F5F5F4.2000906@egenix.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> Message-ID: <43F5FD88.8090605@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: >> M.-A. Lemburg wrote: >>> Walter D?rwald wrote: >>>> [...] >>>> So maybe >>>> codecs.lookup() should return an instance of a subclass of tuple which >>>> has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() >>>> must be able to handle old 4-tuples returned by old search functions and >>>> update those to the new 6-tuples. (But we could drop this again after >>>> several releases, once all third party codecs are updated). >>> This was a design error: I should have not made >>> codecs.lookup() a documented function. >>> >>> I'd suggest we keep codecs.lookup() the way it is and >>> instead add new functions to the codecs module, e.g. >>> codecs.getencoderobject() and codecs.getdecoderobject(). >>> >>> Changing the codec registration is not much of a problem: >>> we could simply allow 6-tuples to be passed into the >>> registry. >> OK, so codecs.lookup() returns 4-tuples, but the registry stores >> 6-tuples and the search functions must return 6-tuples. And we add >> codecs.getencoderobject() and codecs.getdecoderobject() as well as new >> classes codecs.StatefulEncoder and codecs.StatefulDecoder. What about >> old search functions that return 4-tuples? > > The registry should then simply set the missing entries to None > and the getencoderobject()/getdecoderobject() would then have > to raise an error. Sounds simple enough and we don't loose backwards compatibility. > Perhaps we should also deprecate codecs.lookup() in Py 2.5 ?! +1, but I'd like to have a replacement for this, i.e. a function that returns all info the registry has about an encoding: 1. Name 2. Encoder function 3. Decoder function 4. Stateful encoder factory 5. Stateful decoder factory 6. Stream writer factory 7. Stream reader factory and if this is an object with attributes, we won't have any problems if we extend it in the future. BTW, if we change the API, can we fix the return value of the stateless functions? As the stateless function always encodes/decodes the complete string, returning the length of the string doesn't make sense. codecs.getencoder() and codecs.getdecoder() would have to continue to return the old variant of the functions, but codecs.getinfo("latin-1").encoder would be the new encoding function. Bye, Walter D?rwald From chris at atlee.ca Fri Feb 17 17:48:28 2006 From: chris at atlee.ca (Chris AtLee) Date: Fri, 17 Feb 2006 11:48:28 -0500 Subject: [Python-Dev] Copying zlib compression objects Message-ID: <7790b6530602170848oe892897s4157c39b94082ce5@mail.gmail.com> I'm writing a program in python that creates tar files of a certain maximum size (to fit onto CD/DVD). One of the problems I'm running into is that when using compression, it's pretty much impossible to determine if a file, once added to an archive, will cause the archive size to exceed the maximum size. I believe that to do this properly, you need to copy the state of tar file (basically the current file offset as well as the state of the compression object), then add the file. If the new size of the archive exceeds the maximum, you need to restore the original state. The critical part is being able to copy the compression object. Without compression it is trivial to determine if a given file will "fit" inside the archive. When using compression, the compression ratio of a file depends partially on all the data that has been compressed prior to it. The current implementation in the standard library does not allow you to copy these compression objects in a useful way, so I've made some minor modifications (patch attached) to the standard 2.4.2 library: - Add copy() method to zlib compression object. This returns a new compression object with the same internal state. I named it copy() to keep it consistent with things like sha.copy(). - Add snapshot() / restore() methods to GzipFile and TarFile. These work only in write mode. snapshot() returns a state object. Passing in this state object to restore() will restore the state of the GzipFile / TarFile to the state represented by the object. Future work: - Decompression objects could use a copy() method too - Add support for copying bzip2 compression objects Although this patch isn't complete, does this seem like a good approach? Cheers, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060217/5f26b769/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: snapshots.diff Type: text/x-patch Size: 3500 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20060217/5f26b769/attachment-0001.bin From bokr at oz.net Fri Feb 17 18:28:45 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 17:28:45 GMT Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> <43F50BDD.4010106@v.loewis.de> Message-ID: <43f5f598.1044760656@news.gmane.org> On Fri, 17 Feb 2006 00:33:49 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Josiah Carlson wrote: >> I would agree that zip is questionable, but 'uu', 'rot13', perhaps 'hex', >> and likely a few others that the two of you may be arguing against >> should stay as encodings, because strictly speaking, they are defined as >> encodings of data. They may not be encodings of _unicode_ data, but >> that doesn't mean that they aren't useful encodings for other kinds of >> data, some text, some binary, ... > >To support them, the bytes type would have to gain a .encode method, >and I'm -1 on supporting bytes.encode, or string.decode. > >Why is > >s.encode("uu") > >any better than > >binascii.b2a_uu(s) > One aspect is that dotted notation method calling is serially composable, whereas function calls nest, and you have to find and read from the innermost, which gets hard quickly unless you use multiline formatting. But even then you can't read top to bottom as processing order. If we had a general serial composition syntax for function calls something like unix piping (which is a big part of the power of unix shells IMO) we could make the choice of appropriate composition semantics better. Decorators already compose functions in a limited way, but processing order would read like forth horizontally. Maybe '->' ? How about foo(x, y) -> bar() -> baz(z) as as sugar for baz.__get__(bar.__get__(foo(x, y))())(z) ? (Hope I got that right ;-) I.e., you'd have self-like args to receive results from upstream. E.g., >>> def foo(x, y): return 'foo(%s, %s)'%(x,y) ... >>> def bar(stream): return 'bar(%s)'%stream ... >>> def baz(stream, z): return 'baz(%s, %s)'%(stream,z) ... >>> x = 'ex'; y='wye'; z='zed' >>> baz.__get__(bar.__get__(foo(x, y))())(z) 'baz(bar(foo(ex, wye)), zed)' Regards, Bengt Richter From arigo at tunes.org Fri Feb 17 18:30:43 2006 From: arigo at tunes.org (Armin Rigo) Date: Fri, 17 Feb 2006 18:30:43 +0100 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: Message-ID: <20060217173043.GA14607@code0.codespeak.net> Hi, On Tue, Feb 14, 2006 at 09:24:57PM -0800, Neal Norwitz wrote: > http://www.python.org/peps/pep-0356.html There is at least one SF bug, namely "#1333982 Bugs of the new AST compiler", that in my humble opinion absolutely needs to be fixed before the release, even though I won't hide that I have no intention of fixing it myself. Should I raise the issue here in python-dev, and see if we agree that it is critical? (Sorry if I should know about the procedure. Does it then go in the PEP's Planned Features list?) A bientot, Armin From skip at pobox.com Fri Feb 17 18:35:45 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 17 Feb 2006 11:35:45 -0600 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <43F5655E.7060901@holdenweb.com> References: <43F42425.3070807@acm.org> <43F5655E.7060901@holdenweb.com> Message-ID: <17398.2417.668291.784027@montanaro.dyndns.org> Steve> It appears to hang together, but I'm not sure I see how it Steve> overcomes objections to lambda by replacing it with another Steve> keyword. Well, it does replace it with a word which has meaning in common English. FWIW, I would require the parens around the arguments and avoid the ambiguity altogether. Skip From tjreedy at udel.edu Fri Feb 17 18:39:31 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 17 Feb 2006 12:39:31 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com><002301c6337a$7001f140$b83efea9@RaymondLaptop1> Message-ID: "Fredrik Lundh" wrote in message news:dt4icb$e9n$1 at sea.gmane.org... > Raymond Hettinger wrote: > >> I would like to add something like this to the collections module, but a >> PEP is >> probably needed to deal with issues like: > > frankly, now that Guido is working 50% on Python, do we really have to > use > the full PEP process also for simple things like this? > > I'd say we let the BDFL roam free. PEPs are useful for question-answering purposes even after approval. The design phase can be cut short by simply posting the approved design doc. From jeremy at alum.mit.edu Fri Feb 17 18:40:14 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 17 Feb 2006 12:40:14 -0500 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: <20060217173043.GA14607@code0.codespeak.net> References: <20060217173043.GA14607@code0.codespeak.net> Message-ID: It is critical, but I hadn't seen the bug report. Feel free to assign AST bugs to me and assign them a > 5 priority. Jeremy On 2/17/06, Armin Rigo wrote: > Hi, > > On Tue, Feb 14, 2006 at 09:24:57PM -0800, Neal Norwitz wrote: > > http://www.python.org/peps/pep-0356.html > > There is at least one SF bug, namely "#1333982 Bugs of the new AST > compiler", that in my humble opinion absolutely needs to be fixed before > the release, even though I won't hide that I have no intention of fixing > it myself. Should I raise the issue here in python-dev, and see if we > agree that it is critical? > > (Sorry if I should know about the procedure. Does it then go in the > PEP's Planned Features list?) > > > A bientot, > > Armin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From jeremy at alum.mit.edu Fri Feb 17 18:44:15 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 17 Feb 2006 12:44:15 -0500 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: <20060217173043.GA14607@code0.codespeak.net> Message-ID: Actually, it might be easier to assign separate bugs. A number of the old bugs appear to have been fixed. It's hard to track individual items within a bug report. Jeremy On 2/17/06, Jeremy Hylton wrote: > It is critical, but I hadn't seen the bug report. Feel free to assign > AST bugs to me and assign them a > 5 priority. > > Jeremy > > On 2/17/06, Armin Rigo wrote: > > Hi, > > > > On Tue, Feb 14, 2006 at 09:24:57PM -0800, Neal Norwitz wrote: > > > http://www.python.org/peps/pep-0356.html > > > > There is at least one SF bug, namely "#1333982 Bugs of the new AST > > compiler", that in my humble opinion absolutely needs to be fixed before > > the release, even though I won't hide that I have no intention of fixing > > it myself. Should I raise the issue here in python-dev, and see if we > > agree that it is critical? > > > > (Sorry if I should know about the procedure. Does it then go in the > > PEP's Planned Features list?) > > > > > > A bientot, > > > > Armin > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > > > From ianb at colorstudy.com Fri Feb 17 18:58:30 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 11:58:30 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <002301c6337a$7001f140$b83efea9@RaymondLaptop1> References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com> <002301c6337a$7001f140$b83efea9@RaymondLaptop1> Message-ID: <43F60EC6.9080100@colorstudy.com> Raymond Hettinger wrote: >>>Over lunch with Alex Martelli, he proposed that a subclass of dict >>>with this behavior (but implemented in C) would be a good addition to >>>the language > > > I would like to add something like this to the collections module, but a PEP is > probably needed to deal with issues like: > > * implications of a __getitem__ succeeding while get(value, x) returns x > (possibly different from the overall default) > * implications of a __getitem__ succeeding while __contains__ would fail > * whether to add this to the collections module (I would say yes) > * whether to allow default functions as well as default values (so you could > instantiate a new default list) > * comparing all the existing recipes and third-party modules that have already > done this > * evaluating its fitness for common use cases (i.e. bags and dict of lists). It doesn't seem that useful for bags, assuming we're talking about an {object: count} implementation of bags; bags should really have a more set-like interface than a dict-like interface. A dict of lists typically means a multi-valued dict. In that case it seems like x[key_not_found] should return the empty list, as that means zero values; even though zero values also means that x.has_key(key_not_found) should return False as well. *but* getting x[key_not_found] does not (for a multi-valued dict) mean that suddently has_key should return true. I find the side-effect nature of __getitem__ as proposed in default_dict to be rather confusing, and when reading code it will very much break my expectations. I assume that attribute access and [] access will not have side effects. Coming at it from that direction, I'm -1, though I'm +1 on dealing with the specific use case that started this (x.setdefault(key, []).append(value)). An implementation targetted specifically at multi-valued dictionaries seems like it would be better. Incidentally, on Web-SIG we've discussed wsgiref, and it includes a mutli-values, ordered, case-insensitive dictionary. Such a dictionary(ish) object has clear applicability for HTTP headers, but certainly it is something I've used many times elsewhere. In a case-sensitive form it applies to URL variables. Really there's several combinations of features, each with different uses. So we have now... dicts: unordered, key:value (associative), single-value sets: unordered, not key:value, single-value lists: ordered, not key:value, multi-value We don't have... bags: unordered, not key:value, multi-value multi-dict: unordered, key:value, multi-value ordered-dict: ordered, key:value, single-value ordered-multi-dict: ordered, key:value, single-value For all key:value collections, normalized keys can be useful. (Though notably the wsgiref Headers object does not have normalized keys, but instead does case-insensitive comparisons.) I don't know where dict-of-dict best fits in here. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From fredrik at pythonware.com Fri Feb 17 19:10:03 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 17 Feb 2006 19:10:03 +0100 Subject: [Python-Dev] Proposal: defaultdict References: <2773CAC687FD5F4689F526998C7E4E5F4DB960@au3010avexu1.global.avaya.com><002301c6337a$7001f140$b83efea9@RaymondLaptop1> Message-ID: Terry Reedy wrote: > > I'd say we let the BDFL roam free. > > PEPs are useful for question-answering purposes even after approval. The > design phase can be cut short by simply posting the approved design doc. not for trivialities. it'll take Guido more time to write a PEP than to implement the damn thing. is that really a good use of his time ? why is python-dev suddenly full of control freaks ? From skip at pobox.com Fri Feb 17 19:20:54 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 17 Feb 2006 12:20:54 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F576A3.1030604@v.loewis.de> Message-ID: <17398.5126.939988.242626@montanaro.dyndns.org> >> Also, I think has_key/in should return True if there is a default. Fredrik> and keys should return all possible key values! I think keys() and in should reflect reality. Only when you do something like x = d['nonexistent'] or x = d.get('nonexistent') should the default value come into play. Skip From ianb at colorstudy.com Fri Feb 17 19:21:54 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 12:21:54 -0600 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: Message-ID: <43F61442.6050003@colorstudy.com> Georg Brandl wrote: > Hi, > > it has been proposed before, but there was no conclusive answer last time: > is there any chance for 2.5 to include commonly used decorators in a module? One peculiar aspect is that decorators are a programming technique, not a particular kind of functionality. So the module seems kind of funny as a result. > Of course not everything that jumps around should go in, only pretty basic > stuff that can be widely used. > > Candidates are: > - @decorator. This properly wraps up a decorator function to change the > signature of the new function according to the decorated one's. Yes, I like this, and it is purely related to "decorators" not anything else. Without this, decorators really hurt introspectability. > - @contextmanager, see PEP 343. This is abstract enough that it doesn't belong anywhere in particular. > - @synchronized/@locked/whatever, for thread safety. Seems better in the threading module. Plus contexts and with make it much less important as a decorator. > - @memoize Also abstract, so I suppose it would make sense. > - Others from wiki:PythonDecoratorLibrary and Michele Simionato's decorator > module at . redirecting_stdout is better implemented using contexts/with. @threaded (which runs the decorated function in a thread) seems strange to me. @blocking seems like it is going into async directions that don't really fit in with "decorators" (as a general concept). I like @tracing, though it doesn't seem like it is really implemented there, it's just an example? > Unfortunately, a @property decorator is impossible... It already works! But only if you want a read-only property. Which is actually about 50%+ of the properties I create. So the status quo is not really that bad. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From skip at pobox.com Fri Feb 17 19:26:00 2006 From: skip at pobox.com (skip at pobox.com) Date: Fri, 17 Feb 2006 12:26:00 -0600 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: Message-ID: <17398.5432.959908.657882@montanaro.dyndns.org> >> it has been proposed before, but there was no conclusive answer last >> time: is there any chance for 2.5 to include commonly used decorators >> in a module? Georg> No interest at all? I would think the decorators that allow proper introspection (func_name, __doc__, etc) should be available, probably in a decorators module. Beyond that I'm not sure. I think it would be better to be conservative. Skip From g.brandl at gmx.net Fri Feb 17 19:35:55 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 19:35:55 +0100 Subject: [Python-Dev] The decorator(s) module In-Reply-To: <43F61442.6050003@colorstudy.com> References: <43F61442.6050003@colorstudy.com> Message-ID: Ian Bicking wrote: >> Unfortunately, a @property decorator is impossible... > > It already works! But only if you want a read-only property. Which is > actually about 50%+ of the properties I create. So the status quo is > not really that bad. I have abused it this way too and felt bad every time. Kind of like keeping your hat on in the church. :) Georg From bokr at oz.net Fri Feb 17 19:37:08 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 18:37:08 GMT Subject: [Python-Dev] Proposal: defaultdict References: <43F576A3.1030604@v.loewis.de> Message-ID: <43f613df.1052511501@news.gmane.org> On Fri, 17 Feb 2006 08:09:23 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Guido van Rossum wrote: >> Feedback? > >I would like this to be part of the standard dictionary type, >rather than being a subtype. > >d.setdefault([]) (one argument) should install a default value, >and d.cleardefault() should remove that setting; d.default >should be read-only. Alternatively, d.default could be assignable >and del-able. I like the latter, but d.default_factory = callable # or None > >Also, I think has_key/in should return True if there is a default. That seems iffy. ISTM potential should not define actual status. Regards, Bengt Richter From guido at python.org Fri Feb 17 20:09:47 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 11:09:47 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/16/06, Guido van Rossum wrote: > Over lunch with Alex Martelli, he proposed that a subclass of dict > with this behavior (but implemented in C) would be a good addition to > the language. It looks like it wouldn't be hard to implement. It could > be a builtin named defaultdict. The first, required, argument to the > constructor should be the default value. Remaining arguments (even > keyword args) are passed unchanged to the dict constructor. Thanks for all the constructive feedback. Here are some responses and a new proposal. - Yes, I'd like to kill setdefault() in 3.0 if not sooner. - It would indeed be nice if this was an optional feature of the standard dict type. - I'm ignoring the request for other features (ordering, key transforms). If you want one of these, write a PEP! - Many, many people suggested to use a factory function instead of a default value. This is indeed a much better idea (although slightly more cumbersome for the simplest cases). - Some people seem to think that a subclass constructor signature must match the base class constructor signature. That's not so. The subclass constructor must just be careful to call the base class constructor with the correct arguments. Think of the subclass constructor as a factory function. - There's a fundamental difference between associating the default value with the dict object, and associating it with the call. So proposals to invent a better name/signature for setdefault() don't compete. (As to one specific such proposal, adding an optional bool as the 3rd argument to get(), I believe I've explained enough times in the past that flag-like arguments that always get a constant passed in at the call site are a bad idea and should usually be refactored into two separate methods.) - The inconsistency introduced by __getitem__() returning a value for keys while get(), __contains__(), and keys() etc. don't show it, cannot be resolved usefully. You'll just have to live with it. Modifying get() to do the same thing as __getitem__() doesn't seem useful -- it just takes away a potentially useful operation. So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None. The implementation is like this (but in C): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory() self[key] = value return value raise KeyError(key) When __getitem__() (and *only* __getitem__()) finds that the requested key is not present in the dict, it calls self.on_missing(key) and returns whatever it returns -- or raises whatever it raises. __getitem__() doesn't need to raise KeyError any more, that's done by on_missing(). The on_missing() method can be overridden to implement any semantics you want when the key isn't found: return a value without inserting it, insert a value without copying it, only do it for certain key types/values, make the default incorporate the key, etc. But the default implementation is designed so that we can write d = {} d.default_factory = list to create a dict that inserts a new list whenever a key is not found in __getitem__(), which is most useful in the original use case: implementing a multiset so that one can write d[key].append(value) to add a new key/value to the multiset without having to handle the case separately where the key isn't in the dict yet. This also works for sets instead of lists: d = {} d.default_factory = set ... d[key].add(value) I went through several iterations to obtain this design; my first version of on_missing() would just raise KeyError(key), requiring you to always provide a subclass; this is more minimalistic but less useful and would probably raise the bar for using the feature to some extent. To saev you attempts to simplify this, here are some near-misses I considered that didn't quite work out: - def on_missing(self, key): if self.default_factory is not None: return self.default_factory() raise KeyError(key) This would require the multiset example to subclass, since default_factory doesn't see the key so it can't insert it. - def on_missing(self, key): if self.default_factory is not None: return self.default_factory(key) raise KeyError(key) This appears to fix that problem, but now you can't write "d.default_value = list" since (a) list(key) doesn't return an empty list, and (b) it also doesn't insert the key into the dict; attempting to assign a callback function to default_factory that solves these issues fail because the callback doesn't have access to the dict instance (unless there's only one). - Do away with on_missing() and just include its body at the end of __getitem__(), to be invoked when the key isn't found. This is less general in case you want different default semantics (e.g. not inserting the default, or making the default a function of the key) -- you'd have to override __getitem__() for that, which means you'd be paying overhead even for keys that *are* present. I'll try to cook up an implementation on SF after I've dug myself out of the day's email barrage. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Fri Feb 17 20:26:40 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 17 Feb 2006 14:26:40 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <200602171426.41124.fdrake@acm.org> On Friday 17 February 2006 14:09, Guido van Rossum wrote: > So here's a new proposal. I like the version you came up with. It has sufficient functionality to make it easy to use, and enough flexibility to be useful in more specialized cases. I'm quite certain it would handle all the cases I've actually dealt with where I wanted a variation of a mapping with default values. -Fred -- Fred L. Drake, Jr. From aleaxit at gmail.com Fri Feb 17 20:34:46 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 17 Feb 2006 11:34:46 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/16/06, Guido van Rossum wrote: > A bunch of Googlers were discussing the best way of doing the ... Wow, what a great discussion! As you'll recall, I had also mentioned the "callable factory" as a live possibility, and there seems to be a strong sentiment in favor of that; not really a "weakness case" for HOFs, as you feared it might be during the lunchtime discussion. Out of all I've read here, I like the idea of having a collections.autodict (a much nicer name than defaultdict, a better collocation for 2.5 than the builtins). One point I think nobody has made is that whenever reasonably possible the setting of a callback (the callable factory here) should include *a and **k to use when calling back. So, for example: ad = collections.autodict(copy.copy, whatever) would easily cover the use case of Google's DefaultDict (yes, partial would also cover this use case, but having *a and **k is usefully more general). If you're convinced copy.copy is an overwhelmingly popular use case (I'm not, personally), then this specific idiom might also be summarized in a classmethod, a la ad = collections.autodict.by_copy(whatever) This way, all autodicts would start out empty (and be filled by update if needed). An alternative would be to have autodict's ctor have the same signature as dict's, with a separate .set_initial method to pass the factory (and *a, **k) -- this way an autodict might start out populated, but would always start with some default factory, such as lambda:None I guess. I think the first alternative (autodict always starts empty, but with a specifically chosen factory [including *a, **k]) is more useful. Alex From ianb at colorstudy.com Fri Feb 17 20:51:04 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 13:51:04 -0600 Subject: [Python-Dev] Counter proposal: multidict (was: Proposal: defaultdict) In-Reply-To: References: Message-ID: <43F62928.90809@colorstudy.com> I really don't like that defaultdict (or a dict extension) means that x[not_found] will have noticeable side effects. This all seems to be a roundabout way to address one important use case of a dictionary with multiple values for each key, and in the process breaking an important quality of good Python code, that attribute and getitem access not have noticeable side effects. So, here's a proposed interface for a new multidict object, borrowing some methods from Set but mostly from dict. Some things that seemed particularly questionable to me are marked with ??. class multidict: def __init__([mapping], [**kwargs]): """ Create a multidict: multidict() -> new empty multidict multidict(mapping) -> equivalent to: ob = multidict() ob.update(mapping) multidict(**kwargs) -> equivalent to: ob = multidict() ob.update(kwargs) """ def __contains__(key): """ True if ``self[key]`` is true """ def __getitem__(key): """ Returns a list of items associated with the given key. If nothing, then the empty list. ??: Is the list mutable, and to what effect? """ def __delitem__(key): """ Removes any instances of key from the dictionary. Does not raise an error if there are no values associated. ??: Should this raise a KeyError sometimes? """ def __setitem__(key, value): """ Same as: del self[key] self.add(key, value) """ def get(key, default=[]): """ Returns a list of items associated with the given key, or if that list would be empty it returns default """ def getfirst(key, default=None): """ Equivalent to: if key in self: return self[key][0] else: return default """ def add(key, value): """ Adds the value with the given key, so that self[key][-1] == value """ def remove(key, value): """ Remove (key, value) from the mapping (raising KeyError if not present). """ def discard(key, value): """ Remove like self.remove(key, value), except do not raise KeyError if missing. """ def pop(key): """ Removes key and returns the value; returns [] and does nothing if the key is not found. """ def keys(): """ Returns all the keys which have some associated value. """ def items(): """ Returns [(key, value)] for every key/value pair. Keys that have multiple values will be returned as multiple (key, value) tuples. """ def __len__(): """ Equivalent to len(self.items()) ??: Not len(self.keys())? """ def update(E, **kwargs): """ if E has iteritems then:: for k, v in E.iteritems(): self.add(k, v) elif E has keys: for k in E: self.add(k, E[k]) else: for k, v in E: self.add(k, v) ??: Should **kwargs be allowed? If so, should it the values be sequences? """ # iteritems, iterkeys, iter, has_key, copy, popitem, values, clear # with obvious implementations From guido at python.org Fri Feb 17 20:58:10 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 11:58:10 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/17/06, Alex Martelli wrote: > On 2/16/06, Guido van Rossum wrote: > > A bunch of Googlers were discussing the best way of doing the > ... > Wow, what a great discussion! As you'll recall, I had also mentioned > the "callable factory" as a live possibility, and there seems to be a > strong sentiment in favor of that; not really a "weakness case" for > HOFs, as you feared it might be during the lunchtime discussion. :-) You seem to have missed my revised proposal. > Out of all I've read here, I like the idea of having a > collections.autodict (a much nicer name than defaultdict, a better > collocation for 2.5 than the builtins). One point I think nobody has > made is that whenever reasonably possible the setting of a callback > (the callable factory here) should include *a and **k to use when > calling back. That's your C/C++ brain talking. :-) If you need additional data passed to a callback (to be provided at the time the callback is *set*, not when it is *called*) the customary approach is to make the callback a parameterless lambda; you can also use a bound method, etc. There's no need to complicate ever piece of code that calls a callback with the machinery to store and use arbirary arguments and keyword arguments. I forgot to mention in my revised proposal that the API for setting the default_factory is slightly odd: d = {} # or dict() d.default_factory = list rather than d = dict(default_factory=list) This is of course because we cut off that way when we defined what arbitrary keyword arguments to the dict constructor would do. My original proposal solved this by creating a subclass. But there were several suggestions that this would be fine functionality to add to the standard dict type -- and then I really don't see any other way to do this. (Yes, I could have a set_default_factory() method -- but a simple settable attribute seems more pythonic!) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Feb 17 21:02:15 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 12:02:15 -0800 Subject: [Python-Dev] Counter proposal: multidict (was: Proposal: defaultdict) In-Reply-To: <43F62928.90809@colorstudy.com> References: <43F62928.90809@colorstudy.com> Message-ID: On 2/17/06, Ian Bicking wrote: > I really don't like that defaultdict (or a dict extension) means that > x[not_found] will have noticeable side effects. This all seems to be a > roundabout way to address one important use case of a dictionary with > multiple values for each key, and in the process breaking an important > quality of good Python code, that attribute and getitem access not have > noticeable side effects. > > So, here's a proposed interface for a new multidict object, borrowing > some methods from Set but mostly from dict. Some things that seemed > particularly questionable to me are marked with ??. Have you seen my revised proposal (which is indeed an addition to the standard dict rather than a subclass)? Your multidict addresses only one use case for the proposed behavior; what's so special about dicts of lists that they should have special support? What about dicts of dicts, dicts of sets, dicts of user-defined objects? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Fri Feb 17 21:03:06 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 17 Feb 2006 15:03:06 -0500 Subject: [Python-Dev] Counter proposal: multidict (was: Proposal: defaultdict) In-Reply-To: <43F62928.90809@colorstudy.com> References: <43F62928.90809@colorstudy.com> Message-ID: <200602171503.06407.fdrake@acm.org> On Friday 17 February 2006 14:51, Ian Bicking wrote: > This all seems to be a > roundabout way to address one important use case of a dictionary with > multiple values for each key, I think there are use cases that do not involve multiple values per key. That is one place where this commonly comes up, but not the only one. > and in the process breaking an important > quality of good Python code, that attribute and getitem access not have > noticeable side effects. I'm not sure that's quite as well-defined or agreed upon as you do. -Fred -- Fred L. Drake, Jr. From theller at python.net Fri Feb 17 21:04:42 2006 From: theller at python.net (Thomas Heller) Date: Fri, 17 Feb 2006 21:04:42 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F62C5A.7030806@python.net> Guido van Rossum wrote: > So here's a new proposal. > > Let's add a generic missing-key handling method to the dict class, as > well as a default_factory slot initialized to None. The implementation > is like this (but in C): > > def on_missing(self, key): > if self.default_factory is not None: > value = self.default_factory() > self[key] = value > return value > raise KeyError(key) > > When __getitem__() (and *only* __getitem__()) finds that the requested > key is not present in the dict, it calls self.on_missing(key) and > returns whatever it returns -- or raises whatever it raises. > __getitem__() doesn't need to raise KeyError any more, that's done by > on_missing(). Will this also work when PyDict_GetItem() does not find the key? Thomas From guido at python.org Fri Feb 17 21:11:29 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 12:11:29 -0800 Subject: [Python-Dev] Copying zlib compression objects In-Reply-To: <7790b6530602170848oe892897s4157c39b94082ce5@mail.gmail.com> References: <7790b6530602170848oe892897s4157c39b94082ce5@mail.gmail.com> Message-ID: Please submit your patch to SourceForge. On 2/17/06, Chris AtLee wrote: > I'm writing a program in python that creates tar files of a certain > maximum size (to fit onto CD/DVD). One of the problems I'm running > into is that when using compression, it's pretty much impossible to > determine if a file, once added to an archive, will cause the archive > size to exceed the maximum size. > > > I believe that to do this properly, you need to copy the state of tar > file (basically the current file offset as well as the state of the > compression object), then add the file. If the new size of the archive > exceeds the maximum, you need to restore the original state. > > > The critical part is being able to copy the compression object. > Without compression it is trivial to determine if a given file will > "fit" inside the archive. When using compression, the compression > ratio of a file depends partially on all the data that has been > compressed prior to it. > > > The current implementation in the standard library does not allow you > to copy these compression objects in a useful way, so I've made some > minor modifications (patch attached) to the standard 2.4.2 library: > - Add copy() method to zlib compression object. This returns a new > compression object with the same internal state. I named it copy() to > keep it consistent with things like sha.copy(). > - Add snapshot() / restore() methods to GzipFile and TarFile. These > work only in write mode. snapshot() returns a state object. Passing > in this state object to restore() will restore the state of the > GzipFile / TarFile to the state represented by the object. > > > Future work: > - Decompression objects could use a copy() method too > - Add support for copying bzip2 compression objects > > > Although this patch isn't complete, does this seem like a good approach? > > > Cheers, > Chris > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Fri Feb 17 21:18:42 2006 From: theller at python.net (Thomas Heller) Date: Fri, 17 Feb 2006 21:18:42 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F62C5A.7030806@python.net> Message-ID: <43F62FA2.80302@python.net> [cc to py-dev again] Guido van Rossum wrote: > On 2/17/06, Thomas Heller wrote: >> Guido van Rossum wrote: >>> So here's a new proposal. >>> >>> Let's add a generic missing-key handling method to the dict class, as >>> well as a default_factory slot initialized to None. The implementation >>> is like this (but in C): >>> >>> def on_missing(self, key): >>> if self.default_factory is not None: >>> value = self.default_factory() >>> self[key] = value >>> return value >>> raise KeyError(key) >>> >>> When __getitem__() (and *only* __getitem__()) finds that the requested >>> key is not present in the dict, it calls self.on_missing(key) and >>> returns whatever it returns -- or raises whatever it raises. >>> __getitem__() doesn't need to raise KeyError any more, that's done by >>> on_missing(). >> Will this also work when PyDict_GetItem() does not find the key? > > Ouch, tricky. It should, of course, but the code will be a tad tricky > because it's not supposed to inc the refcount. Thanks for reminding > me! > Ahem, I'm still looking for ways to 'overtake' the dict to implement weird and fancy things. Can on_missing be overridden in subclasses (writing the subclass in C would not be a problem)? Thanks, Thomas From guido at python.org Fri Feb 17 21:23:06 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 12:23:06 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F62FA2.80302@python.net> References: <43F62C5A.7030806@python.net> <43F62FA2.80302@python.net> Message-ID: On 2/17/06, Thomas Heller wrote: > Ahem, I'm still looking for ways to 'overtake' the dict to implement > weird and fancy things. Can on_missing be overridden in subclasses (writing > the subclass in C would not be a problem)? Why ahem? The answer is yes. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at performancedrivers.com Fri Feb 17 21:25:14 2006 From: jack at performancedrivers.com (Jack Diederich) Date: Fri, 17 Feb 2006 15:25:14 -0500 Subject: [Python-Dev] Counter proposal: multidict (was: Proposal: defaultdict) In-Reply-To: <200602171503.06407.fdrake@acm.org> References: <43F62928.90809@colorstudy.com> <200602171503.06407.fdrake@acm.org> Message-ID: <20060217202514.GL6100@performancedrivers.com> On Fri, Feb 17, 2006 at 03:03:06PM -0500, Fred L. Drake, Jr. wrote: > On Friday 17 February 2006 14:51, Ian Bicking wrote: > > and in the process breaking an important > > quality of good Python code, that attribute and getitem access not have > > noticeable side effects. > > I'm not sure that's quite as well-defined or agreed upon as you do. Without the __getitem__ side effect default objects that don't support any operators would have problems. d[key] += val works fine when the default is a list or int but fails for dicts and presumably many user defined objects. By assigning the default value in __getitem__ the returned value can be manipulated via its methods. -Jack From theller at python.net Fri Feb 17 21:27:36 2006 From: theller at python.net (Thomas Heller) Date: Fri, 17 Feb 2006 21:27:36 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F62C5A.7030806@python.net> <43F62FA2.80302@python.net> Message-ID: <43F631B8.30306@python.net> Guido van Rossum wrote: > On 2/17/06, Thomas Heller wrote: >> Ahem, I'm still looking for ways to 'overtake' the dict to implement >> weird and fancy things. Can on_missing be overridden in subclasses (writing >> the subclass in C would not be a problem)? > > Why ahem? > > The answer is yes. Ok, so that allows to pass the key, for example, to the default_factory - allowing the case insensitive lookup in namespaces. Thomas From theller at python.net Fri Feb 17 21:27:36 2006 From: theller at python.net (Thomas Heller) Date: Fri, 17 Feb 2006 21:27:36 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F62C5A.7030806@python.net> <43F62FA2.80302@python.net> Message-ID: <43F631B8.30306@python.net> Guido van Rossum wrote: > On 2/17/06, Thomas Heller wrote: >> Ahem, I'm still looking for ways to 'overtake' the dict to implement >> weird and fancy things. Can on_missing be overridden in subclasses (writing >> the subclass in C would not be a problem)? > > Why ahem? > > The answer is yes. Ok, so that allows to pass the key, for example, to the default_factory - allowing the case insensitive lookup in namespaces. Thomas From martin at v.loewis.de Fri Feb 17 21:35:25 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 21:35:25 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F5CFE6.3040502@egenix.com> References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> <43F50BDD.4010106@v.loewis.de> <43F5CFE6.3040502@egenix.com> Message-ID: <43F6338D.8050300@v.loewis.de> M.-A. Lemburg wrote: > Just because some codecs don't fit into the string.decode() > or bytes.encode() scenario doesn't mean that these codecs are > useless or that the methods should be banned. No. The reason to ban string.decode and bytes.encode is that it confuses users. Regards, Martin From ianb at colorstudy.com Fri Feb 17 21:51:26 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 14:51:26 -0600 Subject: [Python-Dev] Counter proposal: multidict In-Reply-To: References: <43F62928.90809@colorstudy.com> Message-ID: <43F6374E.5020708@colorstudy.com> Guido van Rossum wrote: > On 2/17/06, Ian Bicking wrote: > >>I really don't like that defaultdict (or a dict extension) means that >>x[not_found] will have noticeable side effects. This all seems to be a >>roundabout way to address one important use case of a dictionary with >>multiple values for each key, and in the process breaking an important >>quality of good Python code, that attribute and getitem access not have >>noticeable side effects. >> >>So, here's a proposed interface for a new multidict object, borrowing >>some methods from Set but mostly from dict. Some things that seemed >>particularly questionable to me are marked with ??. > > > Have you seen my revised proposal (which is indeed an addition to the > standard dict rather than a subclass)? Yes, and though it is more general it has the same issue of side effects. Doesn't it seem strange that getting an item will change the values of .keys(), .items(), and .has_key()? > Your multidict addresses only one use case for the proposed behavior; > what's so special about dicts of lists that they should have special > support? What about dicts of dicts, dicts of sets, dicts of > user-defined objects? What's so special? 95% (probably more!) of current use of .setdefault() is .setdefault(key, []).append(value). Also, since when do features have to address all possible cases? Certainly there are other cases, and I think they can be answered with other classes. Here are some current options: .setdefault() -- works with any subtype; slightly less efficient than what you propose. Awkward to read; doesn't communicate intent very well. UserDict -- works for a few cases where you want to make dict-like objects. Messes up the concept of identity and containment -- resulting objects both "are" dictionaries, and "contain" a dictionary (obj.data). DictMixin -- does anything you can possibly want, requiring only the overriding of a couple methods. dict subclassing -- does anything you want as well, but you typically have to override many more methods than with DictMixin (and if you don't have to override every method, that's not documented in any way). Isn't written with subclassing in mind. Really, you are proposing that one specific kind of override be made feasible, either with subclassing or injecting a method. That said, I'm not saying that several kinds of behavior shouldn't be supported. I just don't see why dict should support them all (or multidict). And I also think dict will support them poorly. multidict implements one behavior *well*. In a documented way, with a name people can refer to. I can say "multidict", I can't say "a dict where I set default_factory to list" (well, I can say that, but that just opens up yet more questions and clarifications). Some ways multidict differs from default_factory=list: * __contains__ works (you have to use .get() with default_factory to get a meaningful result) * Barring cases where there are exceptions, x[key] and x.get(key) return the same value for multidict; with default_factory one returns [] and the other returns None when the key isn't found. But if you do x[key]; x.get(key) then x.get(key) always returns []. * You can't use __setitem__ to put non-list items into a multidict; with multidict you don't have to guard against non-sequences values. * [] is meaningful not just as the default value, but as a null value; the multidict implementation respects both aspects. * Specific method x.add(key, value) that indicates intent in a way that x[key].append(value) does not. * items and iteritems return values meaningful to the context (a list of (key, single_value) -- this is usually what I want, and avoids a nested for loop). __len__ also usefully different than in dict. * .update() handles iteritems sensibly, and updates from dictionaries sensibly -- if you mix a default_factory=list dict with a "normal" (single-value) dictionary you'll get an effectively corrupted dictionary (where some keys are lists) * x.getfirst(key) is useful * I think this will be much easier to reason about in situations with threads -- dict acts very predictably with threads, and people rely upon that * multidict can be written either with subclassing intended, or with an abstract superclass, so that other kinds of specializations of this superset of the dict interface can be made more easily (if DictMixin itself isn't already sufficient) So, I'm saying: multidict handles one very common collection need that dict handles awkwardly now. multidict is a meaningful and useful class with its own identity/name and meaning separate from dict, and has methods that represent both the intersection and the difference between the two classes. multidict does not in any way preclude other collection objects for other situations; it is entirely unfair to expect a new class to solve all issues. multidict suggests an interface that other related classes can use (e.g., an ordered version). multidict, unlike default_factory, is not just a recipe for creating a specific and commonly needed object, it is a class for creating it. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Fri Feb 17 22:03:49 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 15:03:49 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F63A35.5080500@colorstudy.com> Guido van Rossum wrote: > d = {} > d.default_factory = set > ... > d[key].add(value) Another option would be: d = {} d.default_factory = set d.get_default(key).add(value) Unlike .setdefault, this would use a factory associated with the dictionary, and no default value would get passed in. Unlike the proposal, this would not override __getitem__ (not overriding __getitem__ is really the only difference with the proposal). It would be clear reading the code that you were not implicitly asserting they "key in d" was true. "get_default" isn't the best name, but another name isn't jumping out at me at the moment. Of course, it is not a Pythonic argument to say that an existing method should be overridden, or functionality made nameless simply because we can't think of a name (looking to anonymous functions of course ;) -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From mal at egenix.com Fri Feb 17 22:11:35 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 Feb 2006 22:11:35 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F5FD88.8090605@livinglogic.de> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> Message-ID: <43F63C07.9030901@egenix.com> Walter D?rwald wrote: >>>> I'd suggest we keep codecs.lookup() the way it is and >>>> instead add new functions to the codecs module, e.g. >>>> codecs.getencoderobject() and codecs.getdecoderobject(). >>>> >>>> Changing the codec registration is not much of a problem: >>>> we could simply allow 6-tuples to be passed into the >>>> registry. >>> OK, so codecs.lookup() returns 4-tuples, but the registry stores >>> 6-tuples and the search functions must return 6-tuples. And we add >>> codecs.getencoderobject() and codecs.getdecoderobject() as well as new >>> classes codecs.StatefulEncoder and codecs.StatefulDecoder. What about >>> old search functions that return 4-tuples? >> >> The registry should then simply set the missing entries to None >> and the getencoderobject()/getdecoderobject() would then have >> to raise an error. > > Sounds simple enough and we don't loose backwards compatibility. > >> Perhaps we should also deprecate codecs.lookup() in Py 2.5 ?! > > +1, but I'd like to have a replacement for this, i.e. a function that > returns all info the registry has about an encoding: > > 1. Name > 2. Encoder function > 3. Decoder function > 4. Stateful encoder factory > 5. Stateful decoder factory > 6. Stream writer factory > 7. Stream reader factory > > and if this is an object with attributes, we won't have any problems if > we extend it in the future. Shouldn't be a problem: just expose the registry dictionary via the _codecs module. The rest can then be done in a Python function defined in codecs.py using a CodecInfo class. > BTW, if we change the API, can we fix the return value of the stateless > functions? As the stateless function always encodes/decodes the complete > string, returning the length of the string doesn't make sense. > codecs.getencoder() and codecs.getdecoder() would have to continue to > return the old variant of the functions, but > codecs.getinfo("latin-1").encoder would be the new encoding function. No: you can still write stateless encoders or decoders that do not process the whole input string. Just because we don't have any of those in Python, doesn't mean that they can't be written and used. A stateless codec might want to leave the work of buffering bytes at the end of the input data which cannot be processed to the caller. It is also possible to write stateful codecs on top of such stateless encoding and decoding functions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 17 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From aahz at pythoncraft.com Fri Feb 17 22:18:54 2006 From: aahz at pythoncraft.com (Aahz) Date: Fri, 17 Feb 2006 13:18:54 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <20060217211854.GA14428@panix.com> On Fri, Feb 17, 2006, Guido van Rossum wrote: > > But the default implementation is designed so that we can write > > d = {} > d.default_factory = list +1 I actually like the fact that you're forced to use a separate statement for setting the default_factory. From my POV, this can go into 2.5. (I was only +0 on the previous proposal and I was -1 on making it a built-in; this extension is much nicer.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From guido at python.org Fri Feb 17 22:26:30 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 13:26:30 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: On 2/16/06, Stephen J. Turnbull wrote: > /usr/share often is on a different mount; that's the whole rationale > for /usr/share. I don't think I've worked at a place where something like that was done for at least 10 years. Isn't this argument outdated? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Fri Feb 17 22:54:08 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 17 Feb 2006 14:54:08 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/17/06, Guido van Rossum wrote: > - There's a fundamental difference between associating the default > value with the dict object, and associating it with the call. So > proposals to invent a better name/signature for setdefault() don't > compete. That's a feature, not a bug. :) See below. > - The inconsistency introduced by __getitem__() returning a value for > keys while get(), __contains__(), and keys() etc. don't show it, > cannot be resolved usefully. You'll just have to live with it. > Modifying get() to do the same thing as __getitem__() doesn't seem > useful -- it just takes away a potentially useful operation. Again, see below. > So here's a new proposal. > > Let's add a generic missing-key handling method to the dict class, as > well as a default_factory slot initialized to None. The implementation > is like this (but in C): > > def on_missing(self, key): > if self.default_factory is not None: > value = self.default_factory() > self[key] = value > return value > raise KeyError(key) > > When __getitem__() (and *only* __getitem__()) finds that the requested > key is not present in the dict, it calls self.on_missing(key) and > returns whatever it returns -- or raises whatever it raises. > __getitem__() doesn't need to raise KeyError any more, that's done by > on_missing(). Still -1. It's better, but it violates the principle of encapsulation by mixing how-you-use-it state with what-it-stores state. In doing that it has the potential to break an API documented as accepting a dict. Code that expects d[key] to raise an exception (and catches the resulting KeyError) will now silently "succeed". I believe that necessitates a PEP to document it. It's also makes it harder to read code. You may expect d[key] to raise an exception, but it won't because of a single line up several pages (or in another file entierly!) d.getorset(key, func) has no such problems and has a much simpler specification: def getorset(self, key, func): try: return self[key] except KeyError: value = self[key] = func() return value -- Adam Olsen, aka Rhamphoryncus From bokr at oz.net Fri Feb 17 22:59:19 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 21:59:19 GMT Subject: [Python-Dev] Serial function call composition syntax foo(x, y) -> bar() -> baz(z) Message-ID: <43f61ed8.1055320110@news.gmane.org> Cut to the chase: how about being able to write baz(bar(foo(x, y)),z) serially as foo(x, y) -> bar() -> baz(z) via the above as sugar for baz.__get__(bar.__get__(foo(x, y))())(z) ? I.e., you'd have self-like args to receive results from upstream. E.g., >>> def foo(x, y): return 'foo(%s, %s)'%(x,y) ... >>> def bar(stream): return 'bar(%s)'%stream ... >>> def baz(stream, z): return 'baz(%s, %s)'%(stream,z) ... >>> x = 'ex'; y='wye'; z='zed' then (faked) >>> foo(x, y) -> bar() -> baz(z) 'baz(bar(foo(ex, wye)), zed)' would do (actual) >>> baz.__get__(bar.__get__(foo(x, y))())(z) 'baz(bar(foo(ex, wye)), zed)' (or if the callable has no __get__, use new.instancemethod methodology behind the scenes) This is to provide an alternative to serial composition of function calls as methods of returned objects, which sometimes looks nice, but may have strange coupling of types and functionality. E.g. you could define classes to be able to write the above as foo(x, y).bar().baz(z) and that's effectively what is being done by the -> notation. The __get__ stuff is really just on-the-fly bound method generation without passing the instance class argument. But -> allows the composition without creating knowledge coupling between the foo, bar, and baz sources. It just has to be realized that this way of composition works via the first argument in passing through prior results. BTW, note that in the above foo(x, y) is just the first expression result being fed into the chain, so a constant or any expression can be the first, since it just becomes the argument for the innermost nested call. I.e., 'abcd' -> binascii.hexlify() for >>> new.instancemethod(binascii.hexlify, 'abcd', str)() '61626364' Note that it's valid to leave off the () -- IOW simply, a->b is sugar for b.__get__(a) (or the instancemethod equivalent) (faked) 'expr' -> baz (actual) >>> baz.__get__('expr') and then >>> baz.__get__('expr')('zee') 'baz(expr, zee)' What do you think? Regards, Bengt Richter From martin at v.loewis.de Fri Feb 17 23:00:36 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 23:00:36 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F5AD2D.1000205@voidspace.org.uk> References: <43F576A3.1030604@v.loewis.de> <43F5AD2D.1000205@voidspace.org.uk> Message-ID: <43F64784.4060700@v.loewis.de> Fuzzyman wrote: >>Also, I think has_key/in should return True if there is a default. >> >> >> > And exactly what use would it then be ? Code that checks if d.has_key(k): print d[k] would work correctly. IOW, you could use a dictionary with a default key just as if it were a normal dictionary - which is a useful property, IMO. Regards, Martin From guido at python.org Fri Feb 17 23:08:41 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 14:08:41 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/17/06, Adam Olsen wrote: > It's also makes it harder to read code. You may expect d[key] to > raise an exception, but it won't because of a single line up several > pages (or in another file entierly!) Such are the joys of writing polymorphic code. I don't really see how you can avoid this kind of confusion -- I could have given you some other mapping object that does weird stuff. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Fri Feb 17 23:06:12 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2006 23:06:12 +0100 Subject: [Python-Dev] Serial function call composition syntax foo(x, y) -> bar() -> baz(z) In-Reply-To: <43f61ed8.1055320110@news.gmane.org> References: <43f61ed8.1055320110@news.gmane.org> Message-ID: Bengt Richter wrote: > Cut to the chase: how about being able to write > > baz(bar(foo(x, y)),z) > > serially as > > foo(x, y) -> bar() -> baz(z) > > via the above as sugar for > > baz.__get__(bar.__get__(foo(x, y))())(z) Reminds me of http://dev.perl.org/perl6/doc/design/syn/S03.html#Piping_operators Georg From martin at v.loewis.de Fri Feb 17 23:13:14 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 23:13:14 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F64A7A.3060400@v.loewis.de> Adam Olsen wrote: > Still -1. It's better, but it violates the principle of encapsulation > by mixing how-you-use-it state with what-it-stores state. In doing > that it has the potential to break an API documented as accepting a > dict. Code that expects d[key] to raise an exception (and catches the > resulting KeyError) will now silently "succeed". Of course it will, and without quotes. That's the whole point. > I believe that necessitates a PEP to document it. You are missing the rationale of the PEP process. The point is *not* documentation. The point of the PEP process is to channel and collect discussion, so that the BDFL can make a decision. The BDFL is not bound at all to the PEP process. To document things, we use (or should use) documentation. Regards, Martin From guido at python.org Fri Feb 17 23:15:39 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 14:15:39 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F63A35.5080500@colorstudy.com> References: <43F63A35.5080500@colorstudy.com> Message-ID: On 2/17/06, Ian Bicking wrote: > Guido van Rossum wrote: > > d = {} > > d.default_factory = set > > ... > > d[key].add(value) > > Another option would be: > > d = {} > d.default_factory = set > d.get_default(key).add(value) > > Unlike .setdefault, this would use a factory associated with the > dictionary, and no default value would get passed in. Unlike the > proposal, this would not override __getitem__ (not overriding > __getitem__ is really the only difference with the proposal). It would > be clear reading the code that you were not implicitly asserting they > "key in d" was true. > > "get_default" isn't the best name, but another name isn't jumping out at > me at the moment. Of course, it is not a Pythonic argument to say that > an existing method should be overridden, or functionality made nameless > simply because we can't think of a name (looking to anonymous functions > of course ;) I'm torn. While trying to implement this I came across some ugliness in PyDict_GetItem() -- it would make sense if this also called on_missing(), but it must return a value without incrementing its refcount, and isn't supposed to raise exceptions -- so what to do if on_missing() returns a value that's not inserted in the dict? If the __getattr__()-like operation that supplies and inserts a dynamic default was a separate method, we wouldn't have this problem. OTOH most reviewers here seem to appreciate on_missing() as a way to do various other ways of alterning a dict's __getitem__() behavior behind a caller's back -- perhaps it could even be (ab)used to implement case-insensitive lookup. I'm not going to do a point-by-point to your longer post (I don't have the time); let's (again) agree to disagree and I'll sleep on it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Fri Feb 17 23:18:39 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 17 Feb 2006 14:18:39 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6338D.8050300@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> Message-ID: <20060217141138.5F99.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > M.-A. Lemburg wrote: > > Just because some codecs don't fit into the string.decode() > > or bytes.encode() scenario doesn't mean that these codecs are > > useless or that the methods should be banned. > > No. The reason to ban string.decode and bytes.encode is that > it confuses users. How are users confused? bytes.encode CAN only produce bytes. Though string.decode (or bytes.decode) MAY produce strings (or bytes) or unicode, depending on the codec, I think it is quite reasonable to expect that users will understand that string.decode('utf-8') is different than string.decode('base-64'), and that they may produce different output. In a similar fashion, dict.get(1) may produce different results than dict.get(2) for some dictionaries. If some users can't understand this (passing different arguments to a function may produce different output), then I think that some users are broken beyond repair. - Josiah From guido at python.org Fri Feb 17 23:17:41 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 14:17:41 -0800 Subject: [Python-Dev] Serial function call composition syntax foo(x, y) -> bar() -> baz(z) In-Reply-To: <43f61ed8.1055320110@news.gmane.org> References: <43f61ed8.1055320110@news.gmane.org> Message-ID: Cut to the chase: -1000. On 2/17/06, Bengt Richter wrote: > Cut to the chase: how about being able to write > > baz(bar(foo(x, y)),z) > > serially as > > foo(x, y) -> bar() -> baz(z) > > via the above as sugar for > > baz.__get__(bar.__get__(foo(x, y))())(z) > > ? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Feb 17 23:22:37 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 23:22:37 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43F64CAD.5020407@v.loewis.de> Guido van Rossum wrote: > On 2/16/06, Stephen J. Turnbull wrote: > >>/usr/share often is on a different mount; that's the whole rationale >>for /usr/share. > > > I don't think I've worked at a place where something like that was > done for at least 10 years. Isn't this argument outdated? It still *is* the rationale for putting things into /usr/share, even though I agree that probably nobody actually does that. That, in turn, is because nobody is so short of disk space that you really *have* to share /usr/share across architectures, and because trying to do the sharing still causes problems (e.g. what if the packaging systems of different architectures all decide to put the same files into /usr/share?) Regards, Martin From bokr at oz.net Fri Feb 17 23:34:00 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 22:34:00 GMT Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> <43F50BDD.4010106@v.loewis.de> <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> Message-ID: <43f649fd.1066365712@news.gmane.org> On Fri, 17 Feb 2006 21:35:25 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >M.-A. Lemburg wrote: >> Just because some codecs don't fit into the string.decode() >> or bytes.encode() scenario doesn't mean that these codecs are >> useless or that the methods should be banned. > >No. The reason to ban string.decode and bytes.encode is that >it confuses users. Well, that's because of semantic overloading. Assuming you mean string as characters and bytes as binary bytes. The trouble is encoding and decoding have to have bytes to represent the coded info, whichever direction. Characters per se aren't coded info, so string.decode doesn't make sense without faking it with string.encode().decode() and bytes.encode() likewise first has to have a hidden .decode to become a string that makes sense to encode. And the hidden stuff restricts to ascii, for further grief :-( So yes, please ban string.decode and bytes.encode. And maybe introduce bytes.recode for bytes->bytes transforms? (strings don't have any codes to recode). Regards, Bengt Richter From ianb at colorstudy.com Fri Feb 17 23:38:09 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 16:38:09 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F65051.5000308@colorstudy.com> Guido van Rossum wrote: > On 2/17/06, Adam Olsen wrote: > >>It's also makes it harder to read code. You may expect d[key] to >>raise an exception, but it won't because of a single line up several >>pages (or in another file entierly!) > > > Such are the joys of writing polymorphic code. I don't really see how > you can avoid this kind of confusion -- I could have given you some > other mapping object that does weird stuff. The way you avoid confusion is by not working with code or programmers who write bad code. Python and polymorphic code in general pushes the responsibility for many errors from the language structure onto the programmer -- it is the programmers' responsibility to write good code. Python has never kept people from writing obcenely horrible code. We ought to have an obfuscated Python contest just to prove that point -- it is through practice and convention that readable Python code happens, not through the restrictions of the language. (Honestly, I think such a contest would be a good idea.) I know *I* at least don't like code that mixes up access and modification. Maybe not everyone does (or maybe not everyone thinks of getitem as "access", but that's unlikely). I will assert that it is Pythonic to keep access and modification separate, which is why methods and attributes are different things, and why assignment is not an expression, and why functions with side effects typically return None, or have names that are very explicit about the side effect, with names containing command verbs like "update" or "set". All of these distinguish access from modification. Note that all of what I'm saying *only* applies to the overriding of __getitem__, not the addition of any new method. I think multidict is better for the places it applies, but I see no problem at all with a new method on dictionaries that calls on_missing. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From martin at v.loewis.de Fri Feb 17 23:37:59 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 23:37:59 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F63A35.5080500@colorstudy.com> Message-ID: <43F65047.3050200@v.loewis.de> Guido van Rossum wrote: > I'm torn. While trying to implement this I came across some ugliness > in PyDict_GetItem() -- it would make sense if this also called > on_missing(), but it must return a value without incrementing its > refcount, and isn't supposed to raise exceptions -- so what to do if > on_missing() returns a value that's not inserted in the dict? I think there should be a guideline to use PyObject_GetItem/PyMapping_GetItemString "normally", i.e. in all cases where you would write d[k] in Python code. It should be considered a bug if PyDict_GetItem is used in a place that "should" invoke defaulting; IOW, the function should be reserved to really low-level cases (e.g. if it is known that the dict doesn't have any defaulting, e.g. the string interned dictionary). There should be a policy whether name-lookup invokes defaulting (i.e. __dict__ access); I think it should. This would cause __getattr__ to have no effect if the object's dictionary has a default factory (unless that raises a KeyError). Regards, Martin From martin at v.loewis.de Fri Feb 17 23:52:15 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2006 23:52:15 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060217141138.5F99.JCARLSON@uci.edu> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> Message-ID: <43F6539F.5040707@v.loewis.de> Josiah Carlson wrote: > How are users confused? Users do py> "Martin v. L?wis".encode("utf-8") Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: ordinal not in range(128) because they want to convert the string "to Unicode", and they have found a text telling them that .encode("utf-8") is a reasonable method. What it *should* tell them is py> "Martin v. L?wis".encode("utf-8") Traceback (most recent call last): File "", line 1, in ? AttributeError: 'str' object has no attribute 'encode' > bytes.encode CAN only produce bytes. I don't understand MAL's design, but I believe in that design, bytes.encode could produce anything (say, a list). A codec can convert anything to anything else. > If some users > can't understand this (passing different arguments to a function may > produce different output), It's worse than that. The return *type* depends on the *value* of the argument. I think there is little precedence for that: normally, the return values depend on the argument values, and, in a polymorphic function, the return type might depend on the argument types (e.g. the arithmetic operations). Also, the return type may depend on the number of arguments (e.g. by requesting a return type in a keyword argument). > then I think that some users are broken beyond repair. Hmm. I'm speechless. Regards, Martin From rhamph at gmail.com Fri Feb 17 23:56:24 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 17 Feb 2006 15:56:24 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/17/06, Guido van Rossum wrote: > On 2/17/06, Adam Olsen wrote: > > It's also makes it harder to read code. You may expect d[key] to > > raise an exception, but it won't because of a single line up several > > pages (or in another file entierly!) > > Such are the joys of writing polymorphic code. I don't really see how > you can avoid this kind of confusion -- I could have given you some > other mapping object that does weird stuff. You could pass a float in as well. But if the function is documented as taking a dict, and the programmer expects a dict.. that now has to be changed to "dict without a default". Or they have to code defensively since d[key] may or may not raise KeyError, so they must avoid depending on it either way. -- Adam Olsen, aka Rhamphoryncus From guido at python.org Fri Feb 17 23:58:34 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 14:58:34 -0800 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F64CAD.5020407@v.loewis.de> References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> <43F64CAD.5020407@v.loewis.de> Message-ID: On 2/17/06, "Martin v. L?wis" wrote: > Guido van Rossum wrote: > > On 2/16/06, Stephen J. Turnbull wrote: > >>/usr/share often is on a different mount; that's the whole rationale > >>for /usr/share. > > > > I don't think I've worked at a place where something like that was > > done for at least 10 years. Isn't this argument outdated? > > It still *is* the rationale for putting things into /usr/share, > even though I agree that probably nobody actually does that. > > That, in turn, is because nobody is so short of disk space that > you really *have* to share /usr/share across architectures, and > because trying to do the sharing still causes problems (e.g. > what if the packaging systems of different architectures > all decide to put the same files into /usr/share?) I believe /usr/share was intended only to be used for platform-independent files (e.g. docs, or .py files). Another reason why nobody does this is because NFS is slow and unreliable. It's no fun when your NFS server goes down and your machine hangs because someone wanted to save 50 MB per workstation by sharing it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Feb 18 00:00:22 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 15:00:22 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/17/06, Adam Olsen wrote: > > Such are the joys of writing polymorphic code. I don't really see how > > you can avoid this kind of confusion -- I could have given you some > > other mapping object that does weird stuff. > > You could pass a float in as well. But if the function is documented > as taking a dict, and the programmer expects a dict.. that now has to > be changed to "dict without a default". Or they have to code > defensively since d[key] may or may not raise KeyError, so they must > avoid depending on it either way. I'd like to see a real-life example of code that would break this way. I believe that *most* code that takes a dict will work just fine if that dict has a default factory. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sat Feb 18 00:06:20 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 00:06:20 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F65051.5000308@colorstudy.com> References: <43F65051.5000308@colorstudy.com> Message-ID: <43F656EC.6070107@v.loewis.de> Ian Bicking wrote: > I know *I* at least don't like code that mixes up access and > modification. Maybe not everyone does (or maybe not everyone thinks of > getitem as "access", but that's unlikely). I will assert that it is > Pythonic to keep access and modification separate, which is why methods > and attributes are different things, and why assignment is not an > expression, and why functions with side effects typically return None, > or have names that are very explicit about the side effect, with names > containing command verbs like "update" or "set". All of these > distinguish access from modification. Do you never write d[some_key].append(some_value) This is modification and access, all in a single statement, and all without assignment operator. I don't see the setting of the default value as a modification. The default value has been there, all the time. It only is incarnated lazily. Regards, Martin From martin at v.loewis.de Sat Feb 18 00:07:48 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 00:07:48 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F65744.7050102@v.loewis.de> Adam Olsen wrote: > You could pass a float in as well. But if the function is documented > as taking a dict, and the programmer expects a dict.. that now has to > be changed to "dict without a default". Or they have to code > defensively since d[key] may or may not raise KeyError, so they must > avoid depending on it either way. Can you give an example of real, existing code that will break if a such a dict is passed? Regards, Martin From rhamph at gmail.com Sat Feb 18 00:08:35 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 17 Feb 2006 16:08:35 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F64A7A.3060400@v.loewis.de> References: <43F64A7A.3060400@v.loewis.de> Message-ID: On 2/17/06, "Martin v. L?wis" wrote: > Adam Olsen wrote: > > Still -1. It's better, but it violates the principle of encapsulation > > by mixing how-you-use-it state with what-it-stores state. In doing > > that it has the potential to break an API documented as accepting a > > dict. Code that expects d[key] to raise an exception (and catches the > > resulting KeyError) will now silently "succeed". > > Of course it will, and without quotes. That's the whole point. Consider these two pieces of code: if key in d: dosomething(d[key]) else: dosomethingelse() try: dosomething(d[key]) except KeyError: dosomethingelse() Before they were the same (assuming dosomething() won't raise KeyError). Now they would behave differently. The latter is even the prefered form, since it only invokes a single dict lookup: On 2/16/06, Delaney, Timothy (Tim) wrote: > try: > v = d[key] > except: > v = d[key] = value Obviously this example could be changed to use default_factory, but I find it hard to believe the only use of that pattern is to set default keys. Of course you could just assume that of all the people passing your function a dict, none of them will ever use the default_factory when they build the dict. Should be easy, right? -- Adam Olsen, aka Rhamphoryncus From ianb at colorstudy.com Sat Feb 18 00:13:51 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 17:13:51 -0600 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6539F.5040707@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> Message-ID: <43F658AF.8070801@colorstudy.com> Martin v. L?wis wrote: > Users do > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: > ordinal not in range(128) > > because they want to convert the string "to Unicode", and they have > found a text telling them that .encode("utf-8") is a reasonable > method. > > What it *should* tell them is > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > AttributeError: 'str' object has no attribute 'encode' I think it would be even better if they got "ValueError: utf8 can only encode unicode objects". AttributeError is not much more clear than the UnicodeDecodeError. That str.encode(unicode_encoding) implicitly decodes strings seems like a flaw in the unicode encodings, quite seperate from the existance of str.encode. I for one really like s.encode('zlib').encode('base64') -- and if the zlib encoding raised an error when it was passed a unicode object (instead of implicitly encoding the string with the ascii encoding) that would be fine. The pipe-like nature of .encode and .decode works very nicely for certain transformations, applicable to both unicode and byte objects. Let's not throw the baby out with the bath water. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Sat Feb 18 00:21:52 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 17:21:52 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F656EC.6070107@v.loewis.de> References: <43F65051.5000308@colorstudy.com> <43F656EC.6070107@v.loewis.de> Message-ID: <43F65A90.2010009@colorstudy.com> Martin v. L?wis wrote: >>I know *I* at least don't like code that mixes up access and >>modification. Maybe not everyone does (or maybe not everyone thinks of >>getitem as "access", but that's unlikely). I will assert that it is >>Pythonic to keep access and modification separate, which is why methods >>and attributes are different things, and why assignment is not an >>expression, and why functions with side effects typically return None, >>or have names that are very explicit about the side effect, with names >>containing command verbs like "update" or "set". All of these >>distinguish access from modification. > > > Do you never write > > d[some_key].append(some_value) > > This is modification and access, all in a single statement, and all > without assignment operator. (d[some_key]) is access. (...).append(some_value) is modification. Expressions are compound; of course you can mix both access and modification in a single expression. d[some_key] is access that returns something, and .append(some_value) modifies that something, it doesn't modify d. > I don't see the setting of the default value as a modification. > The default value has been there, all the time. It only is incarnated > lazily. It is lazily incarnated for multidict, because there is no *noticeable* side effect -- if there is any internal side effects that is an implementation detail. However for default_factory=list, the result of .keys(), .has_key(), and .items() changes when you do d[some_key]. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Sat Feb 18 00:34:29 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 17:34:29 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F64A7A.3060400@v.loewis.de> Message-ID: <43F65D85.7080102@colorstudy.com> Adam Olsen wrote: > The latter is even the prefered form, since it only invokes a single > dict lookup: > > On 2/16/06, Delaney, Timothy (Tim) wrote: > >> try: >> v = d[key] >> except: >> v = d[key] = value > > > Obviously this example could be changed to use default_factory, but I > find it hard to believe the only use of that pattern is to set default > keys. I'd go further -- I doubt many cases where try:except KeyError: is used could be refactored to use default_factory -- default_factory can only be used to set default keys to something that can be determined sometime close to the time the dictionary is created, and that the default is not dependent on the context in which the key is fetched, and that default value will not cause unintended side effects if the dictionary leaks out of the code where it was initially used (like if the dictionary is returned to someone). Any default factory is more often an algorithmic detail than truly part of the nature of the dictionary itself. For instance, here is something I do often: try: value = cache[key] except KeyError: ... calculate value ... cache[key] = value Realistically, factoring "... calculate value ..." into a factory that calculates the value would be difficult, produce highly unreadable code, perform worse, and have more bugs. For simple factories like "list" and "dict" the factory works okay. For immutable values like 0 and None, the factory (lambda : 0 and lambda : None) is a wasteful way to create a default value (because storing the value in the dictionary is unnecessary). For non-trivial factories the whole thing falls apart, and one can just hope that no one will try to use this feature and will instead stick with the try:except KeyError: technique. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From bokr at oz.net Sat Feb 18 00:36:26 2006 From: bokr at oz.net (Bengt Richter) Date: Fri, 17 Feb 2006 23:36:26 GMT Subject: [Python-Dev] bdist_* to stdlib? References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> <43F64CAD.5020407@v.loewis.de> Message-ID: <43f65a46.1070534106@news.gmane.org> On Fri, 17 Feb 2006 14:58:34 -0800, "Guido van Rossum" wrote: >On 2/17/06, "Martin v. L=F6wis" wrote: >> Guido van Rossum wrote: >> > On 2/16/06, Stephen J. Turnbull wrote: >> >>/usr/share often is on a different mount; that's the whole rationale >> >>for /usr/share. >> > >> > I don't think I've worked at a place where something like that was >> > done for at least 10 years. Isn't this argument outdated? >> >> It still *is* the rationale for putting things into /usr/share, >> even though I agree that probably nobody actually does that. >> >> That, in turn, is because nobody is so short of disk space that >> you really *have* to share /usr/share across architectures, and >> because trying to do the sharing still causes problems (e.g. >> what if the packaging systems of different architectures >> all decide to put the same files into /usr/share?) > >I believe /usr/share was intended only to be used for >platform-independent files (e.g. docs, or .py files). > linuxbase.org agrees with you, via ref to http://www.pathname.com/fhs/pub/fhs-2.3.html and more specifically http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA >Another reason why nobody does this is because NFS is slow and >unreliable. It's no fun when your NFS server goes down and your >machine hangs because someone wanted to save 50 MB per workstation by >sharing it. > Sometimes a separate mount could be a separate hard disk in the same box, I guess. Apparently it's read-only, so I guess it could also temporarily be a cdrom even. Regards, Bengt Richter From oliphant.travis at ieee.org Sat Feb 18 00:38:16 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 17 Feb 2006 16:38:16 -0700 Subject: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods In-Reply-To: <20060217164111.GD23859@xs4all.nl> References: <20060217162932.GA13840@code0.codespeak.net> <20060217164111.GD23859@xs4all.nl> Message-ID: Thomas Wouters wrote: > On Fri, Feb 17, 2006 at 05:29:32PM +0100, Armin Rigo wrote: > >>> Where obj must be either an int or a long or another object that has the >>> __index__ special method (but not self). > > >>The "anything but not self" rule is not consistent with any other >>special method's behavior. IMHO we should just do the same as >>__nonzero__(): Agreed. I implemented the code, then realized this possible recursion problem while writing the specification. I didn't know how it would be viewed. It is easy enough to require __index__ to return an actual Python integer because for anything that has the nb_index slot you would just return obj.__index__() instead of obj. I'll change the PEP and the implementation. I have an updated implementation that uses the ssize_t patch instead. There seem to be some issues with the ssize_t patch still, though. Shouldn't a lot of checks for INT_MAX be replaced with PY_SSIZE_T_MAX. But, I noticed that PY_SSIZE_T_MAX definition in pyport.h raises errors. I don't think it even makes sense. -Travis From mcherm at mcherm.com Fri Feb 17 23:47:33 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Fri, 17 Feb 2006 14:47:33 -0800 Subject: [Python-Dev] Proposal: defaultdict Message-ID: <20060217144733.8u2hx61yfs2s40s8@login.werra.lunarpages.com> Martin v. L?wis writes: > You are missing the rationale of the PEP process. The point is > *not* documentation. The point of the PEP process is to channel > and collect discussion, so that the BDFL can make a decision. > The BDFL is not bound at all to the PEP process. > > To document things, we use (or should use) documentation. You are oversimplifying significantly. The purpose of the PEP process is to lay out and document the reasoning that went into forming the decision. The BDFL is *allowed* to be capricious, but he's sensible enough to choose not to: in cases where it matters, he tries to document the reasoning behind his decisions. In fact, he does better than that... he gets the PEP author to document it for him! The PEP (whether accepted, rejected, or in-progress) serves as the official documentation of how the decision was made (or of what option it is that is still undecided). If a _trivial_ decision is already made, there's no need for a PEP, but if a difficult decision has been made, then documenting it in a PEP saves years of having to justify it to newbies. -- Michael Chermside From jcarlson at uci.edu Sat Feb 18 00:51:32 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 17 Feb 2006 15:51:32 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6539F.5040707@v.loewis.de> References: <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> Message-ID: <20060217151851.5F9F.JCARLSON@uci.edu> "Martin v. L?wis" wrote: > > Josiah Carlson wrote: > > How are users confused? > > Users do > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: > ordinal not in range(128) > > because they want to convert the string "to Unicode", and they have > found a text telling them that .encode("utf-8") is a reasonable > method. Removing functionality because some users read bad instructions somewhere, is a bit like kicking your kitten because your puppy peed on the floor. You are punishing the wrong group, for something that shouldn't result in punishment: it should result in education. Users are always going to get bad instructions, and removing utility because some users fail to think before they act, or complain when their lack of thinking doesn't work, will give us a language where we are removing features because *new* users have no idea what they are doing. > What it *should* tell them is > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > AttributeError: 'str' object has no attribute 'encode' I disagree. I think the original error was correct, and we should be educating users to prefix their literals with a 'u' if they want unicode, or they should get their data from a unicode source (wxPython with unicode, StreamReader, etc.) > > bytes.encode CAN only produce bytes. > > I don't understand MAL's design, but I believe in that design, > bytes.encode could produce anything (say, a list). A codec > can convert anything to anything else. That seems to me to be a little overkill... In any case, I personally find that data.encode('base-64') and edata.decode('base-64') to be more convenient than binascii.b2a_base64 (data) and binascii.a2b_base64(edata). Ditto for hexlify/unhexlify, etc. > > If some users > > can't understand this (passing different arguments to a function may > > produce different output), > > It's worse than that. The return *type* depends on the *value* of > the argument. I think there is little precedence for that: normally, > the return values depend on the argument values, and, in a polymorphic > function, the return type might depend on the argument types (e.g. > the arithmetic operations). Also, the return type may depend on the > number of arguments (e.g. by requesting a return type in a keyword > argument). You only need to look to dictionaries where different values passed into a function call may very well return results of different types, yet there have been no restrictions on mapping to and from single types per dictionary. Many dict-like interfaces for configuration files do this, things like config.get('remote_host') and config.get('autoconnect') not being uncommon. - Josiah From oliphant.travis at ieee.org Sat Feb 18 00:40:08 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 17 Feb 2006 16:40:08 -0700 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: <43F3A7E4.1090505@v.loewis.de> References: <43F3A7E4.1090505@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Just in case you haven't noticed, I just merged > the ssize_t branch (PEP 353). > > If you have any corrections to the code to make which > you would consider bug fixes, just go ahead. > > If you are uncertain how specific problems should be resolved, > feel free to ask. > > If you think certain API changes should be made, please > discuss them here - they would need to be reflected in the > PEP as well. What is PY_SSIZE_T_MAX supposed to be? The definition in pyport.h doesn't compile. Shouldn't a lot of checks for INT_MAX be replaced with PY_SSIZE_T_MAX? Like in the slice indexing code? Thanks for all your effort on ssize_t fixing. This is a *big* deal for 64-bit number crunching with Python. -Travis From martin at v.loewis.de Sat Feb 18 00:52:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 00:52:51 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F64A7A.3060400@v.loewis.de> Message-ID: <43F661D3.3010108@v.loewis.de> Adam Olsen wrote: > Consider these two pieces of code: > > if key in d: > dosomething(d[key]) > else: > dosomethingelse() > > try: > dosomething(d[key]) > except KeyError: > dosomethingelse() > > Before they were the same (assuming dosomething() won't raise > KeyError). Now they would behave differently. I personally think they should continue to do the same thing, i.e. "in" should return True if there is a default; in the current proposal, it should invoke the default factory. But that's beside the point: Where is the real example where this difference would matter? (I'm not asking for a realistic example, I'm asking for a real one) Regards, Martin From martin at v.loewis.de Sat Feb 18 00:58:35 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 00:58:35 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F658AF.8070801@colorstudy.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F658AF.8070801@colorstudy.com> Message-ID: <43F6632B.6000600@v.loewis.de> Ian Bicking wrote: > That str.encode(unicode_encoding) implicitly decodes strings seems like > a flaw in the unicode encodings, quite seperate from the existance of > str.encode. I for one really like s.encode('zlib').encode('base64') -- > and if the zlib encoding raised an error when it was passed a unicode > object (instead of implicitly encoding the string with the ascii > encoding) that would be fine. > > The pipe-like nature of .encode and .decode works very nicely for > certain transformations, applicable to both unicode and byte objects. > Let's not throw the baby out with the bath water. The way you use it, it's a matter of notation only: why is zlib(base64(s)) any worse? I think it's better: it doesn't use string literals to denote function names. If there is a point to this overgeneralized codec idea, it is the streaming aspect: that you don't need to process all data at once, but can feed data sequentially. Of course, you are not using this in your example. Regards, Martin From ianb at colorstudy.com Sat Feb 18 01:00:09 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 18:00:09 -0600 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060217151851.5F9F.JCARLSON@uci.edu> References: <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060217151851.5F9F.JCARLSON@uci.edu> Message-ID: <43F66389.3000902@colorstudy.com> Josiah Carlson wrote: >>>If some users >>>can't understand this (passing different arguments to a function may >>>produce different output), >> >>It's worse than that. The return *type* depends on the *value* of >>the argument. I think there is little precedence for that: normally, >>the return values depend on the argument values, and, in a polymorphic >>function, the return type might depend on the argument types (e.g. >>the arithmetic operations). Also, the return type may depend on the >>number of arguments (e.g. by requesting a return type in a keyword >>argument). > > > You only need to look to dictionaries where different values passed into > a function call may very well return results of different types, yet > there have been no restrictions on mapping to and from single types per > dictionary. > > Many dict-like interfaces for configuration files do this, things like > config.get('remote_host') and config.get('autoconnect') not being > uncommon. I think there is *some* justification, if you don't understand up front that the codec you refer to (using a string) is just a way of avoiding an import (thankfully -- dynamically importing unicode codecs is obviously infeasible). Now, if you understand the argument refers to some algorithm, it's not so bad. The other aspect is that there should be something consistent about the return types -- the Python type is not what we generally rely on, though. In this case they are all "data". Unicode and bytes are both data, and you could probably argue lists of ints is data too (but an arbitrary list definitely isn't data). On the outer end of data might be an ElementTree structure (but that's getting fishy). An open file object is not data. A tuple probably isn't data. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From bokr at oz.net Sat Feb 18 01:02:15 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 18 Feb 2006 00:02:15 GMT Subject: [Python-Dev] Serial function call composition syntax foo(x, y) -> bar() -> baz(z) References: <43f61ed8.1055320110@news.gmane.org> Message-ID: <43f663bd.1072957381@news.gmane.org> Is that a record? ;-) BTW, does python-dev have different expectations re top-posting? I've seen more here than on c.l.p I think, so I'm wondering what to do. When-in-Rome'ly, Regards, Bengt Richter On Fri, 17 Feb 2006 14:17:41 -0800, "Guido van Rossum" wrote: >Cut to the chase: -1000. > >On 2/17/06, Bengt Richter wrote: >> Cut to the chase: how about being able to write >> >> baz(bar(foo(x, y)),z) >> >> serially as >> >> foo(x, y) -> bar() -> baz(z) >> >> via the above as sugar for >> >> baz.__get__(bar.__get__(foo(x, y))())(z) >> >> ? > >-- >--Guido van Rossum (home page: http://www.python.org/~guido/) >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > From aleaxit at gmail.com Sat Feb 18 01:02:05 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 17 Feb 2006 16:02:05 -0800 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: <43F61442.6050003@colorstudy.com> Message-ID: On 2/17/06, Georg Brandl wrote: > Ian Bicking wrote: > > >> Unfortunately, a @property decorator is impossible... > > > > It already works! But only if you want a read-only property. Which is > > actually about 50%+ of the properties I create. So the status quo is > > not really that bad. > > I have abused it this way too and felt bad every time. > Kind of like keeping your hat on in the church. :) It's not ideal, because the resulting r-o property has no docstring: >>> class ex(object): ... @property ... def amp(self): ... ''' a nice docstring ''' ... return 23 ... >>> ex.amp.__doc__ >>> class xe(object): ... def amp(self): return 23 ... amp=property(amp, doc='whatever!') ... >>> xe.amp.__doc__ 'whatever!' >>> Maybe we could fix that by having property(getfunc) use getfunc.__doc__ as the __doc__ of the resulting property object (easily overridable in more normal property usage by the doc= argument, which, I feel, should almost invariably be there). Alex From guido at python.org Sat Feb 18 01:03:20 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 16:03:20 -0800 Subject: [Python-Dev] Serial function call composition syntax foo(x, y) -> bar() -> baz(z) In-Reply-To: <43f663bd.1072957381@news.gmane.org> References: <43f61ed8.1055320110@news.gmane.org> <43f663bd.1072957381@news.gmane.org> Message-ID: It's only me that's allowed to top-post. :-) On 2/17/06, Bengt Richter wrote: > Is that a record? ;-) > > BTW, does python-dev have different expectations re top-posting? > I've seen more here than on c.l.p I think, so I'm wondering what to do. > > When-in-Rome'ly, > > Regards, > Bengt Richter > > On Fri, 17 Feb 2006 14:17:41 -0800, "Guido van Rossum" wrote: > > >Cut to the chase: -1000. > > > >On 2/17/06, Bengt Richter wrote: > >> Cut to the chase: how about being able to write > >> > >> baz(bar(foo(x, y)),z) > >> > >> serially as > >> > >> foo(x, y) -> bar() -> baz(z) > >> > >> via the above as sugar for > >> > >> baz.__get__(bar.__get__(foo(x, y))())(z) > >> > >> ? > > > >-- > >--Guido van Rossum (home page: http://www.python.org/~guido/) > >_______________________________________________ > >Python-Dev mailing list > >Python-Dev at python.org > >http://mail.python.org/mailman/listinfo/python-dev > >Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Sat Feb 18 01:06:13 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 18:06:13 -0600 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6632B.6000600@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F658AF.8070801@colorstudy.com> <43F6632B.6000600@v.loewis.de> Message-ID: <43F664F5.5040504@colorstudy.com> Martin v. L?wis wrote: > Ian Bicking wrote: > >>That str.encode(unicode_encoding) implicitly decodes strings seems like >>a flaw in the unicode encodings, quite seperate from the existance of >>str.encode. I for one really like s.encode('zlib').encode('base64') -- >>and if the zlib encoding raised an error when it was passed a unicode >>object (instead of implicitly encoding the string with the ascii >>encoding) that would be fine. >> >>The pipe-like nature of .encode and .decode works very nicely for >>certain transformations, applicable to both unicode and byte objects. >>Let's not throw the baby out with the bath water. > > > The way you use it, it's a matter of notation only: why > is > > zlib(base64(s)) > > any worse? I think it's better: it doesn't use string literals to > denote function names. Maybe it isn't worse, but the real alternative is: import zlib import base64 base64.b64encode(zlib.compress(s)) Encodings cover up eclectic interfaces, where those interfaces fit a basic pattern -- data in, data out. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Sat Feb 18 01:07:58 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 18:07:58 -0600 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: <43F61442.6050003@colorstudy.com> Message-ID: <43F6655E.1040308@colorstudy.com> Alex Martelli wrote: > Maybe we could fix that by having property(getfunc) use > getfunc.__doc__ as the __doc__ of the resulting property object > (easily overridable in more normal property usage by the doc= > argument, which, I feel, should almost invariably be there). +1 -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From martin at v.loewis.de Sat Feb 18 01:12:21 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 01:12:21 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F65A90.2010009@colorstudy.com> References: <43F65051.5000308@colorstudy.com> <43F656EC.6070107@v.loewis.de> <43F65A90.2010009@colorstudy.com> Message-ID: <43F66665.7000102@v.loewis.de> Ian Bicking wrote: > It is lazily incarnated for multidict, because there is no *noticeable* > side effect -- if there is any internal side effects that is an > implementation detail. However for default_factory=list, the result of > .keys(), .has_key(), and .items() changes when you do d[some_key]. That's why I think has_key and in should return True for any key. This leaves keys(), items(), and values(). From a pure point of view, they should return infinite sets. Practicality beats purity, so yes, d[k] could be considered a modifying operation. If you look carefully, you find that many access operations also have side effects. For example, .read() on a file not only returns some data, but also advances the file position. Queue.get not only returns the next item, but also removes it from the queue. Regards, Martin From martin at v.loewis.de Sat Feb 18 01:20:15 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 01:20:15 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F664F5.5040504@colorstudy.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F658AF.8070801@colorstudy.com> <43F6632B.6000600@v.loewis.de> <43F664F5.5040504@colorstudy.com> Message-ID: <43F6683F.2060806@v.loewis.de> Ian Bicking wrote: > Maybe it isn't worse, but the real alternative is: > > import zlib > import base64 > > base64.b64encode(zlib.compress(s)) > > Encodings cover up eclectic interfaces, where those interfaces fit a > basic pattern -- data in, data out. So should I write 3.1415.encode("sin") or would that be 3.1415.decode("sin") What about "http://www.python.org".decode("URL") It's "data in, data out", after all. Who needs functions? Regards, Martin From guido at python.org Sat Feb 18 01:20:46 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 16:20:46 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F66665.7000102@v.loewis.de> References: <43F65051.5000308@colorstudy.com> <43F656EC.6070107@v.loewis.de> <43F65A90.2010009@colorstudy.com> <43F66665.7000102@v.loewis.de> Message-ID: On 2/17/06, "Martin v. L?wis" wrote: > That's why I think has_key and in should return True for any key. > This leaves keys(), items(), and values(). From a pure point of > view, they should return infinite sets. Practicality beats purity, > so yes, d[k] could be considered a modifying operation. I think practicality beats purity even for has_key/in; IMO these operations are more useful when they match keys() instead of always returning True. But someone should start writing some code to play with this. I have a working patch (including a hack for PyDict_GetItem()): python.org/sf/1433928 So there's no excuse to be practical now. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Feb 18 01:21:54 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 16:21:54 -0800 Subject: [Python-Dev] The decorator(s) module In-Reply-To: <43F6655E.1040308@colorstudy.com> References: <43F61442.6050003@colorstudy.com> <43F6655E.1040308@colorstudy.com> Message-ID: WFM. Patch anyone? On 2/17/06, Ian Bicking wrote: > Alex Martelli wrote: > > Maybe we could fix that by having property(getfunc) use > > getfunc.__doc__ as the __doc__ of the resulting property object > > (easily overridable in more normal property usage by the doc= > > argument, which, I feel, should almost invariably be there). > > +1 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Sat Feb 18 01:32:23 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 18:32:23 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F661D3.3010108@v.loewis.de> References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> Message-ID: <43F66B17.9080504@colorstudy.com> Martin v. L?wis wrote: > Adam Olsen wrote: > >>Consider these two pieces of code: >> >>if key in d: >> dosomething(d[key]) >>else: >> dosomethingelse() >> >>try: >> dosomething(d[key]) >>except KeyError: >> dosomethingelse() >> >>Before they were the same (assuming dosomething() won't raise >>KeyError). Now they would behave differently. > > > I personally think they should continue to do the same thing, > i.e. "in" should return True if there is a default; in the > current proposal, it should invoke the default factory. As I believe Fredrik implied, this would break the symmetry between "x in d" and "x in d.keys()" (unless d.keys() enumerates all possible keys), and either .get() would become useless, or it would also act in inconsistent ways. I think these broken expectations are much worse than what Adam's talking about. > But that's beside the point: Where is the real example > where this difference would matter? (I'm not asking for > a realistic example, I'm asking for a real one) Well, here's a kind of an example: WSGI specifies that the environment must be a dictionary, and nothing but a dictionary. I think it would have to be updated to say that it must be a dictionary with default_factory not set, as default_factory would break the predictability that was the reason WSGI specified exactly a dictionary (and not a dictionary-like object or subclass). So there's something that becomes brokenish. I think this is the primary kind of breakage -- dictionaries with default_factory set are not acceptable objects when a "plain" dictionary is expected. Of course, it can always be claimed that it's the fault of the person who passes in such a dictionary (they could have passed in None and it would probably also be broken). But now all of the sudden I have to say "x(a) takes a dictionary argument. Oh, and don't you dare use the default_factory feature!" where before I could just say "dictionary". And KeyError just... disappears. KeyError is one of those errors that you *expect* to happen (maybe the "Error" part is a misnomer); having it disappear is a major change. Also, I believe there's two ways to handle thread safety, both of which are broken: 1) d[key] gets the GIL, and thus while default_factory is being called the GIL is locked 2) d[key] doesn't get the GIL and so d[key].append(1) may not actually lead to 1 being in d[key] if another thread is appending something to the same key at the same time, and the key is not yet present in d. Admittedly I don't understand the ins and outs of the GIL, so the first case might not actually need to acquire the GIL. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Sat Feb 18 01:44:56 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 17 Feb 2006 18:44:56 -0600 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6683F.2060806@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F658AF.8070801@colorstudy.com> <43F6632B.6000600@v.loewis.de> <43F664F5.5040504@colorstudy.com> <43F6683F.2060806@v.loewis.de> Message-ID: <43F66E08.70605@colorstudy.com> Martin v. L?wis wrote: >>Maybe it isn't worse, but the real alternative is: >> >> import zlib >> import base64 >> >> base64.b64encode(zlib.compress(s)) >> >>Encodings cover up eclectic interfaces, where those interfaces fit a >>basic pattern -- data in, data out. > > > So should I write > > 3.1415.encode("sin") > > or would that be > > 3.1415.decode("sin") The ambiguity shows that "sin" is clearly not an encoding. Doesn't read right anyway. [0.3, 0.35, ...].encode('fourier') would be sensible though. Except of course lists don't have an encode method; but that's just a convenience of strings and unicode because those objects are always data, where lists are only sometimes data. If extended indefinitely, the namespace issue is notable. But it's not going to be extended indefinitely, so that's just a theoretical problem. > What about > > "http://www.python.org".decode("URL") you mean 'a%20b'.decode('url') == 'a b'? That's not what you meant, but nevertheless that would be an excellent encoding ;) -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From thomas at xs4all.net Sat Feb 18 01:51:49 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 18 Feb 2006 01:51:49 +0100 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: References: <43F3A7E4.1090505@v.loewis.de> Message-ID: <20060218005149.GE23859@xs4all.nl> On Fri, Feb 17, 2006 at 04:40:08PM -0700, Travis Oliphant wrote: > What is PY_SSIZE_T_MAX supposed to be? The definition in pyport.h > doesn't compile. Why not? Does it give an error for your particular platform? What platform is that? What are HAVE_SSIZE_T, SIZEOF_VOID_P and SIZEOF_SIZE_T defined to be? While looking at the piece of code in Include/pyport.h I do notice that the fallback (when ssize_t is not available) is to Py_uintptr_t... Which is an unsigned type, while ssize_t is supposed to be signed. Martin, is that on purpose? I don't have any systems that lack ssize_t. ;P That should prevent the PY_SSIZE_T_MAX definition from compiling though. > Shouldn't a lot of checks for INT_MAX be replaced with PY_SSIZE_T_MAX? > Like in the slice indexing code? Yes, ideally. (Actually, I think slice indexing was changed earlier today.) But while changing it would have little to no effect on 32-bit machines, doing it the wrong way may break the code on 64-bit machines in subtle ways, so it's not all done blindly, or in one big shot. Also, because some output parameters to PyArg_ParsE* change size (s#/t#), code will have to be reviewed to make use of the full address range anyway. (There's some preprocessor hackery that checks for PY_SIZE_T_CLEAN to see if it's safe to use the large output versions.) Adapting all code in the right way isn't finished yet (not in the last place because some of the code is... how shall I put it... 'creative'.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From bob at redivi.com Sat Feb 18 02:12:17 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 17 Feb 2006 17:12:17 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6683F.2060806@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F658AF.8070801@colorstudy.com> <43F6632B.6000600@v.loewis.de> <43F664F5.5040504@colorstudy.com> <43F6683F.2060806@v.loewis.de> Message-ID: <107C82DC-6554-4C56-B81D-5E8415D833DB@redivi.com> On Feb 17, 2006, at 4:20 PM, Martin v. L?wis wrote: > Ian Bicking wrote: >> Maybe it isn't worse, but the real alternative is: >> >> import zlib >> import base64 >> >> base64.b64encode(zlib.compress(s)) >> >> Encodings cover up eclectic interfaces, where those interfaces fit a >> basic pattern -- data in, data out. > > So should I write > > 3.1415.encode("sin") > > or would that be > > 3.1415.decode("sin") > > What about > > "http://www.python.org".decode("URL") > > It's "data in, data out", after all. Who needs functions? Well, 3.1415.decode("sin") is of course NaN, because 3.1415.encode ("sinh") is not defined for numbers outside of [-1, 1] :) -bob From rhamph at gmail.com Sat Feb 18 02:29:34 2006 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 17 Feb 2006 18:29:34 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F65744.7050102@v.loewis.de> References: <43F65744.7050102@v.loewis.de> Message-ID: On 2/17/06, "Martin v. L?wis" wrote: > Adam Olsen wrote: > > You could pass a float in as well. But if the function is documented > > as taking a dict, and the programmer expects a dict.. that now has to > > be changed to "dict without a default". Or they have to code > > defensively since d[key] may or may not raise KeyError, so they must > > avoid depending on it either way. > > Can you give an example of real, existing code that will break > if a such a dict is passed? I only got halfway through the "grep KeyError" results, but.. Demo/metaclass/Meta.py:55 Demo/tkinter/guido/AttrDialog.py:121 # Subclasses override self.classes Lib/ConfigParser.py:623 Lib/random.py:315 Lib/string.py:191 Lib/weakref.py:56 # Currently uses UserDict but I assume it will switch to dict eventually And the pi?ce de r?sistance.. Doc/tools/anno-api.py:51 It has this: try: info = rcdict[s] except KeyError: sys.stderr.write("No refcount data for %s\n" % s) else: ... rcdict is loaded from refcounts.load(). refcounts.load() calls refcounts.loadfile(), which has this (inside a loop): try: entry = d[function] except KeyError: entry = d[function] = Entry(function) A prime candidate for a default. Perhaps the KeyError shouldn't ever get triggered in this case, I'm not sure. I think that's besides the point though. The programmer clearly expected it would. -- Adam Olsen, aka Rhamphoryncus From oliphant.travis at ieee.org Sat Feb 18 02:37:32 2006 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 17 Feb 2006 18:37:32 -0700 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: <20060218005149.GE23859@xs4all.nl> References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> Message-ID: Thomas Wouters wrote: > On Fri, Feb 17, 2006 at 04:40:08PM -0700, Travis Oliphant wrote: > > >>What is PY_SSIZE_T_MAX supposed to be? The definition in pyport.h >>doesn't compile. > Maybe I have the wrong version of code. In my pyport.h (checked out from svn trunk) I have. #define PY_SSIZE_T_MAX ((Py_ssize_t)(((size_t)-1)>>1)) What is size_t? Is this supposed to be sizeof(size_t)? I get a syntax error when I actually use PY_SSIZE_T_MAX somewhere in the code. > While looking at the piece of code in Include/pyport.h I do notice that the > fallback (when ssize_t is not available) is to Py_uintptr_t... Which is an > unsigned type, while ssize_t is supposed to be signed. Martin, is that on > purpose? I don't have any systems that lack ssize_t. ;P I saw the same thing and figured it was an error. > > Adapting all code in the right way isn't finished yet (not in the last place > because some of the code is... how shall I put it... 'creative'.) I'm just trying to adapt my __index__ patch to use ssize_t. I realize this was a big change and will take some "adjusting." I can help with that if needed as I do have some experience here. I just want to make sure I fully understand what issues Martin and others are concerned about. -Travis From greg.ewing at canterbury.ac.nz Sat Feb 18 04:13:45 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 18 Feb 2006 16:13:45 +1300 Subject: [Python-Dev] str object going in Py3K In-Reply-To: <43F50A54.4070609@v.loewis.de> References: <43F2FDDD.3030200@gmail.com> <43F3E7DB.4010502@canterbury.ac.nz> <43F50A54.4070609@v.loewis.de> Message-ID: <43F690E9.7020605@canterbury.ac.nz> Martin v. L?wis wrote: >>Another thought -- what is going to happen to os.open? >>Will it change to return bytes, or will there be a new >>os.openbytes? > > Nit-pickingly: os.open will continue to return integers. Sorry, what I meant was will os.read return bytes. Greg From ncoghlan at gmail.com Sat Feb 18 04:34:35 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2006 13:34:35 +1000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F65744.7050102@v.loewis.de> Message-ID: <43F695CB.3060100@gmail.com> Adam Olsen wrote: > And the pi?ce de r?sistance.. > Doc/tools/anno-api.py:51 > > It has this: > try: > info = rcdict[s] > except KeyError: > sys.stderr.write("No refcount data for %s\n" % s) > else: > ... > rcdict is loaded from refcounts.load(). refcounts.load() calls > refcounts.loadfile(), which has this (inside a loop): > try: > entry = d[function] > except KeyError: > entry = d[function] = Entry(function) > A prime candidate for a default. > > Perhaps the KeyError shouldn't ever get triggered in this case, I'm > not sure. I think that's besides the point though. The programmer > clearly expected it would. Assuming the following override: class EntryDict(dict): def on_missing(self, key): value = Entry(key) self[key] = value return value Then what it means is that the behaviour of "missing functions get an empty refcount entry" propagates to the rcdict code. So the consequence is that the code in anno-api will never print an error message - all functions are deemed to have associated refcount data in refcount.dat. But that would be a bug in refcounts.loadfile: if it returns an EntryDict instead of a normal dict it is, in effect, returning an *infinite* dictionary that contains refcount definitions for every possible function name (some of them are just populated on demand). So *if* refcounts.loadfile was converted to use an EntryDict, it would need to return dict(d) instead of returning d directly. And this is where the question of whether has_key/__having__ return True or False when default_factory is set is important. If they return False, then the LBYL (if key in d:) and EAFTP (try/except) approaches give *different answers*. More importantly, LBYL will never have side effects, whereas EAFTP may. If the methods always return True (as Martin suggests), then we retain the current behaviour where there is no real difference between the two approaches. Given the amount of time spent in recent years explaining this fact, I don't think it is an equivalence that should be broken lightly (IOW, I've persuaded myself that I agree with Martin) The alternative would be to have an additional query API "will_default" that reflects whether or not a given key is actually present in the dictionary ("if key not in d.keys()" would serve a similar purpose, but requires building the list of keys). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From tim.peters at gmail.com Sat Feb 18 04:37:21 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 17 Feb 2006 22:37:21 -0500 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> Message-ID: <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> [Travis Oliphant] > Maybe I have the wrong version of code. In my pyport.h (checked out > from svn trunk) I have. > > #define PY_SSIZE_T_MAX ((Py_ssize_t)(((size_t)-1)>>1)) > > What is size_t? size_t is an unsigned integral type defined by, required by, and used all over the place in standard C. What exactly is the compiler message you get, and exactly which compiler are you using (note that nobody else is having problems with this, so there's something unique in your setup)? > Is this supposed to be sizeof(size_t)? No. (size_t)-1 casts -1 to the unsigned integral type size_t, which creates a "solid string of 1 bits" with the width of the size_t type. ">> 1" then shifts that right one bit, clearing the sign bit but leaving the rest of the integer "all 1s". Then that's cast to type Py_ssize_t, which is a signed integral type with the same width as the standard size_t. In the end, you get the largest positive signed integer with the same width as size_t, and that's the intent. > I get a syntax error when I actually use PY_SSIZE_T_MAX somewhere in the > code. Nobody else does (PY_SSIZE_T_MAX is referenced in a few places already), so you need to give more information. Is it simply that you neglected to include Python.h in some extension module? The definition of size_t must be (according to the C standard) supplied by stdlib.h, and Python.h includes stdlib.h. It's also possible the some core Python C code doesn't #include enough stuff to get the platform's size_t definition. From greg.ewing at canterbury.ac.nz Sat Feb 18 04:27:13 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 18 Feb 2006 16:27:13 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43F69411.1020807@canterbury.ac.nz> Stephen J. Turnbull wrote: >>>>>>"Guido" == Guido van Rossum writes: > Guido> - b = bytes(t, enc); t = text(b, enc) > > +1 The coding conversion operation has always felt like a constructor > to me, and in this particular usage that's exactly what it is. I > prefer the nomenclature to reflect that. This also has the advantage that it competely avoids using the verbs "encode" and "decode" and the attendant confusion about which direction they go in. e.g. s = text(b, "base64") makes it obvious that you're going from the binary side to the text side of the base64 conversion. Greg From ncoghlan at gmail.com Sat Feb 18 04:43:42 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2006 13:43:42 +1000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <43F697EE.2030709@gmail.com> Guido van Rossum wrote: > But there were > several suggestions that this would be fine functionality to add to > the standard dict type -- and then I really don't see any other way to > do this. Given the constructor problem, and the issue with "this function expects a plain dictionary", I think your original instinct to use a subclass may have been correct. The constructor is much cleaner that way: # bag like behavior dd = collections.autodict(int) for elem in collection: dd[elem] += 1 # setdefault-like behavior dd = collections.autodict(list) for page_number, page in enumerate(book): for word in page.split(): dd[word].append(word) And it can be a simple fact that for an autodict, "if key in d" and "d[key]" may give different answers. Much cleaner than making the semantics of normal dicts dependent on: a. whether or not on_missing has been overridden b. whether or not default_factory has been set Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From pje at telecommunity.com Sat Feb 18 04:51:13 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 17 Feb 2006 22:51:13 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <5.1.1.6.0.20060217224236.02237ec0@mail.telecommunity.com> At 11:58 AM 02/17/2006 -0800, Guido van Rossum wrote: >I forgot to mention in my revised proposal that the API for setting >the default_factory is slightly odd: > > d = {} # or dict() > d.default_factory = list > >rather than > > d = dict(default_factory=list) > >This is of course because we cut off that way when we defined what >arbitrary keyword arguments to the dict constructor would do. My >original proposal solved this by creating a subclass. But there were >several suggestions that this would be fine functionality to add to >the standard dict type -- and then I really don't see any other way to >do this. (Yes, I could have a set_default_factory() method -- but a >simple settable attribute seems more pythonic!) Why not a classmethod constructor: d = dict.with_factory(list) Admittedly, the name's not that great. Actually, it's almost as bad as setdefault in some ways. But I'd rather set the default and create the dictionary in one operation, since when reading it as two, you first think 'd is a dictionary', and then 'oh, but it has a default factory', as opposed to "d is a dict with a factory" in one thought. But maybe that's just me. :) From guido at python.org Sat Feb 18 05:14:28 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 17 Feb 2006 20:14:28 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F695CB.3060100@gmail.com> References: <43F65744.7050102@v.loewis.de> <43F695CB.3060100@gmail.com> Message-ID: On 2/17/06, Nick Coghlan wrote: > And this is where the question of whether has_key/__having__ return True or > False when default_factory is set is important. If they return False, then the > LBYL (if key in d:) and EAFTP (try/except) approaches give *different answers*. > > More importantly, LBYL will never have side effects, whereas EAFTP may. > > If the methods always return True (as Martin suggests), then we retain the > current behaviour where there is no real difference between the two > approaches. Given the amount of time spent in recent years explaining this > fact, I don't think it is an equivalence that should be broken lightly (IOW, > I've persuaded myself that I agree with Martin) > > The alternative would be to have an additional query API "will_default" that > reflects whether or not a given key is actually present in the dictionary ("if > key not in d.keys()" would serve a similar purpose, but requires building the > list of keys). Looking at it from the "which invariants hold" POV isn't always the right perspective. Reality is that some amount of code that takes a dict won't work if you give it a dict with a default_factory. Well, that's nothing new. Some code also breaks if you pass it a dict containing key or value types it doesn't expect, or if you pass it an anydbm instance, or os.environ on Windows (which implements case-insensitive keys). >From the POV of someone who decides to use a dict with a default_factory (or overriding on-missing()), having the 'in' operator always return True is d*mn annoying -- it means that any kind of introspection of the dict doesn't work. Take for example the multiset use case. Suppose you're aware that you're using a dict with this special behavior. Now you've built up your multiset and now you want to use it. Part of your app is interested in knowing the list of values associated with each key. But another part may be interested only in whether a particular key hs *any* values associated. If "key in d" returns whether that key is currently present, you can write if key in d: print "whatever" But under Martin and your proposed semantics, you'd have to write if d.get(key): print "whatever" or (worse) if d[key]: # inserts an empty list into the dict! print "whatever" I'd much rather be able to write "if key in d" and get the result I want... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From oliphant.travis at ieee.org Sat Feb 18 05:17:00 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Fri, 17 Feb 2006 21:17:00 -0700 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> Message-ID: Tim Peters wrote: > [Travis Oliphant] > >>Maybe I have the wrong version of code. In my pyport.h (checked out >>from svn trunk) I have. >> >>#define PY_SSIZE_T_MAX ((Py_ssize_t)(((size_t)-1)>>1)) >> >>What is size_t? > > > size_t is an unsigned integral type defined by, required by, and used > all over the place in standard C. What exactly is the compiler > message you get, and exactly which compiler are you using (note that > nobody else is having problems with this, so there's something unique > in your setup)? I'm very sorry for my silliness. I do see the problem I was having now. Thank you for helping me out. I was assuming that PY_SSIZE_T_MAX could be used in a pre-processor statement like LONG_MAX and INT_MAX. In other words #if PY_SSIZE_T_MAX != INT_MAX This was giving me errors and I tried to understand the #define statement as an arithmetic operation (not a type-casting one). I did know about size_t but thought it strange that 1 was being subtracted from it. I would have written this as (size_t)(-1) to avoid that confusion. I do apologize for my error. Thank you for taking the time to explain it. I still think that PY_SSIZE_T_MAX ought to be usable in a pre-processor statement, but it's a nit. Best, -Travis > > No. (size_t)-1 casts -1 to the unsigned integral type size_t, That's what I was missing I saw this as subtraction not type-casting. My mistake -Travis From jcarlson at uci.edu Sat Feb 18 05:33:16 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 17 Feb 2006 20:33:16 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F69411.1020807@canterbury.ac.nz> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> Message-ID: <20060217202813.5FA2.JCARLSON@uci.edu> Greg Ewing wrote: > > Stephen J. Turnbull wrote: > >>>>>>"Guido" == Guido van Rossum writes: > > > Guido> - b = bytes(t, enc); t = text(b, enc) > > > > +1 The coding conversion operation has always felt like a constructor > > to me, and in this particular usage that's exactly what it is. I > > prefer the nomenclature to reflect that. > > This also has the advantage that it competely > avoids using the verbs "encode" and "decode" > and the attendant confusion about which direction > they go in. > > e.g. > > s = text(b, "base64") > > makes it obvious that you're going from the > binary side to the text side of the base64 > conversion. But you aren't always getting *unicode* text from the decoding of bytes, and you may be encoding bytes *to* bytes: b2 = bytes(b, "base64") b3 = bytes(b2, "base64") Which direction are we going again? - Josiah From bokr at oz.net Sat Feb 18 05:41:07 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 18 Feb 2006 04:41:07 GMT Subject: [Python-Dev] Proposal: defaultdict References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> Message-ID: <43f6a154.1088724012@news.gmane.org> On Sat, 18 Feb 2006 00:52:51 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Adam Olsen wrote: >> Consider these two pieces of code: >> >> if key in d: >> dosomething(d[key]) >> else: >> dosomethingelse() >> >> try: >> dosomething(d[key]) >> except KeyError: >> dosomethingelse() >> >> Before they were the same (assuming dosomething() won't raise >> KeyError). Now they would behave differently. > >I personally think they should continue to do the same thing, >i.e. "in" should return True if there is a default; in the >current proposal, it should invoke the default factory. > >But that's beside the point: Where is the real example >where this difference would matter? (I'm not asking for >a realistic example, I'm asking for a real one) > My guess is that realistically default_factory will be used to make clean code for filling a dict, and then turning the factory off if it's to be passed into unknown contexts. Those contexts can then use old code to do as above, or if worth it can temporarily set a factory to do some work. Tightly coupled code I guess could pass factory-enabled dicts between each other. IOW, no code should break unless you pass a factory-enabled dict where you shouldn't ;-) That said, maybe enabling/disabling could be separate from d.default_factory (e.g., d.defaults_enabled) as that could allow e.g. foo(**kw) more options in how to copy kw and what foo could do. Would total copy including defaulting state be best? What other copies must be sanitized? type('Foo',(), **{'this':'one?'}) It will be interesting to see what comes out of the woodwork ;-) Regards, Bengt Richter From bob at redivi.com Sat Feb 18 06:10:04 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 17 Feb 2006 21:10:04 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060217202813.5FA2.JCARLSON@uci.edu> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> Message-ID: <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: > > Greg Ewing wrote: >> >> Stephen J. Turnbull wrote: >>>>>>>> "Guido" == Guido van Rossum writes: >> >>> Guido> - b = bytes(t, enc); t = text(b, enc) >>> >>> +1 The coding conversion operation has always felt like a >>> constructor >>> to me, and in this particular usage that's exactly what it is. I >>> prefer the nomenclature to reflect that. >> >> This also has the advantage that it competely >> avoids using the verbs "encode" and "decode" >> and the attendant confusion about which direction >> they go in. >> >> e.g. >> >> s = text(b, "base64") >> >> makes it obvious that you're going from the >> binary side to the text side of the base64 >> conversion. > > But you aren't always getting *unicode* text from the decoding of > bytes, > and you may be encoding bytes *to* bytes: > > b2 = bytes(b, "base64") > b3 = bytes(b2, "base64") > > Which direction are we going again? This is *exactly* why the current set of codecs are INSANE. unicode.encode and str.decode should be used *only* for unicode codecs. Byte transforms are entirely different semantically and should be some other method pair. -bob From aahz at pythoncraft.com Sat Feb 18 06:13:44 2006 From: aahz at pythoncraft.com (Aahz) Date: Fri, 17 Feb 2006 21:13:44 -0800 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6539F.5040707@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> Message-ID: <20060218051344.GC28761@panix.com> On Fri, Feb 17, 2006, "Martin v. L?wis" wrote: > Josiah Carlson wrote: >> >> How are users confused? > > Users do > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: > ordinal not in range(128) > > because they want to convert the string "to Unicode", and they have > found a text telling them that .encode("utf-8") is a reasonable > method. The problem is that they don't understand that "Martin v. L?wis" is not Unicode -- once all strings are Unicode, this is guaranteed to work. While it's not absolutely true, my experience of watching Unicode confusion is that the simplest approach for newbies is: encode FROM Unicode, decode TO Unicode. Most people when they start playing with Unicode think of it as just another text encoding rather than suddenly replacing "the universe" as the most base form of text. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From murman at gmail.com Sat Feb 18 06:38:52 2006 From: murman at gmail.com (Michael Urman) Date: Fri, 17 Feb 2006 23:38:52 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F64A7A.3060400@v.loewis.de> Message-ID: On 2/17/06, Adam Olsen wrote: > if key in d: > dosomething(d[key]) > else: > dosomethingelse() > > try: > dosomething(d[key]) > except KeyError: > dosomethingelse() I agree with the gut feeling that these should still do the same thing. Could we modify d.get() instead? >>> class ddict(dict): ... default_value_factory = None ... def get(self, k, d=None): ... v = super(ddict, self).get(k, d) ... if v is not None or d is not None or self.default_value_factory is None: ... return v ... return self.setdefault(k, self.default_value_factory()) ... >>> d = ddict() >>> d.default_value_factory = list >>> d.get('list', []) [] >>> d['list'] Traceback (most recent call last): File "", line 1, in ? KeyError: 'list' >>> d.get('list').append(5) >>> d['list'] [5] There was never an exception raised by d.get so this wouldn't change (assuming the C is implemented more carefully than the python above). What are the problems with this other than, like setdefault, it only works on values with mutator methods (i.e., no counting dicts)? Is the lack of counting dicts that d.__getitem__ supports a deal breaker? >>> d.default_value_factory = int >>> d.get('count') += 1 SyntaxError: can't assign to function call How does the above either in dict or a subclass compare to five line or smaller custom subclasses using something like the following? def append(self, k, val): self.setdefault(k, []).append(val) or def accumulate(self, k, val): try: self[k] += val except KeyError: self[k] = val Michael -- Michael Urman http://www.tortall.net/mu/blog From talin at acm.org Sat Feb 18 07:47:44 2006 From: talin at acm.org (Talin) Date: Fri, 17 Feb 2006 22:47:44 -0800 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: References: Message-ID: <43F6C310.1050307@acm.org> All right, the patch is up on SF. Sorry for the delay, I accidentally left my powerbook about an hour's drive away from home, and had to drive to go get it this morning :) To those who were asking what advantage the new syntax has - well, from a technical perspective there are none, since the underlying implementation is identical. The only (minor) difference is in the syntactical ambiguity, which both forms have - with lambda you can't be certain when to stop parsing the result expression, whereas with 'given' you can't be certain when to stop parsing the argument list. I see the primary advantage of the inline syntax as pedagogical - given a choice, I would rather explain the "given" syntax to a novice programmer than try to explain lambda. This is especially true given the similarity in form to generator expressions - in other words, once you've gone through the effort of explaining generator expressions, you can re-use most of that explanation when explaining "function expressions"; whereas with lambda, which looks like nothing else in Python, you have to start from scratch. -- Talin From nnorwitz at gmail.com Sat Feb 18 07:53:19 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 17 Feb 2006 22:53:19 -0800 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: <20060217173043.GA14607@code0.codespeak.net> References: <20060217173043.GA14607@code0.codespeak.net> Message-ID: On 2/17/06, Armin Rigo wrote: > Hi, > > On Tue, Feb 14, 2006 at 09:24:57PM -0800, Neal Norwitz wrote: > > http://www.python.org/peps/pep-0356.html > > There is at least one SF bug, namely "#1333982 Bugs of the new AST > compiler", that in my humble opinion absolutely needs to be fixed before > the release, even though I won't hide that I have no intention of fixing > it myself. Should I raise the issue here in python-dev, and see if we > agree that it is critical? I agree it's critical. > (Sorry if I should know about the procedure. Does it then go in the > PEP's Planned Features list?) I don't think it belongs in the PEP. I bumped the priority to 7 which is the standard protocol, though I don't know that it's really followed. I will enumerate the existing problems for Jeremy in the bug report. In the future, I would also prefer separate bug reports. Feel free to assign new bugs to Jeremy too. :-) n From nnorwitz at gmail.com Sat Feb 18 08:01:45 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 17 Feb 2006 23:01:45 -0800 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> Message-ID: On 2/17/06, Travis E. Oliphant wrote: > > I'm very sorry for my silliness. I do see the problem I was having now. > Thank you for helping me out. I was assuming that PY_SSIZE_T_MAX > could be used in a pre-processor statement like LONG_MAX and INT_MAX. > > In other words > > #if PY_SSIZE_T_MAX != INT_MAX I suppose that might be nice, but would require configure magic. I'm not sure how it could be done on Windows. There are much more important problems to address at this point IMO. Just review the recent fixes related to Py_BuildValue() on python-checkins to see what I mean. n From jcarlson at uci.edu Sat Feb 18 08:05:48 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 17 Feb 2006 23:05:48 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> References: <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> Message-ID: <20060217221623.5FA5.JCARLSON@uci.edu> Bob Ippolito wrote: > > > On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: > > > > > Greg Ewing wrote: > >> > >> Stephen J. Turnbull wrote: > >>>>>>>> "Guido" == Guido van Rossum writes: > >> > >>> Guido> - b = bytes(t, enc); t = text(b, enc) > >>> > >>> +1 The coding conversion operation has always felt like a > >>> constructor > >>> to me, and in this particular usage that's exactly what it is. I > >>> prefer the nomenclature to reflect that. > >> > >> This also has the advantage that it competely > >> avoids using the verbs "encode" and "decode" > >> and the attendant confusion about which direction > >> they go in. > >> > >> e.g. > >> > >> s = text(b, "base64") > >> > >> makes it obvious that you're going from the > >> binary side to the text side of the base64 > >> conversion. > > > > But you aren't always getting *unicode* text from the decoding of > > bytes, > > and you may be encoding bytes *to* bytes: > > > > b2 = bytes(b, "base64") > > b3 = bytes(b2, "base64") > > > > Which direction are we going again? > > This is *exactly* why the current set of codecs are INSANE. > unicode.encode and str.decode should be used *only* for unicode > codecs. Byte transforms are entirely different semantically and > should be some other method pair. The problem is that we are overloading data types. Strings (and bytes) can contain both encoded text as well as data, or even encoded data. Unless the plan is to make bytes _only_ contain encoded unicode, or _only_ data, or _only_ encoded data, the confusion for users may continue. Me, I'm a fan of education. Educating your users is simple, and if you have good exceptions and documentation, it gets easier. Raise an exception when a user tries to use a codec which doesn't have a particular source ('...'.decode('utf-8') should raise an error like "Cannot use text as a source for 'utf-8' decoding", when unicode/text becomes the default format for string literals). Tossing out bytes.encode(), as well as decodings for bytes->bytes, also brings up the issue of text.decode() for pure text transformations. Are we going to push all of those transformations somewhere else? Look at what we've currently got going for data transformations in the standard library to see what these removals will do: base64 module, binascii module, binhex module, uu module, ... Do we want or need to add another top-level module for every future encoding/codec that comes out (or does everyone think that we're done seeing codecs)? Do we want to keep monkey-patching binascii with names like 'a2b_hqx'? While there is currently one text->text transform (rot13), do we add another module for text->text transforms? Would it start having names like t2e_rot13() and e2t_rot13()? Educate the users. Raise better exceptions telling people why their encoding or decoding failed, as Ian Bicking already pointed out. If bytes.encode() and the equivalent of text.decode() is going to disappear, Bengt Richter had a good idea with bytes.recode() for strictly bytes transformations (and the equivalent for text), though it is ambiguous as to the direction; are we encoding or decoding with bytes.recode()? In my opinion, this is why .encode() and .decode() makes sense to keep on both bytes and text, the direction is unambiguous, and if one has even a remote idea of what the heck the codec is, they know their result. - Josiah From ilya at bluefir.net Sat Feb 18 08:03:42 2006 From: ilya at bluefir.net (Ilya Sandler) Date: Fri, 17 Feb 2006 23:03:42 -0800 (PST) Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <5.1.1.6.0.20060217224236.02237ec0@mail.telecommunity.com> References: <5.1.1.6.0.20060217224236.02237ec0@mail.telecommunity.com> Message-ID: On Fri, 17 Feb 2006, Phillip J. Eby wrote: > > d = {} # or dict() > > d.default_factory = list > > Why not a classmethod constructor: > > d = dict.with_factory(list) > > But I'd rather set the default and create the > dictionary in one operation, since when reading it as two, you first think > 'd is a dictionary', and then 'oh, but it has a default factory', as > opposed to "d is a dict with a factory" in one thought. Also, class method would mean less typing (esp if dictionary name happens to be longer than a couple of characters ;-) But I'd like to suggest a different name: d = dict.with_default( list) Ilya From ncoghlan at gmail.com Sat Feb 18 08:23:20 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2006 17:23:20 +1000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F65744.7050102@v.loewis.de> <43F695CB.3060100@gmail.com> Message-ID: <43F6CB68.5080108@gmail.com> Guido van Rossum wrote: > I'd much rather be able to write "if key in d" and get the result I want... Somewhere else in this byzantine thread, I realised that what was really bothering me was the conditional semantics that dict ended up with (i.e., it's behaviour changed significantly if the default factory was set). If we go back to your idea of collection.defaultdict (or Alex's name collection.autodict), then the change in semantics bothers me a lot less, and I'd be all in favour of the more usual variant (where "key in d" implies "key in d.keys()". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From bokr at oz.net Sat Feb 18 08:24:31 2006 From: bokr at oz.net (Bengt Richter) Date: Sat, 18 Feb 2006 07:24:31 GMT Subject: [Python-Dev] bytes.from_hex() References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> Message-ID: <43f6abab.1091371449@news.gmane.org> On Fri, 17 Feb 2006 20:33:16 -0800, Josiah Carlson wrote: > >Greg Ewing wrote: >> >> Stephen J. Turnbull wrote: >> >>>>>>"Guido" == Guido van Rossum writes: >> >> > Guido> - b = bytes(t, enc); t = text(b, enc) >> > >> > +1 The coding conversion operation has always felt like a constructor >> > to me, and in this particular usage that's exactly what it is. I >> > prefer the nomenclature to reflect that. >> >> This also has the advantage that it competely >> avoids using the verbs "encode" and "decode" >> and the attendant confusion about which direction >> they go in. >> >> e.g. >> >> s = text(b, "base64") >> >> makes it obvious that you're going from the >> binary side to the text side of the base64 >> conversion. > >But you aren't always getting *unicode* text from the decoding of bytes, >and you may be encoding bytes *to* bytes: > > b2 = bytes(b, "base64") > b3 = bytes(b2, "base64") > >Which direction are we going again? Well, base64 is probably not your best example, because it necessarily involves characters ;-) If you are using "base64" you are looking at characters in your input to produce your bytes output. The only way you can see characters in bytes input is to decode them. So you are hiding your assumption about b's encoding. You can make useful rules of inference from type(b), but with bytes you really don't know. "base64" has to interpret b bytes as characters, because that's what it needs to recognize base64 characters, to produce the output bytes. The characters in b could be encoded in plain ascii, or utf16le, you have to know. So for utf16le it should be b2 = bytes(text(b, 'utf16le'), "base64") just because you assume an implicit b2 = bytes(text(b, 'ascii'), "base64") doesn't make it so in general. Even if you build that assumption in, it's not really true that you are going "bytes *to* bytes" without characters involved when you do bytes(b, "base64"). You have just left undocumented an API restriction (assert ) and an implementation optimization ;-) This is the trouble with str.encode and unicode.decode. They both hide implicit decodes and encodes respectively. They should be banned IMO. Let people spell it out and maybe understand what they are doing. OTOH, a bytes-to-bytes codec might be decompressing tgz into tar. For conceptual consistency, one might define a 'bytes' encoding that conceptually turns bytes into unicode byte characters and vice versa. Then "gunzip" can decode bytes, producing unicode characters which are then encoded back to bytes from the unicode ;-) The 'bytes' encoding would numerically be just like latin-1 except on the unicode side it would have wrapped-bytes internal representation. b_tar = bytes(text(b_tgz, 'gunzip'), 'bytes') of course, text(b_tgz, 'gunzip') would produce unicode text with a special internal representation that just wraps bytes though they are true unicode. The 'bytes' codec encode of course would just unwrap the internal bytes representation, but it would conceptually be an encoding into bytes. bytes(t, 'latin-1') would produce the same output from the wrapped bytes unicode. Sometimes conceptual purity can clarify things and sometimes it's just another confusing description. Regards, Bengt Richter From martin at v.loewis.de Sat Feb 18 08:33:35 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 08:33:35 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F66B17.9080504@colorstudy.com> References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> <43F66B17.9080504@colorstudy.com> Message-ID: <43F6CDCF.8080309@v.loewis.de> Ian Bicking wrote: > Well, here's a kind of an example: WSGI specifies that the environment > must be a dictionary, and nothing but a dictionary. I think it would > have to be updated to say that it must be a dictionary with > default_factory not set, as default_factory would break the > predictability that was the reason WSGI specified exactly a dictionary > (and not a dictionary-like object or subclass). So there's something > that becomes brokenish. I don't understand. In the rationale of PEP 333, it says "The rationale for requiring a dictionary is to maximize portability between servers. The alternative would be to define some subset of a dictionary's methods as being the standard and portable interface." That rationale is not endangered: if the environment continues to be a dict exactly, servers continue to be guaranteed what precise set of operations is available on the environment. Of course, that may change from Python version to Python version, as new dict methods get defined. But that should have been clear when the PEP was written: the dict type itself may evolve, providing additional features that weren't present in earlier versions. Even now, some dict implementations have setdefault(), others don't. > KeyError is one of > those errors that you *expect* to happen (maybe the "Error" part is a > misnomer); having it disappear is a major change. Well, as you say: you get a KeyError if there is an error with the key. With a default_factory, there isn't normally an error with the key. > Also, I believe there's two ways to handle thread safety, both of which > are broken: > > 1) d[key] gets the GIL, and thus while default_factory is being called > the GIL is locked > > 2) d[key] doesn't get the GIL and so d[key].append(1) may not actually > lead to 1 being in d[key] if another thread is appending something to > the same key at the same time, and the key is not yet present in d. It's 1), primarily. If default_factory is written in Python, though (e.g. if it is *not* list()), the interpreter will give up the GIL every N byte code instructions (or when a blocking operation is executed). Notice the same issue already exist with __hash__ for the key. Also notice that the same issue already exists with any kind of manipulation of a dictionary in multiple threads, today: if you do try: d[k].append(v) except KeyError: d[k] = [v] then two threads might interleavingly execute the except-suite. Regards, Martin From martin at v.loewis.de Sat Feb 18 09:08:00 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 09:08:00 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F65744.7050102@v.loewis.de> Message-ID: <43F6D5E0.1050006@v.loewis.de> Adam Olsen wrote: > Demo/metaclass/Meta.py:55 That wouldn't break. If you had actually read the code, you would have seen it is try: ga = dict['__getattr__'] except KeyError: pass How would it break if dict had a default factory? ga would get the __getattr__ value, and everything would be fine. The KeyError is ignored, after all. > Demo/tkinter/guido/AttrDialog.py:121 # Subclasses override self.classes Hmm try: cl = self.classes[c] except KeyError: cl = 'unknown' So cl wouldn't be 'unknown'. Why would that be a problem? > Lib/ConfigParser.py:623 try: v = map[var] except KeyError: raise InterpolationMissingOptionError( option, section, rest, var) So there is no InterpolationMissingOptionError. *Of course not*. The whole point would be to provide a value for all interpolation variables. > Lib/random.py:315 This entire functions samples k elements with indices between 0 and len(population). Now, people "shouldn't" be passing dictionaries in in the first place; that specific code tests whether there valid values at indices 0, n//2, and n. If the dictionary isn't really a sequence (i.e. if it doesn't provide values at all indices), the function may later fail even if it passes that test. With a default-valued dictionary, the function would not fail, but a large number of samples might be the default value. > Lib/string.py:191 Same like ConfigParser: the intperpolation will always succeed, interpolating all values (rather than leaving $identifier in the string). That would be precisely the expected behaviour. > Lib/weakref.py:56 # Currently uses UserDict but I assume it will > switch to dict eventually Or, rather, UserDict might grow the on_missing feature as well. That is irrelevant for this issue, though: o = self.data[key]() if o is None: raise KeyError, key # line 56 else: return o So we are looking for lookup failures in self.data, here: self.dict is initialized to {} in UserDict, with no default factory. So there cannot be a change in behaviour. > Perhaps the KeyError shouldn't ever get triggered in this case, I'm > not sure. I think that's besides the point though. The programmer > clearly expected it would. No. I now see your problem: An "except KeyError" does *not* mean that the programmer "clearly expects it will" raise an KeyError. Instead, the programmer expects it *might* raise a KeyError, and tries to deal with this situation. If the situation doesn't arise, the code continue just fine. Regards, Martin From martin at v.loewis.de Sat Feb 18 09:21:04 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 09:21:04 +0100 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> Message-ID: <43F6D8F0.2060808@v.loewis.de> Neal Norwitz wrote: > I suppose that might be nice, but would require configure magic. I'm > not sure how it could be done on Windows. Contributions are welcome. On Windows, it can be hard-coded. Actually, something like #if SIZEOF_SIZE_T == SIZEOF_INT #define PY_SSIZE_T_MAX INT_MAX #elif SIZEOF_SIZE_T == SIZEOF_LONG #define PY_SSIZE_T_MAX LONG_MAX #else #error What is size_t equal to? #endif might work. > There are much more important problems to address at this point IMO. > Just review the recent fixes related to Py_BuildValue() on > python-checkins to see what I mean. Nevertheless, it would be desirable IMO if it expanded to a literal, so that the preprocessor could understand it. Regards, Martin From rrr at ronadam.com Sat Feb 18 09:35:24 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 18 Feb 2006 02:35:24 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060217221623.5FA5.JCARLSON@uci.edu> References: <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> <20060217221623.5FA5.JCARLSON@uci.edu> Message-ID: <43F6DC4C.1070100@ronadam.com> Josiah Carlson wrote: > Bob Ippolito wrote: >> >> On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: >> >>> Greg Ewing wrote: >>>> Stephen J. Turnbull wrote: >>>>>>>>>> "Guido" == Guido van Rossum writes: >>>>> Guido> - b = bytes(t, enc); t = text(b, enc) >>>>> >>>>> +1 The coding conversion operation has always felt like a >>>>> constructor >>>>> to me, and in this particular usage that's exactly what it is. I >>>>> prefer the nomenclature to reflect that. >>>> This also has the advantage that it competely >>>> avoids using the verbs "encode" and "decode" >>>> and the attendant confusion about which direction >>>> they go in. >>>> >>>> e.g. >>>> >>>> s = text(b, "base64") >>>> >>>> makes it obvious that you're going from the >>>> binary side to the text side of the base64 >>>> conversion. >>> But you aren't always getting *unicode* text from the decoding of >>> bytes, >>> and you may be encoding bytes *to* bytes: >>> >>> b2 = bytes(b, "base64") >>> b3 = bytes(b2, "base64") >>> >>> Which direction are we going again? >> This is *exactly* why the current set of codecs are INSANE. >> unicode.encode and str.decode should be used *only* for unicode >> codecs. Byte transforms are entirely different semantically and >> should be some other method pair. > > The problem is that we are overloading data types. Strings (and bytes) > can contain both encoded text as well as data, or even encoded data. Right > Educate the users. Raise better exceptions telling people why their > encoding or decoding failed, as Ian Bicking already pointed out. If > bytes.encode() and the equivalent of text.decode() is going to disappear, +1 on better documentation all around with regards to encodings and Unicode. So far the best explanation I've found (so far) is in PEP 100. The Python docs and built in help hardly explain more than the minimal argument list for the encoding and decoding methods, and the str and unicode type constructor arguments aren't explained any better. > Bengt Richter had a good idea with bytes.recode() for strictly bytes > transformations (and the equivalent for text), though it is ambiguous as > to the direction; are we encoding or decoding with bytes.recode()? In > my opinion, this is why .encode() and .decode() makes sense to keep on > both bytes and text, the direction is unambiguous, and if one has even a > remote idea of what the heck the codec is, they know their result. > > - Josiah I like the bytes.recode() idea a lot. +1 It seems to me it's a far more useful idea than encoding and decoding by overloading and could do both and more. It has a lot of potential to be an intermediate step for encoding as well as being used for many other translations to byte data. I think I would prefer that encode and decode be just functions with well defined names and arguments instead of being methods or arguments to string and Unicode types. I'm not sure on exactly how this would work. Maybe it would need two sets of encodings, ie.. decoders, and encoders. An exception would be given if it wasn't found for the direction one was going in. Roughly... something or other like: import encodings encodings.tostr(obj, encoding): if encoding not in encoders: raise LookupError 'encoding not found in encoders' # check if obj works with encoding to string # ... b = bytes(obj).recode(encoding) return str(b) encodings.tounicode(obj, decodeing): if decoding not in decoders: raise LookupError 'decoding not found in decoders' # check if obj works with decoding to unicode # ... b = bytes(obj).recode(decoding) return unicode(b) Anyway... food for thought. Cheers, Ronald Adam From g.brandl at gmx.net Sat Feb 18 09:38:45 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 18 Feb 2006 09:38:45 +0100 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: <43F61442.6050003@colorstudy.com> <43F6655E.1040308@colorstudy.com> Message-ID: Guido van Rossum wrote: > WFM. Patch anyone? Done. http://python.org/sf/1434038 Georg > On 2/17/06, Ian Bicking wrote: >> Alex Martelli wrote: >> > Maybe we could fix that by having property(getfunc) use >> > getfunc.__doc__ as the __doc__ of the resulting property object >> > (easily overridable in more normal property usage by the doc= >> > argument, which, I feel, should almost invariably be there). From martin at v.loewis.de Sat Feb 18 09:59:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 09:59:38 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060218051344.GC28761@panix.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060218051344.GC28761@panix.com> Message-ID: <43F6E1FA.7080909@v.loewis.de> Aahz wrote: > The problem is that they don't understand that "Martin v. L?wis" is not > Unicode -- once all strings are Unicode, this is guaranteed to work. This specific call, yes. I don't think the problem will go away as long as both encode and decode are available for both strings and byte arrays. > While it's not absolutely true, my experience of watching Unicode > confusion is that the simplest approach for newbies is: encode FROM > Unicode, decode TO Unicode. I think this is what should be in-grained into the library, also. It shouldn't try to give additional meaning to these terms. Regards, Martin From jcarlson at uci.edu Sat Feb 18 10:16:07 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 18 Feb 2006 01:16:07 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F6DC4C.1070100@ronadam.com> References: <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> Message-ID: <20060218005534.5FA8.JCARLSON@uci.edu> Ron Adam wrote: > Josiah Carlson wrote: > > Bengt Richter had a good idea with bytes.recode() for strictly bytes > > transformations (and the equivalent for text), though it is ambiguous as > > to the direction; are we encoding or decoding with bytes.recode()? In > > my opinion, this is why .encode() and .decode() makes sense to keep on > > both bytes and text, the direction is unambiguous, and if one has even a > > remote idea of what the heck the codec is, they know their result. > > > > - Josiah > > I like the bytes.recode() idea a lot. +1 > > It seems to me it's a far more useful idea than encoding and decoding by > overloading and could do both and more. It has a lot of potential to be > an intermediate step for encoding as well as being used for many other > translations to byte data. Indeed it does. > I think I would prefer that encode and decode be just functions with > well defined names and arguments instead of being methods or arguments > to string and Unicode types. Attaching it to string and unicode objects is a useful convenience. Just like x.replace(y, z) is a convenience for string.replace(x, y, z) . Tossing the encode/decode somewhere else, like encodings, or even string, I see as a backwards step. > I'm not sure on exactly how this would work. Maybe it would need two > sets of encodings, ie.. decoders, and encoders. An exception would be > given if it wasn't found for the direction one was going in. > > Roughly... something or other like: > > import encodings > > encodings.tostr(obj, encoding): > if encoding not in encoders: > raise LookupError 'encoding not found in encoders' > # check if obj works with encoding to string > # ... > b = bytes(obj).recode(encoding) > return str(b) > > encodings.tounicode(obj, decodeing): > if decoding not in decoders: > raise LookupError 'decoding not found in decoders' > # check if obj works with decoding to unicode > # ... > b = bytes(obj).recode(decoding) > return unicode(b) > > Anyway... food for thought. Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding _to_ something, or are we decoding _from_ something? Are we going to need to embed the direction in the encoding/decoding name (to_base64, from_base64, etc.)? That doesn't any better than binascii.b2a_base64 . What about .reencode and .redecode? It seems as though the 're' added as a prefix to .encode and .decode makes it clearer that you get the same type back as you put in, and it is also unambiguous to direction. The question remains: is str.decode() returning a string or unicode depending on the argument passed, when the argument quite literally names the codec involved, difficult to understand? I don't believe so; am I the only one? - Josiah From walter at livinglogic.de Sat Feb 18 10:44:15 2006 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Sat, 18 Feb 2006 10:44:15 +0100 (CET) Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F63A35.5080500@colorstudy.com> Message-ID: <61121.89.54.8.114.1140255855.squirrel@isar.livinglogic.de> Guido van Rossum wrote: > On 2/17/06, Ian Bicking wrote: >> Guido van Rossum wrote: >> > d = {} >> > d.default_factory = set >> > ... >> > d[key].add(value) >> >> Another option would be: >> >> d = {} >> d.default_factory = set >> d.get_default(key).add(value) >> >> Unlike .setdefault, this would use a factory associated with the dictionary, and no default value would get passed in. >> Unlike the proposal, this would not override __getitem__ (not overriding >> __getitem__ is really the only difference with the proposal). It would be clear reading the code that you were not >> implicitly asserting they "key in d" was true. >> >> "get_default" isn't the best name, but another name isn't jumping out at me at the moment. Of course, it is not a Pythonic >> argument to say that an existing method should be overridden, or functionality made nameless simply because we can't think >> of a name (looking to anonymous functions of course ;) > > I'm torn. While trying to implement this I came across some ugliness in PyDict_GetItem() -- it would make sense if this also > called > on_missing(), but it must return a value without incrementing its > refcount, and isn't supposed to raise exceptions -- so what to do if on_missing() returns a value that's not inserted in the > dict? > > If the __getattr__()-like operation that supplies and inserts a > dynamic default was a separate method, we wouldn't have this problem. > > OTOH most reviewers here seem to appreciate on_missing() as a way to do various other ways of alterning a dict's > __getitem__() behavior behind a caller's back -- perhaps it could even be (ab)used to > implement case-insensitive lookup. I don't like the fact that on_missing()/default_factory can change the behaviour of __getitem__, which upto now has been something simple and understandable. Why don't we put the on_missing()/default_factory functionality into get() instead? d.get(key, default) does what it did before. d.get(key) invokes on_missing() (and dict would have default_factory == type(None)) Bye, Walter D?rwald From mal at egenix.com Sat Feb 18 12:06:37 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 12:06:37 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6539F.5040707@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> Message-ID: <43F6FFBD.7080704@egenix.com> Martin, v. L?wis wrote: >> How are users confused? > > Users do > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11: > ordinal not in range(128) > > because they want to convert the string "to Unicode", and they have > found a text telling them that .encode("utf-8") is a reasonable > method. > > What it *should* tell them is > > py> "Martin v. L?wis".encode("utf-8") > Traceback (most recent call last): > File "", line 1, in ? > AttributeError: 'str' object has no attribute 'encode' I've already explained why we have .encode() and .decode() methods on strings and Unicode many times. I've also explained the misunderstanding that can codecs only do Unicode-string conversions. And I've explained that the .encode() and .decode() method *do* check the return types of the codecs and only allow strings or Unicode on return (no lists, instances, tuples or anything else). You seem to ignore this fact. If we were to follow your idea, we should remove .encode() and .decode() altogether and refer users to the codecs.encode() and codecs.decode() function. However, I doubt that users will like this idea. >> bytes.encode CAN only produce bytes. > > I don't understand MAL's design, but I believe in that design, > bytes.encode could produce anything (say, a list). A codec > can convert anything to anything else. True. However, note that the .encode()/.decode() methods on strings and Unicode narrow down the possible return types. The corresponding .bytes methods should only allow bytes and Unicode. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Sat Feb 18 12:25:26 2006 From: mwh at python.net (Michael Hudson) Date: Sat, 18 Feb 2006 11:25:26 +0000 Subject: [Python-Dev] Serial function call composition syntax foo(x, y) -> bar() -> baz(z) In-Reply-To: (Guido van Rossum's message of "Fri, 17 Feb 2006 16:03:20 -0800") References: <43f61ed8.1055320110@news.gmane.org> <43f663bd.1072957381@news.gmane.org> Message-ID: <2mek20aoyh.fsf@starship.python.net> "Guido van Rossum" writes: > It's only me that's allowed to top-post. :-) At least you include attributions these days! Cheers, mwh -- SPIDER: 'Scuse me. [scuttles off] ZAPHOD: One huge spider. FORD: Polite though. -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From thomas at xs4all.net Sat Feb 18 12:33:58 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 18 Feb 2006 12:33:58 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6FFBD.7080704@egenix.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> Message-ID: <20060218113358.GG23859@xs4all.nl> On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote: > I've already explained why we have .encode() and .decode() > methods on strings and Unicode many times. I've also > explained the misunderstanding that can codecs only do > Unicode-string conversions. And I've explained that > the .encode() and .decode() method *do* check the return > types of the codecs and only allow strings or Unicode > on return (no lists, instances, tuples or anything else). > > You seem to ignore this fact. Actually, I think the problem is that while we all agree the bytestring/unicode methods are a useful way to convert from bytestring to unicode and back again, we disagree on their *general* usefulness. Sure, the codecs mechanism is powerful, and even more so because they can determine their own returntype. But it still smells and feels like a Perl attitude, for the reasons already explained numerous times, as well: - The return value for the non-unicode encodings depends on the value of the encoding argument. - The general case, by and large, especially in non-powerusers, is to encode unicode to bytestrings and to decode bytestrings to unicode. And that is a hard enough task for many of the non-powerusers. Being able to use the encode/decode methods for other tasks isn't helping them. That is why I disagree with the hypergeneralization of the encode/decode methods, regardless of the fact that it is a natural expansion of the implementation of codecs. Sure, it looks 'right' and 'natural' when you look at the implementation. It sure doesn't look natural, to me and to many others, when you look at the task of encoding and decoding bytestrings/unicode. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh at python.net Sat Feb 18 12:44:23 2006 From: mwh at python.net (Michael Hudson) Date: Sat, 18 Feb 2006 11:44:23 +0000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: (Guido van Rossum's message of "Fri, 17 Feb 2006 14:15:39 -0800") References: <43F63A35.5080500@colorstudy.com> Message-ID: <2maccoao2w.fsf@starship.python.net> "Guido van Rossum" writes: > I'm torn. While trying to implement this I came across some ugliness > in PyDict_GetItem() -- it would make sense if this also called > on_missing(), but it must return a value without incrementing its > refcount, and isn't supposed to raise exceptions This last bit has been a painful lie for quite some time. I don't know what can be done about it, though -- avoid the use of PyDict_GetItem() in situations where you don't expect string only dicts (so using it on globals and instance dicts would still be ok)? > -- so what to do if > on_missing() returns a value that's not inserted in the dict? Well, like some others I am a bit uncomfortable with changing the semantics of such an important operation on such an important data structure. But then I'm also not that unhappy with setdefault, so I must be weird. > If the __getattr__()-like operation that supplies and inserts a > dynamic default was a separate method, we wouldn't have this problem. Yes. > OTOH most reviewers here seem to appreciate on_missing() as a way to > do various other ways of alterning a dict's __getitem__() behavior > behind a caller's back -- perhaps it could even be (ab)used to > implement case-insensitive lookup. Well, I'm not sure I do. There seems to be quite a conceptual difference between being able to make a new kind of dictionary and mess with the behaviour of one that exists already, but I don't know if that matters in practice (the fact that you can currently do things like "import sys; sys.__dict__.clear()" doesn't seem to cause real problems). Finally, I'll just note that subclassing to modify the behaviour of a builtin type has generally been actively discouraged in python so far. If all dictionary lookups went through a method that you could override in Python (i.e. subclasses could replace ma_lookup, in effect) this would be easy to do in Python code. But they don't, and bug reports suggesting that they do have been rejected in the past (and I agree with the rejection, fwiw). So that rambled a bit. But in essence: I'd much prefer much prefer an addtion of a method or a type than modifictaion of existing behaviour. Cheers, mwh -- If you're talking "useful", I'm not your bot. -- Tim Peters, 08 Nov 2001 From mal at egenix.com Sat Feb 18 12:44:27 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 12:44:27 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6338D.8050300@v.loewis.de> References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> <43F50BDD.4010106@v.loewis.de> <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> Message-ID: <43F7089B.8040707@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: >> Just because some codecs don't fit into the string.decode() >> or bytes.encode() scenario doesn't mean that these codecs are >> useless or that the methods should be banned. > > No. The reason to ban string.decode and bytes.encode is that > it confuses users. Instead of starting to ban everything that can potentially confuse a few users, we should educate those users and tell them what these methods mean and how they should be used. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Sat Feb 18 13:01:34 2006 From: mwh at python.net (Michael Hudson) Date: Sat, 18 Feb 2006 12:01:34 +0000 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F6539F.5040707@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Fri, 17 Feb 2006 23:52:15 +0100") References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> Message-ID: <2m64ncana9.fsf@starship.python.net> This posting is entirely tangential. Be warned. "Martin v. L?wis" writes: > It's worse than that. The return *type* depends on the *value* of > the argument. I think there is little precedence for that: There's one extremely significant example where the *value* of something impacts on the type of something else: functions. The types of everything involved in str([1]) and len([1]) are the same but the results are different. This shows up in PyPy's type annotation; most of the time we just track types indeed, but when something is called we need to have a pretty good idea of the potential values, too. Relavent to the point at hand? No. Apologies for wasting your time :) Cheers, mwh -- The ultimate laziness is not using Perl. That saves you so much work you wouldn't believe it if you had never tried it. -- Erik Naggum, comp.lang.lisp From rrr at ronadam.com Sat Feb 18 13:17:42 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 18 Feb 2006 06:17:42 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060218005534.5FA8.JCARLSON@uci.edu> References: <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> <20060218005534.5FA8.JCARLSON@uci.edu> Message-ID: <43F71066.70000@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: >> Josiah Carlson wrote: >>> Bengt Richter had a good idea with bytes.recode() for strictly bytes >>> transformations (and the equivalent for text), though it is ambiguous as >>> to the direction; are we encoding or decoding with bytes.recode()? In >>> my opinion, this is why .encode() and .decode() makes sense to keep on >>> both bytes and text, the direction is unambiguous, and if one has even a >>> remote idea of what the heck the codec is, they know their result. >>> >>> - Josiah >> I like the bytes.recode() idea a lot. +1 >> >> It seems to me it's a far more useful idea than encoding and decoding by >> overloading and could do both and more. It has a lot of potential to be >> an intermediate step for encoding as well as being used for many other >> translations to byte data. > > Indeed it does. > >> I think I would prefer that encode and decode be just functions with >> well defined names and arguments instead of being methods or arguments >> to string and Unicode types. > > Attaching it to string and unicode objects is a useful convenience. > Just like x.replace(y, z) is a convenience for string.replace(x, y, z) . > Tossing the encode/decode somewhere else, like encodings, or even string, > I see as a backwards step. > >> I'm not sure on exactly how this would work. Maybe it would need two >> sets of encodings, ie.. decoders, and encoders. An exception would be >> given if it wasn't found for the direction one was going in. >> >> Roughly... something or other like: >> >> import encodings >> >> encodings.tostr(obj, encoding): >> if encoding not in encoders: >> raise LookupError 'encoding not found in encoders' >> # check if obj works with encoding to string >> # ... >> b = bytes(obj).recode(encoding) >> return str(b) >> >> encodings.tounicode(obj, decodeing): >> if decoding not in decoders: >> raise LookupError 'decoding not found in decoders' >> # check if obj works with decoding to unicode >> # ... >> b = bytes(obj).recode(decoding) >> return unicode(b) >> >> Anyway... food for thought. > > Again, the problem is ambiguity; what does bytes.recode(something) mean? > Are we encoding _to_ something, or are we decoding _from_ something? This was just an example of one way that might work, but here are my thoughts on why I think it might be good. In this case, the ambiguity is reduced as far as the encoding and decodings opperations are concerned.) somestring = encodings.tostr( someunicodestr, 'latin-1') It's pretty clear what is happening to me. It will encode to a string an object, named someunicodestr, with the 'latin-1' encoder. And also rusult in clear errors if the specified encoding is unavailable, and if it is, if it's not compatible with the given *someunicodestr* obj type. Further hints could be gained by. help(encodings.tostr) Which could result in... something like... """ encoding.tostr( , ) -> string Encode a unicode string using a encoder codec to a non-unicode string or transform a non-unicode string to another non-unicode string using an encoder codec. """ And if that's not enough, then help(encodings) could give more clues. These steps would be what I would do. And then the next thing would be to find the python docs entry on encodings. Placing them in encodings seems like a fairly good place to look for these functions if you are working with encodings. So I find that just as convenient as having them be string methods. There is no intermediate default encoding involved above, (the bytes object is used instead), so you wouldn't get some of the messages the present system results in when ascii is the default. (Yes, I know it won't when P3K is here also) > Are we going to need to embed the direction in the encoding/decoding > name (to_base64, from_base64, etc.)? That doesn't any better than > binascii.b2a_base64 . No, that's why I suggested two separate lists (or dictionaries might be better). They can contain the same names, but the lists they are in determine the context and point to the needed codec. And that step is abstracted out by putting it inside the encodings.tostr() and encodings.tounicode() functions. So either function would call 'base64' from the correct codec list and get the correct encoding or decoding codec it needs. What about .reencode and .redecode? It seems as > though the 're' added as a prefix to .encode and .decode makes it > clearer that you get the same type back as you put in, and it is also > unambiguous to direction. But then wouldn't we end up with multitude of ways to do things? s.encode(codec) == s.redecode(codec) s.decode(codec) == s.reencode(codec) unicode(s, codec) == s.decode(codec) str(u, codec) == u.encode(codec) str(s, codec) == s.encode(codec) unicode(s, codec) == s.reencode(codec) str(u, codec) == s.redecode(codec) str(s, codec) == s.redecode(codec) Umm .. did I miss any? Which ones would you remove? Which ones of those will succeed with which codecs? The method bytes.recode(), always does a byte transformation which can be almost anything. It's the context bytes.recode() is used in that determines what's happening. In the above cases, it's using an encoding transformation, so what it's doing is precisely what you would expect by it's context. There isn't a bytes.decode(), since that's just another transformation. So only the one method is needed. Which makes it easer to learn. > The question remains: is str.decode() returning a string or unicode > depending on the argument passed, when the argument quite literally > names the codec involved, difficult to understand? I don't believe so; > am I the only one? > > - Josiah Using help(str.decode) and help(str.encode) gives: S.decode([encoding[,errors]]) -> object S.encode([encoding[,errors]]) -> object These look an awful lot alike. The descriptions are nearly identical as well. The Python docs just reproduce (or close to) the doc strings with only a very small amount of additional words. Learning how the current system works comes awfully close to reverse engineering. Maybe I'm overstating it a bit, but I suspect many end up doing exactly that in order to learn how Python does it. Or they go with the first solution that seems to work and hope for the best. I believe that's what Martin said earlier in this thread. It's much too late (or early now) to think further on this. So until tomorrow. (please ignore typos) ;-) Cheers, Ronald Adam From mal at egenix.com Sat Feb 18 13:21:18 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 13:21:18 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <20060218113358.GG23859@xs4all.nl> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> Message-ID: <43F7113E.8090300@egenix.com> Thomas Wouters wrote: > On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote: > >> I've already explained why we have .encode() and .decode() >> methods on strings and Unicode many times. I've also >> explained the misunderstanding that can codecs only do >> Unicode-string conversions. And I've explained that >> the .encode() and .decode() method *do* check the return >> types of the codecs and only allow strings or Unicode >> on return (no lists, instances, tuples or anything else). >> >> You seem to ignore this fact. > > Actually, I think the problem is that while we all agree the > bytestring/unicode methods are a useful way to convert from bytestring to > unicode and back again, we disagree on their *general* usefulness. Sure, the > codecs mechanism is powerful, and even more so because they can determine > their own returntype. But it still smells and feels like a Perl attitude, > for the reasons already explained numerous times, as well: It's by no means a Perl attitude. The main reason is symmetry and the fact that strings and Unicode should be as similar as possible in order to simplify the task of moving from one to the other. > - The return value for the non-unicode encodings depends on the value of > the encoding argument. Not really: you'll always get a basestring instance. > - The general case, by and large, especially in non-powerusers, is to > encode unicode to bytestrings and to decode bytestrings to unicode. And > that is a hard enough task for many of the non-powerusers. Being able to > use the encode/decode methods for other tasks isn't helping them. Agreed. Still, I believe that this is an educational problem. There are a couple of gotchas users will have to be aware of (and this is unrelated to the methods in question): * "encoding" always refers to transforming original data into a derived form * "decoding" always refers to transforming a derived form of data back into its original form * for Unicode codecs the original form is Unicode, the derived form is, in most cases, a string As a result, if you want to use a Unicode codec such as utf-8, you encode Unicode into a utf-8 string and decode a utf-8 string into Unicode. Encoding a string is only possible if the string itself is original data, e.g. some data that is supposed to be transformed into a base64 encoded form. Decoding Unicode is only possible if the Unicode string itself represents a derived form, e.g. a sequence of hex literals. > That is why I disagree with the hypergeneralization of the encode/decode > methods, regardless of the fact that it is a natural expansion of the > implementation of codecs. Sure, it looks 'right' and 'natural' when you look > at the implementation. It sure doesn't look natural, to me and to many > others, when you look at the task of encoding and decoding > bytestrings/unicode. That's because you only look at one specific task. Codecs also unify the various interfaces to common encodings such as base64, uu or zip which are not Unicode related. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From pierre.barbier at cirad.fr Sat Feb 18 12:53:25 2006 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Sat, 18 Feb 2006 12:53:25 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <17397.58935.8669.271616@montanaro.dyndns.org> References: <20060216212133.GB23859@xs4all.nl> <17397.58935.8669.271616@montanaro.dyndns.org> Message-ID: <20060218125325.pz5tgfem0qdcssc0@monbureau3.cirad.fr> An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://mail.python.org/pipermail/python-dev/attachments/20060218/8025fdb4/attachment.asc From rhamph at gmail.com Sat Feb 18 14:19:26 2006 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 18 Feb 2006 06:19:26 -0700 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060217221623.5FA5.JCARLSON@uci.edu> References: <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> <20060217221623.5FA5.JCARLSON@uci.edu> Message-ID: On 2/18/06, Josiah Carlson wrote: > Look at what we've currently got going for data transformations in the > standard library to see what these removals will do: base64 module, > binascii module, binhex module, uu module, ... Do we want or need to > add another top-level module for every future encoding/codec that comes > out (or does everyone think that we're done seeing codecs)? Do we want > to keep monkey-patching binascii with names like 'a2b_hqx'? While there > is currently one text->text transform (rot13), do we add another module > for text->text transforms? Would it start having names like t2e_rot13() > and e2t_rot13()? If top-level modules are the problem then why not make codecs into a package? from codecs import utf8, base64 utf8.encode(u) -> b utf8.decode(b) -> u base64.encode(b) -> b base64.decode(b) -> b -- Adam Olsen, aka Rhamphoryncus From mal at egenix.com Sat Feb 18 14:44:29 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 14:44:29 +0100 Subject: [Python-Dev] A codecs nit In-Reply-To: <1140040661.14818.42.camel@geddy.wooz.org> References: <43F397F6.4090402@egenix.com> <1140040661.14818.42.camel@geddy.wooz.org> Message-ID: <43F724BD.9060000@egenix.com> Barry Warsaw wrote: > On Wed, 2006-02-15 at 22:07 +0100, M.-A. Lemburg wrote: > >> Those are not pseudo-encodings, they are regular codecs. >> >> It's a common misunderstanding that codecs are only seen as serving >> the purpose of converting between Unicode and strings. >> >> The codec system is deliberately designed to be general enough >> to also work with many other types, e.g. it is easily possible to >> write a codec that convert between the hex literal sequence you >> have above to a list of ordinals: > > Slightly off-topic, but one thing that's always bothered me about the > current codecs implementation is that str.encode() (and friends) > implicitly treats its argument as module, and imports it, even if the > module doesn't live in the encodings package. That seems like a mistake > to me (and a potential security problem if the import has side-effects). It was a mistake, yes, and thanks for bringing this up. Codec packages should implement and register their own codec search functions. > I don't know whether at the very least restricting the imports to the > encodings package would make sense or would break things. > >>>> import sys >>>> sys.modules['smtplib'] > Traceback (most recent call last): > File "", line 1, in ? > KeyError: 'smtplib' >>>> ''.encode('smtplib') > Traceback (most recent call last): > File "", line 1, in ? > LookupError: unknown encoding: smtplib >>>> sys.modules['smtplib'] > > > I can't see any reason for allowing any randomly importable module to > act like an encoding. The encodings package search function will try to import the module and then check the module signature. If the module fails to export the codec registration API, then it raises the LookupError you see above. At the time, it was nice to be able to write codec packages as Python packages and have them readily usable by just putting the package on the sys.path. This was a side-effect of the way the encodings search function worked. The original design idea was to have all 3rd party codecs register themselves with the codec registry. However, this implies that the application using the codecs would have to run the registration code at least ones. Since the encodings package search function provided a more convenient way, this was used by most codec package programmers. In Py 2.5 we'll change that. The encodings package search function will only allow codecs in that package to be imported. All other codec packages will have to provide their own search function and register this with the codecs registry. The big question is: what to do about 2.3 and 2.4 - adding the same patch will cause serious breakage, since popular codec packages such as Tamito's Japanese package rely on the existing behavior. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From skip at pobox.com Sat Feb 18 15:50:18 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 18 Feb 2006 08:50:18 -0600 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <43F6C310.1050307@acm.org> References: <43F6C310.1050307@acm.org> Message-ID: <17399.13354.22638.464024@montanaro.dyndns.org> talin> ... whereas with 'given' you can't be certain when to stop talin> parsing the argument list. So require parens around the arglist: (x*y given (x, y)) Skip From richard.m.tew at gmail.com Fri Feb 17 20:48:55 2006 From: richard.m.tew at gmail.com (Richard Tew) Date: Fri, 17 Feb 2006 19:48:55 +0000 Subject: [Python-Dev] Stackless Python sprint at PyCon 2006 Message-ID: <952d92df0602171148h791a5fa1gddf3e6a3f7c68ecd@mail.gmail.com> Hi, During the sprint period after PyCon, we are planning on sprinting to bring Stackless up to date and to make it more current and approachable. A key part of this is porting it and the recently completed 64 bit changes that have been made to it to the latest version of Python. At the end of the sprint we hope to have up to date working 32 and 64 bit versions. If anyone on this list who is attending PyCon, has some time to spare during the sprint period and an interest in perhaps getting more familiar with Stackless, you would be more than welcome in joining us to help out. Familiarity with the Python source code and its workings would be a great help in the work we hope to get done. Especially participants with an interest in ensuring and testing that the porting done works on other platforms than those we will be developing on (Windows XP and Windows XP x64 edition). Obviously being the most familiar with the Stackless Python source code, Christian Tismer has kindly offered us guidance by acting as the coach for the sprint, taking time away from the PyPy sprint. In any case, if you have any questions, or are interested, please feel free to reply, whether here, to this email address or to richard at ccpgames.com. Thanks, Richard Tew Senior Programmer CCP Games You can read more about the sprint and the scheduled talk about how Stackless is used in the massively multiplayer game EVE Online we make, at PyCon at the folloing URL: http://www.stackless.com/Members/rmtew/pycon2006 And don't forget the Stackless website :) http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060217/7dda4c13/attachment.htm From aleaxit at gmail.com Sat Feb 18 16:24:41 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sat, 18 Feb 2006 07:24:41 -0800 Subject: [Python-Dev] The decorator(s) module In-Reply-To: References: <43F61442.6050003@colorstudy.com> <43F6655E.1040308@colorstudy.com> Message-ID: <2C516D2C-FFEE-4BD9-AF92-0D95A9B0229C@gmail.com> On Feb 18, 2006, at 12:38 AM, Georg Brandl wrote: > Guido van Rossum wrote: >> WFM. Patch anyone? > > Done. > http://python.org/sf/1434038 I reviewed the patch and added a comment on it, but since the point may be controversial I had better air it here for discussion: in 2.4, property(fset=acallable) does work (maybe silly, but it does make a write-only property) -- with the patch as given, it would stop working (due to attempts to get __doc__ from the None value of fget); I think we should ensure it keeps working (and add a unit test to that effect). Alex From aahz at pythoncraft.com Sat Feb 18 16:32:41 2006 From: aahz at pythoncraft.com (Aahz) Date: Sat, 18 Feb 2006 07:32:41 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F6DC4C.1070100@ronadam.com> References: <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> Message-ID: <20060218153241.GA14054@panix.com> On Sat, Feb 18, 2006, Ron Adam wrote: > > I like the bytes.recode() idea a lot. +1 > > It seems to me it's a far more useful idea than encoding and decoding by > overloading and could do both and more. It has a lot of potential to be > an intermediate step for encoding as well as being used for many other > translations to byte data. > > I think I would prefer that encode and decode be just functions with > well defined names and arguments instead of being methods or arguments > to string and Unicode types. > > I'm not sure on exactly how this would work. Maybe it would need two > sets of encodings, ie.. decoders, and encoders. An exception would be > given if it wasn't found for the direction one was going in. Here's an idea I don't think I've seen before: bytes.recode(b, src_encoding, dest_encoding) This requires the user to state up-front what the source encoding is. One of the big problems that I see with the whole encoding mess is that so much of it contains implicit assumptions about the source encoding; this gets away from that. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From mal at egenix.com Sat Feb 18 16:47:08 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 16:47:08 +0100 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060218153241.GA14054@panix.com> References: <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> <20060218153241.GA14054@panix.com> Message-ID: <43F7417C.6070507@egenix.com> Aahz wrote: > On Sat, Feb 18, 2006, Ron Adam wrote: >> I like the bytes.recode() idea a lot. +1 >> >> It seems to me it's a far more useful idea than encoding and decoding by >> overloading and could do both and more. It has a lot of potential to be >> an intermediate step for encoding as well as being used for many other >> translations to byte data. >> >> I think I would prefer that encode and decode be just functions with >> well defined names and arguments instead of being methods or arguments >> to string and Unicode types. >> >> I'm not sure on exactly how this would work. Maybe it would need two >> sets of encodings, ie.. decoders, and encoders. An exception would be >> given if it wasn't found for the direction one was going in. > > Here's an idea I don't think I've seen before: > > bytes.recode(b, src_encoding, dest_encoding) > > This requires the user to state up-front what the source encoding is. > One of the big problems that I see with the whole encoding mess is that > so much of it contains implicit assumptions about the source encoding; > this gets away from that. You might want to look at the codecs.py module: it has all these things and a lot more. http://docs.python.org/lib/module-codecs.html http://svn.python.org/view/python/trunk/Lib/codecs.py?view=markup -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From walter at livinglogic.de Sat Feb 18 17:11:39 2006 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Sat, 18 Feb 2006 17:11:39 +0100 (CET) Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F63C07.9030901@egenix.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> <43F63C07.9030901@egenix.com> Message-ID: <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: >>>>> I'd suggest we keep codecs.lookup() the way it is and >>>>> instead add new functions to the codecs module, e.g. >>>>> codecs.getencoderobject() and codecs.getdecoderobject(). >>>>> >>>>> Changing the codec registration is not much of a problem: >>>>> we could simply allow 6-tuples to be passed into the >>>>> registry. >>>> OK, so codecs.lookup() returns 4-tuples, but the registry stores 6-tuples and the search functions must return 6-tuples. >>>> And we add codecs.getencoderobject() and codecs.getdecoderobject() as well as new classes codecs.StatefulEncoder and >>>> codecs.StatefulDecoder. What about old search functions that return 4-tuples? >>> >>> The registry should then simply set the missing entries to None and the getencoderobject()/getdecoderobject() would then >>> have >>> to raise an error. >> >> Sounds simple enough and we don't loose backwards compatibility. >> >>> Perhaps we should also deprecate codecs.lookup() in Py 2.5 ?! >> >> +1, but I'd like to have a replacement for this, i.e. a function that returns all info the registry has about an encoding: >> >> 1. Name >> 2. Encoder function >> 3. Decoder function >> 4. Stateful encoder factory >> 5. Stateful decoder factory >> 6. Stream writer factory >> 7. Stream reader factory >> >> and if this is an object with attributes, we won't have any problems if we extend it in the future. > > Shouldn't be a problem: just expose the registry dictionary > via the _codecs module. > > The rest can then be done in a Python function defined in > codecs.py using a CodecInfo class. This would require the Python code to call codecs.lookup() and then look into the codecs dictionary (normalizing the encoding name again). Maybe we should make a version of __PyCodec_Lookup() that allows 4- and 6-tuples available to Python and use that? The official PyCodec_Lookup() would then have to downgrade the 6-tuples to 4-tuples. >> BTW, if we change the API, can we fix the return value of the stateless functions? As the stateless function always >> encodes/decodes the complete string, returning the length of the string doesn't make sense. >> codecs.getencoder() and codecs.getdecoder() would have to continue to return the old variant of the functions, but >> codecs.getinfo("latin-1").encoder would be the new encoding function. > > No: you can still write stateless encoders or decoders that do > not process the whole input string. Just because we don't have > any of those in Python, doesn't mean that they can't be written > and used. A stateless codec might want to leave the work > of buffering bytes at the end of the input data which cannot > be processed to the caller. But what would the call do with that info? It can't retry encoding/decoding the rejected input, because the state of the codec has been thrown away already. > It is also possible to write > stateful codecs on top of such stateless encoding and decoding > functions. That's what the codec helper functions from Python/_codecs.c are for. Anyway, I've started implementing a patch that just adds codecs.StatefulEncoder/codecs.StatefulDecoder. UTF8, UTF8-Sig, UTF-16, UTF-16-LE and UTF-16-BE are already working. Bye, Walter D?rwald From martin at v.loewis.de Sat Feb 18 17:15:14 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 17:15:14 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F6FFBD.7080704@egenix.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> Message-ID: <43F74812.1080505@v.loewis.de> M.-A. Lemburg wrote: > I've already explained why we have .encode() and .decode() > methods on strings and Unicode many times. I've also > explained the misunderstanding that can codecs only do > Unicode-string conversions. And I've explained that > the .encode() and .decode() method *do* check the return > types of the codecs and only allow strings or Unicode > on return (no lists, instances, tuples or anything else). > > You seem to ignore this fact. I'm not ignoring the fact that you have explained this many times. I just fail to understand your explanations. For example, you said at some point that codecs are not restricted to Unicode. However, I don't recall any explanation what the restriction *is*, if any restriction exists. No such restriction seems to be documented. > True. However, note that the .encode()/.decode() methods on > strings and Unicode narrow down the possible return types. > The corresponding .bytes methods should only allow bytes and > Unicode. I forgot that: what is the rationale for that restriction? Regards, Martin From martin at v.loewis.de Sat Feb 18 17:22:01 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 17:22:01 +0100 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <2m64ncana9.fsf@starship.python.net> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <2m64ncana9.fsf@starship.python.net> Message-ID: <43F749A9.80301@v.loewis.de> Michael Hudson wrote: > There's one extremely significant example where the *value* of > something impacts on the type of something else: functions. The types > of everything involved in str([1]) and len([1]) are the same but the > results are different. This shows up in PyPy's type annotation; most > of the time we just track types indeed, but when something is called > we need to have a pretty good idea of the potential values, too. > > Relavent to the point at hand? No. Apologies for wasting your time > :) Actually, I think it is relevant. I never thought about it this way, but now that you mention it, you are right. This demonstrates that the string argument to .encode is actually a function name, atleast the way it is implemented now. So .encode("uu") and .encode("rot13") are *two* different methods, instead of being a single method. This brings me back to my original point: "rot13" should be a function, not a parameter to some function. In essence, .encode reimplements apply(), with the added feature of not having to pass the function itself, but just its name. Maybe this design results from a really deep understanding of Namespaces are one honking great idea -- let's do more of those! Regards, Martin From martin at v.loewis.de Sat Feb 18 17:28:28 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 17:28:28 +0100 Subject: [Python-Dev] Stackless Python sprint at PyCon 2006 In-Reply-To: <952d92df0602171148h791a5fa1gddf3e6a3f7c68ecd@mail.gmail.com> References: <952d92df0602171148h791a5fa1gddf3e6a3f7c68ecd@mail.gmail.com> Message-ID: <43F74B2C.205@v.loewis.de> Richard Tew wrote: > If anyone on this list who is attending PyCon, has some time to spare > during the sprint period and an interest in perhaps getting more > familiar with Stackless, you would be more than welcome in joining us to > help out. Familiarity with the Python source code and its workings > would be a great help in the work we hope to get done. Especially > participants with an interest in ensuring and testing that the porting > done works on other platforms than those we will be developing on > (Windows XP and Windows XP x64 edition). If you are going to work on XP x64, make sure you have the latest platform SDK installed on these machines. I plan to build AMD64 binaries with the platform SDK, not with VS 2005. Regards, Martin From talin at acm.org Sat Feb 18 17:49:29 2006 From: talin at acm.org (Talin) Date: Sat, 18 Feb 2006 08:49:29 -0800 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <17399.13354.22638.464024@montanaro.dyndns.org> References: <43F6C310.1050307@acm.org> <17399.13354.22638.464024@montanaro.dyndns.org> Message-ID: <43F75019.2050707@acm.org> skip at pobox.com wrote: > talin> ... whereas with 'given' you can't be certain when to stop > talin> parsing the argument list. > >So require parens around the arglist: > > (x*y given (x, y)) > >Skip > > I would not be opposed to mandating the parens, and its an easy enough change to make. The patch on SF lets you do it both ways, which will give people who are interested a chance to get a feel for the various alternatives. I realize of course that this is a moot point. But perhaps I can help to winnow down the dozens of rejected lambda replacement proposals to just a few rejected lamda proposals :) -- Talin From mal at egenix.com Sat Feb 18 18:10:14 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 18:10:14 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F74812.1080505@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <43F74812.1080505@v.loewis.de> Message-ID: <43F754F6.9050204@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: >> I've already explained why we have .encode() and .decode() >> methods on strings and Unicode many times. I've also >> explained the misunderstanding that can codecs only do >> Unicode-string conversions. And I've explained that >> the .encode() and .decode() method *do* check the return >> types of the codecs and only allow strings or Unicode >> on return (no lists, instances, tuples or anything else). >> >> You seem to ignore this fact. > > I'm not ignoring the fact that you have explained this > many times. I just fail to understand your explanations. Feel free to ask questions. > For example, you said at some point that codecs are not > restricted to Unicode. However, I don't recall any > explanation what the restriction *is*, if any restriction > exists. No such restriction seems to be documented. The codecs are not restricted w/r to the data types they work on. It's up to the codecs to define which data types are valid and which they take on input and return. >> True. However, note that the .encode()/.decode() methods on >> strings and Unicode narrow down the possible return types. >> The corresponding .bytes methods should only allow bytes and >> Unicode. > > I forgot that: what is the rationale for that restriction? To assure that only those types can be returned from those methods, ie. instances of basestring, which in return permits type inference for those methods. The codecs functions encode() and decode() don't have these restrictions, and thus provide a generic interface to the codec's encode and decode functions. It's up to the caller to restrict the allowed encodings and as result the possible input/output types. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Sat Feb 18 18:24:46 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 18:24:46 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> <43F63C07.9030901@egenix.com> <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> Message-ID: <43F7585E.4080909@egenix.com> Walter D?rwald wrote: > M.-A. Lemburg wrote: >> Walter D?rwald wrote: >>>>>> I'd suggest we keep codecs.lookup() the way it is and >>>>>> instead add new functions to the codecs module, e.g. >>>>>> codecs.getencoderobject() and codecs.getdecoderobject(). >>>>>> >>>>>> Changing the codec registration is not much of a problem: >>>>>> we could simply allow 6-tuples to be passed into the >>>>>> registry. >>>>> OK, so codecs.lookup() returns 4-tuples, but the registry stores 6-tuples and the search functions must return 6-tuples. >>>>> And we add codecs.getencoderobject() and codecs.getdecoderobject() as well as new classes codecs.StatefulEncoder and >>>>> codecs.StatefulDecoder. What about old search functions that return 4-tuples? >>>> The registry should then simply set the missing entries to None and the getencoderobject()/getdecoderobject() would then >>>> have >>>> to raise an error. >>> Sounds simple enough and we don't loose backwards compatibility. >>> >>>> Perhaps we should also deprecate codecs.lookup() in Py 2.5 ?! >>> +1, but I'd like to have a replacement for this, i.e. a function that returns all info the registry has about an encoding: >>> >>> 1. Name >>> 2. Encoder function >>> 3. Decoder function >>> 4. Stateful encoder factory >>> 5. Stateful decoder factory >>> 6. Stream writer factory >>> 7. Stream reader factory >>> >>> and if this is an object with attributes, we won't have any problems if we extend it in the future. >> Shouldn't be a problem: just expose the registry dictionary >> via the _codecs module. >> >> The rest can then be done in a Python function defined in >> codecs.py using a CodecInfo class. > > This would require the Python code to call codecs.lookup() and then look into the codecs dictionary (normalizing the encoding > name again). Maybe we should make a version of __PyCodec_Lookup() that allows 4- and 6-tuples available to Python and use that? > The official PyCodec_Lookup() would then have to downgrade the 6-tuples to 4-tuples. Hmm, you're right: the dictionary may not have the requested codec info yet (it's only used as cache) and only a call to _PyCodec_Lookup() would fill it. >>> BTW, if we change the API, can we fix the return value of the stateless functions? As the stateless function always >>> encodes/decodes the complete string, returning the length of the string doesn't make sense. >>> codecs.getencoder() and codecs.getdecoder() would have to continue to return the old variant of the functions, but >>> codecs.getinfo("latin-1").encoder would be the new encoding function. >> No: you can still write stateless encoders or decoders that do >> not process the whole input string. Just because we don't have >> any of those in Python, doesn't mean that they can't be written >> and used. A stateless codec might want to leave the work >> of buffering bytes at the end of the input data which cannot >> be processed to the caller. > > But what would the call do with that info? It can't retry encoding/decoding the rejected input, because the state of the codec > has been thrown away already. This depends a lot on the nature of the codec. It may well be possible to work on chunks of input data in a stateless way, e.g. say you have a string of 4-byte hex values, then the decode function would be able to work on 4 bytes each and let the caller buffer any remaining bytes for the next call. There'd be no need for keeping state in the decoder function. >> It is also possible to write >> stateful codecs on top of such stateless encoding and decoding >> functions. > > That's what the codec helper functions from Python/_codecs.c are for. I'm not sure what you mean here. > Anyway, I've started implementing a patch that just adds codecs.StatefulEncoder/codecs.StatefulDecoder. UTF8, UTF8-Sig, UTF-16, > UTF-16-LE and UTF-16-BE are already working. Nice :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Sat Feb 18 19:11:43 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 19 Feb 2006 04:11:43 +1000 Subject: [Python-Dev] Adventures with ASTs - Inline Lambda In-Reply-To: <43F75019.2050707@acm.org> References: <43F6C310.1050307@acm.org> <17399.13354.22638.464024@montanaro.dyndns.org> <43F75019.2050707@acm.org> Message-ID: <43F7635F.6030402@gmail.com> Talin wrote: > skip at pobox.com wrote: > >> talin> ... whereas with 'given' you can't be certain when to stop >> talin> parsing the argument list. >> >> So require parens around the arglist: >> >> (x*y given (x, y)) >> >> Skip >> >> > I would not be opposed to mandating the parens, and its an easy enough > change to make. The patch on SF lets you do it both ways, which will > give people who are interested a chance to get a feel for the various > alternatives. Another ambiguity is that when they're optional it is unclear whether or not adding them means the callable now expects a tuple argument (i.e., doubled parens at the call site). If they're mandatory, then it is clear that only doubled parentheses at the definition point require doubled parentheses at the call site (this is, not coincidentally, exactly the same rule as applies for normal functions). > I realize of course that this is a moot point. But perhaps I can help to > winnow down the dozens of rejected lambda replacement proposals to just > a few rejected lamda proposals :) Heh. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Sat Feb 18 19:19:55 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 19:19:55 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F754F6.9050204@egenix.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <43F74812.1080505@v.loewis.de> <43F754F6.9050204@egenix.com> Message-ID: <43F7654B.5030302@v.loewis.de> M.-A. Lemburg wrote: >>>True. However, note that the .encode()/.decode() methods on >>>strings and Unicode narrow down the possible return types. >>>The corresponding .bytes methods should only allow bytes and >>>Unicode. >> >>I forgot that: what is the rationale for that restriction? > > > To assure that only those types can be returned from those > methods, ie. instances of basestring, which in return permits > type inference for those methods. Hmm. So it for type inference???? Where is that documented? This looks pretty inconsistent. Either codecs can give arbitrary return types, then .encode/.decode should also be allowed to give arbitrary return types, or codecs should be restricted. What's the point of first allowing a wide interface, and then narrowing it? Also, if type inference is the goal, what is the point in allowing two result types? Regards, Martin From foom at fuhm.net Sat Feb 18 19:44:01 2006 From: foom at fuhm.net (James Y Knight) Date: Sat, 18 Feb 2006 13:44:01 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F6CDCF.8080309@v.loewis.de> References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> <43F66B17.9080504@colorstudy.com> <43F6CDCF.8080309@v.loewis.de> Message-ID: On Feb 18, 2006, at 2:33 AM, Martin v. L?wis wrote: > I don't understand. In the rationale of PEP 333, it says > "The rationale for requiring a dictionary is to maximize portability > between servers. The alternative would be to define some subset of a > dictionary's methods as being the standard and portable interface." > > That rationale is not endangered: if the environment continues to > be a dict exactly, servers continue to be guaranteed what precise > set of operations is available on the environment. Yes it is endangered. > Well, as you say: you get a KeyError if there is an error with the > key. > With a default_factory, there isn't normally an error with the key. But there should be. Consider the case of two servers. One which takes all the items out of the dictionary (using items()) and puts them in some other data structure. Then it checks if the "Date" header has been set. It was not, so it adds it. Consider another similar server which checks if the "Date" header has been set on the dict passed in by the user. The default_factory then makes one up. Different behavior due to internal implementation details of how the server uses the dict object, which is what the restriction to _exactly_ dict prevents. Consider another server which takes the dict instance and transports it across thread boundaries, from the wsgi-app's thread to the main server thread. Because WSGI specifies that you can only use 'dict', and the server checked that type(obj) == dict, it is guaranteed that using the dict won't run thread-unsafe code. That is now broken, since dict.__getitem__ can now invoke arbitrary user code. That is a major change. James From g.brandl at gmx.net Sat Feb 18 19:55:52 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 18 Feb 2006 19:55:52 +0100 Subject: [Python-Dev] The decorator(s) module In-Reply-To: <2C516D2C-FFEE-4BD9-AF92-0D95A9B0229C@gmail.com> References: <43F61442.6050003@colorstudy.com> <43F6655E.1040308@colorstudy.com> <2C516D2C-FFEE-4BD9-AF92-0D95A9B0229C@gmail.com> Message-ID: Alex Martelli wrote: > On Feb 18, 2006, at 12:38 AM, Georg Brandl wrote: > >> Guido van Rossum wrote: >>> WFM. Patch anyone? >> >> Done. >> http://python.org/sf/1434038 > > I reviewed the patch and added a comment on it, but since the point > may be controversial I had better air it here for discussion: in 2.4, > property(fset=acallable) does work (maybe silly, but it does make a > write-only property) -- with the patch as given, it would stop > working (due to attempts to get __doc__ from the None value of fget); > I think we should ensure it keeps working (and add a unit test to > that effect). Yes, of course. Thanks for pointing that out. I updated the patch and hope it's now bullet-proof when no fget argument is given. Georg From martin at v.loewis.de Sat Feb 18 20:06:55 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 18 Feb 2006 20:06:55 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> <43F66B17.9080504@colorstudy.com> <43F6CDCF.8080309@v.loewis.de> Message-ID: <43F7704F.9070104@v.loewis.de> James Y Knight wrote: > But there should be. Consider the case of two servers. One which takes > all the items out of the dictionary (using items()) and puts them in > some other data structure. Then it checks if the "Date" header has been > set. It was not, so it adds it. Consider another similar server which > checks if the "Date" header has been set on the dict passed in by the > user. The default_factory then makes one up. Different behavior due to > internal implementation details of how the server uses the dict object, > which is what the restriction to _exactly_ dict prevents. Right. I would claim that this is an artificial example: you can't provide a HTTP_DATE value in a default_factory implementation, since you don't know what the key is. However, you are now making up a different rationale from the one the PEP specifies: The PEP says that you need an "exact dict" so that everybody knows precisely how the dictionary behaves; instead of having to define which precise subset of the dict API is to be used. *That* goal is still achieved: everybody knows that the dict might have an on_missing/default_factory implementation. So to find out whether HTTP_DATE has a value (which might be defaulted), you need to invoke d['HTTP_DATE']. > Consider another server which takes the dict instance and transports it > across thread boundaries, from the wsgi-app's thread to the main server > thread. Because WSGI specifies that you can only use 'dict', and the > server checked that type(obj) == dict, it is guaranteed that using the > dict won't run thread-unsafe code. That is now broken, since > dict.__getitem__ can now invoke arbitrary user code. That is a major > change. Not at all. dict.__getitem__ could always invoke arbitrary user code, through __hash__. Regards, Martin From rhamph at gmail.com Sat Feb 18 20:06:59 2006 From: rhamph at gmail.com (Adam Olsen) Date: Sat, 18 Feb 2006 12:06:59 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> <43F66B17.9080504@colorstudy.com> <43F6CDCF.8080309@v.loewis.de> Message-ID: On 2/18/06, James Y Knight wrote: > On Feb 18, 2006, at 2:33 AM, Martin v. L?wis wrote: > > Well, as you say: you get a KeyError if there is an error with the > > key. > > With a default_factory, there isn't normally an error with the key. > > But there should be. Consider the case of two servers. One which > takes all the items out of the dictionary (using items()) and puts > them in some other data structure. Then it checks if the "Date" > header has been set. It was not, so it adds it. Consider another > similar server which checks if the "Date" header has been set on the > dict passed in by the user. The default_factory then makes one up. > Different behavior due to internal implementation details of how the > server uses the dict object, which is what the restriction to > _exactly_ dict prevents. It just occured to me, what affect does this have on repr? Does it attempt to store the default_factory in the representation, or does it remove it? Is it even possible to store a reference to a builtin such as list and have eval restore it? -- Adam Olsen, aka Rhamphoryncus From mal at egenix.com Sat Feb 18 20:38:21 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 18 Feb 2006 20:38:21 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F7654B.5030302@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <43F74812.1080505@v.loewis.de> <43F754F6.9050204@egenix.com> <43F7654B.5030302@v.loewis.de> Message-ID: <43F777AD.7090203@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: >>>> True. However, note that the .encode()/.decode() methods on >>>> strings and Unicode narrow down the possible return types. >>>> The corresponding .bytes methods should only allow bytes and >>>> Unicode. >>> I forgot that: what is the rationale for that restriction? >> >> To assure that only those types can be returned from those >> methods, ie. instances of basestring, which in return permits >> type inference for those methods. > > Hmm. So it for type inference???? > Where is that documented? Somewhere in the python-dev mailing list archives ;-) Seriously, we should probably add this to the documentation. > This looks pretty inconsistent. Either codecs can give arbitrary > return types, then .encode/.decode should also be allowed to > give arbitrary return types, or codecs should be restricted. No. As I've said before: the .encode() and .decode() methods are convenience methods to interface to codecs which take string/Unicode on input and create string/Unicode output. > What's the point of first allowing a wide interface, and then > narrowing it? The codec interface is an abstract interface. It is a flexible enough to allow codecs to define possible input and output types while being strict about the method names and signatures. Much like the file interface in Python, the copy protocol or the pickle interface. > Also, if type inference is the goal, what is the point in allowing > two result types? I'm not sure I understand the question: type inference is about being able to infer the types of (among other things) function return objects. This is what the restriction guarantees - much like int() guarantees that you get either an integer or a long. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From jcarlson at uci.edu Sat Feb 18 20:46:40 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 18 Feb 2006 11:46:40 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F71066.70000@ronadam.com> References: <20060218005534.5FA8.JCARLSON@uci.edu> <43F71066.70000@ronadam.com> Message-ID: <20060218111809.5FAB.JCARLSON@uci.edu> Ron Adam wrote: > Josiah Carlson wrote: [snip] > > Again, the problem is ambiguity; what does bytes.recode(something) mean? > > Are we encoding _to_ something, or are we decoding _from_ something? > > This was just an example of one way that might work, but here are my > thoughts on why I think it might be good. > > In this case, the ambiguity is reduced as far as the encoding and > decodings opperations are concerned.) > > somestring = encodings.tostr( someunicodestr, 'latin-1') > > It's pretty clear what is happening to me. > > It will encode to a string an object, named someunicodestr, with > the 'latin-1' encoder. But now how do you get it back? encodings.tounicode(..., 'latin-1')?, unicode(..., 'latin-1')? What about string transformations: somestring = encodings.tostr(somestr, 'base64') How do we get that back? encodings.tostr() again is completely ambiguous, str(somestring, 'base64') seems a bit awkward (switching namespaces)? > And also rusult in clear errors if the specified encoding is > unavailable, and if it is, if it's not compatible with the given > *someunicodestr* obj type. > > Further hints could be gained by. > > help(encodings.tostr) > > Which could result in... something like... > """ > encoding.tostr( , ) -> string > > Encode a unicode string using a encoder codec to a > non-unicode string or transform a non-unicode string > to another non-unicode string using an encoder codec. > """ > > And if that's not enough, then help(encodings) could give more clues. > These steps would be what I would do. And then the next thing would be > to find the python docs entry on encodings. > > Placing them in encodings seems like a fairly good place to look for > these functions if you are working with encodings. So I find that just > as convenient as having them be string methods. > > There is no intermediate default encoding involved above, (the bytes > object is used instead), so you wouldn't get some of the messages the > present system results in when ascii is the default. > > (Yes, I know it won't when P3K is here also) > > > Are we going to need to embed the direction in the encoding/decoding > > name (to_base64, from_base64, etc.)? That doesn't any better than > > binascii.b2a_base64 . > > No, that's why I suggested two separate lists (or dictionaries might be > better). They can contain the same names, but the lists they are in > determine the context and point to the needed codec. And that step is > abstracted out by putting it inside the encodings.tostr() and > encodings.tounicode() functions. > > So either function would call 'base64' from the correct codec list and > get the correct encoding or decoding codec it needs. Either the API you have described is incomplete, you haven't noticed the directional ambiguity you are describing, or I have completely lost it. > > What about .reencode and .redecode? It seems as > > though the 're' added as a prefix to .encode and .decode makes it > > clearer that you get the same type back as you put in, and it is also > > unambiguous to direction. > > But then wouldn't we end up with multitude of ways to do things? > > s.encode(codec) == s.redecode(codec) > s.decode(codec) == s.reencode(codec) > unicode(s, codec) == s.decode(codec) > str(u, codec) == u.encode(codec) > str(s, codec) == s.encode(codec) > unicode(s, codec) == s.reencode(codec) > str(u, codec) == s.redecode(codec) > str(s, codec) == s.redecode(codec) > > Umm .. did I miss any? Which ones would you remove? > > Which ones of those will succeed with which codecs? I must not be expressing myself very well. Right now: s.encode() -> s s.decode() -> s, u u.encode() -> s, u u.decode() -> u Martin et al's desired change to encode/decode: s.decode() -> u u.encode() -> s No others. What my thoughts on .reencode() and .redecode() would get you given Martin et al's desired change: s.reencode() -> s (you get encoded strings as strings) s.redecode() -> s (you get decoded strings as strings) u.reencode() -> u (you get encoded unicode as unicode) u.redecode() -> u (you get decoded unicode as unicode) If one wants to go from unicode to string, one uses .encode(). If one wants to go from string to unicode, one uses .decode(). If one wants to keep their type unchanged, but encode or decode the data/text, one would use .reencode() and .redecode(), depending on whether their source is an encoded block of data, or the original data they want to encode. The other bonus is that if given .reencode() and .redecode(), one can quite easily verify that the source is possible as a source, and that you would get back the proper type. How this would occur behind the scenes is beyond the scope of this discussion, but it seems to me to be easy, given what I've read about the current mechanism. Whether the constructors for the str and unicode do their own codec transformations is beside the point. > The method bytes.recode(), always does a byte transformation which can > be almost anything. It's the context bytes.recode() is used in that > determines what's happening. In the above cases, it's using an encoding > transformation, so what it's doing is precisely what you would expect by > it's context. Indeed, there is a translation going on, but it is not clear as to whether you are encoding _to_ something or _from_ something. What does s.recode('base64') mean? Are you encoding _to_ base64 or _from_ base64? That's where the ambiguity lies. > There isn't a bytes.decode(), since that's just another transformation. > So only the one method is needed. Which makes it easer to learn. But ambiguous. > > The question remains: is str.decode() returning a string or unicode > > depending on the argument passed, when the argument quite literally > > names the codec involved, difficult to understand? I don't believe so; > > am I the only one? > > Using help(str.decode) and help(str.encode) gives: > > S.decode([encoding[,errors]]) -> object > > S.encode([encoding[,errors]]) -> object > > These look an awful lot alike. The descriptions are nearly identical as > well. The Python docs just reproduce (or close to) the doc strings with > only a very small amount of additional words. > > Learning how the current system works comes awfully close to reverse > engineering. Maybe I'm overstating it a bit, but I suspect many end up > doing exactly that in order to learn how Python does it. Again, we _need_ better documentation, regardless of whether or when the removal of some or all .encode()/.decode() methods happen. - Josiah From walter at livinglogic.de Sat Feb 18 22:08:19 2006 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Sat, 18 Feb 2006 22:08:19 +0100 (CET) Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43F7585E.4080909@egenix.com> References: <43F2FDDD.3030200@gmail.com> <1140025909.13758.43.camel@geddy.wooz.org> <83B94970-3479-4C2B-A709-4C28E88EEA06@gmail.com> <43F4E582.1040201@egenix.com> <43F5C562.4040105@livinglogic.de> <43F5CB53.4000301@egenix.com> <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> <43F63C07.9030901@egenix.com> <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> <43F7585E.4080909@egenix.com> Message-ID: <61847.89.54.8.114.1140296899.squirrel@isar.livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: >> M.-A. Lemburg wrote: >>> Walter D?rwald wrote: >>>> [...] >>>>> Perhaps we should also deprecate codecs.lookup() in Py 2.5 ?! >>>> +1, but I'd like to have a replacement for this, i.e. a function that returns all info the registry has about an encoding: >>>> >>>> 1. Name >>>> 2. Encoder function >>>> 3. Decoder function >>>> 4. Stateful encoder factory >>>> 5. Stateful decoder factory >>>> 6. Stream writer factory >>>> 7. Stream reader factory >>>> >>>> and if this is an object with attributes, we won't have any problems if we extend it in the future. >>> Shouldn't be a problem: just expose the registry dictionary >>> via the _codecs module. >>> >>> The rest can then be done in a Python function defined in >>> codecs.py using a CodecInfo class. >> >> This would require the Python code to call codecs.lookup() and then look into the codecs dictionary (normalizing the >> encoding name again). Maybe we should make a version of __PyCodec_Lookup() that allows 4- and 6-tuples available to Python >> and use that? The official PyCodec_Lookup() would then have to downgrade the 6-tuples to 4-tuples. > > Hmm, you're right: the dictionary may not have the requested codec info yet (it's only used as cache) and only a call to > _PyCodec_Lookup() would fill it. I'm now trying a different approach: codecs.lookup() returns a subclass of tuple. We could deprecate calling __getitem__() in 2.5/2.6 and then remove the tuple subclassing later. >>>> BTW, if we change the API, can we fix the return value of the stateless functions? As the stateless function always >>>> encodes/decodes the complete string, returning the length of the string doesn't make sense. codecs.getencoder() and >>>> codecs.getdecoder() would have to continue to return the old variant of the functions, but >>>> codecs.getinfo("latin-1").encoder would be the new encoding function. >>> No: you can still write stateless encoders or decoders that do >>> not process the whole input string. Just because we don't have >>> any of those in Python, doesn't mean that they can't be written and used. A stateless codec might want to leave the work >>> of buffering bytes at the end of the input data which cannot >>> be processed to the caller. >> >> But what would the call do with that info? It can't retry encoding/decoding the rejected input, because the state of the >> codec has been thrown away already. > > This depends a lot on the nature of the codec. It may well be > possible to work on chunks of input data in a stateless way, > e.g. say you have a string of 4-byte hex values, then the decode > function would be able to work on 4 bytes each and let the caller > buffer any remaining bytes for the next call. There'd be no need for keeping state in the decoder function. So incomplete byte sequence would be silently ignored. >>> It is also possible to write >>> stateful codecs on top of such stateless encoding and decoding >>> functions. >> >> That's what the codec helper functions from Python/_codecs.c are for. > > I'm not sure what you mean here. _codecs.utf_8_decode() etc. use (result, count) tuples as the return value, because those functions are the building blocks of the codecs themselves. >> Anyway, I've started implementing a patch that just adds codecs.StatefulEncoder/codecs.StatefulDecoder. UTF8, UTF8-Sig, >> UTF-16, UTF-16-LE and UTF-16-BE are already working. > > Nice :-) gencodec.py is updated now too. The rest should be manageble too. I'll leave updating the CJKV codecs to Hye-Shik though. Bye, Walter D?rwald From bh at intevation.de Sat Feb 18 22:41:07 2006 From: bh at intevation.de (Bernhard Herzog) Date: Sat, 18 Feb 2006 22:41:07 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: (Guido van Rossum's message of "Fri, 17 Feb 2006 14:15:39 -0800") References: <43F63A35.5080500@colorstudy.com> Message-ID: "Guido van Rossum" writes: > If the __getattr__()-like operation that supplies and inserts a > dynamic default was a separate method, we wouldn't have this problem. Why implement it in the dictionary type at all? If, for intance, the default value functionality were provided as a decorator, it could be used with all kinds of mappings. I.e. you could have something along these lines: class defaultwrapper(object): def __init__(self, base, factory): self.__base = base self.__factory = factory def __getitem__(self, key): try: return self.__base[key] except KeyError: value = self.__factory() self.__base[key] = value return value def __getattr__(self, attr): return getattr(self.__base, attr) def test(): dd = defaultwrapper({}, list) dd["abc"].append(1) dd["abc"].append(2) dd["def"].append(1) assert sorted(dd.keys()) == ["abc", "def"] assert sorted(dd.values()) == [[1], [1, 2]] assert sorted(dd.items()) == [("abc", [1, 2]), ("def", [1])] assert dd.has_key("abc") assert not dd.has_key("xyz") The precise semantics would have to be determined yet, of course. > OTOH most reviewers here seem to appreciate on_missing() as a way to > do various other ways of alterning a dict's __getitem__() behavior > behind a caller's back -- perhaps it could even be (ab)used to > implement case-insensitive lookup. case-insensitive lookup could be implemented with another wrapper/decorator. If you need both case-insitivity and a default value, you can easily stack the decorators. Bernhard -- Intevation GmbH http://intevation.de/ Skencil http://skencil.org/ Thuban http://thuban.intevation.org/ From rrr at ronadam.com Sat Feb 18 23:15:17 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 18 Feb 2006 16:15:17 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060218153241.GA14054@panix.com> References: <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> <20060218153241.GA14054@panix.com> Message-ID: <43F79C75.5020707@ronadam.com> Aahz wrote: > On Sat, Feb 18, 2006, Ron Adam wrote: >> I like the bytes.recode() idea a lot. +1 >> >> It seems to me it's a far more useful idea than encoding and decoding by >> overloading and could do both and more. It has a lot of potential to be >> an intermediate step for encoding as well as being used for many other >> translations to byte data. >> >> I think I would prefer that encode and decode be just functions with >> well defined names and arguments instead of being methods or arguments >> to string and Unicode types. >> >> I'm not sure on exactly how this would work. Maybe it would need two >> sets of encodings, ie.. decoders, and encoders. An exception would be >> given if it wasn't found for the direction one was going in. > > Here's an idea I don't think I've seen before: > > bytes.recode(b, src_encoding, dest_encoding) > > This requires the user to state up-front what the source encoding is. > One of the big problems that I see with the whole encoding mess is that > so much of it contains implicit assumptions about the source encoding; > this gets away from that. Yes, but it's not just the encodings that are implicit, it is also the types. s.encode(codec) # explicit source type, ? dest type s.decode(codec) # explicit source type, ? dest type encodings.tostr(obj, codec) # implicit *known* source type # explicit dest type encodings.tounicode(obj, codec) # implicit *known* source type # explicit dest type In this case the source is implicit, but there can be a well defined check to validate the source type against the codec being used. It's my feeling the user *knows* what he already has, and so it's more important that the resulting object type is explicit. In your suggestion... bytes.recode(b, src_encoding, dest_incoding) Here the encodings are both explicit, but the both the source and the destinations of the bytes are not. Since it working on bytes, they could have come from anywhere, and after the translation they would then will be cast to the type the user *thinks* it should result in. A source of errors that would likely pass silently. The way I see it is the bytes type should be a lower level object that doesn't care what byte transformation it does. Ie.. they are all one way byte to byte transformations determined by context. And it should have the capability to read from and write to types without translating in the same step. Keep it simple. Then it could be used as a lower level byte translator to implement encodings and other translations in encoding methods or functions instead of trying to make it replace the higher level functionality. Cheers, Ron From thomas at xs4all.net Sat Feb 18 23:33:15 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 18 Feb 2006 23:33:15 +0100 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43F7113E.8090300@egenix.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> <43F7113E.8090300@egenix.com> Message-ID: <20060218223315.GG23863@xs4all.nl> On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote: > It's by no means a Perl attitude. In your eyes, perhaps. It certainly feels that way to me (or I wouldn't have said it :). Perl happens to be full of general constructs that were added because they were easy to add, or they were useful in edgecases. The encode/decode methods remind me of that, even though I fully understand the reasoning behind it, and the elegance of the implementation. > The main reason is symmetry and the fact that strings and Unicode > should be as similar as possible in order to simplify the task of > moving from one to the other. Yes, and this is a design choice I don't agree with. They're different types. They do everything similarly, except when they are mixed together (unicode takes precedence, in general, encoding the bytestring from the default encoding.) Going from one to the other isn't symmetric, though. I understand that you disagree; the disagreement is on the fundamental choice of allowing 'encode' and 'decode' to do *more* than going from and to unicode. I regret that decision, not the decision to make encode and decode symmetric (which makes sense, after the decision to overgeneralize encode/decode is made.) > > - The return value for the non-unicode encodings depends on the value of > > the encoding argument. > Not really: you'll always get a basestring instance. Which is not a particularly useful distinction, since in any real world application, you have to be careful not to mix unicode with (non-ascii) bytestrings. The only way to reliably deal with unicode is to have it well-contained (when migrating an application from using bytestrings to using unicode) or to use unicode everywhere, decoding/encoding at entrypoints. Containment is hard to achieve. > Still, I believe that this is an educational problem. There are > a couple of gotchas users will have to be aware of (and this is > unrelated to the methods in question): > > * "encoding" always refers to transforming original data into > a derived form > > * "decoding" always refers to transforming a derived form of > data back into its original form > > * for Unicode codecs the original form is Unicode, the derived > form is, in most cases, a string > > As a result, if you want to use a Unicode codec such as utf-8, > you encode Unicode into a utf-8 string and decode a utf-8 string > into Unicode. > > Encoding a string is only possible if the string itself is > original data, e.g. some data that is supposed to be transformed > into a base64 encoded form. > > Decoding Unicode is only possible if the Unicode string itself > represents a derived form, e.g. a sequence of hex literals. Most of these gotchas would not have been gotchas had encode/decode only been usable for unicode encodings. > > That is why I disagree with the hypergeneralization of the encode/decode > > methods [..] > That's because you only look at one specific task. > Codecs also unify the various interfaces to common encodings > such as base64, uu or zip which are not Unicode related. No, I think you misunderstand. I object to the hypergeneralization of the *encode/decode methods*, not the codec system. I would have been fine with another set of methods for non-unicode transformations. Although I would have been even more fine if they got their encoding not as a string, but as, say, a module object, or something imported from a module. Not that I think any of this matters; we have what we have and I'll have to live with it ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tjreedy at udel.edu Sat Feb 18 23:48:10 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 18 Feb 2006 17:48:10 -0500 Subject: [Python-Dev] bytes.from_hex() References: <20060217221623.5FA5.JCARLSON@uci.edu><43F6DC4C.1070100@ronadam.com> <20060218005534.5FA8.JCARLSON@uci.edu> Message-ID: "Josiah Carlson" wrote in message news:20060218005534.5FA8.JCARLSON at uci.edu... > Again, the problem is ambiguity; what does bytes.recode(something) mean? > Are we encoding _to_ something, or are we decoding _from_ something? > Are we going to need to embed the direction in the encoding/decoding > name (to_base64, from_base64, etc.)? To me, that seems simple and clear. b.recode('from_base64') obviously requires that b meet the restrictions of base64. Similarly for 'from_hex'. > That doesn't any better than binascii.b2a_base64 I think 'from_base64' is *much* better. I think there are now 4 string-to-string transform modules that do similar things. Not optimal to me. >What about .reencode and .redecode? It seems as > though the 're' added as a prefix to .encode and .decode makes it > clearer that you get the same type back as you put in, and it is also > unambiguous to direction. To me, the 're' prefix is awkward, confusing, and misleading. Terry J. Reedy From oliphant.travis at ieee.org Sun Feb 19 00:16:02 2006 From: oliphant.travis at ieee.org (Travis E. Oliphant) Date: Sat, 18 Feb 2006 16:16:02 -0700 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: <43F6D8F0.2060808@v.loewis.de> References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> <43F6D8F0.2060808@v.loewis.de> Message-ID: <43F7AAB2.1030809@ieee.org> Martin v. L?wis wrote: > Neal Norwitz wrote: > >>I suppose that might be nice, but would require configure magic. I'm >>not sure how it could be done on Windows. > > > Contributions are welcome. On Windows, it can be hard-coded. > > Actually, something like > > #if SIZEOF_SIZE_T == SIZEOF_INT > #define PY_SSIZE_T_MAX INT_MAX > #elif SIZEOF_SIZE_T == SIZEOF_LONG > #define PY_SSIZE_T_MAX LONG_MAX > #else > #error What is size_t equal to? > #endif > > might work. Why not just #if SIZEOF_SIZE_T == 2 #define PY_SSIZE_T_MAX 0x7fff #elif SIZEOF_SIZE_T == 4 #define PY_SSIZE_T_MAX 0x7fffffff #elif SIZEOF_SIZE_T == 8 #define PY_SSIZE_T_MAX 0x7fffffffffffffff #elif SIZEOF_SIZE_T == 16 #define PY_SSIZE_T_MAX 0x7fffffffffffffffffffffffffffffff #endif ? From pje at telecommunity.com Sun Feb 19 00:34:59 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 18 Feb 2006 18:34:59 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F6CDCF.8080309@v.loewis.de> <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> <43F66B17.9080504@colorstudy.com> <43F6CDCF.8080309@v.loewis.de> Message-ID: <5.1.1.6.0.20060218183329.021de050@mail.telecommunity.com> At 01:44 PM 02/18/2006 -0500, James Y Knight wrote: >On Feb 18, 2006, at 2:33 AM, Martin v. L?wis wrote: > > I don't understand. In the rationale of PEP 333, it says > > "The rationale for requiring a dictionary is to maximize portability > > between servers. The alternative would be to define some subset of a > > dictionary's methods as being the standard and portable interface." > > > > That rationale is not endangered: if the environment continues to > > be a dict exactly, servers continue to be guaranteed what precise > > set of operations is available on the environment. > >Yes it is endangered. So we'll update the spec to say you can't use a dict that has the default set. It's not reasonable to expect that language changes might not require updates to a PEP. Certainly, we don't have to worry about being backward compatible when it's only Python 2.5 that's affected by the change. :) From rrr at ronadam.com Sun Feb 19 00:56:02 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 18 Feb 2006 17:56:02 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060218111809.5FAB.JCARLSON@uci.edu> References: <20060218005534.5FA8.JCARLSON@uci.edu> <43F71066.70000@ronadam.com> <20060218111809.5FAB.JCARLSON@uci.edu> Message-ID: <43F7B412.8090205@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: >> Josiah Carlson wrote: > [snip] >>> Again, the problem is ambiguity; what does bytes.recode(something) mean? >>> Are we encoding _to_ something, or are we decoding _from_ something? >> This was just an example of one way that might work, but here are my >> thoughts on why I think it might be good. >> >> In this case, the ambiguity is reduced as far as the encoding and >> decodings opperations are concerned.) >> >> somestring = encodings.tostr( someunicodestr, 'latin-1') >> >> It's pretty clear what is happening to me. >> >> It will encode to a string an object, named someunicodestr, with >> the 'latin-1' encoder. > > But now how do you get it back? encodings.tounicode(..., 'latin-1')?, > unicode(..., 'latin-1')? Yes, Just do. someunicodestr = encoding.tounicode( somestring, 'latin-1') > What about string transformations: > somestring = encodings.tostr(somestr, 'base64') > > How do we get that back? encodings.tostr() again is completely > ambiguous, str(somestring, 'base64') seems a bit awkward (switching > namespaces)? In the case where a string is converted to another string. It would probably be best to have a requirement that they all get converted to unicode as an intermediate step. By doing that it becomes an explicit two step opperation. # string to string encoding u_string = encodings.tounicode(s_string, 'base64') s2_string = encodings.tostr(u_string, 'base64') Or you could have a convenience function to do it in the encodings module also. def strtostr(s, sourcecodec, destcodec): u = tounicode(s, sourcecodec) return tostr(u, destcodec) Then... s2 = encodings.strtostr(s, 'base64, 'base64) Which would be kind of pointless in this example, but it would be a good way to test a codec. assert s == s2 >>> Are we going to need to embed the direction in the encoding/decoding >>> name (to_base64, from_base64, etc.)? That doesn't any better than >>> binascii.b2a_base64 . >> No, that's why I suggested two separate lists (or dictionaries might be >> better). They can contain the same names, but the lists they are in >> determine the context and point to the needed codec. And that step is >> abstracted out by putting it inside the encodings.tostr() and >> encodings.tounicode() functions. >> >> So either function would call 'base64' from the correct codec list and >> get the correct encoding or decoding codec it needs. > > Either the API you have described is incomplete, you haven't noticed the > directional ambiguity you are describing, or I have completely lost it. Most likely I gave an incomplete description of the API in this case because there are probably several ways to implement it. >>> What about .reencode and .redecode? It seems as >>> though the 're' added as a prefix to .encode and .decode makes it >>> clearer that you get the same type back as you put in, and it is also >>> unambiguous to direction. ... > I must not be expressing myself very well. > > Right now: > s.encode() -> s > s.decode() -> s, u > u.encode() -> s, u > u.decode() -> u > > Martin et al's desired change to encode/decode: > s.decode() -> u > u.encode() -> s > > No others. Which would be similar to the functions I suggested. The main difference is only weather it would be better to have them as methods or separate factory functions and the spelling of the names. Both have their advantages I think. >> The method bytes.recode(), always does a byte transformation which can >> be almost anything. It's the context bytes.recode() is used in that >> determines what's happening. In the above cases, it's using an encoding >> transformation, so what it's doing is precisely what you would expect by >> it's context. > > Indeed, there is a translation going on, but it is not clear as to > whether you are encoding _to_ something or _from_ something. What does > s.recode('base64') mean? Are you encoding _to_ base64 or _from_ base64? > That's where the ambiguity lies. Bengt didn't propose adding .recode() to the string types, but only the bytes type. The byte type would "recode" the bytes using a specific transformation. I believe his view is it's a lower level API than strings that can be used to implement the higher level encoding API with, not replace the encoding API. Or that is they way I interpreted the suggestion. >> There isn't a bytes.decode(), since that's just another transformation. >> So only the one method is needed. Which makes it easer to learn. > > But ambiguous. What's ambiguous about it? It's no more ambiguous than any math operation where you can do it one way with one operations and get your original value back with the same operation by using an inverse value. n2=n+1; n3=n+(-1); n==n3 n2=n*2; n3=n*(.5); n==n3 >> Learning how the current system works comes awfully close to reverse >> engineering. Maybe I'm overstating it a bit, but I suspect many end up >> doing exactly that in order to learn how Python does it. > > Again, we _need_ better documentation, regardless of whether or when the > removal of some or all .encode()/.decode() methods happen. Yes, in the short term some parts of PEP 100 could be moved to the python docs I think. Cheers, Ron From jcarlson at uci.edu Sun Feb 19 02:26:49 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 18 Feb 2006 17:26:49 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F7B412.8090205@ronadam.com> References: <20060218111809.5FAB.JCARLSON@uci.edu> <43F7B412.8090205@ronadam.com> Message-ID: <20060218160554.5FB4.JCARLSON@uci.edu> Ron Adam wrote: > Josiah Carlson wrote: > > Ron Adam wrote: > >> Josiah Carlson wrote: > > [snip] > >>> Again, the problem is ambiguity; what does bytes.recode(something) mean? > >>> Are we encoding _to_ something, or are we decoding _from_ something? > >> This was just an example of one way that might work, but here are my > >> thoughts on why I think it might be good. > >> > >> In this case, the ambiguity is reduced as far as the encoding and > >> decodings opperations are concerned.) > >> > >> somestring = encodings.tostr( someunicodestr, 'latin-1') > >> > >> It's pretty clear what is happening to me. > >> > >> It will encode to a string an object, named someunicodestr, with > >> the 'latin-1' encoder. > > > > But now how do you get it back? encodings.tounicode(..., 'latin-1')?, > > unicode(..., 'latin-1')? > > Yes, Just do. > > someunicodestr = encoding.tounicode( somestring, 'latin-1') > > > What about string transformations: > > somestring = encodings.tostr(somestr, 'base64') > > > > How do we get that back? encodings.tostr() again is completely > > ambiguous, str(somestring, 'base64') seems a bit awkward (switching > > namespaces)? > > In the case where a string is converted to another string. It would > probably be best to have a requirement that they all get converted to > unicode as an intermediate step. By doing that it becomes an explicit > two step opperation. > > # string to string encoding > u_string = encodings.tounicode(s_string, 'base64') > s2_string = encodings.tostr(u_string, 'base64') Except that ambiguates it even further. Is encodings.tounicode() encoding, or decoding? According to everything you have said so far, it would be decoding. But if I am decoding binary data, why should it be spending any time as a unicode string? What do I mean? x = f.read() #x contains base-64 encoded binary data y = encodings.to_unicode(x, 'base64') y now contains BINARY DATA, except that it is a unicode string z = encodings.to_str(y, 'latin-1') Later you define a str_to_str function, which I (or someone else) would use like: z = str_to_str(x, 'base64', 'latin-1') But the trick is that I don't want some unicode string encoded in latin-1, I want my binary data unencoded. They may happen to be the same in this particular example, but that doesn't mean that it makes any sense to the user. [...] > >>> What about .reencode and .redecode? It seems as > >>> though the 're' added as a prefix to .encode and .decode makes it > >>> clearer that you get the same type back as you put in, and it is also > >>> unambiguous to direction. > > ... > > > I must not be expressing myself very well. > > > > Right now: > > s.encode() -> s > > s.decode() -> s, u > > u.encode() -> s, u > > u.decode() -> u > > > > Martin et al's desired change to encode/decode: > > s.decode() -> u > > u.encode() -> s > > > > No others. > > Which would be similar to the functions I suggested. The main > difference is only weather it would be better to have them as methods or > separate factory functions and the spelling of the names. Both have > their advantages I think. While others would disagree, I personally am not a fan of to* or from* style namings, for either function names (especially in the encodings module) or methods. Just a personal preference. Of course, I don't find the current situation regarding str/unicode.encode/decode to be confusing either, but maybe it's because my unicode experience is strictly within the realm of GUI widgets, where compartmentalization can be easier. > >> The method bytes.recode(), always does a byte transformation which can > >> be almost anything. It's the context bytes.recode() is used in that > >> determines what's happening. In the above cases, it's using an encoding > >> transformation, so what it's doing is precisely what you would expect by > >> it's context. [THIS IS THE AMBIGUITY] > > Indeed, there is a translation going on, but it is not clear as to > > whether you are encoding _to_ something or _from_ something. What does > > s.recode('base64') mean? Are you encoding _to_ base64 or _from_ base64? > > That's where the ambiguity lies. > > Bengt didn't propose adding .recode() to the string types, but only the > bytes type. The byte type would "recode" the bytes using a specific > transformation. I believe his view is it's a lower level API than > strings that can be used to implement the higher level encoding API > with, not replace the encoding API. Or that is they way I interpreted > the suggestion. But again, what would the transformation be? To something? From something? 'to_base64', 'from_base64', 'to_rot13' (which happens to be identical to) 'from_rot13', ... Saying it would "recode ... using a specific transformation" is a cop-out, what would the translation be? How would it work? How would it be spelled? That smells quite a bit like .encode() and .decode(), just spelled differently, and without quite a clear path. That is why I was offering... > > > s.reencode() -> s (you get encoded strings as strings) > > > s.redecode() -> s (you get decoded strings as strings) > > > u.reencode() -> u (you get encoded unicode as unicode) > > > u.redecode() -> u (you get decoded unicode as unicode) You keep the encode and decode to be translating between types, you use reencode and redecode to keep the type, and define whether you are encoding or decoding your data/text. While I have come to agree with Terry Reedy regarding the 're' prefix on the 'encode' and 'decode', I think that having the name of the method define the action and the argument of the method define the codec, is the way to go (essentially the status quo). It may make sense to differentiate the cases of what an encoding/decoding process may return (types change, types stay the same), but we then have a naming issue. So far, I've not seen _really_ good names for describing the encoding/decoding process, except for what we already have: encode and decode. What if instead of using encode/decode for the following transformations: > > Martin et al's desired change to encode/decode: > > s.decode() -> u > > u.encode() -> s We use some method name for inter-type transformations: s.transform() -> u u.transform() -> s ... or something better than 'transform', then we use the .encode()/.decode() for intra-type transformations... s.encode() -> s (you get encoded strings as strings) s.decode() -> s (you get decoded strings as strings) u.encode() -> u (you get encoded unicode as unicode) u.decode() -> u (you get decoded unicode as unicode) Probably DOA, but just a thought. > >> There isn't a bytes.decode(), since that's just another transformation. > >> So only the one method is needed. Which makes it easer to learn. > > > > But ambiguous. > > What's ambiguous about it? See the section above that I marked "[THIS IS THE AMBIGUITY]" . > It's no more ambiguous than any math > operation where you can do it one way with one operations and get your > original value back with the same operation by using an inverse value. > > n2=n+1; n3=n+(-1); n==n3 > n2=n*2; n3=n*(.5); n==n3 Ahh, so you are saying 'to_base64' and 'from_base64'. There is one major reason why I don't like that kind of a system: I can't just say encoding='base64' and use str.encode(encoding) and str.decode(encoding), I necessarily have to use, str.recode('to_'+encoding) and str.recode('from_'+encoding) . Seems a bit awkward. - Josiah From greg.ewing at canterbury.ac.nz Sun Feb 19 02:50:44 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 19 Feb 2006 14:50:44 +1300 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F65051.5000308@colorstudy.com> <43F656EC.6070107@v.loewis.de> <43F65A90.2010009@colorstudy.com> <43F66665.7000102@v.loewis.de> Message-ID: <43F7CEF4.9020609@canterbury.ac.nz> Would people perhaps feel better if defaultdict *wasn't* a subclass of dict, but a distinct mapping type of its own? That would make it clearer that it's not meant to be a drop-in replacement for a dict in arbitrary contexts. Greg From raymond.hettinger at verizon.net Sun Feb 19 03:10:42 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 18 Feb 2006 21:10:42 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <43F65051.5000308@colorstudy.com> <43F656EC.6070107@v.loewis.de><43F65A90.2010009@colorstudy.com> <43F66665.7000102@v.loewis.de> <43F7CEF4.9020609@canterbury.ac.nz> Message-ID: <009501c634f9$ba0ef000$b83efea9@RaymondLaptop1> [Greg Ewing] > Would people perhaps feel better if defaultdict > *wasn't* a subclass of dict, but a distinct mapping > type of its own? That would make it clearer that it's > not meant to be a drop-in replacement for a dict > in arbitrary contexts. Absolutely. That's the right way to avoid Liskov violations from altered invariants and API changes. Besides, with Python's propensity for duck typing, there's no reason to subclass when we don't have to. Raymond From greg.ewing at canterbury.ac.nz Sun Feb 19 03:11:53 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 19 Feb 2006 15:11:53 +1300 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43f6a154.1088724012@news.gmane.org> References: <43F64A7A.3060400@v.loewis.de> <43F661D3.3010108@v.loewis.de> <43f6a154.1088724012@news.gmane.org> Message-ID: <43F7D3E9.3050204@canterbury.ac.nz> Bengt Richter wrote: > My guess is that realistically default_factory will be used > to make clean code for filling a dict, and then turning the factory > off if it's to be passed into unknown contexts. This suggests that maybe the autodict behaviour shouldn't be part of the dict itself, but provided by a wrapper around the dict. The you can fill the dict through the wrapper, and still have a normal dict underneath to use for other purposes. Greg From bokr at oz.net Sun Feb 19 03:47:10 2006 From: bokr at oz.net (Bengt Richter) Date: Sun, 19 Feb 2006 02:47:10 GMT Subject: [Python-Dev] Proposal: defaultdict References: <43F63A35.5080500@colorstudy.com> <61121.89.54.8.114.1140255855.squirrel@isar.livinglogic.de> Message-ID: <43f7d9e9.1168745306@news.gmane.org> On Sat, 18 Feb 2006 10:44:15 +0100 (CET), "=?iso-8859-1?Q?Walter_D=F6rwald?=" wrote: >Guido van Rossum wrote: >> On 2/17/06, Ian Bicking wrote: >>> Guido van Rossum wrote: >>> > d =3D {} >>> > d.default_factory =3D set >>> > ... >>> > d[key].add(value) >>> >>> Another option would be: >>> >>> d =3D {} >>> d.default_factory =3D set >>> d.get_default(key).add(value) >>> >>> Unlike .setdefault, this would use a factory associated with the diction= >ary, and no default value would get passed in. >>> Unlike the proposal, this would not override __getitem__ (not overriding >>> __getitem__ is really the only difference with the proposal). It would = >be clear reading the code that you were not >>> implicitly asserting they "key in d" was true. >>> >>> "get_default" isn't the best name, but another name isn't jumping out at= > me at the moment. Of course, it is not a Pythonic >>> argument to say that an existing method should be overridden, or functio= >nality made nameless simply because we can't think >>> of a name (looking to anonymous functions of course ;) >> >> I'm torn. While trying to implement this I came across some ugliness in P= >yDict_GetItem() -- it would make sense if this also >> called >> on_missing(), but it must return a value without incrementing its >> refcount, and isn't supposed to raise exceptions -- so what to do if on_m= >issing() returns a value that's not inserted in the >> dict? >> >> If the __getattr__()-like operation that supplies and inserts a >> dynamic default was a separate method, we wouldn't have this problem. >> >> OTOH most reviewers here seem to appreciate on_missing() as a way to do v= >arious other ways of alterning a dict's >> __getitem__() behavior behind a caller's back -- perhaps it could even be= > (ab)used to >> implement case-insensitive lookup. > >I don't like the fact that on_missing()/default_factory can change the beha= >viour of __getitem__, which upto now has been >something simple and understandable. >Why don't we put the on_missing()/default_factory functionality into get() = >instead? > >d.get(key, default) does what it did before. d.get(key) invokes on_missing(= >) (and dict would have default_factory =3D=3D type(None)) > OTOH, I forgot why it was desirable in the first place to overload d[k] with defaulting logic. E.g., why wouldn't d.defaulting[k] be ok to write when you want the d.default_factory action? on_missing feels more like a tracing hook though, so maybe it could always act either way if defined. Also, for those wanting to avoid lambda:42 as factory, would a callable test cost a lot? Of course then the default_factory name might require revision. Regards, Bengt Richter From nnorwitz at gmail.com Sun Feb 19 04:15:07 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 18 Feb 2006 19:15:07 -0800 Subject: [Python-Dev] buildbot is all green Message-ID: http://www.python.org/dev/buildbot/ Whoever is first to break the build, buys a round of drinks at PyCon! That's over 400 people and counting: http://www.python.org/pycon/2006/pycon-attendees.txt Remember to run the tests *before* checkin. :-) n From steve at holdenweb.com Sun Feb 19 04:38:39 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 18 Feb 2006 22:38:39 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F576A3.1030604@v.loewis.de> References: <43F576A3.1030604@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Guido van Rossum wrote: > >>Feedback? > > > I would like this to be part of the standard dictionary type, > rather than being a subtype. > > d.setdefault([]) (one argument) should install a default value, > and d.cleardefault() should remove that setting; d.default > should be read-only. Alternatively, d.default could be assignable > and del-able. > The issue with setting the default this way is that a copy would have to be created if the behavior was to differ from the sometimes-confusing default argument behavior for functions. > Also, I think has_key/in should return True if there is a default. > It certainly seems desirable to see True where d[some_key] doesn't raise an exception, but one could argue either way. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From steve at holdenweb.com Sun Feb 19 04:44:37 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 18 Feb 2006 22:44:37 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: Guido van Rossum wrote: > On 2/16/06, Guido van Rossum wrote: > >>Over lunch with Alex Martelli, he proposed that a subclass of dict >>with this behavior (but implemented in C) would be a good addition to >>the language. It looks like it wouldn't be hard to implement. It could >>be a builtin named defaultdict. The first, required, argument to the >>constructor should be the default value. Remaining arguments (even >>keyword args) are passed unchanged to the dict constructor. > > > Thanks for all the constructive feedback. Here are some responses and > a new proposal. > > - Yes, I'd like to kill setdefault() in 3.0 if not sooner. > > - It would indeed be nice if this was an optional feature of the > standard dict type. > > - I'm ignoring the request for other features (ordering, key > transforms). If you want one of these, write a PEP! > > - Many, many people suggested to use a factory function instead of a > default value. This is indeed a much better idea (although slightly > more cumbersome for the simplest cases). > One might think about calling it if it were callable, otherwise using it literally. Of course this would require jiggery-pokery int eh cases where you actually *wantes* the default value to be a callable (you'd have to provide a callable to return the callable as a default). > - Some people seem to think that a subclass constructor signature must > match the base class constructor signature. That's not so. The > subclass constructor must just be careful to call the base class > constructor with the correct arguments. Think of the subclass > constructor as a factory function. > True, but then this does get in the way of treating the base dict and its defaulting subtype polymorphically. That might not be a big issue. > - There's a fundamental difference between associating the default > value with the dict object, and associating it with the call. So > proposals to invent a better name/signature for setdefault() don't > compete. (As to one specific such proposal, adding an optional bool as > the 3rd argument to get(), I believe I've explained enough times in > the past that flag-like arguments that always get a constant passed in > at the call site are a bad idea and should usually be refactored into > two separate methods.) > > - The inconsistency introduced by __getitem__() returning a value for > keys while get(), __contains__(), and keys() etc. don't show it, > cannot be resolved usefully. You'll just have to live with it. > Modifying get() to do the same thing as __getitem__() doesn't seem > useful -- it just takes away a potentially useful operation. > > So here's a new proposal. > > Let's add a generic missing-key handling method to the dict class, as > well as a default_factory slot initialized to None. The implementation > is like this (but in C): > > def on_missing(self, key): > if self.default_factory is not None: > value = self.default_factory() > self[key] = value > return value > raise KeyError(key) > > When __getitem__() (and *only* __getitem__()) finds that the requested > key is not present in the dict, it calls self.on_missing(key) and > returns whatever it returns -- or raises whatever it raises. > __getitem__() doesn't need to raise KeyError any more, that's done by > on_missing(). > > The on_missing() method can be overridden to implement any semantics > you want when the key isn't found: return a value without inserting > it, insert a value without copying it, only do it for certain key > types/values, make the default incorporate the key, etc. > > But the default implementation is designed so that we can write > > d = {} > d.default_factory = list > > to create a dict that inserts a new list whenever a key is not found > in __getitem__(), which is most useful in the original use case: > implementing a multiset so that one can write > > d[key].append(value) > > to add a new key/value to the multiset without having to handle the > case separately where the key isn't in the dict yet. This also works > for sets instead of lists: > > d = {} > d.default_factory = set > ... > d[key].add(value) > This seems like a very good compromise. [non-functional alternatives ...] > regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From jcarlson at uci.edu Sun Feb 19 04:50:07 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 18 Feb 2006 19:50:07 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F7D3E9.3050204@canterbury.ac.nz> References: <43f6a154.1088724012@news.gmane.org> <43F7D3E9.3050204@canterbury.ac.nz> Message-ID: <20060218194442.5FC0.JCARLSON@uci.edu> Greg Ewing wrote: > Bengt Richter wrote: > > > My guess is that realistically default_factory will be used > > to make clean code for filling a dict, and then turning the factory > > off if it's to be passed into unknown contexts. > > This suggests that maybe the autodict behaviour shouldn't > be part of the dict itself, but provided by a wrapper > around the dict. > > The you can fill the dict through the wrapper, and still > have a normal dict underneath to use for other purposes. I prefer this to changing dictionaries directly. The actual wrapper could sit in the collections module, ready for subclassing/replacement of the on_missing method. - Josiah From python at rcn.com Sun Feb 19 04:53:35 2006 From: python at rcn.com (Raymond Hettinger) Date: Sat, 18 Feb 2006 22:53:35 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <43F576A3.1030604@v.loewis.de> Message-ID: <007a01c63508$0f00f7d0$b83efea9@RaymondLaptop1> > > Also, I think has_key/in should return True if there is a default. > It certainly seems desirable to see True where d[some_key] > doesn't raise an exception, but one could argue either way. Some things can be agreed by everyone: * if __contains__ always returns True, then it is a useless feature (since scripts containing a line such as "if k in dd" can always eliminate that line without affecting the algorithm). * if defaultdicts are supposed to be drop-in dict substitutes, then having __contains__ always return True will violate basic dict invariants: del d[some_key] assert some_key not in d Raymond From rrr at ronadam.com Sun Feb 19 04:54:44 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 18 Feb 2006 21:54:44 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060218160554.5FB4.JCARLSON@uci.edu> References: <20060218111809.5FAB.JCARLSON@uci.edu> <43F7B412.8090205@ronadam.com> <20060218160554.5FB4.JCARLSON@uci.edu> Message-ID: <43F7EC04.6090501@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: > Except that ambiguates it even further. > > Is encodings.tounicode() encoding, or decoding? According to everything > you have said so far, it would be decoding. But if I am decoding binary > data, why should it be spending any time as a unicode string? What do I > mean? Encoding and decoding are relative concepts. It's all encoding from one thing to another. Weather it's "decoding" or "encoding" depends on the relationship of the current encoding to a standard encoding. The confusion introduced by "decode" is when the 'default_encoding' changes, will change, or is unknown. > x = f.read() #x contains base-64 encoded binary data > y = encodings.to_unicode(x, 'base64') > > y now contains BINARY DATA, except that it is a unicode string No, that wasn't what I was describing. You get a Unicode string object as the result, not a bytes object with binary data. See the toy example at the bottom. > z = encodings.to_str(y, 'latin-1') > > Later you define a str_to_str function, which I (or someone else) would > use like: > > z = str_to_str(x, 'base64', 'latin-1') > > But the trick is that I don't want some unicode string encoded in > latin-1, I want my binary data unencoded. They may happen to be the > same in this particular example, but that doesn't mean that it makes any > sense to the user. If you want bytes then you would use the bytes() type to get bytes directly. Not encode or decode. binary_unicode = bytes(unicode_string) The exact byte order and representation would need to be decided by the python developers in this case. The internal representation 'unicode-internal', is UCS-2 I believed. >> It's no more ambiguous than any math >> operation where you can do it one way with one operations and get your >> original value back with the same operation by using an inverse value. >> >> n2=n+1; n3=n+(-1); n==n3 >> n2=n*2; n3=n*(.5); n==n3 > > Ahh, so you are saying 'to_base64' and 'from_base64'. There is one > major reason why I don't like that kind of a system: I can't just say > encoding='base64' and use str.encode(encoding) and str.decode(encoding), > I necessarily have to use, str.recode('to_'+encoding) and > str.recode('from_'+encoding) . Seems a bit awkward. Yes, but the encodings API could abstract out the 'to_base64' and 'from_base64' so you can just say 'base64' and have it work either way. Maybe a toy "incomplete" example might help. # in module bytes.py or someplace else. class bytes(list): """ bytes methods defined here """ # in module encodings.py # using a dict of lists, but other solutions would # work just as well. unicode_codecs = { 'base64': ('from_base64', 'to_base64'), } def tounicode(obj, from_codec): b = bytes(obj) b = b.recode(unicode_codecs[from_codec][0]) return unicode(b) def tostr(obj, to_codec): b = bytes(obj) b = b.recode(unicode_codecs[to_codec][1]) return str(b) # in your application import encodings ... a bunch of code ... u = encodings.tounicode(s, 'base64') # or if going the other way s = encodings.tostr(u, 'base64') Does this help? Is the relationship between the bytes object and the encodings API clearer here? If not maybe we should discuss it further off line. Cheers, Ronald Adam From steve at holdenweb.com Sun Feb 19 04:57:35 2006 From: steve at holdenweb.com (Steve Holden) Date: Sat, 18 Feb 2006 22:57:35 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <43F64A7A.3060400@v.loewis.de> References: <43F64A7A.3060400@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Adam Olsen wrote: > >>Still -1. It's better, but it violates the principle of encapsulation >>by mixing how-you-use-it state with what-it-stores state. In doing >>that it has the potential to break an API documented as accepting a >>dict. Code that expects d[key] to raise an exception (and catches the >>resulting KeyError) will now silently "succeed". > > > Of course it will, and without quotes. That's the whole point. > > >>I believe that necessitates a PEP to document it. > > > You are missing the rationale of the PEP process. The point is > *not* documentation. The point of the PEP process is to channel > and collect discussion, so that the BDFL can make a decision. > The BDFL is not bound at all to the PEP process. > > To document things, we use (or should use) documentation. > > One could wish this ideal had been the case for the import extensions defined in PEP 302. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From benji at benjiyork.com Sun Feb 19 05:11:32 2006 From: benji at benjiyork.com (Benji York) Date: Sat, 18 Feb 2006 23:11:32 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: Message-ID: <43F7EFF4.6010903@benjiyork.com> Neal Norwitz wrote: > http://www.python.org/dev/buildbot/ If there's interest in slightly nicer buildbot CSS (something like http://buildbot.zope.org/) I'd be glad to contribute. -- Benji York From tjreedy at udel.edu Sun Feb 19 06:13:20 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 19 Feb 2006 00:13:20 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <20060216212133.GB23859@xs4all.nl><17397.58935.8669.271616@montanaro.dyndns.org> <20060218125325.pz5tgfem0qdcssc0@monbureau3.cirad.fr> Message-ID: > Quoting skip at pobox.com: >> The only question in my mind is whether or not getting a non-existent >> value >> under the influence of a given default value should stick that value in >> the >> dictionary or not. It seems to me that there are at least two types of default dicts, which have opposite answers to that question. One is a 'universal dict' that maps every key to something -- the default if nothing else. That should not have the default ever explicitly entered. Udict.keys() should only give the keys *not* mapped to the universal value. Another is the accumlator dict. The default value is the identity (0, [], or whatever) for the type of accumulation. An adict must have the identity added, even though that null will usually be immedially incremented by +=1 or .append(ob) or whatever. Guido's last proposal was for the default default_dict to cater to the second type (and others needing the same behavior) while catering to the first by making the default fill-in method over-rideable. It we go with, for instance, wrappers in the collections module instead of modification of dict, then perhaps there should be at least two wrappers included, with each of these two behaviors. Terry Jan Reedy From martin at v.loewis.de Sun Feb 19 06:46:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 06:46:40 +0100 Subject: [Python-Dev] ssize_t branch merged In-Reply-To: <43F7AAB2.1030809@ieee.org> References: <43F3A7E4.1090505@v.loewis.de> <20060218005149.GE23859@xs4all.nl> <1f7befae0602171937s2e770577w94a9191887fb0e1d@mail.gmail.com> <43F6D8F0.2060808@v.loewis.de> <43F7AAB2.1030809@ieee.org> Message-ID: <43F80640.1090104@v.loewis.de> Travis E. Oliphant wrote: > Why not just > > #if SIZEOF_SIZE_T == 2 > #define PY_SSIZE_T_MAX 0x7fff > #elif SIZEOF_SIZE_T == 4 > #define PY_SSIZE_T_MAX 0x7fffffff > #elif SIZEOF_SIZE_T == 8 > #define PY_SSIZE_T_MAX 0x7fffffffffffffff > #elif SIZEOF_SIZE_T == 16 > #define PY_SSIZE_T_MAX 0x7fffffffffffffffffffffffffffffff > #endif That would not work: 0x7fffffffffffffff is not a valid integer literal. 0x7fffffffffffffffL might work, or 0x7fffffffffffffffLL, or 0x7fffffffffffffffi64. Which of these is correct depends on the compiler. How to spell 128-bit integral constants, I don't know; it appears that MS foresees a i128 suffix for them. Regards, Martin From martin at v.loewis.de Sun Feb 19 07:05:11 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 07:05:11 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <007a01c63508$0f00f7d0$b83efea9@RaymondLaptop1> References: <43F576A3.1030604@v.loewis.de> <007a01c63508$0f00f7d0$b83efea9@RaymondLaptop1> Message-ID: <43F80A97.2030505@v.loewis.de> Raymond Hettinger wrote: >>>Also, I think has_key/in should return True if there is a default. > * if __contains__ always returns True, then it is a useless feature (since > scripts containing a line such as "if k in dd" can always eliminate that line > without affecting the algorithm). If you mean "if __contains__ always returns True for a default dict, then it is a useless feature", I disagree. The code using "if k in dd" cannot be eliminated if you don't know that you have a default dict. > * if defaultdicts are supposed to be drop-in dict substitutes, then having > __contains__ always return True will violate basic dict invariants: > del d[some_key] > assert some_key not in d If you have a default value, you cannot ultimately del a key. This sequence is *not* a basic mapping invariant. If it was, then it would be also an invariant that, after del d[some_key], d[some_key] will raise a KeyError. This kind of invariant doesn't take into account that there might be a default value. Regards, Martin From ncoghlan at gmail.com Sun Feb 19 07:12:56 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 19 Feb 2006 16:12:56 +1000 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: Message-ID: <43F80C68.1010204@gmail.com> Neal Norwitz wrote: > http://www.python.org/dev/buildbot/ > > Whoever is first to break the build, buys a round of drinks at PyCon! > That's over 400 people and counting: > http://www.python.org/pycon/2006/pycon-attendees.txt > > Remember to run the tests *before* checkin. :-) I don't think we can blame Tim's recent checkins for test_logging subsequently breaking on Solaris though ;) There still seems to be something a bit temperamental in that test. . . Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Sun Feb 19 07:17:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 07:17:53 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F7EFF4.6010903@benjiyork.com> References: <43F7EFF4.6010903@benjiyork.com> Message-ID: <43F80D91.7050004@v.loewis.de> Benji York wrote: >>http://www.python.org/dev/buildbot/ > > > If there's interest in slightly nicer buildbot CSS (something like > http://buildbot.zope.org/) I'd be glad to contribute. I personally don't care much about the visual look of web pages. However, people have commented that the buildbot page is ugly, so yes, please do contribute something. Bonus points for visually separating the "trunk" columns from the "2.4" columns. Would a vertical line be appropriate? Bigger spacing? Regards, Martin From martin at v.loewis.de Sun Feb 19 07:19:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 07:19:40 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: Message-ID: <43F80DFC.6000307@v.loewis.de> Neal Norwitz wrote: > http://www.python.org/dev/buildbot/ Unfortunately, test_logging still fails sporadically on Solaris. Regards, Martin From raymond.hettinger at verizon.net Sun Feb 19 07:33:42 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 19 Feb 2006 01:33:42 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <43F576A3.1030604@v.loewis.de> <007a01c63508$0f00f7d0$b83efea9@RaymondLaptop1> <43F80A97.2030505@v.loewis.de> Message-ID: <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> [Martin v. L?wis] > If you have a default value, you cannot ultimately del a key. This > sequence is *not* a basic mapping invariant. You believe that key deletion is not basic to mappings? > This kind of invariant doesn't take into account > that there might be a default value. Precisely. Therefore, a defaultdict subclass violates the Liskov Substitution Principle. Of course, the __del__ followed __contains__ sequence is not the only invariant that is thrown-off. There are plenty of examples. Here's one that is absolutely basic to the method's contract: k, v = dd.popitem() assert k not in dd Any code that was expecting a dictionary and uses popitem() as a means of looping over and consuming entries will fail. No one should kid themselves that a default dictionary is a drop-in substitute. Much of the dict's API has an ambiguous meaning when applied to defaultdicts. If all keys are in-theory predefined, what is the meaning of len(dd)? Should dd.items() include any entries where the value is equal to the default or should the collection never store those? If the former, then how do you access the entries without looping over the whole contents? If the latter, then do you worry that "dd[v]=k" does not imply "(k,v) in dd.items()"? Raymond From g.brandl at gmx.net Sun Feb 19 07:49:12 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 19 Feb 2006 07:49:12 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: Message-ID: Neal Norwitz wrote: > http://www.python.org/dev/buildbot/ > > Whoever is first to break the build, buys a round of drinks at PyCon! > That's over 400 people and counting: > http://www.python.org/pycon/2006/pycon-attendees.txt > > Remember to run the tests *before* checkin. :-) Don't we have a Windows slave yet? Georg From martin at v.loewis.de Sun Feb 19 07:59:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 07:59:58 +0100 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> References: <43F576A3.1030604@v.loewis.de> <007a01c63508$0f00f7d0$b83efea9@RaymondLaptop1> <43F80A97.2030505@v.loewis.de> <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> Message-ID: <43F8176E.5050302@v.loewis.de> Raymond Hettinger wrote: >> If you have a default value, you cannot ultimately del a key. This >> sequence is *not* a basic mapping invariant. > > > You believe that key deletion is not basic to mappings? No, not in the sense that the key will go away through deletion. I view a mapping as a modifiable partial function. There is some initial key/value association (in a classic mapping, it is initially empty), and then there are modifications. Key deletion means to reset the key to the initial association. > Of course, the __del__ followed __contains__ sequence is not the only > invariant that is thrown-off. There are plenty of examples. Here's one > that is absolutely basic to the method's contract: > > k, v = dd.popitem() > assert k not in dd > > Any code that was expecting a dictionary and uses popitem() as a means > of looping over and consuming entries will fail. Well, code that loops over a dictionary using popitem typically terminates when the dictionary becomes false (or its length becomes zero). That code wouldn't be affected by the behaviour of "in". > No one should kid themselves that a default dictionary is a drop-in > substitute. Much of the dict's API has an ambiguous meaning when applied > to defaultdicts. Right. But it is only ambiguous until specified. Of course, in the face of ambiguity, refuse the temptation to guess. > If all keys are in-theory predefined, what is the meaning of len(dd)? Taking my definition from the beginning of the message, it is the number of keys that have been modified from the initial mapping. > Should dd.items() include any entries where the value is equal to the > default or should the collection never store those? It should include all modified items, and none of the unmodified ones. Explicitly assigning the default value still makes the entry modified; you need to del it to set it back to "unmodified". > If the former, then > how do you access the entries without looping over the whole contents? Not sure I understand the question. You use d[k] to access an entry. Regards, Martin From raymond.hettinger at verizon.net Sun Feb 19 08:11:33 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 19 Feb 2006 02:11:33 -0500 Subject: [Python-Dev] Proposal: defaultdict References: <20060216212133.GB23859@xs4all.nl><17397.58935.8669.271616@montanaro.dyndns.org><20060218125325.pz5tgfem0qdcssc0@monbureau3.cirad.fr> Message-ID: <005301c63523$b751b2b0$b83efea9@RaymondLaptop1> [Terry Reedy] > One is a 'universal dict' that maps every key to something -- the default if > nothing else. That should not have the default ever explicitly entered. > Udict.keys() should only give the keys *not* mapped to the universal value. Would you consider it a mapping invariant that "k in dd" implies "k in dd.keys()"? Is the notion of __contains__ at odds with notion of universality? Raymond From jcarlson at uci.edu Sun Feb 19 08:42:56 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 18 Feb 2006 23:42:56 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> References: <43F80A97.2030505@v.loewis.de> <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> Message-ID: <20060218225135.5FCA.JCARLSON@uci.edu> "Raymond Hettinger" wrote: > [Martin v. L?wis] > > This kind of invariant doesn't take into account > > that there might be a default value. > > Precisely. Therefore, a defaultdict subclass violates the Liskov Substitution > Principle. class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.on_missing(key) def on_missing(self, key): if not hasattr(self, 'default') or not callable(self.default): raise KeyError, key r = self[key] = self.default() return r In my opinion, the above implementation as a subclass "does the right thing" in regards to __del__, __contains__, get, pop, popitem, __len__, has_key, and anything else I can think of. Does it violate the Liskov Substitution Principle? Yes, but only if user code relies on dd[key] raising a KeyError on a lack of a key. This can be easily remedied by removing the default when it is unneeded, at which point, you get your Liskov Substitution. > Of course, the __del__ followed __contains__ sequence is not the only invariant > that is thrown-off. There are plenty of examples. Here's one that is > absolutely basic to the method's contract: > > k, v = dd.popitem() > assert k not in dd > > Any code that was expecting a dictionary and uses popitem() as a means of > looping over and consuming entries will fail. >>> a = defaultdict() >>> a.default = list >>> a['hello'] [] >>> k, v = a.popitem() >>> assert k not in a >>> Seems to work for the above implementation. > No one should kid themselves that a default dictionary is a drop-in substitute. > Much of the dict's API has an ambiguous meaning when applied to defaultdicts. Actually, if one is careful, the dict's API is completely unchanged, except for direct access to the object via b = a[i]. >>> del a['hello'] Traceback (most recent call last): File "", line 1, in ? KeyError: 'hello' >>> 'hello' in a False >>> a.get('hello') >>> a.pop('hello') Traceback (most recent call last): File "", line 1, in ? KeyError: 'pop(): dictionary is empty' >>> a.popitem() Traceback (most recent call last): File "", line 1, in ? KeyError: 'popitem(): dictionary is empty' >>> len(a) 0 >>> a.has_key('hello') False > If all keys are in-theory predefined, what is the meaning of len(dd)? It depends on the sequence of actions. Play around with the above defaultdict implementation. From what I understood of Guido's original post, this is essentially what he was proposing, only implemented in C. > Should dd.items() include any entries where the value is equal to the default or > should the collection never store those? Yes, it should store any value which was stored via 'dd[k]=v', or any default value created via access by 'v=dd[k]' . > If the former, then how do you access > the entries without looping over the whole contents? Presumably one is looking for a single kind of default (empty list, 0, etc.) because one wanted to accumulate into them, similar to one of the following... for item, value in input: try: d[item] += value #or d[item].append(value) except KeyError: d[item] = value #or d[item] = [value] which becomes for item in input: dd[item] += 1 #or dd[item].append(value) Once accumulation has occurred, iteration over them via .iteritems(), .items(), .popitem(), etc., would progress exactly the same way as with a regular dictionary. If the code which is using the accumulated data does things like... for key in wanted_keys: try: value = dd[key] except KeyError: continue #do something nontrivial with value rather than... for key in wanted_keys: if key not in dd: continue value = dd[key] #do something nontrivial with value Then the user has at least three options to make it 'work right': 1. User can change to using 'in' to iterate rather than relying on a KeyError. 2. User could remember to remove the default. 3. User can create a copy of the default dictionary via dict(dd) and pass it into the code which relies on the non-defaulting dictionary. > If the latter, then do you > worry that "dd[v]=k" does not imply "(k,v) in dd.items()"? I personally wouldn't want the latter. My post probably hasn't convinced you, but much of the confusion, I believe, is based on Martin's original belief that 'k in dd' should always return true if there is a default. One can argue that way, but then you end up on the circular train of thought that gets you to "you can't do anything useful if that is the case, .popitem() doesn't work, len() is undefined, ...". Keep it simple, keep it sane. - Josiah From martin at v.loewis.de Sun Feb 19 09:03:38 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 09:03:38 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: Message-ID: <43F8265A.2020209@v.loewis.de> Georg Brandl wrote: > Don't we have a Windows slave yet? No; nobody volunteered a machine yet (plus the hand-holding that is always necessary with Windows). Regards, Martin From mwh at python.net Sun Feb 19 11:18:35 2006 From: mwh at python.net (Michael Hudson) Date: Sun, 19 Feb 2006 10:18:35 +0000 Subject: [Python-Dev] buildbot is all green In-Reply-To: (Neal Norwitz's message of "Sat, 18 Feb 2006 19:15:07 -0800") References: Message-ID: <2m1wxzabyc.fsf@starship.python.net> "Neal Norwitz" writes: > http://www.python.org/dev/buildbot/ Wow, that's very cool! Cheers, mwh -- this "I hate c++" is so old it's as old as C++, yes -- from Twisted.Quotes From mwh at python.net Sun Feb 19 11:36:09 2006 From: mwh at python.net (Michael Hudson) Date: Sun, 19 Feb 2006 10:36:09 +0000 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F777AD.7090203@egenix.com> (M.'s message of "Sat, 18 Feb 2006 20:38:21 +0100") References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <43F74812.1080505@v.loewis.de> <43F754F6.9050204@egenix.com> <43F7654B.5030302@v.loewis.de> <43F777AD.7090203@egenix.com> Message-ID: <2mwtfr8wkm.fsf@starship.python.net> "M.-A. Lemburg" writes: > Martin v. L?wis wrote: >> M.-A. Lemburg wrote: >>>>> True. However, note that the .encode()/.decode() methods on >>>>> strings and Unicode narrow down the possible return types. >>>>> The corresponding .bytes methods should only allow bytes and >>>>> Unicode. >>>> I forgot that: what is the rationale for that restriction? >>> >>> To assure that only those types can be returned from those >>> methods, ie. instances of basestring, which in return permits >>> type inference for those methods. >> >> Hmm. So it for type inference???? >> Where is that documented? > > Somewhere in the python-dev mailing list archives ;-) > > Seriously, we should probably add this to the documentation. Err.................. I don't think this is a good argument, for quite a few reasons. There certainly aren't many other features in Python designed to aid type inference and the knowledge that something returns "unicode or str" isn't especially useful... Cheers, mwh -- ROOSTA: Ever since you arrived on this planet last night you've been going round telling people that you're Zaphod Beeblebrox, but that they're not to tell anyone else. -- The Hitch-Hikers Guide to the Galaxy, Episode 7 From mal at egenix.com Sun Feb 19 11:56:28 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 19 Feb 2006 11:56:28 +0100 Subject: [Python-Dev] [Python-checkins] r42490 - in python/branches/release24-maint: Lib/fileinput.py Lib/test/test_fileinput.py Misc/NEWS In-Reply-To: <20060219095133.E1C131E4004@bag.python.org> References: <20060219095133.E1C131E4004@bag.python.org> Message-ID: <43F84EDC.90104@egenix.com> Why are these new features being backported to 2.4 ? georg.brandl wrote: > Author: georg.brandl > Date: Sun Feb 19 10:51:33 2006 > New Revision: 42490 > > Modified: > python/branches/release24-maint/Lib/fileinput.py > python/branches/release24-maint/Lib/test/test_fileinput.py > python/branches/release24-maint/Misc/NEWS > Log: > Patch #1337756: fileinput now accepts Unicode filenames. > > > Modified: python/branches/release24-maint/Lib/fileinput.py > ============================================================================== > --- python/branches/release24-maint/Lib/fileinput.py (original) > +++ python/branches/release24-maint/Lib/fileinput.py Sun Feb 19 10:51:33 2006 > @@ -184,7 +184,7 @@ > """ > > def __init__(self, files=None, inplace=0, backup="", bufsize=0): > - if type(files) == type(''): > + if isinstance(files, basestring): > files = (files,) > else: > if files is None: > > Modified: python/branches/release24-maint/Lib/test/test_fileinput.py > ============================================================================== > --- python/branches/release24-maint/Lib/test/test_fileinput.py (original) > +++ python/branches/release24-maint/Lib/test/test_fileinput.py Sun Feb 19 10:51:33 2006 > @@ -157,3 +157,13 @@ > verify(fi.lineno() == 6) > finally: > remove_tempfiles(t1, t2) > + > +if verbose: > + print "15. Unicode filenames" > +try: > + t1 = writeTmp(1, ["A\nB"]) > + fi = FileInput(files=unicode(t1, sys.getfilesystemencoding())) > + lines = list(fi) > + verify(lines == ["A\n", "B"]) > +finally: > + remove_tempfiles(t1) > > Modified: python/branches/release24-maint/Misc/NEWS > ============================================================================== > --- python/branches/release24-maint/Misc/NEWS (original) > +++ python/branches/release24-maint/Misc/NEWS Sun Feb 19 10:51:33 2006 > @@ -74,6 +74,8 @@ > Library > ------- > > +- Patch #1337756: fileinput now accepts Unicode filenames. > + > - Patch #1373643: The chunk module can now read chunks larger than > two gigabytes. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 19 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From hyeshik at gmail.com Sun Feb 19 12:17:43 2006 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Sun, 19 Feb 2006 20:17:43 +0900 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <61847.89.54.8.114.1140296899.squirrel@isar.livinglogic.de> References: <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> <43F63C07.9030901@egenix.com> <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> <43F7585E.4080909@egenix.com> <61847.89.54.8.114.1140296899.squirrel@isar.livinglogic.de> Message-ID: <4f0b69dc0602190317g2b98df5cw2fcf42f948540b01@mail.gmail.com> On 2/19/06, Walter D?rwald wrote: > M.-A. Lemburg wrote: > > Walter D?rwald wrote: > >> Anyway, I've started implementing a patch that just adds codecs.StatefulEncoder/codecs.StatefulDecoder. UTF8, UTF8-Sig, > >> UTF-16, UTF-16-LE and UTF-16-BE are already working. > > > > Nice :-) > > gencodec.py is updated now too. The rest should be manageble too. I'll leave updating the CJKV codecs to Hye-Shik though. > Okay. I'll look whether how CJK codecs can be improved by the new protocol soon. I guess it'll be not so difficult because CJK codecs have a their own common stateful framework already. BTW, CJK codecs don't have V yet. :-) Hye-Shik From stephen at xemacs.org Sun Feb 19 13:38:44 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 19 Feb 2006 21:38:44 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F664F5.5040504@colorstudy.com> (Ian Bicking's message of "Fri, 17 Feb 2006 18:06:13 -0600") References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F658AF.8070801@colorstudy.com> <43F6632B.6000600@v.loewis.de> <43F664F5.5040504@colorstudy.com> Message-ID: <87slqfseuj.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Ian" == Ian Bicking writes: Ian> Encodings cover up eclectic interfaces, where those Ian> interfaces fit a basic pattern -- data in, data out. Isn't "filter" the word you're looking for? I think you've just made a very strong case that this is a slippery slope that we should avoid. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From mal at egenix.com Sun Feb 19 14:12:11 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 19 Feb 2006 14:12:11 +0100 Subject: [Python-Dev] [Python-checkins] r42490 - in python/branches/release24-maint: Lib/fileinput.py Lib/test/test_fileinput.py Misc/NEWS In-Reply-To: References: <20060219095133.E1C131E4004@bag.python.org> <43F84EDC.90104@egenix.com> Message-ID: <43F86EAB.9040704@egenix.com> Georg Brandl wrote: > M.-A. Lemburg wrote: >> Why are these new features being backported to 2.4 ? >> >> georg.brandl wrote: >>> Author: georg.brandl >>> Date: Sun Feb 19 10:51:33 2006 >>> New Revision: 42490 >>> >>> Modified: >>> python/branches/release24-maint/Lib/fileinput.py >>> python/branches/release24-maint/Lib/test/test_fileinput.py >>> python/branches/release24-maint/Misc/NEWS >>> Log: >>> Patch #1337756: fileinput now accepts Unicode filenames. > > Is that a new feature? I thought that wherever a filename is accepted, > it can be unicode too. > > The previous behavior was a bug in any case, since it treated the > unicode string as a sequence of filenames. Would you fix that by > raising a ValueError? No, but from the text in the NEWS file things sounded a lot like a feature rather than a bug fix. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 19 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From stephen at xemacs.org Sun Feb 19 14:30:54 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 19 Feb 2006 22:30:54 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F7089B.8040707@egenix.com> (M.'s message of "Sat, 18 Feb 2006 12:44:27 +0100") References: <43F3ED97.40901@canterbury.ac.nz> <20060215212629.5F6D.JCARLSON@uci.edu> <43F50BDD.4010106@v.loewis.de> <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <43F7089B.8040707@egenix.com> Message-ID: <87lkw7scfl.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "M" == "M.-A. Lemburg" writes: M> Martin v. L?wis wrote: >> No. The reason to ban string.decode and bytes.encode is that it >> confuses users. M> Instead of starting to ban everything that can potentially M> confuse a few users, we should educate those users and tell M> them what these methods mean and how they should be used. ISTM it's neither "potential" nor "a few". As Aahz pointed out, for the common use of text I/O it requires only a single clue ("Unicode is The One True Plain Text, everything else must be decoded to Unicode before use.") and you don't need any "education" about "how to use" codecs under Martin's restrictions; you just need to know which ones to use. This is not a benefit to be given up lightly. Would it be reasonable to put those restrictions in the codecs? Ie, so that bytes().encode('gzip') is allowed for the "generic" codec 'gzip', but bytes().encode('utf-8') is an error for the "charset" codec 'utf-8'? -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From arigo at tunes.org Sun Feb 19 15:03:40 2006 From: arigo at tunes.org (Armin Rigo) Date: Sun, 19 Feb 2006 15:03:40 +0100 Subject: [Python-Dev] 2.5 release schedule In-Reply-To: References: <20060217173043.GA14607@code0.codespeak.net> Message-ID: <20060219140339.GA31961@code0.codespeak.net> Hi Neal & Jeremy, On Fri, Feb 17, 2006 at 10:53:19PM -0800, Neal Norwitz wrote: > I don't think it belongs in the PEP. I bumped the priority to 7 which > is the standard protocol, though I don't know that it's really > followed. Ok. > I will enumerate the existing problems for Jeremy in the > bug report. > > In the future, I would also prefer separate bug reports. Feel free > to assign new bugs to Jeremy too. :-) Thanks :-) A bientot, Armin From stephen at xemacs.org Sun Feb 19 15:30:02 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 19 Feb 2006 23:30:02 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F7113E.8090300@egenix.com> (M.'s message of "Sat, 18 Feb 2006 13:21:18 +0100") References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> <43F7113E.8090300@egenix.com> Message-ID: <87hd6vs9p1.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "M" == "M.-A. Lemburg" writes: M> The main reason is symmetry and the fact that strings and M> Unicode should be as similar as possible in order to simplify M> the task of moving from one to the other. Those are perfectly compatible with Martin's suggestion. M> Still, I believe that this is an educational problem. There are M> a couple of gotchas users will have to be aware of (and this is M> unrelated to the methods in question): But IMO that's wrong, both in attitude and in fact. As for attitude, users should not have to be aware of these gotchas. Codec writers, on the other hand, should be required to avoid presenting users with those gotchas. Martin's draconian restriction is in the right direction, but you can argue it goes way too far. In fact, of course it's related to the methods in question. "Original" vs "derived" data can only be defined in terms of some notion of the "usual semantics" of the streams, and that is going to be strongly reflected in the semantics of the methods. M> * "encoding" always refers to transforming original data into a M> derived form M> * "decoding" always refers to transforming a derived form of M> data back into its original form Users *already* know that; it's a very strong connotation of the English words. The problem is that users typically have their own concept of what's original and what's derived. For example: M> * for Unicode codecs the original form is Unicode, the derived M> form is, in most cases, a string First of all, that's Martin's point! Second, almost all Americans, a large majority of Japanese, and I would bet most Western Europeans would say you have that backwards. That's the problem, and it's the Unicode advocates' problem (ie, ours), not the users'. Even if we're right: education will require lots of effort. Rather, we should just make it as easy as possible to do it right, and hard to do it wrong. BTW, what use cases do you have in mind for Unicode -> Unicode decoding? Maximally decomposed forms and/or eliminating compatibility characters etc? Very specialized. M> Codecs also unify the various interfaces to common encodings M> such as base64, uu or zip which are not Unicode related. Now this is useful and has use cases I've run into, for example in email, where you would like to use the same interface for base64 as for shift_jis and you'd like to be able to write def encode-mime-body (string, codec-list): if codec-list[0] not in charset-codec-list: raise NotCharsetCodecException if len (codec-list) > 1 and codec-list[-1] not in transfer-codec-list: raise NotTransferCodecException for codec in codec-list: string = string.encode (codec) return string mime-body = encode-mime-body ("This is a pen.", [ 'shift_jis', 'zip', 'base64' ]) I guess I have to admit I'm backtracking from my earlier hardline support for Martin's position, but I'm still sympathetic: (a) that's the direct way to "make it easy to do it right", and (b) I still think the use cases for non-Unicode codecs are YAGNI very often. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From g.brandl at gmx.net Sun Feb 19 16:00:47 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 19 Feb 2006 16:00:47 +0100 Subject: [Python-Dev] Enhancements to the fileinput module Message-ID: I've just checked in some enhancements to the fileinput module. * fileno() to check the current file descriptor * mode argument to allow opening in universal newline mode * openhook argument to allow transparent opening of compressed or encoded files. Please feel free to comment. Cheers, Georg From fredrik at pythonware.com Sun Feb 19 16:05:36 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun, 19 Feb 2006 16:05:36 +0100 Subject: [Python-Dev] Enhancements to the fileinput module References: Message-ID: Georg Brandl wrote: > I've just checked in some enhancements to the fileinput module. > > * fileno() to check the current file descriptor > * mode argument to allow opening in universal newline mode > * openhook argument to allow transparent opening of compressed > or encoded files. > > Please feel free to comment. hey, where's the PEP, the endless thread where the same arguments are repeated over and over again, the -1 vetos from the peanut gallery, and the mandatory off-topic subthreads? (looks good to me. it might be idea to mention that hook_compressed uses the extension instead of the file signature to determine what de- compressor to use, though...) From g.brandl at gmx.net Sun Feb 19 16:06:16 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 19 Feb 2006 16:06:16 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F7EFF4.6010903@benjiyork.com> References: <43F7EFF4.6010903@benjiyork.com> Message-ID: Benji York wrote: > Neal Norwitz wrote: >> http://www.python.org/dev/buildbot/ > > If there's interest in slightly nicer buildbot CSS (something like > http://buildbot.zope.org/) I'd be glad to contribute. +1. Looks nice! Georg From stephen at xemacs.org Sun Feb 19 16:14:14 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 20 Feb 2006 00:14:14 +0900 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: (Guido van Rossum's message of "Fri, 17 Feb 2006 13:26:30 -0800") References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <87d5hjs7nd.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Guido" == Guido van Rossum writes: Guido> On 2/16/06, Stephen J. Turnbull wrote: >> /usr/share often is on a different mount; that's the whole >> rationale for /usr/share. Guido> I don't think I've worked at a place where something like Guido> that was done for at least 10 years. Isn't this argument Guido> outdated? I don't know. It may be obsolete in practice. I just know that I do it, and so do several of the people on Coda list. In my case, I don't do it because I'm short of disk space. I do it because my preferred distributed file system is Coda, which doesn't support exporting a local file system. You use a specialized server instead. Because Coda is designed for disconnected use, the files I actually am using are in the cache (200MB, so cache misses when disconnected are fairly rare). But if the host whose files I'm browsing gets an update and I'm connected, Coda automatically refreshes the cache. Coda is still not really production quality, and development on Coda and similar (eg Intermezzo) seem pretty slow, so this use case may never be of practical importance. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From stephen at xemacs.org Sun Feb 19 17:21:59 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 20 Feb 2006 01:21:59 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060218005534.5FA8.JCARLSON@uci.edu> (Josiah Carlson's message of "Sat, 18 Feb 2006 01:16:07 -0800") References: <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> <20060218005534.5FA8.JCARLSON@uci.edu> Message-ID: <878xs7s4ig.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Josiah" == Josiah Carlson writes: Josiah> The question remains: is str.decode() returning a string Josiah> or unicode depending on the argument passed, when the Josiah> argument quite literally names the codec involved, Josiah> difficult to understand? I don't believe so; am I the Josiah> only one? Do you do any of the user education *about codec use* that you recommend? The people I try to teach about coding invariably find it difficult to understand. The problem is that the near-universal intuition is that for "human-usable text" is pretty much anything *but Unicode* will do. This is a really hard block to get them past. There is very good reason why Unicode is plain text ("original" in MAL's terms) and everything else is encoded ("derived"), but students new to the concept often take a while to "get" it. Maybe it's just me, but whether it's the teacher or the students, I am *not* excited about the education route. Martin's simple rule *is* simple, and the exceptions for using a "nonexistent" method mean I don't have to reinforce---the students will be able to teach each other. The exceptions also directly help reinforce the notion that text == Unicode. I grant the point that .decode('base64') is useful, but I also believe that "education" is a lot more easily said than done in this case. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From murman at gmail.com Sun Feb 19 17:23:15 2006 From: murman at gmail.com (Michael Urman) Date: Sun, 19 Feb 2006 10:23:15 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060218225135.5FCA.JCARLSON@uci.edu> References: <43F80A97.2030505@v.loewis.de> <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> <20060218225135.5FCA.JCARLSON@uci.edu> Message-ID: On 2/19/06, Josiah Carlson wrote: > My post probably hasn't convinced you, but much of the confusion, I > believe, is based on Martin's original belief that 'k in dd' should > always return true if there is a default. One can argue that way, but > then you end up on the circular train of thought that gets you to "you > can't do anything useful if that is the case, .popitem() doesn't work, > len() is undefined, ...". Keep it simple, keep it sane. A default factory implementation fundamentally modifies the behavior of the mapping. There is no single answer to the question "what is the right behavior for contains, len, popitem" as that depends on what the code that consumes the mapping is written like, what it is attempting to do, and what you are attempting to override it to do. Or, simply, on why you are providing a default value. Resisting the temptation to guess the why and just leaving the methods as is seems the best choice; overriding __contains__ to return true is much easier than reversing that behavior would be. An example when it could theoretically be used, if not particularly useful. The gettext.install() function was just updated to take a names parameter which controls which gettext accessor functions it adds to the builtin namespace. Its implementation looks for "method in names" to decide. Passing a default-true dict would allow the future behavior to be bind all checked names, but only if __contains__ returns True. Even though it would make a poor base implementation, and these effects aren't a good candidate for it, the code style that could best leverage such a __contains__ exists. Michael -- Michael Urman http://www.tortall.net/mu/blog From benji at benjiyork.com Sun Feb 19 18:06:13 2006 From: benji at benjiyork.com (Benji York) Date: Sun, 19 Feb 2006 12:06:13 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F80D91.7050004@v.loewis.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> Message-ID: <43F8A585.7090906@benjiyork.com> Martin v. L?wis wrote: > I personally don't care much about the visual look of web pages. > However, people have commented that the buildbot page is ugly, > so yes, please do contribute something. See http://www.benjiyork.com/pybb. It doesn't look quite as good in IE because of the limited HTML the buildbot waterfall display generates and the limitations of IE's CSS support. > Bonus points for visually separating the "trunk" columns from > the "2.4" columns. The best I could do without hacking buildbot was to highlight the trunk "builder" links. This only works in Firefox, also because of IE's limited CSS2 support. More could be done if the HTML generation was modified, but that didn't seem prudent. -- Benji York From stephen at xemacs.org Sun Feb 19 18:26:39 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 20 Feb 2006 02:26:39 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> (Bob Ippolito's message of "Fri, 17 Feb 2006 21:10:04 -0800") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <6008E233-E723-48FA-ADA9-48AA321321B3@redivi.com> Message-ID: <874q2vs1io.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Bob" == Bob Ippolito writes: Bob> On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: >> But you aren't always getting *unicode* text from the decoding >> of bytes, and you may be encoding bytes *to* bytes: Please note that I presumed that you can indeed assume that decoding of bytes always results in unicode, and encoding of unicode always results in bytes. I believe Guido made the proposal relying on that assumption too. The constructor notation makes no sense for making an object of the same type as the original unless it's a copy constructor. You could argue that the base64 language is indeed a different language from the bytes language, and I'd agree. But since there's no way in Python to determine whether a string that conforms to base64 is supposed to be base64 or bytes, it would be a very bad idea to interpret the distinction as one of type. >> b2 = bytes(b, "base64") >> b3 = bytes(b2, "base64") >> Which direction are we going again? Bob> This is *exactly* why the current set of codecs are INSANE. Bob> unicode.encode and str.decode should be used *only* for Bob> unicode codecs. Byte transforms are entirely different Bob> semantically and should be some other method pair. General filters are semantically different, I agree. But "encode" and "decode" in English are certainly far more general than character coding conversion. The use of those methods for any stream conversion that is invertible (eg, compression or encryption) is not insane. It's just pedagogically inconvenient given the existing confusion (outside of python-dev, of course) about character coding issues. I'd like to rephrase your statement as "*only* unicode.encode and str.decode should be used for unicode codecs". Ie, str.encode(codec) and unicode.decode(codec) should raise errors if codec is a "unicode codec". The question in my mind is whether we should allow other kinds of codecs or not. I could live with "not", but if we're going to have other kinds of codecs, I think they should have concrete signatures. Ie, basestring -> basestring shouldn't be allowed. Content transfer encodings like BASE64 and quoted-printable, compression, encryption, etc IMO should be bytes -> bytes. Overloading to unicode -> unicode is sorta plausible for BASE64 or QP, but YAGNI. OTOH, the Unicode standard does define a number of unicode -> unicode transformations, and it might make sense to generalize to case conversions etc. (Note that these conversions are pseudo-invertible, so you can think of them as generalized .encode/.decode pairs. The inverse is usually the identity, which seems weird, but from the pedagogical standpoint you could handle that weirdness by raising an error if the .encode method were invoked.) To be concrete, I could imagine writing s2 = s1.decode('upcase') if s2 == s1: print "Why are you shouting at me?" else: print "I like calm, well-spoken snakes." s3 = s2.encode('upcase') if s3 == s2: print "Never fails!" else: print "See a vet; your Python is *very* sick." I chose the decode method to do the non-trivial transformation because .decode()'s value is supposed to be "original" text in MAL's terms. And that's true of uppercase-only text; you're still supposed to be able to read it, so I guess it's not "encoded". That's pretty pedantic; I think it's better to raise on .encode('upcase'). -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From stephen at xemacs.org Sun Feb 19 19:04:24 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 20 Feb 2006 03:04:24 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43f6abab.1091371449@news.gmane.org> (Bengt Richter's message of "Sat, 18 Feb 2006 07:24:31 GMT") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> Message-ID: <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Bengt" == Bengt Richter writes: Bengt> The characters in b could be encoded in plain ascii, or Bengt> utf16le, you have to know. Which base64 are you thinking about? Both RFC 3548 and RFC 2045 (MIME) specify subsets of US-ASCII explicitly. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From walter at livinglogic.de Sun Feb 19 19:33:37 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Sun, 19 Feb 2006 19:33:37 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8A585.7090906@benjiyork.com> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> Message-ID: <43F8BA01.9060506@livinglogic.de> Benji York wrote: > Martin v. L?wis wrote: >> I personally don't care much about the visual look of web pages. >> However, people have commented that the buildbot page is ugly, >> so yes, please do contribute something. > > See http://www.benjiyork.com/pybb. > > It doesn't look quite as good in IE because of the limited HTML the > buildbot waterfall display generates and the limitations of IE's CSS > support. > >> Bonus points for visually separating the "trunk" columns from >> the "2.4" columns. > > The best I could do without hacking buildbot was to highlight the trunk > "builder" links. This only works in Firefox, also because of IE's > limited CSS2 support. > > More could be done if the HTML generation was modified, but that didn't > seem prudent. I'd like to see vertical lines between the column. Why is everything bold? Bye, Walter D?rwald From martin at v.loewis.de Sun Feb 19 19:55:49 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 19:55:49 +0100 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87hd6vs9p1.fsf@tleepslib.sk.tsukuba.ac.jp> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> <43F7113E.8090300@egenix.com> <87hd6vs9p1.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43F8BF35.5060503@v.loewis.de> Stephen J. Turnbull wrote: > BTW, what use cases do you have in mind for Unicode -> Unicode > decoding? I think "rot13" falls into that category: it is a transformation on text, not on bytes. For other "odd" cases: "base64" goes Unicode->bytes in the *decode* direction, not in the encode direction. Some may argue that base64 is bytes, not text, but in many applications, you can combine base64 (or uuencode) with abitrary other text in a single stream. Of course, it could be required that you go u.encode("ascii").decode("base64"). > def encode-mime-body (string, codec-list): > if codec-list[0] not in charset-codec-list: > raise NotCharsetCodecException > if len (codec-list) > 1 and codec-list[-1] not in transfer-codec-list: > raise NotTransferCodecException > for codec in codec-list: > string = string.encode (codec) > return string > > mime-body = encode-mime-body ("This is a pen.", > [ 'shift_jis', 'zip', 'base64' ]) I think this is an example where you *should* use the codec API, as designed. As that apparently requires streams for stacking (ie. no support for codec stacking), you would have to write def encode_mime_body(string, codec_list): stack = output = cStringIO.StringIO() for codec in reversed(codec_list): stack = codecs.getwriter(codec)(stack) stack.write(string) stack.reset() return output.getValue() Notice that you have to start the stacking with the last codec, and you have to keep a reference to the StringIO object where the actual bytes end up. Regards, Martin P.S. there shows some LISP through in your Python code :-) From martin at v.loewis.de Sun Feb 19 20:04:41 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 20:04:41 +0100 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <878xs7s4ig.fsf@tleepslib.sk.tsukuba.ac.jp> References: <20060217221623.5FA5.JCARLSON@uci.edu> <43F6DC4C.1070100@ronadam.com> <20060218005534.5FA8.JCARLSON@uci.edu> <878xs7s4ig.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43F8C149.90509@v.loewis.de> Stephen J. Turnbull wrote: > Do you do any of the user education *about codec use* that you > recommend? The people I try to teach about coding invariably find it > difficult to understand. The problem is that the near-universal > intuition is that for "human-usable text" is pretty much anything *but > Unicode* will do. It really is a matter of education. For the first time in my career, I have been teaching the first-semester programming course, and I was happy to see that the text book already has a section on text and Unicode (actually, I selected the text book also based on whether there was good discussion of that aspect). So I spent quite some time with data representation (integrals, floats, characters), and I hope that the students now "got it". If they didn't learn it that way in the first semester (or already got mis-educated in highschool), it will be very hard for them to relearn. So I expect that it will take a decade or two until this all is common knowledge. Regards, Martin From martin at v.loewis.de Sun Feb 19 20:14:25 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 20:14:25 +0100 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43F8C391.2070405@v.loewis.de> Stephen J. Turnbull wrote: > Bengt> The characters in b could be encoded in plain ascii, or > Bengt> utf16le, you have to know. > > Which base64 are you thinking about? Both RFC 3548 and RFC 2045 > (MIME) specify subsets of US-ASCII explicitly. Unfortunately, it is ambiguous as to whether they refer to US-ASCII, the character set, or US-ASCII, the encoding. It appears that RFC 3548 talks about the character set only: - section 2.4 talks about "choosing an alphabet", and how it should be possible for humans to handle such data. - section 2.3 talks about non-alphabet characters So it appears that RFC 3548 defines a conversion bytes->text. To transmit this, you then also need encoding. MIME appears to also use the US-ASCII *encoding* ("charset", in IETF speak), for the "base64" Content-Transfer-Encoding. For an example where base64 is *not* necessarily ASCII-encoded, see the "binary" data type in XML Schema. There, base64 is embedded into an XML document, and uses the encoding of the entire XML document. As a result, you may get base64 data in utf16le. Regards, Martin From martin at v.loewis.de Sun Feb 19 20:18:00 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2006 20:18:00 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8BA01.9060506@livinglogic.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> Message-ID: <43F8C468.6010705@v.loewis.de> Walter D?rwald wrote: > I'd like to see vertical lines between the column. Can you please elaborate? Between which columns? > Why is everything bold? Not sure. Regards, Martin From walter at livinglogic.de Sun Feb 19 20:37:29 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Sun, 19 Feb 2006 20:37:29 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8C468.6010705@v.loewis.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8C468.6010705@v.loewis.de> Message-ID: <43F8C8F9.6030705@livinglogic.de> Martin v. L?wis wrote: > Walter D?rwald wrote: >> I'd like to see vertical lines between the column. > > Can you please elaborate? Between which columns? Something like this: http://styx.livinglogic.de/~walter/python/buildbot.gif >> Why is everything bold? > > Not sure. Bye, Walter D?rwald From nnorwitz at gmail.com Sun Feb 19 20:42:51 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 19 Feb 2006 11:42:51 -0800 Subject: [Python-Dev] [Python-checkins] r42396 - peps/trunk/pep-0011.txt In-Reply-To: <43F5D222.2030402@egenix.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> Message-ID: On 2/17/06, M.-A. Lemburg wrote: > Neal Norwitz wrote: > > > > I don't have a strong opinion. Any one else have an opinion about > > removing --with-wctype-functions from configure? > > FWIW, I announced this plan in Dec 2004: > > http://mail.python.org/pipermail/python-dev/2004-December/050193.html > > I didn't get any replies back then, so assumed that no-one > would object, but forgot to add this to the PEP 11. > > The reason I'd like to get this removed early rather than > later is that some Linux distros happen to use the config > switch causing the Python Unicode implementation on those > distros to behave inconsistent with regular Python > builds. Like I said I don't have a strong opinion. At least update PEP 11 now. It would be good to ask on c.l.p. I suspect that no one cares enough about this flag to complain. So it's probably ok to remove it. But we should at least give people the opportunity to object. > Another candidate for removal is the --disable-unicode > switch. > > We should probably add a deprecation warning for that in > Py 2.5 and then remove the hundreds of > #idef Py_USING_UNICODE > from the source code in time for Py 2.6. I've heard of a bunch of people using --disable-unicode. I'm not sure if it's curiosity or if there are really production builds without unicode. Ask this on c.l.p too. We can update configure to add the warning and add a note to PEP 11. If we don't hear any complaints remove it for 2.6. If there are complaints, we can always back off. n From ianb at colorstudy.com Sun Feb 19 21:15:38 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 19 Feb 2006 14:15:38 -0600 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F80A97.2030505@v.loewis.de> <00d301c6351e$6e02b500$b83efea9@RaymondLaptop1> <20060218225135.5FCA.JCARLSON@uci.edu> Message-ID: <43F8D1EA.8080702@colorstudy.com> Michael Urman wrote: > On 2/19/06, Josiah Carlson wrote: > >>My post probably hasn't convinced you, but much of the confusion, I >>believe, is based on Martin's original belief that 'k in dd' should >>always return true if there is a default. One can argue that way, but >>then you end up on the circular train of thought that gets you to "you >>can't do anything useful if that is the case, .popitem() doesn't work, >>len() is undefined, ...". Keep it simple, keep it sane. > > > A default factory implementation fundamentally modifies the behavior > of the mapping. There is no single answer to the question "what is the > right behavior for contains, len, popitem" as that depends on what the > code that consumes the mapping is written like, what it is attempting > to do, and what you are attempting to override it to do. Or, simply, > on why you are providing a default value. Resisting the temptation to > guess the why and just leaving the methods as is seems the best > choice; overriding __contains__ to return true is much easier than > reversing that behavior would be. I agree that there is simply no universally correct answer for the various uses of default_factory. I think ambiguity on points like this is a sign that something is overly general. In many of the concrete cases it is fairly clear how these methods should work. In the most obvious case (default_factory=list) what seems to be to be the correct implementation is one that no one is proposing, that is, "x in d" means "d.get(x)". But that uses the fact that the return value of default_factory() is a false value, which we cannot assume in general. And it effects .keys() -- which I would propose overriding for multidict (so it only returns keys with non-empty lists for values), but I don't see how it could be made correct for default_factory. I just don't see why we should cram all these potential features into dict by using a vague feature like default_factory. Why can't we just add a half-dozen new types of collections (to the module of the same name)? Each one will get its own page of documentation, a name, a proper __repr__, and well defined meaning for all of these methods that it shares with dict only insofar as it makes sense to share. Note that even if we use defaultdict or autodict or something besides changing dict itself, we still won't get a good __contains__, a good repr, or any of the other features that specific collection implementations will give us. Isn't there anyone else who sees the various dict-like objects being passed around as recipes, and thinks that maybe that's a sign they should go in the stdlib? The best of those recipes aren't all-encompassing, they just do one kind of container well. -- Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org From benji at benjiyork.com Sun Feb 19 22:06:41 2006 From: benji at benjiyork.com (Benji York) Date: Sun, 19 Feb 2006 16:06:41 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8BA01.9060506@livinglogic.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> Message-ID: <43F8DDE1.8010305@benjiyork.com> Walter D?rwald wrote: > I'd like to see vertical lines between the column. I've done a version like that (still at http://www.benjiyork.com/pybb). > Why is everything bold? I was trying to increase the legibility of the smaller type (a result of trying to fit more in the horizontal space). The current version is bold-free with slightly larger text. -- Benji York From tim.peters at gmail.com Sun Feb 19 22:07:47 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 19 Feb 2006 16:07:47 -0500 Subject: [Python-Dev] test_fileinput failing on Windows Message-ID: <1f7befae0602191307g41f7c41bh633a1459fea351cc@mail.gmail.com> This started failing since last night: C:\Code\python\PCbuild>python ..\lib\test\test_fileinput.py 1. Simple iteration (bs=0) 2. Status variables (bs=0) 3. Nextfile (bs=0) 4. Stdin (bs=0) 5. Boundary conditions (bs=0) 6. Inplace (bs=0) 7. Simple iteration (bs=30) 8. Status variables (bs=30) 9. Nextfile (bs=30) 10. Stdin (bs=30) 11. Boundary conditions (bs=30) 12. Inplace (bs=30) 13. 0-byte files 14. Files that don't end with newline 15. Unicode filenames 16. fileno() 17. Specify opening mode Traceback (most recent call last): File "..\lib\test\test_fileinput.py", line 201, in verify(lines == ["A\n", "B\n", "C\n", "D"]) File "C:\Code\python\lib\test\test_support.py", line 204, in verify raise TestFailed(reason) test.test_support.TestFailed: test failed `lines` at that point is ['A\n', 'B\n', '\n', 'C\n', 'D'] which indeed doesn't equal ["A\n", "B\n", "C\n", "D"] From nnorwitz at gmail.com Sun Feb 19 22:13:12 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sun, 19 Feb 2006 13:13:12 -0800 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8DDE1.8010305@benjiyork.com> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8DDE1.8010305@benjiyork.com> Message-ID: On 2/19/06, Benji York wrote: > Walter D?rwald wrote: > > I'd like to see vertical lines between the column. > > I've done a version like that (still at http://www.benjiyork.com/pybb). I liked your current version better so I installed it. n From benji at benjiyork.com Sun Feb 19 22:14:11 2006 From: benji at benjiyork.com (Benji York) Date: Sun, 19 Feb 2006 16:14:11 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8BA45.1030101@v.loewis.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA45.1030101@v.loewis.de> Message-ID: <43F8DFA3.6080702@benjiyork.com> Martin v. L?wis wrote: > Benji York wrote: > >>See http://www.benjiyork.com/pybb. > > > Great! you haven't explicitly stated that: may I copy this on > python.org? (I did, but I need confirmation) Sure! Feel free to use it as you wish. I replied to Walter D?rwald's suggestions and made a few changes, but don't know which I like better. If you prefer the new one at http://www.benjiyork.com/pybb you can use it as well. (copying python-dev as a permanent record of permission) -- Benji York From tim.peters at gmail.com Sun Feb 19 22:22:29 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 19 Feb 2006 16:22:29 -0500 Subject: [Python-Dev] test_fileinput failing on Windows In-Reply-To: <1f7befae0602191307g41f7c41bh633a1459fea351cc@mail.gmail.com> References: <1f7befae0602191307g41f7c41bh633a1459fea351cc@mail.gmail.com> Message-ID: <1f7befae0602191322s39f69cfex8cf7c0d7df5496db@mail.gmail.com> Never mind -- repaired it. From crutcher at gmail.com Sun Feb 19 22:52:38 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 13:52:38 -0800 Subject: [Python-Dev] New Module: CommandLoop Message-ID: This is something I've been working on for a bit, and I think it is more or less ready to bring up on this list. I'd like to add a module (though probably not for 2.5). Before you ask, this module is _not_ compatible with cmd.py, as it is command oriented, whereas cmd.py is line oriented. Anyway, I'm looking for feedback, feature requests before starting the submission process. Code available here: http://littlelanguages.com/web/software/python/modules/cmdloop.py Base class for writing simple interactive command loop environments. CommandLoop provides a base class for writing simple interactive user environments. It is designed around sub-classing, has a simple command parser, and is trivial to initialize. Here is a trivial little environment written using CommandLoop: import cmdloop class Hello(cmdloop.commandLoop): PS1='hello>' @cmdloop.aliases('hello', 'hi', 'hola') @cmdloop.shorthelp('say hello') @cmdloop.usage('hello TARGET') def helloCmd(self, flags, args): ''' Say hello to TARGET, which defaults to 'world' ''' if flags or len(args) != 1: raise cmdloop.InvalidArguments print 'Hello %s!' % args[0] @cmdloop.aliases('quit') def quitCmd(self, flags, args): ''' Quit the environment. ''' raise cmdloop.HaltLoop Hello().runLoop() Here's a more complex example: import cmdloop class HelloGoodbye(cmdloop.CommandLoop): PS1='hello>' def __init__(self, default_target = 'world'): self.default_target = default_target self.target_list = [] @cmdloop.aliases('hello', 'hi', 'hola') @cmdloop.shorthelp('say hello') @cmdloop.usage('hello [TARGET]') def helloCmd(self, flags, args): ''' Say hello to TARGET, which defaults to 'world' ''' if flags or len(args) > 1: raise cmdloop.InvalidArguments if args: target = args[0] else: target = self.default_target if target not in self.target_list: self.target_list.append(target) print 'Hello %s!' % target @cmdloop.aliases('goodbye') @cmdloop.shorthelp('say goodbye') @cmdloop.usage('goodbye TARGET') def goodbyeCmd(self, flags, args): ''' Say goodbye to TARGET. ''' if flags or len(args) != 1: raise cmdloop.InvalidArguments target = args[0] if target in self.target_list: print 'Goodbye %s!' % target self.target_list.remove(target) else: print "I haven't said hello to %s." % target @cmdloop.aliases('quit') def quitCmd(self, flags, args): ''' Quit the environment. ''' raise cmdloop.HaltLoop def _onLoopExit(self): if len(self.target_list): self.pushCommands(('quit',)) for target in self.target_list: self.pushCommands(('goodbye', target)) else: raise cmdloop.HaltLoop HelloGoodbye().runLoop() -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From crutcher at gmail.com Sun Feb 19 23:02:20 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 14:02:20 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: References: Message-ID: oops, error in the example: s/commandLoop/CommandLoop/g On 2/19/06, Crutcher Dunnavant wrote: > This is something I've been working on for a bit, and I think it is > more or less ready to bring up on this list. I'd like to add a module > (though probably not for 2.5). > > Before you ask, this module is _not_ compatible with cmd.py, as it is > command oriented, whereas cmd.py is line oriented. > > Anyway, I'm looking for feedback, feature requests before starting the > submission process. > > Code available here: > http://littlelanguages.com/web/software/python/modules/cmdloop.py > > Base class for writing simple interactive command loop environments. > > CommandLoop provides a base class for writing simple interactive user > environments. It is designed around sub-classing, has a simple command > parser, and is trivial to initialize. > > Here is a trivial little environment written using CommandLoop: > > import cmdloop > > class Hello(cmdloop.commandLoop): > PS1='hello>' > > @cmdloop.aliases('hello', 'hi', 'hola') > @cmdloop.shorthelp('say hello') > @cmdloop.usage('hello TARGET') > def helloCmd(self, flags, args): > ''' > Say hello to TARGET, which defaults to 'world' > ''' > if flags or len(args) != 1: > raise cmdloop.InvalidArguments > print 'Hello %s!' % args[0] > > @cmdloop.aliases('quit') > def quitCmd(self, flags, args): > ''' > Quit the environment. > ''' > raise cmdloop.HaltLoop > > Hello().runLoop() > > Here's a more complex example: > > import cmdloop > > class HelloGoodbye(cmdloop.CommandLoop): > PS1='hello>' > > def __init__(self, default_target = 'world'): > self.default_target = default_target > self.target_list = [] > > @cmdloop.aliases('hello', 'hi', 'hola') > @cmdloop.shorthelp('say hello') > @cmdloop.usage('hello [TARGET]') > def helloCmd(self, flags, args): > ''' > Say hello to TARGET, which defaults to 'world' > ''' > if flags or len(args) > 1: > raise cmdloop.InvalidArguments > if args: > target = args[0] > else: > target = self.default_target > if target not in self.target_list: > self.target_list.append(target) > print 'Hello %s!' % target > > @cmdloop.aliases('goodbye') > @cmdloop.shorthelp('say goodbye') > @cmdloop.usage('goodbye TARGET') > def goodbyeCmd(self, flags, args): > ''' > Say goodbye to TARGET. > ''' > if flags or len(args) != 1: > raise cmdloop.InvalidArguments > target = args[0] > if target in self.target_list: > print 'Goodbye %s!' % target > self.target_list.remove(target) > else: > print "I haven't said hello to %s." % target > > @cmdloop.aliases('quit') > def quitCmd(self, flags, args): > ''' > Quit the environment. > ''' > raise cmdloop.HaltLoop > > def _onLoopExit(self): > if len(self.target_list): > self.pushCommands(('quit',)) > for target in self.target_list: > self.pushCommands(('goodbye', target)) > else: > raise cmdloop.HaltLoop > > HelloGoodbye().runLoop() > > -- > Crutcher Dunnavant > littlelanguages.com > monket.samedi-studios.com > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From brett at python.org Sun Feb 19 23:09:31 2006 From: brett at python.org (Brett Cannon) Date: Sun, 19 Feb 2006 14:09:31 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: References: Message-ID: On 2/19/06, Crutcher Dunnavant wrote: > This is something I've been working on for a bit, and I think it is > more or less ready to bring up on this list. I'd like to add a module > (though probably not for 2.5). > > Before you ask, this module is _not_ compatible with cmd.py, as it is > command oriented, whereas cmd.py is line oriented. > > Anyway, I'm looking for feedback, feature requests before starting the > submission process. > > Code available here: > http://littlelanguages.com/web/software/python/modules/cmdloop.py Just so you know, there is a basic rule that all new modules need to have been used in the while and generally accepted and used by the Python community before being accepted. While this is not a hard rule, it is mostly followed so if there is no visible use of the module by the rest of the world it will be difficult to get it accepted. -Brett From crutcher at gmail.com Sun Feb 19 23:15:56 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 14:15:56 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: References: Message-ID: Yes, I know. Hence this not being a patch. This is really meant to be a compelling alternative to cmd.Cmd, and as such I'm trying to get some discussion about it. On 2/19/06, Brett Cannon wrote: > On 2/19/06, Crutcher Dunnavant wrote: > > This is something I've been working on for a bit, and I think it is > > more or less ready to bring up on this list. I'd like to add a module > > (though probably not for 2.5). > > > > Before you ask, this module is _not_ compatible with cmd.py, as it is > > command oriented, whereas cmd.py is line oriented. > > > > Anyway, I'm looking for feedback, feature requests before starting the > > submission process. > > > > Code available here: > > http://littlelanguages.com/web/software/python/modules/cmdloop.py > > Just so you know, there is a basic rule that all new modules need to > have been used in the while and generally accepted and used by the > Python community before being accepted. While this is not a hard > rule, it is mostly followed so if there is no visible use of the > module by the rest of the world it will be difficult to get it > accepted. > > -Brett > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From bob at redivi.com Sun Feb 19 23:49:48 2006 From: bob at redivi.com (Bob Ippolito) Date: Sun, 19 Feb 2006 14:49:48 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43F8BF35.5060503@v.loewis.de> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> <43F7113E.8090300@egenix.com> <87hd6vs9p1.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8BF35.5060503@v.loewis.de> Message-ID: <6F604032-F59D-4AE8-8CD0-1CFC67E4A7E7@redivi.com> On Feb 19, 2006, at 10:55 AM, Martin v. L?wis wrote: > Stephen J. Turnbull wrote: >> BTW, what use cases do you have in mind for Unicode -> Unicode >> decoding? > > I think "rot13" falls into that category: it is a transformation > on text, not on bytes. The current implementation is a transformation on bytes, not text. Conceptually though, it's a text->text transform. > For other "odd" cases: "base64" goes Unicode->bytes in the *decode* > direction, not in the encode direction. Some may argue that base64 > is bytes, not text, but in many applications, you can combine base64 > (or uuencode) with abitrary other text in a single stream. Of course, > it could be required that you go u.encode("ascii").decode("base64"). I would say that base64 is bytes->bytes. Just because those bytes happen to be in a subset of ASCII, it's still a serialization meant for wire transmission. Sometimes it ends up in unicode (e.g. in XML), but that's the exception not the rule. -bob From mail at manuzhai.nl Sun Feb 19 23:32:23 2006 From: mail at manuzhai.nl (Manuzhai) Date: Sun, 19 Feb 2006 23:32:23 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8265A.2020209@v.loewis.de> References: <43F8265A.2020209@v.loewis.de> Message-ID: > No; nobody volunteered a machine yet (plus the hand-holding that > is always necessary with Windows). What exactly is needed for this? Does it need to be a machine dedicated to this stuff, or could I just run the tests once every day or so when I feel like it and have them submitted to buildbot? Regards, Manuzhai From walter at livinglogic.de Mon Feb 20 00:07:40 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon, 20 Feb 2006 00:07:40 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8DDE1.8010305@benjiyork.com> Message-ID: <43F8FA3C.30405@livinglogic.de> Neal Norwitz wrote: > On 2/19/06, Benji York wrote: >> Walter D?rwald wrote: >>> I'd like to see vertical lines between the column. >> I've done a version like that (still at http://www.benjiyork.com/pybb). > > I liked your current version better so I installed it. How about this one: http://styx.livinglogic.de/~walter/python/BuildBot_%20Python.html Bye, Walter D?rwald From python at rcn.com Mon Feb 20 00:14:35 2006 From: python at rcn.com (Raymond Hettinger) Date: Sun, 19 Feb 2006 18:14:35 -0500 Subject: [Python-Dev] New Module: CommandLoop References: Message-ID: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> [Crutcher Dunnavant] > Anyway, I'm looking for feedback, feature requests before starting the > submission process. With respect to the API, the examples tend to be visually dominated dominated by the series of decorators. The three decorators do nothing more than add a function attribute, so they are essentially doing the same type of action. Consider combining those into a single decorator and using keywords for the various arguments. For example, change: @cmdloop.aliases('goodbye') @cmdloop.shorthelp('say goodbye') @cmdloop.usage('goodbye TARGET') to just: @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye', usage='goodbye TARGET') leaving the possibility of multiple decorators when one line gets to long: @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye') @cmdloop.addspec(usage='goodbye TARGET # where TARGET is a filename in the current directory') Another thought on the API is to consider adding another decorator option to make commands case-insensitive so that 'help', 'HELP', and 'Help' will all work: @cmdloop.addspec(case_sensitive=True) Also, in the absence of readline(), consider adding support for "!" style repeats of previous commands. The exception hierarchy looks to be well-designed. I'm not clear on whether it is internal or part of the API. If the latter, is there an example of how to trap and handle specific exceptions in specific contexts? If you're interested, here are a few code comments based on my first read-through: 1) The "chars" variable can be eliminated and the "while chars" and "c=chars.pop(0)" sequence simplified to just: for c in reversed(str): . . . 2) Can the reformatDocString() function be replaced by textwrap.dedent() or do they do something different? 3) In _mapCommands(), the sort can be simplified from: self._cmds.sort(lambda a, b: cmp(a.aliases[0], b.aliases[0])) to: self._cmds.sort(key=operator.itemgetter(0)) or if you want to avoid module dependencies: self._cmds.sort(key=lambda a: a[0]) 4) In _preCommand, the sort simplifies from: names = self.aliasdict.keys() names.sort() for name in names: to: for name in sorted(self.aliasdict): Raymond From brett at python.org Mon Feb 20 00:23:18 2006 From: brett at python.org (Brett Cannon) Date: Sun, 19 Feb 2006 15:23:18 -0800 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8FA3C.30405@livinglogic.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8DDE1.8010305@benjiyork.com> <43F8FA3C.30405@livinglogic.de> Message-ID: On 2/19/06, Walter D?rwald wrote: > Neal Norwitz wrote: > > On 2/19/06, Benji York wrote: > >> Walter D?rwald wrote: > >>> I'd like to see vertical lines between the column. > >> I've done a version like that (still at http://www.benjiyork.com/pybb). > > > > I liked your current version better so I installed it. > > How about this one: > http://styx.livinglogic.de/~walter/python/BuildBot_%20Python.html > I like it. It's really nice to be able to fit it all on a single screen (at least for me). Seems slightly crisper to me as well. -Brett From fdrake at acm.org Mon Feb 20 00:32:13 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 19 Feb 2006 18:32:13 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8DFA3.6080702@benjiyork.com> References: <43F8BA45.1030101@v.loewis.de> <43F8DFA3.6080702@benjiyork.com> Message-ID: <200602191832.14566.fdrake@acm.org> On Sunday 19 February 2006 16:14, Benji York wrote: > I replied to Walter D?rwald's suggestions and made a few changes, but > don't know which I like better. If you prefer the new one at > http://www.benjiyork.com/pybb you can use it as well. I like the new one better; any chance we can switch to that on buildbot.zope.org as well? ;-) The improved use of horizontal space is good. -Fred -- Fred L. Drake, Jr. From fdrake at acm.org Mon Feb 20 00:34:49 2006 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 19 Feb 2006 18:34:49 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8FA3C.30405@livinglogic.de> References: <43F8FA3C.30405@livinglogic.de> Message-ID: <200602191834.49422.fdrake@acm.org> On Sunday 19 February 2006 18:07, Walter D?rwald wrote: > How about this one: > http://styx.livinglogic.de/~walter/python/BuildBot_%20Python.html Sigh. This is nice too. Now I'm not sure which I'd rather see on zope.org. ;-) -Fred -- Fred L. Drake, Jr. From raymond.hettinger at verizon.net Mon Feb 20 00:59:07 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 19 Feb 2006 18:59:07 -0500 Subject: [Python-Dev] New Module: CommandLoop References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> Message-ID: <000801c635b0$78a78ac0$b83efea9@RaymondLaptop1> [Raymond Hettinger] > 1) The "chars" variable can be eliminated and the "while chars" and > "c=chars.pop(0)" sequence simplified to just: > for c in reversed(str): Actually, that should have been just: for c in str: . . . Raymond From crutcher at gmail.com Mon Feb 20 01:26:25 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 16:26:25 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> Message-ID: Whoa, thanks. Incorporated the suggestions to the code. On 2/19/06, Raymond Hettinger wrote: > [Crutcher Dunnavant] > > Anyway, I'm looking for feedback, feature requests before starting the > > submission process. > > With respect to the API, the examples tend to be visually dominated dominated by > the series of decorators. The three decorators do nothing more than add a > function attribute, so they are essentially doing the same type of action. > Consider combining those into a single decorator and using keywords for the > various arguments. For example, change: > > @cmdloop.aliases('goodbye') > @cmdloop.shorthelp('say goodbye') > @cmdloop.usage('goodbye TARGET') > > to just: > > @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye', > usage='goodbye TARGET') > > leaving the possibility of multiple decorators when one line gets to long: > > @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye') > @cmdloop.addspec(usage='goodbye TARGET # where TARGET is a filename in > the current directory') Well, why not support both, and leave it up to the user? > Another thought on the API is to consider adding another decorator option to > make commands case-insensitive so that 'help', 'HELP', and 'Help' will all work: > @cmdloop.addspec(case_sensitive=True) shouldn't this be a property of the shell, and not the individual commands? Perhaps a CASE_SENSITIVE=False attribute on the shell? > Also, in the absence of readline(), consider adding support for "!" style > repeats of previous commands. How would this work? Would it be a simple replay of the previous command? Would it search a command history? How do we make it interact with the _preCommand code? I'm not sure how to make this work. > The exception hierarchy looks to be well-designed. I'm not clear on whether it > is internal or part of the API. If the latter, is there an example of how to > trap and handle specific exceptions in specific contexts? The exceptions are part of the API, but are only meant to be thrown by user code, and handled by the module code. There aren't any situations when user code needs to catch modules that I know of. > If you're interested, here are a few code comments based on my first > read-through: > > 1) The "chars" variable can be eliminated and the "while chars" and > "c=chars.pop(0)" sequence simplified to just: > for c in reversed(str): > . . . chars is something of a navel. The parser went through some evolution, and for a time, it _didn't_ consume a character every time arround. However, the chars are not reversed, so given s/str/cmdline/g: for c in cmdline: ... > 2) Can the reformatDocString() function be replaced by textwrap.dedent() or do > they do something different? I guess so, they seem to do the same thing. > 3) In _mapCommands(), the sort can be simplified from: > self._cmds.sort(lambda a, b: cmp(a.aliases[0], b.aliases[0])) > to: > self._cmds.sort(key=operator.itemgetter(0)) > or if you want to avoid module dependencies: > self._cmds.sort(key=lambda a: a[0]) well, almost. we are sorting on the aliases, so self._cmds.sort(key=lambda a: a.aliases[0]) > 4) In _preCommand, the sort simplifies from: > names = self.aliasdict.keys() > names.sort() > for name in names: > to: > for name in sorted(self.aliasdict): > cool. > Raymond > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From crutcher at gmail.com Mon Feb 20 01:28:39 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 16:28:39 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> Message-ID: s/catch modules/catch exceptions/g On 2/19/06, Crutcher Dunnavant wrote: > Whoa, thanks. Incorporated the suggestions to the code. > > On 2/19/06, Raymond Hettinger wrote: > > [Crutcher Dunnavant] > > > Anyway, I'm looking for feedback, feature requests before starting the > > > submission process. > > > > With respect to the API, the examples tend to be visually dominated dominated by > > the series of decorators. The three decorators do nothing more than add a > > function attribute, so they are essentially doing the same type of action. > > Consider combining those into a single decorator and using keywords for the > > various arguments. For example, change: > > > > @cmdloop.aliases('goodbye') > > @cmdloop.shorthelp('say goodbye') > > @cmdloop.usage('goodbye TARGET') > > > > to just: > > > > @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye', > > usage='goodbye TARGET') > > > > leaving the possibility of multiple decorators when one line gets to long: > > > > @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye') > > @cmdloop.addspec(usage='goodbye TARGET # where TARGET is a filename in > > the current directory') > > Well, why not support both, and leave it up to the user? > > > Another thought on the API is to consider adding another decorator option to > > make commands case-insensitive so that 'help', 'HELP', and 'Help' will all work: > > @cmdloop.addspec(case_sensitive=True) > > shouldn't this be a property of the shell, and not the individual commands? > Perhaps a CASE_SENSITIVE=False attribute on the shell? > > > Also, in the absence of readline(), consider adding support for "!" style > > repeats of previous commands. > > How would this work? Would it be a simple replay of the previous > command? Would it search a command history? How do we make it interact > with the _preCommand code? I'm not sure how to make this work. > > > The exception hierarchy looks to be well-designed. I'm not clear on whether it > > is internal or part of the API. If the latter, is there an example of how to > > trap and handle specific exceptions in specific contexts? > > The exceptions are part of the API, but are only meant to be thrown by > user code, and handled by the module code. There aren't any situations > when user code needs to catch modules that I know of. > > > If you're interested, here are a few code comments based on my first > > read-through: > > > > 1) The "chars" variable can be eliminated and the "while chars" and > > "c=chars.pop(0)" sequence simplified to just: > > for c in reversed(str): > > . . . > > chars is something of a navel. The parser went through some evolution, > and for a time, it _didn't_ consume a character every time arround. > However, the chars are not reversed, so given s/str/cmdline/g: > > for c in cmdline: > ... > > > 2) Can the reformatDocString() function be replaced by textwrap.dedent() or do > > they do something different? > > I guess so, they seem to do the same thing. > > > 3) In _mapCommands(), the sort can be simplified from: > > self._cmds.sort(lambda a, b: cmp(a.aliases[0], b.aliases[0])) > > to: > > self._cmds.sort(key=operator.itemgetter(0)) > > or if you want to avoid module dependencies: > > self._cmds.sort(key=lambda a: a[0]) > > well, almost. we are sorting on the aliases, so > self._cmds.sort(key=lambda a: a.aliases[0]) > > > 4) In _preCommand, the sort simplifies from: > > names = self.aliasdict.keys() > > names.sort() > > for name in names: > > to: > > for name in sorted(self.aliasdict): > > > > cool. > > > Raymond > > > > > -- > Crutcher Dunnavant > littlelanguages.com > monket.samedi-studios.com > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From jeff at taupro.com Mon Feb 20 01:34:38 2006 From: jeff at taupro.com (Jeff Rush) Date: Sun, 19 Feb 2006 18:34:38 -0600 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> Message-ID: <43F90E9E.8060603@taupro.com> Neal Norwitz wrote: > On 2/17/06, M.-A. Lemburg wrote: > >>Neal Norwitz wrote: > > >>Another candidate for removal is the --disable-unicode >>switch. >> >>We should probably add a deprecation warning for that in >>Py 2.5 and then remove the hundreds of >>#idef Py_USING_UNICODE >>from the source code in time for Py 2.6. > > I've heard of a bunch of people using --disable-unicode. I'm not sure > if it's curiosity or if there are really production builds without > unicode. Ask this on c.l.p too. Such a switch quite likely is useful to those creating Python interpreters for small hand-held devices, where space is at a premium. I would hesitate to remove switches to drop features in general, for that reason. Although I have played with reducing the footprint of Python, I am not currently doing so. I could never get the footprint down sufficiently to make it usable, unfortunately. But I would like to see the Python developers maintain an awareness of memory consumption and not assume that Python is always run on modern fully-loaded desktops. We are seeing increasing use of Python in embedded systems these days. -Jeff From raymond.hettinger at verizon.net Mon Feb 20 02:03:15 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 19 Feb 2006 20:03:15 -0500 Subject: [Python-Dev] New Module: CommandLoop References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> Message-ID: <003101c635b9$6e93a6f0$b83efea9@RaymondLaptop1> >> @cmdloop.aliases('goodbye') >> @cmdloop.shorthelp('say goodbye') >> @cmdloop.usage('goodbye TARGET') >> >> to just: >> >> @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye', >> usage='goodbye TARGET') >> >> leaving the possibility of multiple decorators when one line gets to long: >> >> @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye') >> @cmdloop.addspec(usage='goodbye TARGET # where TARGET is a filename >> in >> the current directory') > Well, why not support both, and leave it up to the user? Having only one method keeps the API simple. Also, the addspec() approach allows the user to choose between single and multiple lines. BTW, addspec() could be made completely general by supporting all possible keywords at once: def addspec(**kwds): def decorator(func): func.__dict__.update(kwds) return func return decorator With an open definition like that, users can specify new attributes with less effort. Raymond From martin at v.loewis.de Mon Feb 20 02:10:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 02:10:53 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F8265A.2020209@v.loewis.de> Message-ID: <43F9171D.1040208@v.loewis.de> Manuzhai wrote: >>No; nobody volunteered a machine yet (plus the hand-holding that >>is always necessary with Windows). > > > What exactly is needed for this? Does it need to be a machine dedicated > to this stuff, or could I just run the tests once every day or so when I > feel like it and have them submitted to buildbot? "The point" of buildbot (atleast the way we use it) is to see immediately what check-in broke the tests on some platform. So yes, permanent availability would be desirable. However, buildbot runs in the background (atleast on Unix), and gets triggered whenever a checkin occurs. So the machine doesn't have to be *dedicated*; any machine that is always on might do. Regards, Martin From crutcher at gmail.com Mon Feb 20 02:12:44 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 17:12:44 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: <003101c635b9$6e93a6f0$b83efea9@RaymondLaptop1> References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> <003101c635b9$6e93a6f0$b83efea9@RaymondLaptop1> Message-ID: On 2/19/06, Raymond Hettinger wrote: > >> @cmdloop.aliases('goodbye') > >> @cmdloop.shorthelp('say goodbye') > >> @cmdloop.usage('goodbye TARGET') > >> > >> to just: > >> > >> @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye', > >> usage='goodbye TARGET') > >> > >> leaving the possibility of multiple decorators when one line gets to long: > >> > >> @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say goodbye') > >> @cmdloop.addspec(usage='goodbye TARGET # where TARGET is a filename > >> in > >> the current directory') > > > Well, why not support both, and leave it up to the user? > > Having only one method keeps the API simple. Also, the addspec() approach > allows the user to choose between single and multiple lines. > > BTW, addspec() could be made completely general by supporting all possible > keywords at once: > > def addspec(**kwds): > def decorator(func): > func.__dict__.update(kwds) > return func > return decorator > > With an open definition like that, users can specify new attributes with less > effort. Well, yes it could. But as it currently stands, there is no mechanism for user code to manipulate commands, so that would be of limited utility, versus throwing errors if the user specified something unsupported. > > > Raymond > > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From martin at v.loewis.de Mon Feb 20 02:14:09 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 02:14:09 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8A585.7090906@benjiyork.com> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> Message-ID: <43F917E1.1080006@v.loewis.de> Benji York wrote: > See http://www.benjiyork.com/pybb. > > It doesn't look quite as good in IE because of the limited HTML the > buildbot waterfall display generates and the limitations of IE's CSS > support. Thanks again for the contribution! > The best I could do without hacking buildbot was to highlight the trunk > "builder" links. This only works in Firefox, also because of IE's > limited CSS2 support. > > More could be done if the HTML generation was modified, but that didn't > seem prudent. I looked at it, and it would require quite a lot of changes to the buildbot, so I abstain from wanting such a thing (atleast for the moment). Your regex-matching (or whatever the mechanism is) works quite well for me. Regards, Martin From bob at redivi.com Mon Feb 20 02:24:04 2006 From: bob at redivi.com (Bob Ippolito) Date: Sun, 19 Feb 2006 17:24:04 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: <003101c635b9$6e93a6f0$b83efea9@RaymondLaptop1> References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1> <003101c635b9$6e93a6f0$b83efea9@RaymondLaptop1> Message-ID: <0AB4CB4E-E404-4D28-9310-E08B337E17CD@redivi.com> On Feb 19, 2006, at 5:03 PM, Raymond Hettinger wrote: >>> @cmdloop.aliases('goodbye') >>> @cmdloop.shorthelp('say goodbye') >>> @cmdloop.usage('goodbye TARGET') >>> >>> to just: >>> >>> @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say >>> goodbye', >>> usage='goodbye TARGET') >>> >>> leaving the possibility of multiple decorators when one line gets >>> to long: >>> >>> @cmdloop.addspec(aliases=['goodbye'], shorthelp ='say >>> goodbye') >>> @cmdloop.addspec(usage='goodbye TARGET # where TARGET is >>> a filename >>> in >>> the current directory') > >> Well, why not support both, and leave it up to the user? > > Having only one method keeps the API simple. Also, the addspec() > approach > allows the user to choose between single and multiple lines. > > BTW, addspec() could be made completely general by supporting all > possible > keywords at once: > > def addspec(**kwds): > def decorator(func): > func.__dict__.update(kwds) > return func > return decorator > > With an open definition like that, users can specify new attributes > with less > effort. Doesn't this discussion belong on c.l.p / python-list? -bob From tjreedy at udel.edu Mon Feb 20 02:39:13 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 19 Feb 2006 20:39:13 -0500 Subject: [Python-Dev] New Module: CommandLoop References: Message-ID: I know it is tempting and perhaps ok in your own privatecode, but casually masking builtins like 'str' in public library code sets a bad example ;-). tjr From tjreedy at udel.edu Mon Feb 20 02:27:10 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 19 Feb 2006 20:27:10 -0500 Subject: [Python-Dev] buildbot is all green References: <43F8265A.2020209@v.loewis.de> <43F9171D.1040208@v.loewis.de> Message-ID: >>>is always necessary with Windows). With a couple of more machines added, should there be two separate pages for trunk and 2.4 builds? Or do most checkins affect both? From mhammond at skippinet.com.au Mon Feb 20 03:07:06 2006 From: mhammond at skippinet.com.au (Mark Hammond) Date: Mon, 20 Feb 2006 13:07:06 +1100 Subject: [Python-Dev] javascript "standing on Python's shoulders" as it moves forward. Message-ID: Sorry for the slightly off-topic post, but I thought it of interest that Brendan Eich (the "father" of javascript) has blogged about the future of js, and specifically how he will "borrow from Python for iteration, generators, and comprehensions" and more generally why he is "standing on Python?s shoulders" when appropriate. http://weblogs.mozillazine.org/roadmap/archives/2006/02/ The fact my name appears there is a happy coincidence related to the fact I am working with Mozilla on making their DOM "language agnostic" and supporting Python - but the general reasons why Python is seen as important is interesting... Mark From crutcher at gmail.com Mon Feb 20 03:21:39 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Sun, 19 Feb 2006 18:21:39 -0800 Subject: [Python-Dev] New Module: CommandLoop In-Reply-To: References: Message-ID: totally agree, removed them. On 2/19/06, Terry Reedy wrote: > I know it is tempting and perhaps ok in your own privatecode, but casually > masking builtins like 'str' in public library code sets a bad example ;-). > > tjr > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From fredrik at pythonware.com Mon Feb 20 03:22:54 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 20 Feb 2006 03:22:54 +0100 Subject: [Python-Dev] New Module: CommandLoop References: <001301c635aa$40288e70$b83efea9@RaymondLaptop1><003101c635b9$6e93a6f0$b83efea9@RaymondLaptop1> <0AB4CB4E-E404-4D28-9310-E08B337E17CD@redivi.com> Message-ID: Bob Ippolito wrote: > Doesn't this discussion belong on c.l.p / python-list? yes, please. From jcarlson at uci.edu Mon Feb 20 04:52:43 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 19 Feb 2006 19:52:43 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <20060218225135.5FCA.JCARLSON@uci.edu> Message-ID: <20060219194015.5FCD.JCARLSON@uci.edu> "Michael Urman" wrote: > > On 2/19/06, Josiah Carlson wrote: > > My post probably hasn't convinced you, but much of the confusion, I > > believe, is based on Martin's original belief that 'k in dd' should > > always return true if there is a default. One can argue that way, but > > then you end up on the circular train of thought that gets you to "you > > can't do anything useful if that is the case, .popitem() doesn't work, > > len() is undefined, ...". Keep it simple, keep it sane. > > A default factory implementation fundamentally modifies the behavior > of the mapping. There is no single answer to the question "what is the > right behavior for contains, len, popitem" as that depends on what the > code that consumes the mapping is written like, what it is attempting > to do, and what you are attempting to override it to do. Or, simply, > on why you are providing a default value. Resisting the temptation to > guess the why and just leaving the methods as is seems the best > choice; overriding __contains__ to return true is much easier than > reversing that behavior would be. I agree, there is nothing perfect. But at least in all of my use-cases, and the majority of the ones I've seen 'in the wild', my previous post provided an implementation that worked precisely like desired, and precisely like a regular dictionary, except when accessing a non-existant key via: value = dd[key] . __contains__, etc., all work exactly like they do with a non-defaulting dictionary. Iteration via popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the way you would expect them. The only nit is that code which iterates like: for key in keys: try: value = dd[key] except KeyError: continue (where 'keys' has nothing to do with dd.keys(), it is merely a listing of keys which are desired at this particular point) However, the following works like it always did: for key in keys: if key not in dd: continue value = dd[key] > An example when it could theoretically be used, if not particularly > useful. The gettext.install() function was just updated to take a > names parameter which controls which gettext accessor functions it > adds to the builtin namespace. Its implementation looks for "method in > names" to decide. Passing a default-true dict would allow the future > behavior to be bind all checked names, but only if __contains__ > returns True. > > Even though it would make a poor base implementation, and these > effects aren't a good candidate for it, the code style that could > best leverage such a __contains__ exists. Indeed, there are cases where an always-true __contains__ exists, and the pure-Python implementation I previously posted can be easily modified to offer such a feature. However, because there are also use cases for the not-always-true __contains__, picking either as the "one true way" seems a bit unnecessary. Presumably, if one goes into the collections module, the other will too. Actually, they could share all of their code except for a simple flag which determines the always-true __contains__. With minor work, that 'flag', or really the single bit it would require, may even be embeddable into the type object. Arguably, there should be a handful of these defaulting dictionary-like objects, and for each variant, it should be documented what their use-cases are, and any gotcha's that will inevitably come up. - Josiah From jcarlson at uci.edu Mon Feb 20 05:28:41 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 19 Feb 2006 20:28:41 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <878xs7s4ig.fsf@tleepslib.sk.tsukuba.ac.jp> References: <20060218005534.5FA8.JCARLSON@uci.edu> <878xs7s4ig.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <20060219195258.5FD0.JCARLSON@uci.edu> "Stephen J. Turnbull" wrote: > > >>>>> "Josiah" == Josiah Carlson writes: > > Josiah> The question remains: is str.decode() returning a string > Josiah> or unicode depending on the argument passed, when the > Josiah> argument quite literally names the codec involved, > Josiah> difficult to understand? I don't believe so; am I the > Josiah> only one? > > Do you do any of the user education *about codec use* that you > recommend? The people I try to teach about coding invariably find it > difficult to understand. The problem is that the near-universal > intuition is that for "human-usable text" is pretty much anything *but > Unicode* will do. This is a really hard block to get them past. > There is very good reason why Unicode is plain text ("original" in > MAL's terms) and everything else is encoded ("derived"), but students > new to the concept often take a while to "get" it. I've not been teaching Python; when I was still a TA, it was strictly algorithms and data structures. Of those people who I have had the opportunity to entice into Python, I've not followed up on their progress to know if they had any issues. I try to internalize it by not thinking of strings as encoded data, but as binary data, and unicode as text. I then remind myself that unicode isn't native on-disk or cross-network (which stores and transports bytes, not characters), so one needs to encode it as binary data. It's a subtle difference, but it has worked so far for me. In my experience, at least for only-English speaking users, most people don't even get to unicode. I didn't even touch it until I had been well versed with the encoding and decoding of all different kinds of binary data, when a half-dozen international users (China, Japan, Russia, ...) requested its support in my source editor; so I added it. Supporting it properly hasn't been very difficult, and the only real nit I have experienced is supporting the encoding line just after the #! line for arbitrary codecs (sometimes saving a file in a particular encoding dies). I notice that you seem to be in Japan, so teaching unicode is a must. If you are using the "unicode is text" and "strings are data", and they aren't getting it; then I don't know. > Maybe it's just me, but whether it's the teacher or the students, I am > *not* excited about the education route. Martin's simple rule *is* > simple, and the exceptions for using a "nonexistent" method mean I > don't have to reinforce---the students will be able to teach each > other. The exceptions also directly help reinforce the notion that > text == Unicode. Are you sure that they would help? If .encode() and .decode() drop from strings and unicode (respectively), they get an AttributeError. That's almost useless. Raising a better exception (with more information) would be better in that case, but losing the functionality that either would offer seems unnecessary; which is why I had suggested some of the other method names. Perhaps a "This method was removed because it confused users. Use help(str.encode) (or unicode.decode) to find out how you can do the equivalent, or do what you *really* wanted to do." > I grant the point that .decode('base64') is useful, but I also believe > that "education" is a lot more easily said than done in this case. What I meant by "education" is 'better documentation' and 'better exception messages'. I didn't learn Python by sitting in a class; I learned it by going through the tutorial over a weekend as a 2nd year undergrad and writing software which could do what I wanted/needed. Compared to the compiler messages I'd been seeing from Codewarrior and MSVC 6, Python exceptions were like an oracle. I can understand how first-time programmers can have issues with *some* Python exception messages, which is why I think that we could use better ones. There is also the other issue that sometimes people fail to actually read the messages. Again, I don't believe that an AttributeError is any better than an "ordinal not in range(128)", but "You are trying to encode/decode to/from incompatible types. expected: a->b got: x->y" is better. Some of those can be done *very soon*, given the capabilities of the encodings module, and they could likely be easily migrated, regardless of the decisions with .encode()/.decode() . - Josiah From guido at python.org Mon Feb 20 06:16:37 2006 From: guido at python.org (Guido van Rossum) Date: Sun, 19 Feb 2006 21:16:37 -0800 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43F90E9E.8060603@taupro.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> Message-ID: On 2/19/06, Jeff Rush wrote: [Quoting Neal Norwitz] > > I've heard of a bunch of people using --disable-unicode. I'm not sure > > if it's curiosity or if there are really production builds without > > unicode. Ask this on c.l.p too. > > Such a switch quite likely is useful to those creating Python interpreters > for small hand-held devices, where space is at a premium. I would hesitate > to remove switches to drop features in general, for that reason. > > Although I have played with reducing the footprint of Python, I am not > currently doing so. I could never get the footprint down sufficiently to > make it usable, unfortunately. But I would like to see the Python > developers maintain an awareness of memory consumption and not assume that > Python is always run on modern fully-loaded desktops. We are seeing > increasing use of Python in embedded systems these days. Do you know of any embedded platform that doesn't have unicode support as a requirement? Python runs fine on Nokia phones running Symbian, where *everything* is a Unicode string. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 20 06:22:02 2006 From: guido at python.org (Guido van Rossum) Date: Sun, 19 Feb 2006 21:22:02 -0800 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F8265A.2020209@v.loewis.de> <43F9171D.1040208@v.loewis.de> Message-ID: On 2/19/06, Terry Reedy wrote: > With a couple of more machines added, should there be two separate pages > for trunk and 2.4 builds? Or do most checkins affect both? They don't; I think a separate page would be a fine idea. FWIW, it looks like all the sample templates are still wasting a lot of horizontal space in the first two columns the second is almost always empty. Perhaps the author of the change could be placed *below* the timestamp instead of next to it? Also for all practical purposes we can probably get rid of the seconds in the timestamp. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at holdenweb.com Mon Feb 20 07:35:59 2006 From: steve at holdenweb.com (Steve Holden) Date: Mon, 20 Feb 2006 01:35:59 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F8FA3C.30405@livinglogic.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8DDE1.8010305@benjiyork.com> <43F8FA3C.30405@livinglogic.de> Message-ID: <43F9634F.3020000@holdenweb.com> Walter D?rwald wrote: > Neal Norwitz wrote: > >>On 2/19/06, Benji York wrote: >> >>>Walter D?rwald wrote: >>> >>>>I'd like to see vertical lines between the column. >>> >>>I've done a version like that (still at http://www.benjiyork.com/pybb). >> >>I liked your current version better so I installed it. > > > How about this one: > http://styx.livinglogic.de/~walter/python/BuildBot_%20Python.html > All formats would be improved of the headers could be made to float at the top of the page as scrolling took place. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From martin at v.loewis.de Mon Feb 20 08:09:44 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 08:09:44 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F8265A.2020209@v.loewis.de> <43F9171D.1040208@v.loewis.de> Message-ID: <43F96B38.4040004@v.loewis.de> Terry Reedy wrote: >>>>is always necessary with Windows). > > > With a couple of more machines added, should there be two separate pages > for trunk and 2.4 builds? Or do most checkins affect both? I'd like to avoid this, assuming that people only look at the "main" page. An individual checkin affects either the trunk or 2.4, but never both; many check-ins come in pairs. Regards, Martin From jeff at taupro.com Mon Feb 20 10:36:09 2006 From: jeff at taupro.com (Jeff Rush) Date: Mon, 20 Feb 2006 03:36:09 -0600 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> Message-ID: <43F98D89.2060102@taupro.com> Guido van Rossum wrote: > On 2/19/06, Jeff Rush wrote: > [Quoting Neal Norwitz] > >>>I've heard of a bunch of people using --disable-unicode. I'm not sure >>>if it's curiosity or if there are really production builds without >>>unicode. Ask this on c.l.p too. >> > Do you know of any embedded platform that doesn't have unicode support > as a requirement? Python runs fine on Nokia phones running Symbian, > where *everything* is a Unicode string. 1. PalmOS, at least the last time I was involved with it. Python on a Palm is a very tight fit. 2. "GM862 Cellular Module with Python Interpreter" http://tinyurl.com/jgxz These may be dimishing markets as memory capacity increases and I wouldn't argue adding compile flags for such at this late date, but if the flags are already there, perhaps the slight inconvenience to Python-internal developers is worth it. Hey, perhaps dropping out Unicode support is not a big win - I just know it is useful at times to have a collection of flags to drop out floating point, complex arithmetic, language parsing and such for memory-constrained cases. -Jeff From bokr at oz.net Mon Feb 20 12:52:22 2006 From: bokr at oz.net (Bengt Richter) Date: Mon, 20 Feb 2006 11:52:22 GMT Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]] References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060218051344.GC28761@panix.com> <43F6E1FA.7080909@v.loewis.de> Message-ID: <43f9424b.1261003927@news.gmane.org> On Sat, 18 Feb 2006 09:59:38 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: >Aahz wrote: >> The problem is that they don't understand that "Martin v. L?wis" is not >> Unicode -- once all strings are Unicode, this is guaranteed to work. Well, after all the "string" literal escapes that were being used to define byte values are all rewritten, yes, I'll believe the guarantee ;-) (BTW, are there plans for migration tools?) Ok, now back to the s/bytes/octet/ topic: > >This specific call, yes. I don't think the problem will go away as long >as both encode and decode are available for both strings and byte >arrays. > >> While it's not absolutely true, my experience of watching Unicode >> confusion is that the simplest approach for newbies is: encode FROM >> Unicode, decode TO Unicode. > >I think this is what should be in-grained into the library, also. It >shouldn't try to give additional meaning to these terms. > Thinking about bytes recently, it occurs to me that bytes are really not intrinsically numeric in nature. They don't necessarily represent uint8's. E.g., a binary file is really a sequence of bit octets in its most primitive and abstract sense. So I'm wondering if we shouldn't have an octet type analogous to unicode, and instances of octet would be vectors of octets as abstract 8-bit bit vectors, like instances of unicode are vectors of abstract characters. If you wanted integers you could map ord for integers guaranteed to be in range(256). The constructor would naturally take any suitable integer sequence so octet([65,66,67]) would work. In general, all encode methods would produce an octet instance, e.g. unicode.encode. octet.decode(octet_instance, 'src_encoding') or octet_instance.decode('src_encoding') would do all the familiar character code sequence decoding, e.g., octet.decode(oseq, 'utf-8') or oseq.decode('utf-8') to make a unicode instance. Going from unicode, unicode.encode(uinst, 'utf-8') or uinst.encode('utf-8') would produce an octet instance. I think this is conceptually purer than the current bytes idea, since the result really has no arithmetic significance. Also, ord would work on a length-one octet instance, and produce the unsigned integer value you'd expect, but would fail if not length-one, like ord on unicode (or current str). Thus octet would replace bytes as the binary info container, and would not have any presumed aritmetic significance, either as integer or as character-of-current-source-encoding-inferred-from-integer-value-as-ord. To get a text representation of octets, hex is natural, e.g., octet('6162 6380') # spaces ignored so repr(octet('a deaf bee')) => "octet('adeafbee')" and octet('616263').decode('ascii') => u'abc' and back: u'abc.encode('ascii') => octet('616263'). The base64 codec looks conceptually cleaner too, so long as you keep in mind base64 as a character subset of unicode and the name of the transformation function pair. octet('616263').decode('base64') => u'YWJj\n' # octets -> characters u'YWJj\n'.encode('base64') => octet('616263') # characters -> octets If you wanted integer-nature bytes, you could have octet codecs for uint8 and int8, e.g., octseq.decode('int8') could produce a list of signed integers all in range(-128,128). Or maybe map(dec_int8, octseq). The array module could easily be a target for octet.decode, e.g., octseq.decode('array_B') or octet.decode(octseq, 'array_B'), and octet(array_instance) the other way. Likewise, other types could be destination for octet.decode. E.g., if you had an abstraction for a display image one could have 'gif' and 'png' and 'bmp' etc be like 'cp437', 'latin-1', and 'utf-8' etc are for decoding octest to unicode, and write stuff like o_seq = open('pic.gif','rb') # makes octet instance img = o_seq.decode('gif89') # => img is abstract, internally represented suitably but hidden, like unicode. open('pic.png', 'wb').write(img.encode('png')) UIAM PIL has this functionality, if not as encode/decode methods. Similarly, there could be an abstract archive container, and you could have arch = open('tree.tgz','rb').decode('tgz') # => might do lazy things waiting for encode egg_octets = arch.encode('python_egg') # convert to egg format?? (just hand-waving ;-) Probably all it would take is to wrap some things in abstract-container (AC) types, to enforce the protocol. Image(octet_seq, 'gif') might produce an AC that only saved a (octet_seq, 'gif') internally, or it might do eager conversion per optional additional args. Certainly .bmp without rle can be hugely wasteful. For flexibility like eager vs not, or perhaps returning an iterator instead of a byte sequence, I guess the encode/decode signatures should be (enc, *args, **kw) and pass those things on to the worker functions? An abstract container could have a "pack" codec to do serial composition/decomposition. I'm sure Mal has all this stuff one way or another, but I wanted the conceptual purity of AC instances ac in ac = octet_seq.decode('src_enc'); octet_seq = ac.encode('dst_enc') ;-) Bottom line thought: binary octets aren't numeric ;-) Regards, Bengt Richter From p.f.moore at gmail.com Mon Feb 20 13:01:02 2006 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 20 Feb 2006 12:01:02 +0000 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <43F64A7A.3060400@v.loewis.de> Message-ID: <79990c6b0602200401u143f3d7at59e9323465db8300@mail.gmail.com> On 2/19/06, Steve Holden wrote: > > You are missing the rationale of the PEP process. The point is > > *not* documentation. The point of the PEP process is to channel > > and collect discussion, so that the BDFL can make a decision. > > The BDFL is not bound at all to the PEP process. > > > > To document things, we use (or should use) documentation. > > > > > One could wish this ideal had been the case for the import extensions > defined in PEP 302. (A bit off-topic, but that hit home, so I'll reply...) Agreed, and it's my fault they weren't, to some extent. I did try to find a suitable place, but the import docs are generally fairly scattered, and there wasn't a particularly good place to put the changes. Any suggestions would be gratefully accepted... Paul. From mal at egenix.com Mon Feb 20 14:23:22 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 20 Feb 2006 14:23:22 +0100 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43F98D89.2060102@taupro.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> Message-ID: <43F9C2CA.4010808@egenix.com> Jeff Rush wrote: > Guido van Rossum wrote: >> On 2/19/06, Jeff Rush wrote: >> [Quoting Neal Norwitz] >> >>>> I've heard of a bunch of people using --disable-unicode. I'm not sure >>>> if it's curiosity or if there are really production builds without >>>> unicode. Ask this on c.l.p too. >>> >> Do you know of any embedded platform that doesn't have unicode support >> as a requirement? Python runs fine on Nokia phones running Symbian, >> where *everything* is a Unicode string. > > 1. PalmOS, at least the last time I was involved with it. Python on a > Palm is a very tight fit. > > > 2. "GM862 Cellular Module with Python Interpreter" > http://tinyurl.com/jgxz > > These may be dimishing markets as memory capacity increases and I > wouldn't argue adding compile flags for such at this late date, but if > the flags are already there, perhaps the slight inconvenience to > Python-internal developers is worth it. > > Hey, perhaps dropping out Unicode support is not a big win - I just know > it is useful at times to have a collection of flags to drop out floating > point, complex arithmetic, language parsing and such for > memory-constrained cases. These switches make the code less maintainable. I'm not even talking about the testing overhead. I'd say that the parties interested in non-Unicode versions of Python should maintain these branches of Python. Dito for other stripped down versions. Note that this does not mean that we should forget about memory consumption issues. It's just that if there's only marginal interest in certain special builds of Python, I don't see the requirement for the Python core developers to maintain them. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 20 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From guido at python.org Mon Feb 20 14:41:43 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 05:41:43 -0800 Subject: [Python-Dev] defaultdict proposal round three Message-ID: I'm withdrawing the last proposal. I'm not convinced by the argument that __contains__ should always return True (perhaps it should also insert the value?), nor by the complaint that a holy invariant would be violated (so what?). But the amount of discussion and the number of different viewpoints present makes it clear that the feature as I last proposed would be forever divisive. I see two alternatives. These will cause a different kind of philosophical discussion; so be it. I'll describe them relative to the last proposal; for those who wisely skipped the last thread, here's a link to the proposal: http://mail.python.org/pipermail/python-dev/2006-February/061261.html. Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). This avoids the discussion about broken invariants, but one could argue that it adds to an already overly broad API. Alternative B: provide a dict subclass that implements the __getattr__ semantics from the last proposal. It could be an unrelated type for all I care, but I do care about implementation inheritance since it should perform just as well as an unmodified dict object, and that's hard to do without sharing implementation (copying would be worse). Parting shots: - Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful. - It would be unwise to have a default value that would be called if it was callable: what if I wanted the default to be a class instance that happens to have a __call__ method for unrelated reasons? Callability is an elusive propperty; APIs should not attempt to dynamically decide whether an argument is callable or not. - A third alternative would be to have a new method that takes an explicit defaut factory argument. This differs from setdefault() only in the type of the second argument. I'm not keen on this; the original use case came from an example where the readability of d.setdefault(key, []).append(value) was questioned, and I'm not sure that d.something(key, list).append(value) is any more readable. IOW I like (and I believe few have questioned) associating the default factory with the dict object instead of with the call site. Let the third round of the games begin! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From benji at benjiyork.com Mon Feb 20 15:10:31 2006 From: benji at benjiyork.com (Benji York) Date: Mon, 20 Feb 2006 09:10:31 -0500 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F8265A.2020209@v.loewis.de> <43F9171D.1040208@v.loewis.de> Message-ID: <43F9CDD7.3050406@benjiyork.com> Guido van Rossum wrote: > FWIW, it looks like all the sample templates are still wasting a lot > of horizontal space in the first two columns the second is almost > always empty. Perhaps the author of the change could be placed *below* > the timestamp instead of next to it? Also for all practical purposes > we can probably get rid of the seconds in the timestamp. So far the cosmetic changes have been done purely in CSS, implementing the above would (AFAICT) require modifying the buildbot waterfall display HTML generation. Something that's been shied away from thus far. -- Benji York From jonathan.barbero at gmail.com Mon Feb 20 16:16:51 2006 From: jonathan.barbero at gmail.com (Jonathan Barbero) Date: Mon, 20 Feb 2006 12:16:51 -0300 Subject: [Python-Dev] (-1)**(1/2)==1? Message-ID: Hello! My name is Jonathan, i?m new with Python. I try this in the command line: >>> (-1)**(1/2) 1 This is wrong, i think it must throw an exception. What do you think? Bye. Jonathan. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060220/3902d76d/attachment-0001.htm From aahz at pythoncraft.com Mon Feb 20 16:18:08 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 20 Feb 2006 07:18:08 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060219194015.5FCD.JCARLSON@uci.edu> References: <20060218225135.5FCA.JCARLSON@uci.edu> <20060219194015.5FCD.JCARLSON@uci.edu> Message-ID: <20060220151808.GA503@panix.com> On Sun, Feb 19, 2006, Josiah Carlson wrote: > > I agree, there is nothing perfect. But at least in all of my use-cases, > and the majority of the ones I've seen 'in the wild', my previous post > provided an implementation that worked precisely like desired, and > precisely like a regular dictionary, except when accessing a > non-existant key via: value = dd[key] . __contains__, etc., all work > exactly like they do with a non-defaulting dictionary. Iteration via > popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the > way you would expect them. This is the telling point, IMO. My company makes heavy use of a "default dict" (actually, it's a "default class" because using constants as the lookup keys is mostly what we do and the convenience of foo.bar is compelling over foo['bar']). Anyway, our semantics are as Josiah outlines, and I can't see much use case for the alternatives. Those of you arguing something different: do you have a real use case (that you've implemented in real code)? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From John.Marshall at ec.gc.ca Mon Feb 20 16:19:17 2006 From: John.Marshall at ec.gc.ca (John Marshall) Date: Mon, 20 Feb 2006 15:19:17 +0000 Subject: [Python-Dev] Does eval() leak? In-Reply-To: <43F57780.5050300@v.loewis.de> References: <43F4A88A.7050100@ec.gc.ca> <43F57780.5050300@v.loewis.de> Message-ID: <43F9DDF5.4060405@ec.gc.ca> Martin v. L?wis wrote: > John Marshall wrote: > >>Should I expect the virtual memory allocation >>to go up if I do the following? > > > python-dev is a list for discussing development of Python, > not the development with Python. Please post this question > to python-list at python.org. > > For python-dev, a message explaining where the memory leak > is and how to correct it would be more appropriate. Most > likely, there is no memory leak in eval. My question was not a "development with Python" question. However, I posted to python-list as you said. Only one person responded to a request to test the provided code (~10 lines) which demonstrates a problem with eval()--he confirmed my observations. As the problem _does exist_ for 2.3.5 which is the last 2.3 version still available at python.org, I would suggest people avoid using it if they do eval()s. Unfortunately I, myself, cannot check into it more. John From g.brandl at gmx.net Mon Feb 20 16:19:45 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 20 Feb 2006 16:19:45 +0100 Subject: [Python-Dev] (-1)**(1/2)==1? In-Reply-To: References: Message-ID: Jonathan Barbero wrote: > Hello! > My name is Jonathan, i?m new with Python. > > I try this in the command line: > > >>> (-1)**(1/2) > 1 > > This is wrong, i think it must throw an exception. > What do you think? >>> 1/2 0 >>> (-1)**0 1 It's fine. If you want to get a floating point result from dividing, make one of the two numbers a float: >>> 1.0/2 0.5 >>> Georg From aahz at pythoncraft.com Mon Feb 20 16:25:37 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 20 Feb 2006 07:25:37 -0800 Subject: [Python-Dev] (-1)**(1/2)==1? In-Reply-To: References: Message-ID: <20060220152537.GD503@panix.com> Georg, Please do not respond to off-topic posts on python-dev without redirecting them to comp.lang.python (or other suitable place). Thanks! On Mon, Feb 20, 2006, Georg Brandl wrote: > > Jonathan Barbero wrote: >> Hello! >> My name is Jonathan, i?m new with Python. >> >> I try this in the command line: >> >> >>> (-1)**(1/2) >> 1 >> >> This is wrong, i think it must throw an exception. >> What do you think? > >>>> 1/2 > 0 >>>> (-1)**0 > 1 > > It's fine. > > If you want to get a floating point result from dividing, > make one of the two numbers a float: > >>>> 1.0/2 > 0.5 >>>> -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From facundobatista at gmail.com Mon Feb 20 16:50:35 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Mon, 20 Feb 2006 12:50:35 -0300 Subject: [Python-Dev] (-1)**(1/2)==1? In-Reply-To: References: Message-ID: 2006/2/20, Jonathan Barbero : > Hello! > My name is Jonathan, i?m new with Python. Hello Jonathan. This list is only for developing Python itself, not for developing in Python. You should address this kind of question in comp.lang.python (available as a newsgroup and a mailing list), see here for instructions: http://www.python.org/community/lists.html > I try this in the command line: > > >>> (-1)**(1/2) > 1 > > This is wrong, i think it must throw an exception. > What do you think? It's OK, because (1/2) is zero, not 0.5. >>> 1/2 0 Regards, . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From aleaxit at gmail.com Mon Feb 20 17:05:11 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 08:05:11 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: <5862C4B0-124C-419A-A333-F0D7E01EBAA0@gmail.com> On Feb 20, 2006, at 5:41 AM, Guido van Rossum wrote: ... > Alternative A: add a new method to the dict type with the semantics of > __getattr__ from the last proposal, using default_factory if not None > (except on_missing is inlined). This avoids the discussion about > broken invariants, but one could argue that it adds to an already > overly broad API. > > Alternative B: provide a dict subclass that implements the __getattr__ > semantics from the last proposal. It could be an unrelated type for > all I care, but I do care about implementation inheritance since it > should perform just as well as an unmodified dict object, and that's > hard to do without sharing implementation (copying would be worse). "Let's do both!"...;-). Add a method X to dict as per A _and_ provide in collections a subclass of dict that sets __getattr__ to X and also takes the value of default_dict as the first mandatory argument to __init__. Yes, mapping is a "fat interface", chock full of convenience methods, but that's what makes it OK to add another, when it's really convenient; and nearly nobody's been arguing against defaultdict, only about the details of its architecture, so the convenience of this X can be taken as established. As long as DictMixin changes accordingly, the downsides are small. Also having a collections.defaultdict as well as method X would be my preference, for even more convenience. From my POV, either or both of these additions would be an improvement wrt 2.4 (as would most of the other alternatives debated here), but I'm keen to have _some_ alternative get in, rather than all being blocked out of 2.5 by "analysis paralysis". Alex From guido at python.org Mon Feb 20 17:31:44 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 08:31:44 -0800 Subject: [Python-Dev] Does eval() leak? In-Reply-To: <43F4A88A.7050100@ec.gc.ca> References: <43F4A88A.7050100@ec.gc.ca> Message-ID: On 2/16/06, John Marshall wrote: > Hi, > > Should I expect the virtual memory allocation > to go up if I do the following? > ----- > raw = open("data").read() > while True: > d = eval(raw) > ----- > > I would have expected the memory allocated to the > object referenced by d to be deallocated, garbage > collected, and reallocated for the new eval(raw) > results, assigned to d. > > The file contains a large, SIMPLE (no self refs; all > native python types/objects) dictionary (>300K). You're probably running into the problem that the concrete parse tree built up by the parser is rather large. While the memory used for that tree is freed to Python's malloc pool, thus making it available for other allocations by the same process, it is likely that the VM allocation for the process will permanently go up. When I try something like this (*) I see the virtual memory size go up indefinitely with Python 2.3.5, but not with Python 2.4.1 or 2.5(head). Even so, the problem may be fragmentation instead of a memory leak; fragmentation problems are even harded to debug than leaks (since they depend on the heuristics applied by the platform's malloc implementation). You can file a bug for 2.3 but unless you also provide a patch it's unlikely to be fixed; the memory allocation code was revamped significantly for 2.4 so there's no simple backport of the fix available. (*) d = {} for i in range(100000): d[repr(i)] = i s = str(d) while 1: x = eval(s); print 'x' -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Mon Feb 20 17:35:35 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 20 Feb 2006 11:35:35 -0500 Subject: [Python-Dev] defaultdict proposal round three References: Message-ID: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> [GvR] > I'm not convinced by the argument > that __contains__ should always return True Me either. I cannot think of a more useless behavior or one more likely to have unexpected consequences. Besides, as Josiah pointed out, it is much easier for a subclass override to substitute always True return values than vice-versa. > Alternative A: add a new method to the dict type with the semantics of > __getattr__ from the last proposal Did you mean __getitem__? If not, then I'm missing what the current proposal is. >, using default_factory if not None > (except on_missing is inlined). This avoids the discussion about > broken invariants, but one could argue that it adds to an already > overly broad API. +1 I prefer this approach over subclassing. The mental load from an additional method is less than the load from a separate type (even a subclass). Also, avoidance of invariant issues is a big plus. Besides, if this allows setdefault() to be deprecated, it becomes an all-around win. > - Even if the default_factory were passed to the constructor, it still > ought to be a writable attribute so it can be introspected and > modified. A defaultdict that can't change its default factory after > its creation is less useful. Right! My preference is to have default_factory not passed to the constructor, so we are left with just one way to do it. But that is a nit. > - It would be unwise to have a default value that would be called if > it was callable: what if I wanted the default to be a class instance > that happens to have a __call__ method for unrelated reasons? > Callability is an elusive propperty; APIs should not attempt to > dynamically decide whether an argument is callable or not. That makes sense, though it seems over-the-top to need a zero-factory for a multiset. An alternative is to have two possible attributes: d.default_factory = list or d.default_value = 0 with an exception being raised when both are defined (the test is done when the attribute is created, not when the lookup is performed). Raymond From michael.walter at gmail.com Mon Feb 20 17:52:26 2006 From: michael.walter at gmail.com (Michael Walter) Date: Mon, 20 Feb 2006 17:52:26 +0100 Subject: [Python-Dev] (-1)**(1/2)==1? In-Reply-To: References: Message-ID: <877e9a170602200852h55e1can6813a6f38e9bc337@mail.gmail.com> >>> 1/2 0 >>> (-1) ** (1./2) Traceback (most recent call last): File "", line 1, in ? ValueError: negative number cannot be raised to a fractional power Regards, Michael On 2/20/06, Jonathan Barbero wrote: > Hello! > My name is Jonathan, i?m new with Python. > > I try this in the command line: > > >>> (-1)**(1/2) > 1 > > This is wrong, i think it must throw an exception. > What do you think? > > Bye. > Jonathan. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > > > From bokr at oz.net Mon Feb 20 18:10:13 2006 From: bokr at oz.net (Bengt Richter) Date: Mon, 20 Feb 2006 17:10:13 GMT Subject: [Python-Dev] defaultdict proposal round three References: Message-ID: <43f9e355.1302229717@news.gmane.org> On Mon, 20 Feb 2006 05:41:43 -0800, "Guido van Rossum" wrote: >I'm withdrawing the last proposal. I'm not convinced by the argument >that __contains__ should always return True (perhaps it should also >insert the value?), nor by the complaint that a holy invariant would >be violated (so what?). > >But the amount of discussion and the number of different viewpoints >present makes it clear that the feature as I last proposed would be >forever divisive. > >I see two alternatives. These will cause a different kind of >philosophical discussion; so be it. I'll describe them relative to the >last proposal; for those who wisely skipped the last thread, here's a >link to the proposal: >http://mail.python.org/pipermail/python-dev/2006-February/061261.html. > >Alternative A: add a new method to the dict type with the semantics of >__getattr__ from the last proposal, using default_factory if not None >(except on_missing is inlined). This avoids the discussion about >broken invariants, but one could argue that it adds to an already >overly broad API. > >Alternative B: provide a dict subclass that implements the __getattr__ >semantics from the last proposal. It could be an unrelated type for >all I care, but I do care about implementation inheritance since it >should perform just as well as an unmodified dict object, and that's >hard to do without sharing implementation (copying would be worse). > >Parting shots: > >- Even if the default_factory were passed to the constructor, it still >ought to be a writable attribute so it can be introspected and >modified. A defaultdict that can't change its default factory after >its creation is less useful. > >- It would be unwise to have a default value that would be called if >it was callable: what if I wanted the default to be a class instance >that happens to have a __call__ method for unrelated reasons? You'd have to put it in a lambda: thing_with_unrelated__call__method >Callability is an elusive propperty; APIs should not attempt to >dynamically decide whether an argument is callable or not. > >- A third alternative would be to have a new method that takes an >explicit defaut factory argument. This differs from setdefault() only >in the type of the second argument. I'm not keen on this; the original >use case came from an example where the readability of > > d.setdefault(key, []).append(value) > >was questioned, and I'm not sure that > > d.something(key, list).append(value) > >is any more readable. IOW I like (and I believe few have questioned) >associating the default factory with the dict object instead of with >the call site. > >Let the third round of the games begin! > Sorry if I missed it, but is it established that defaulting lookup will be spelled the same as traditional lookup, i.e. d[k] or d.__getitem__(k) ? IOW, are default-enabled dicts really going to be be passed into unknown contexts for use as a dict workalike? I can see using on_missing for external side effects like logging etc., or _maybe_ modifying the dict with a known separate set of keys that wouldn't be used for the normal purposes of the dict. ISTM a defaulting dict could only reasonably be passed into contexts that expected it, but that could still be useful also. How about d = dict() for a totally normal dict, and d.defaulting to get a view that uses d.default_factory if present? E.g., d = dict() d.default_factory = list for i,name in enumerate('Eeny Meeny Miny Moe'.split()): # prefix insert order d.defaulting[name].append(i) # or hoist d.defaulting => dd[name].append(i) Maybe d.defaulting could be a descriptor? If the above were done, could d.on_missing be independent and always active if present? E.g., d.on_missing = lambda self, key: self.__setitem__(key, 0) or 0 would be allowed to work on its own first, irrespective of whether default_factory was set. If it created d[key] it would effectively override default_factory if active, and if not active, it would still act, letting you instrument a "normal" dict with special effects. Of course, if you wanted to write an on_missing handler to use default_factory like your original example, you could. So on_missing would always trigger if present, for missing keys, but d.defaulting[k] would only call d.default_factory if the latter was set and the key was missing even after on_missing (if present) did something (e.g., it could be logging passively). Regards, Bengt Richter From stephen at xemacs.org Mon Feb 20 18:31:21 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 21 Feb 2006 02:31:21 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060219195258.5FD0.JCARLSON@uci.edu> (Josiah Carlson's message of "Sun, 19 Feb 2006 20:28:41 -0800") References: <20060218005534.5FA8.JCARLSON@uci.edu> <878xs7s4ig.fsf@tleepslib.sk.tsukuba.ac.jp> <20060219195258.5FD0.JCARLSON@uci.edu> Message-ID: <87y806os2e.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Josiah" == Josiah Carlson writes: Josiah> I try to internalize it by not thinking of strings as Josiah> encoded data, but as binary data, and unicode as text. I Josiah> then remind myself that unicode isn't native on-disk or Josiah> cross-network (which stores and transports bytes, not Josiah> characters), so one needs to encode it as binary data. Josiah> It's a subtle difference, but it has worked so far for me. Seems like a lot of work for something that for monolingual usage should "Just Work" almost all of the time. Josiah> I notice that you seem to be in Japan, so teaching unicode Josiah> is a must. Yes. Japan is more complicated than that, but in Python unicode is a must. Josiah> If you are using the "unicode is text" and "strings are Josiah> data", and they aren't getting it; then I don't know. Well, I can tell you that they don't get it. One problem is PEP 263. It makes it very easy to write programs that do line-oriented I/O with input() and print, and the students come to think it should always be that easy. Since Japan has at least 6 common encodings that students encounter on a daily basis while browsing the web, plus a couple more that live inside of MSFT Word and Java, they're used to huge amounts of magic. The normal response of novice programmers is to mandate that users of their programs use the encoding of choice and put it in ordinary strings so that it just works. Ie, the average student just "eats" the F on the codecs assignment, and writes the rest of her programs without them. >> simple, and the exceptions for using a "nonexistent" method >> mean I don't have to reinforce---the students will be able to >> teach each other. The exceptions also directly help reinforce >> the notion that text == Unicode. Josiah> Are you sure that they would help? If .encode() and Josiah> .decode() drop from strings and unicode (respectively), Josiah> they get an AttributeError. That's almost useless. Well, I'm not _sure_, but this is the kind of thing that you can learn by rote. And it will happen on a sufficiently regular basis that a large fraction of students will experience it. They'll ask each other, and usually they'll find a classmate who knows what happened. I haven't tried this with codecs, but that's been my experience with statistical packages where some routines understand non-linear equations but others insist on linear equations.[1] The error messages ("Equation is non-linear! Aaugh!") are not much more specific than AttributeError. Josiah> Raising a better exception (with more information) would Josiah> be better in that case, but losing the functionality that Josiah> either would offer seems unnecessary; Well, the point is that for the "usual suspects" (ie, Unicode codecs) there is no functionality that would be lost. As MAL pointed out, for these codecs the "original" text is always Unicode; that's the role Unicode is designed for, and by and large it fits the bill very well. With few exceptions (such as rot13) the "derived" text will be bytes that peripherals such as keyboards and terminals can generate and display. Josiah> "You are trying to encode/decode to/from incompatible Josiah> types. expected: a->b got: x->y" is better. Some of those Josiah> can be done *very soon*, given the capabilities of the Josiah> encodings module, That's probably the way to go. If we can have a derived "Unicode codec" class that does this, that would pretty much entirely serve the need I perceive. Beginning students could learn to write iconv.py, more advanced students could learn to create codec stacks to generate MIME bodies, which could include base64 or quoted-printable bytes -> bytes codecs. Footnotes: [1] If you're not familiar with regression analysis, the problem is that the equation "z = a*log(x) + b*log(y)" where a and b are to be estimated is _linear_ in the sense that x, y, and z are data series, and X = log(x) and Y = log(y) can be precomputed so that the equation actually computed is "z = a*X + b*Y". On the other hand "z = a*(x + b*y)" is _nonlinear_ because of the coefficient on y being a*b. Students find this hard to grasp in the classroom, but they learn quickly in the lab. I believe the parameter/variable inversion that my students have trouble with in statistics is similar to the "original"/"derived" inversion that happens with "text you can see" (derived, string) and "abstract text inside the program" (original, Unicode). -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From bokr at oz.net Mon Feb 20 18:38:54 2006 From: bokr at oz.net (Bengt Richter) Date: Mon, 20 Feb 2006 17:38:54 GMT Subject: [Python-Dev] documenting things [Was: Re: Proposal: defaultdict] References: <43F64A7A.3060400@v.loewis.de> <79990c6b0602200401u143f3d7at59e9323465db8300@mail.gmail.com> Message-ID: <43f9fa06.1308038560@news.gmane.org> On Mon, 20 Feb 2006 12:01:02 +0000, "Paul Moore" wrote: >On 2/19/06, Steve Holden wrote: >> > You are missing the rationale of the PEP process. The point is >> > *not* documentation. The point of the PEP process is to channel >> > and collect discussion, so that the BDFL can make a decision. >> > The BDFL is not bound at all to the PEP process. >> > >> > To document things, we use (or should use) documentation. >> > >> > >> One could wish this ideal had been the case for the import extensions >> defined in PEP 302. > >(A bit off-topic, but that hit home, so I'll reply...) > >Agreed, and it's my fault they weren't, to some extent. I did try to >find a suitable place, but the import docs are generally fairly >scattered, and there wasn't a particularly good place to put the >changes. > >Any suggestions would be gratefully accepted... I've always thought we could leverage google to find good doc information if we would just tag it in some consistent way. E.g., if you wanted to post a partial draft of some pep doc, you could post it here and/or c.l.p with PEP 302 docs version 2 <<--ENDMARK-- text here ========= (use REST if ambitious) ... --ENDMARK-- If we had some standard tag lines, we could make an urllib tool to harvest the material and merge the most recent version paragraphs and auto-post it as html in one place for draft docs on python.org The same tagged section technique could be used re any documention, so long as update and/or addition text can be associated with where it should be tagged in as references. I think mouseover popup hints with clickable js popups for the additional material would be cool. It would mean automatically editing the doc to insert the hints though. Well, nothing there is rocket science, but neither is a wall of bricks so long and high you can't live long enough to complete it ;-) Regards, Bengt Richter From rhamph at gmail.com Mon Feb 20 19:22:30 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 20 Feb 2006 11:22:30 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060220151808.GA503@panix.com> References: <20060218225135.5FCA.JCARLSON@uci.edu> <20060219194015.5FCD.JCARLSON@uci.edu> <20060220151808.GA503@panix.com> Message-ID: On 2/20/06, Aahz wrote: > On Sun, Feb 19, 2006, Josiah Carlson wrote: > > > > I agree, there is nothing perfect. But at least in all of my use-cases, > > and the majority of the ones I've seen 'in the wild', my previous post > > provided an implementation that worked precisely like desired, and > > precisely like a regular dictionary, except when accessing a > > non-existant key via: value = dd[key] . __contains__, etc., all work > > exactly like they do with a non-defaulting dictionary. Iteration via > > popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the > > way you would expect them. > > This is the telling point, IMO. My company makes heavy use of a "default > dict" (actually, it's a "default class" because using constants as the > lookup keys is mostly what we do and the convenience of foo.bar is > compelling over foo['bar']). Anyway, our semantics are as Josiah > outlines, and I can't see much use case for the alternatives. Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? > Those of you arguing something different: do you have a real use case > (that you've implemented in real code)? (again, for the record) getorset provides the minimum needed functionality in a clean and intuitive way. Why go for a complicated solution when you simply don't need it? -- Adam Olsen, aka Rhamphoryncus From lists at janc.be Mon Feb 20 19:32:48 2006 From: lists at janc.be (Jan Claeys) Date: Mon, 20 Feb 2006 19:32:48 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <43F64CAD.5020407@v.loewis.de> References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> <43F64CAD.5020407@v.loewis.de> Message-ID: <1140460369.13739.117.camel@localhost.localdomain> Op vr, 17-02-2006 te 23:22 +0100, schreef "Martin v. L?wis": > That, in turn, is because nobody is so short of disk space that > you really *have* to share /usr/share across architectures, I can see diskless thin clients that boot from flash memory doing things like that? (E.g. having documentation and header files and other less-important stuff on an nfs mount?) -- Jan Claeys From guido at python.org Mon Feb 20 19:53:30 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 10:53:30 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> Message-ID: On 2/20/06, Raymond Hettinger wrote: > [GvR] > > Alternative A: add a new method to the dict type with the semantics of > > __getattr__ from the last proposal > > Did you mean __getitem__? Yes, sorry, I meant __getitem__. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jimjjewett at gmail.com Mon Feb 20 20:02:02 2006 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 20 Feb 2006 14:02:02 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: Adam Olsen asked: > ... d.getorset(key, func) would work in your use > cases? It is an improvement over setdefault, because it doesn't always evaluate the expensive func. (But why should every call have to pass in the function, when it is a property of the dictionary?) It doesn't actually *solve* the problem because it doesn't compose well. This makes it hard to use for configuration. (Use case from plucker web reader, where the config is arguably overdesigned, but ... the version here is simplified) There is a system-wide default config. Users have config files. A config file can be specified for this program run. In each of these, settings can be either general settings or channel-specific. The end result is that the value should be pulled from the first of about half a dozen dictionaries to have an answer. Because most settings are never used in most channels, and several channels are typically run at once, it feels wrong to pre-build the whole "anything they might ask" settings dictionary for each of them. On the other hand, I certainly don't want to write userchannelconfig.getorset(key, systemchannelconfig.getorset(key, ...) even once, let alone every time I get a config value. In other words, the program would work correctly if I passed in a normal but huge dictionary; I want to avoid that for reasons of efficiency. This isn't the only use for a mapping, but it is the only one I've seen where KeyError is "expected" by the program's normal flow. -jJ From aleaxit at gmail.com Mon Feb 20 20:09:48 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 11:09:48 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> Message-ID: <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> On Feb 20, 2006, at 8:35 AM, Raymond Hettinger wrote: > [GvR] >> I'm not convinced by the argument >> that __contains__ should always return True > > Me either. I cannot think of a more useless behavior or one more > likely to have > unexpected consequences. Besides, as Josiah pointed out, it is > much easier for > a subclass override to substitute always True return values than > vice-versa. Agreed on all counts. > I prefer this approach over subclassing. The mental load from an > additional > method is less than the load from a separate type (even a > subclass). Also, > avoidance of invariant issues is a big plus. Besides, if this allows > setdefault() to be deprecated, it becomes an all-around win. I'd love to remove setdefault in 3.0 -- but I don't think it can be done before that: default_factory won't cover the occasional use cases where setdefault is called with different defaults at different locations, and, rare as those cases may be, any 2.* should not break any existing code that uses that approach. >> - Even if the default_factory were passed to the constructor, it >> still >> ought to be a writable attribute so it can be introspected and >> modified. A defaultdict that can't change its default factory after >> its creation is less useful. > > Right! My preference is to have default_factory not passed to the > constructor, > so we are left with just one way to do it. But that is a nit. No big deal either way, but I see "passing the default factory to the ctor" as the "one obvious way to do it", so I'd rather have it (be it with a subclass or a classmethod-alternate constructor). I won't weep bitter tears if this drops out, though. >> - It would be unwise to have a default value that would be called if >> it was callable: what if I wanted the default to be a class instance >> that happens to have a __call__ method for unrelated reasons? >> Callability is an elusive propperty; APIs should not attempt to >> dynamically decide whether an argument is callable or not. > > That makes sense, though it seems over-the-top to need a zero- > factory for a > multiset. But int is a convenient zero-factory. > > An alternative is to have two possible attributes: > d.default_factory = list > or > d.default_value = 0 > with an exception being raised when both are defined (the test is > done when the > attribute is created, not when the lookup is performed). I see default_value as a way to get exactly the same beginner's error we already have with function defaults: a mutable object will not work as beginners expect, and we can confidently predict (based on the function defaults case) that python-list and python-help and python-tutor and a bazillion other venues will see an unending stream of confused beginners (in addition to those confused by mutable objects as default values for function arguments, but those can't be avoided). I presume you consider the "one obvious way" is to use default_value for immutables and default_factory for mutables, but based on a lot of experience teaching Python I feel certain that this won't be obvious to many, MANY users (and not just non-Dutch ones, either). Alex From steven.bethard at gmail.com Mon Feb 20 20:24:09 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 20 Feb 2006 12:24:09 -0700 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: Guido van Rossum wrote: > Alternative A: add a new method to the dict type with the semantics of > __getattr__ from the last proposal, using default_factory if not None > (except on_missing is inlined). I'm not certain I understood this right but (after s/__getattr__/__getitem__) this seems to suggest that for keeping a dict of counts the code wouldn't really improve much: dd = {} dd.default_factory = int for item in items: # I want to do ``dd[item] += 1`` but with a regular method instead # of __getitem__, this is not possible dd[item] = dd.somenewmethod(item) + 1 I don't think that's much better than just calling ``dd.get(item, 0)``. Did I misunderstand Alternative A? > Alternative B: provide a dict subclass that implements the __getattr__ > semantics from the last proposal. If I didn't misinterpret Alternative A, I'd definitely prefer Alternative B. A dict of counts is by far my most common use case... STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From dw at botanicus.net Mon Feb 20 20:58:08 2006 From: dw at botanicus.net (David Wilson) Date: Mon, 20 Feb 2006 19:58:08 +0000 Subject: [Python-Dev] Simple CPython stack overflow. Message-ID: <20060220195808.GA59552@thailand.botanicus.net> Just noticed this and wondered if it came under the Python should never crash mantra. Should sys.getrecursionlimit() perhaps be taken into account somewhere? >>> D = {'a': None} >>> for i in xrange(150000): ... D = {'a': D} ... >>> D {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': .... ': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {[+]'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a' .... Bus error bash$ Cheers, David. -- 'tis better to be silent and be thought a fool, than to speak and remove all doubt. -- Lincoln From jcarlson at uci.edu Mon Feb 20 21:24:18 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 20 Feb 2006 12:24:18 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <20060220151808.GA503@panix.com> Message-ID: <20060220121906.5FEC.JCARLSON@uci.edu> "Adam Olsen" wrote: > Can you say, for the record (since nobody else seems to care), if > d.getorset(key, func) would work in your use cases? It doesn't work for the multiset/accumulation case: dd[key] += 1 - Josiah From guido at python.org Mon Feb 20 21:25:24 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 12:25:24 -0800 Subject: [Python-Dev] Simple CPython stack overflow. In-Reply-To: <20060220195808.GA59552@thailand.botanicus.net> References: <20060220195808.GA59552@thailand.botanicus.net> Message-ID: Yes, this is the type of thing we've been struggling with for years. There used to be way more of these. I can't guarantee it'll be fixed with priority (it's mostly of the "then don't do that" type) but please do file a bug so someone with inclination can fix it. The same happens for deeply recursive tuples and lists BTW. --Guido On 2/20/06, David Wilson wrote: > Just noticed this and wondered if it came under the Python should never > crash mantra. Should sys.getrecursionlimit() perhaps be taken into > account somewhere? > > >>> D = {'a': None} > >>> for i in xrange(150000): > ... D = {'a': D} > ... > >>> D > {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': {'a': > {'a': {'a': {'a': {'a': {'a': {'a': {'a': .... ': {'a': {'a': {'a': > {'a': {'a': {'a': {'a': {[+]'a': {'a': {'a': {'a': {'a': {'a': {'a': > {'a': {'a' .... Bus error > bash$ > > Cheers, > > > David. > > -- > 'tis better to be silent and be thought a fool, > than to speak and remove all doubt. > -- Lincoln > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 20 21:33:04 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 12:33:04 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: On 2/20/06, Steven Bethard wrote: > Guido van Rossum wrote: > > Alternative A: add a new method to the dict type with the semantics of > > [__getitem__] from the last proposal, using default_factory if not None > > (except on_missing is inlined). > > I'm not certain I understood this right but [...] > this seems to suggest that for keeping a > dict of counts the code wouldn't really improve much: You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 is perfectly fine there and hard to improve upon. It's the slightly more esoteric use case where the default is a list and you want to append to that list that we're trying to improve: currently the shortest version is d.setdefault(k, []).append(v) but that lacks legibility and creates an empty list that is thrown away most of the time. We're trying to obtain the minimal form d.foo(k).append(v) where the new list is created by implicitly calling d.default_factory if d[k] doesn't yet exist, and d.default_factory is set to the list constructor. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From crutcher at gmail.com Mon Feb 20 21:34:44 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Mon, 20 Feb 2006 12:34:44 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: Sorry to chime in so late, but why are we setting a value when the key isn't defined? It seems there are many situations where you want: a) default values, and b) the ability to determine if a value was defined. There are many times that I want d[key] to give me a value even when it isn't defined, but that doesn't always mean I want to _save_ that value in the dict. Sometimes I do, sometimes I don't. We should have some means of describing this in any defaultdict implementation On 2/20/06, Guido van Rossum wrote: > I'm withdrawing the last proposal. I'm not convinced by the argument > that __contains__ should always return True (perhaps it should also > insert the value?), nor by the complaint that a holy invariant would > be violated (so what?). > > But the amount of discussion and the number of different viewpoints > present makes it clear that the feature as I last proposed would be > forever divisive. > > I see two alternatives. These will cause a different kind of > philosophical discussion; so be it. I'll describe them relative to the > last proposal; for those who wisely skipped the last thread, here's a > link to the proposal: > http://mail.python.org/pipermail/python-dev/2006-February/061261.html. > > Alternative A: add a new method to the dict type with the semantics of > __getattr__ from the last proposal, using default_factory if not None > (except on_missing is inlined). This avoids the discussion about > broken invariants, but one could argue that it adds to an already > overly broad API. > > Alternative B: provide a dict subclass that implements the __getattr__ > semantics from the last proposal. It could be an unrelated type for > all I care, but I do care about implementation inheritance since it > should perform just as well as an unmodified dict object, and that's > hard to do without sharing implementation (copying would be worse). > > Parting shots: > > - Even if the default_factory were passed to the constructor, it still > ought to be a writable attribute so it can be introspected and > modified. A defaultdict that can't change its default factory after > its creation is less useful. > > - It would be unwise to have a default value that would be called if > it was callable: what if I wanted the default to be a class instance > that happens to have a __call__ method for unrelated reasons? > Callability is an elusive propperty; APIs should not attempt to > dynamically decide whether an argument is callable or not. > > - A third alternative would be to have a new method that takes an > explicit defaut factory argument. This differs from setdefault() only > in the type of the second argument. I'm not keen on this; the original > use case came from an example where the readability of > > d.setdefault(key, []).append(value) > > was questioned, and I'm not sure that > > d.something(key, list).append(value) > > is any more readable. IOW I like (and I believe few have questioned) > associating the default factory with the dict object instead of with > the call site. > > Let the third round of the games begin! > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From crutcher at gmail.com Mon Feb 20 21:37:30 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Mon, 20 Feb 2006 12:37:30 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: I'm thinking something mutch closer to this (note default_factory gets the key): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory(key) if self.on_missing_define_key: self[key] = value return value raise KeyError(key) On 2/20/06, Crutcher Dunnavant wrote: > Sorry to chime in so late, but why are we setting a value when the key > isn't defined? > > It seems there are many situations where you want: > a) default values, and > b) the ability to determine if a value was defined. > > There are many times that I want d[key] to give me a value even when > it isn't defined, but that doesn't always mean I want to _save_ that > value in the dict. Sometimes I do, sometimes I don't. We should have > some means of describing this in any defaultdict implementation > > On 2/20/06, Guido van Rossum wrote: > > I'm withdrawing the last proposal. I'm not convinced by the argument > > that __contains__ should always return True (perhaps it should also > > insert the value?), nor by the complaint that a holy invariant would > > be violated (so what?). > > > > But the amount of discussion and the number of different viewpoints > > present makes it clear that the feature as I last proposed would be > > forever divisive. > > > > I see two alternatives. These will cause a different kind of > > philosophical discussion; so be it. I'll describe them relative to the > > last proposal; for those who wisely skipped the last thread, here's a > > link to the proposal: > > http://mail.python.org/pipermail/python-dev/2006-February/061261.html. > > > > Alternative A: add a new method to the dict type with the semantics of > > __getattr__ from the last proposal, using default_factory if not None > > (except on_missing is inlined). This avoids the discussion about > > broken invariants, but one could argue that it adds to an already > > overly broad API. > > > > Alternative B: provide a dict subclass that implements the __getattr__ > > semantics from the last proposal. It could be an unrelated type for > > all I care, but I do care about implementation inheritance since it > > should perform just as well as an unmodified dict object, and that's > > hard to do without sharing implementation (copying would be worse). > > > > Parting shots: > > > > - Even if the default_factory were passed to the constructor, it still > > ought to be a writable attribute so it can be introspected and > > modified. A defaultdict that can't change its default factory after > > its creation is less useful. > > > > - It would be unwise to have a default value that would be called if > > it was callable: what if I wanted the default to be a class instance > > that happens to have a __call__ method for unrelated reasons? > > Callability is an elusive propperty; APIs should not attempt to > > dynamically decide whether an argument is callable or not. > > > > - A third alternative would be to have a new method that takes an > > explicit defaut factory argument. This differs from setdefault() only > > in the type of the second argument. I'm not keen on this; the original > > use case came from an example where the readability of > > > > d.setdefault(key, []).append(value) > > > > was questioned, and I'm not sure that > > > > d.something(key, list).append(value) > > > > is any more readable. IOW I like (and I believe few have questioned) > > associating the default factory with the dict object instead of with > > the call site. > > > > Let the third round of the games begin! > > > > -- > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com > > > > > -- > Crutcher Dunnavant > littlelanguages.com > monket.samedi-studios.com > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From aahz at pythoncraft.com Mon Feb 20 21:38:52 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 20 Feb 2006 12:38:52 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <20060218225135.5FCA.JCARLSON@uci.edu> <20060219194015.5FCD.JCARLSON@uci.edu> <20060220151808.GA503@panix.com> Message-ID: <20060220203852.GA22115@panix.com> On Mon, Feb 20, 2006, Adam Olsen wrote: > On 2/20/06, Aahz wrote: >> On Sun, Feb 19, 2006, Josiah Carlson wrote: >>> >>> I agree, there is nothing perfect. But at least in all of my use-cases, >>> and the majority of the ones I've seen 'in the wild', my previous post >>> provided an implementation that worked precisely like desired, and >>> precisely like a regular dictionary, except when accessing a >>> non-existant key via: value = dd[key] . __contains__, etc., all work >>> exactly like they do with a non-defaulting dictionary. Iteration via >>> popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the >>> way you would expect them. >> >> This is the telling point, IMO. My company makes heavy use of a "default >> dict" (actually, it's a "default class" because using constants as the >> lookup keys is mostly what we do and the convenience of foo.bar is >> compelling over foo['bar']). Anyway, our semantics are as Josiah >> outlines, and I can't see much use case for the alternatives. > > Can you say, for the record (since nobody else seems to care), if > d.getorset(key, func) would work in your use cases? Because I haven't been reading this thread all that closely, you'll have to remind me what this means. >> Those of you arguing something different: do you have a real use case >> (that you've implemented in real code)? > > (again, for the record) getorset provides the minimum needed > functionality in a clean and intuitive way. Why go for a complicated > solution when you simply don't need it? Ditto above. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From guido at python.org Mon Feb 20 21:54:42 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 12:54:42 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060220121906.5FEC.JCARLSON@uci.edu> References: <20060220151808.GA503@panix.com> <20060220121906.5FEC.JCARLSON@uci.edu> Message-ID: On 2/20/06, Josiah Carlson wrote: > "Adam Olsen" wrote: > > Can you say, for the record (since nobody else seems to care), if > > d.getorset(key, func) would work in your use cases? > > It doesn't work for the multiset/accumulation case: > > dd[key] += 1 This is actually a fairly powerful argument for a subclass that redefines __getitem__ in favor of a new dict method. (Not to mention that it's much easier to pick a name for the subclass than for the method. :-) See the new thread I started. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fuzzyman at voidspace.org.uk Mon Feb 20 21:57:49 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 20 Feb 2006 20:57:49 +0000 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F8265A.2020209@v.loewis.de> Message-ID: <43FA2D4D.9000202@voidspace.org.uk> Manuzhai wrote: >> No; nobody volunteered a machine yet (plus the hand-holding that >> is always necessary with Windows). >> > > What exactly is needed for this? Does it need to be a machine dedicated > to this stuff, or could I just run the tests once every day or so when I > feel like it and have them submitted to buildbot? > > Has a machine been volunteered ? I have a spare machine and an always on connection. Would the 'right' development tools be needed ? (In the case of Microsoft they are a touch expensive I believe.) All the best, Michael Foord > Regards, > > Manuzhai > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From ianb at colorstudy.com Mon Feb 20 22:13:23 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 20 Feb 2006 15:13:23 -0600 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> Message-ID: <43FA30F3.1040900@colorstudy.com> Alex Martelli wrote: >>I prefer this approach over subclassing. The mental load from an >>additional >>method is less than the load from a separate type (even a >>subclass). Also, >>avoidance of invariant issues is a big plus. Besides, if this allows >>setdefault() to be deprecated, it becomes an all-around win. > > > I'd love to remove setdefault in 3.0 -- but I don't think it can be > done before that: default_factory won't cover the occasional use > cases where setdefault is called with different defaults at different > locations, and, rare as those cases may be, any 2.* should not break > any existing code that uses that approach. Would it be deprecated in 2.*, or start deprecating in 3.0? Also, is default_factory=list threadsafe in the same way .setdefault is? That is, you can safely do this from multiple threads: d.setdefault(key, []).append(value) I believe this is safe with very few caveats -- setdefault itself is atomic (or else I'm writing some bad code ;). My impression is that default_factory will not generally be threadsafe in the way setdefault is. For instance: def make_list(): return [] d = dict d.default_factory = make_list # from multiple threads: d.getdef(key).append(value) This would not be correct (a value can be lost if two threads concurrently enter make_list for the same key). In the case of default_factory=list (using the list builtin) is the story different? Will this work on Jython, IronPython, or PyPy? Will this be a documented guarantee? Or alternately, are we just creating a new way to punish people who use threads? And if we push threadsafety up to user code, are we trading a very small speed issue (creating lists that are thrown away) for a much larger speed issue (acquiring a lock)? I tried to make a test for this threadsafety, actually -- using a technique besides setdefault which I knew was bad (try:except KeyError:). And (except using time.sleep(), which is cheating), I wasn't actually able to trigger the bug. Which is frustrating, because I know the bug is there. So apparently threadsafety is hard to test in this case. (If anyone is interested in trying it, I can email what I have.) Note that multidict -- among other possible concrete collection patterns (like Bag, OrderedDict, or others) -- can be readily implemented with threading guarantees. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From ianb at colorstudy.com Mon Feb 20 22:13:27 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 20 Feb 2006 15:13:27 -0600 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: <43FA30F7.5060104@colorstudy.com> Steven Bethard wrote: >>Alternative A: add a new method to the dict type with the semantics of >>__getattr__ from the last proposal, using default_factory if not None >>(except on_missing is inlined). > > > I'm not certain I understood this right but (after > s/__getattr__/__getitem__) this seems to suggest that for keeping a > dict of counts the code wouldn't really improve much: > > dd = {} > dd.default_factory = int > for item in items: > # I want to do ``dd[item] += 1`` but with a regular method instead > # of __getitem__, this is not possible > dd[item] = dd.somenewmethod(item) + 1 This would be better done with a bag (a set that can contain multiple instances of the same item): dd = collections.Bag() for item in items: dd.add(item) Then to see how many there are of an item, perhaps something like: dd.count(item) No collections.Bag exists, but of course one should. It has nice properties -- inclusion is done with __contains__ (with dicts it probably has to be done with get), you can't accidentally go below zero, the methods express intent, and presumably it will implement only a meaningful set of methods. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Mon Feb 20 22:18:14 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 13:18:14 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FA30F3.1040900@colorstudy.com> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <43FA30F3.1040900@colorstudy.com> Message-ID: On 2/20/06, Ian Bicking wrote: > Would it be deprecated in 2.*, or start deprecating in 3.0? 3.0 will have no backwards compatibility allowances. Whenever someone says "remove this in 3.0" they mean exactly that. There will be too many incompatibilities in 3.0 to be bothered with deprecating them all; most likely we'll have to have some kind of (semi-)automatic conversion tool. Deprecation in 2.x is generally done to indicate that a feature will be removed in 2.y for y >= x+1. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at gmail.com Mon Feb 20 22:20:46 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 13:20:46 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> On Feb 20, 2006, at 12:33 PM, Guido van Rossum wrote: ... > You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 > is perfectly fine there and hard to improve upon. I see d[k]+=1 as a substantial improvement -- conceptually more direct, "I've now seen one more k than I had seen before". Alex From aleaxit at gmail.com Mon Feb 20 22:24:24 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 13:24:24 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060220203852.GA22115@panix.com> References: <20060218225135.5FCA.JCARLSON@uci.edu> <20060219194015.5FCD.JCARLSON@uci.edu> <20060220151808.GA503@panix.com> <20060220203852.GA22115@panix.com> Message-ID: On Feb 20, 2006, at 12:38 PM, Aahz wrote: ... >> Can you say, for the record (since nobody else seems to care), if >> d.getorset(key, func) would work in your use cases? > > Because I haven't been reading this thread all that closely, you'll > have > to remind me what this means. Roughly the same (save for method/function difference) as: def getorset(d, key, func): if key not in d: d[key] = func() return d[key] Alex From guido at python.org Mon Feb 20 22:28:46 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 13:28:46 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FA30F3.1040900@colorstudy.com> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <43FA30F3.1040900@colorstudy.com> Message-ID: On 2/20/06, Ian Bicking wrote: > Also, is default_factory=list threadsafe in the same way .setdefault is? > That is, you can safely do this from multiple threads: > > d.setdefault(key, []).append(value) > > I believe this is safe with very few caveats -- setdefault itself is > atomic (or else I'm writing some bad code ;). Only if the key is a string and all values in the dict are also strings (or other builtins). And I don't think that Jython or IronPython promise anything here. Here's a sketch of a situation that isn't thread-safe: class C: def __eq__(self, other): return False def __hash__(self): return hash("abc") d = {C(): 42} print d["abc"] Because "abc" and C() have the same hash value, the lookup will compare "abc" to C() which will invoke C.__eq__(). Why are you so keen on using a dictionary to share data between threads that may both modify it? IMO this is asking for trouble -- the advice about sharing data between threads is always to use the Queue module. [...] > Note that multidict -- among other possible concrete collection patterns > (like Bag, OrderedDict, or others) -- can be readily implemented with > threading guarantees. I don't believe that this is as easy as you think. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Feb 20 22:32:08 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 13:32:08 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: On 2/20/06, Alex Martelli wrote: > > On Feb 20, 2006, at 12:33 PM, Guido van Rossum wrote: > ... > > You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 > > is perfectly fine there and hard to improve upon. > > I see d[k]+=1 as a substantial improvement -- conceptually more > direct, "I've now seen one more k than I had seen before". Yes, I now agree. This means that I'm withdrawing proposal A (new method) and championing only B (a subclass that implements __getitem__() calling on_missing() and on_missing() defined in that subclass as before, calling default_factory unless it's None). I don't think this crisis is big enough to need *two* solutions, and this example shows B's superiority over A. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Mon Feb 20 22:43:20 2006 From: python at rcn.com (Raymond Hettinger) Date: Mon, 20 Feb 2006 16:43:20 -0500 Subject: [Python-Dev] defaultdict proposal round three References: Message-ID: <007701c63666$aaf98080$7600a8c0@RaymondLaptop1> [Crutcher Dunnavant ] >> There are many times that I want d[key] to give me a value even when >> it isn't defined, but that doesn't always mean I want to _save_ that >> value in the dict. How does that differ from the existing dict.get method? Raymond From martin at v.loewis.de Mon Feb 20 22:44:00 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 22:44:00 +0100 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FA3820.3070607@v.loewis.de> Stephen J. Turnbull wrote: > Martin> For an example where base64 is *not* necessarily > Martin> ASCII-encoded, see the "binary" data type in XML > Martin> Schema. There, base64 is embedded into an XML document, > Martin> and uses the encoding of the entire XML document. As a > Martin> result, you may get base64 data in utf16le. > > I'll have to take a look. It depends on whether base64 is specified > as an octet-stream to Unicode stream transformation or as an embedding > of an intermediate representation into Unicode. Granted, defining the > base64 alphabet as a subset of Unicode seems like the logical way to > do it in the context of XML. Please do take a look. It is the only way: If you were to embed base64 *bytes* into character data content of an XML element, the resulting XML file might not be well-formed anymore (if the encoding of the XML file is not an ASCII superencoding). Regards, Martin From ianb at colorstudy.com Mon Feb 20 22:47:45 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 20 Feb 2006 15:47:45 -0600 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <43FA30F3.1040900@colorstudy.com> Message-ID: <43FA3901.3040304@colorstudy.com> Guido van Rossum wrote: > Why are you so keen on using a dictionary to share data between > threads that may both modify it? IMO this is asking for trouble -- > the advice about sharing data between threads is always to use the > Queue module. I use them often for a shared caches. But yeah, it's harder than I thought at first -- I think the actual cases I'm using work, since they use simple keys (ints, strings), but yeah, thread guarantees are too difficult to handle in general. Damn threads. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From aahz at pythoncraft.com Mon Feb 20 23:10:16 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 20 Feb 2006 14:10:16 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: <20060218225135.5FCA.JCARLSON@uci.edu> <20060219194015.5FCD.JCARLSON@uci.edu> <20060220151808.GA503@panix.com> <20060220203852.GA22115@panix.com> Message-ID: <20060220221015.GB27299@panix.com> On Mon, Feb 20, 2006, Alex Martelli wrote: > On Feb 20, 2006, at 12:38 PM, Aahz wrote: > ... >>> Can you say, for the record (since nobody else seems to care), if >>> d.getorset(key, func) would work in your use cases? >> >> Because I haven't been reading this thread all that closely, you'll >> have >> to remind me what this means. > > Roughly the same (save for method/function difference) as: > > def getorset(d, key, func): > if key not in d: d[key] = func() > return d[key] That has the problem of looking clumsy, and doubly so for our use case where it's an attribute-based dict. Our style relies on the clean look of code like this: if order.street: ... Even as a dict, that doesn't look horrible: if order['street']: ... OTOH, this starts looking ugly: if order.get('street'): ... And this is just plain bad: if getattr(order, 'street'): ... Note that because we have to deal with *both* the possibility that the attribute/key may not be there *and* that it might be blank -- but both are semantically equivalent for our application -- there's no other clean coding style. Now, I realize this is different from the "primary use case" for needing mutable values, but any proposed default dict solution that doesn't cleanly support my use case is less interesting to me. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From martin at v.loewis.de Mon Feb 20 23:22:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 23:22:53 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: References: <43F8265A.2020209@v.loewis.de> <43F9171D.1040208@v.loewis.de> Message-ID: <43FA413D.8090001@v.loewis.de> Guido van Rossum wrote: > They don't; I think a separate page would be a fine idea. Ok, I have now split this into three pages. > FWIW, it looks like all the sample templates are still wasting a lot > of horizontal space in the first two columns the second is almost > always empty. Perhaps the author of the change could be placed *below* > the timestamp instead of next to it? Also for all practical purposes > we can probably get rid of the seconds in the timestamp. The latter was easy to do, so I did it. The former is tricky; contributions are welcome. Regards, Martin From martin at v.loewis.de Mon Feb 20 23:25:16 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 23:25:16 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43F9634F.3020000@holdenweb.com> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8DDE1.8010305@benjiyork.com> <43F8FA3C.30405@livinglogic.de> <43F9634F.3020000@holdenweb.com> Message-ID: <43FA41CC.1020901@v.loewis.de> Steve Holden wrote: > All formats would be improved of the headers could be made to float at > the top of the page as scrolling took place. Can this be done in CSS? If so, contributions are welcome. If not, can somebody prepare a modified page with the necessary changes (preferably only additional classes for the header or some such); I can then try to edit buildbot to add these changes into the page. Regards, Martin From g.brandl at gmx.net Mon Feb 20 23:35:02 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 20 Feb 2006 23:35:02 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43FA41CC.1020901@v.loewis.de> References: <43F7EFF4.6010903@benjiyork.com> <43F80D91.7050004@v.loewis.de> <43F8A585.7090906@benjiyork.com> <43F8BA01.9060506@livinglogic.de> <43F8DDE1.8010305@benjiyork.com> <43F8FA3C.30405@livinglogic.de> <43F9634F.3020000@holdenweb.com> <43FA41CC.1020901@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Steve Holden wrote: >> All formats would be improved of the headers could be made to float at >> the top of the page as scrolling took place. > > Can this be done in CSS? If so, contributions are welcome. Not as it is. The big table would have to be split so that there is one table with the heading and one with the rest. But that would make the columns independent, so the header's column widths would differ from the content's. Even then, I don't know if there's a working solution for the headers to stay on top since * floats are only left or right aligned * the header's height is variable * position:absolute doesn't work in MSIE. regards, Georg From martin at v.loewis.de Mon Feb 20 23:38:27 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 23:38:27 +0100 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43F9C2CA.4010808@egenix.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> <43F9C2CA.4010808@egenix.com> Message-ID: <43FA44E3.9090206@v.loewis.de> M.-A. Lemburg wrote: > Note that this does not mean that we should forget about memory > consumption issues. It's just that if there's only marginal > interest in certain special builds of Python, I don't see the > requirement for the Python core developers to maintain them. Well, the cost of Unicode support is not so much in the algorithmic part, but in the tables that come along with it. AFAICT, everything but unicodectype is optional; that is 5KiB of code and 20KiB of data on x86. Actually, the size of the code *does* matter, at a second glance. Here are the largest object files in the Python code base on my system (not counting dynamic modules): text data bss dec hex filename 4845 19968 0 24813 60ed Objects/unicodectype.o 22633 2432 352 25417 6349 Objects/listobject.o 29259 1412 152 30823 7867 Objects/classobject.o 20696 11488 4 32188 7dbc Python/bltinmodule.o 33579 740 0 34319 860f Objects/longobject.o 34119 16 288 34423 8677 Python/ceval.o 35179 2796 0 37975 9457 Modules/_sre.o 26539 15820 416 42775 a717 Modules/posixmodule.o 35283 8800 1056 45139 b053 Objects/stringobject.o 50360 0 28 50388 c4d4 Python/compile.o 68455 4624 440 73519 11f2f Objects/typeobject.o 69993 9316 1196 80505 13a79 Objects/unicodeobject.o So it appears that dropping Unicode support can indeed provide some savings. For reference, we also have an option to drop complex numbers: 9654 692 4 10350 286e Objects/complexobject.o Regards, Martin From martin at v.loewis.de Mon Feb 20 23:49:59 2006 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 23:49:59 +0100 Subject: [Python-Dev] bdist_* to stdlib? In-Reply-To: <1140460369.13739.117.camel@localhost.localdomain> References: <43F27D25.7070208@canterbury.ac.nz> <1140007745.13739.7.camel@localhost.localdomain> <87slqiv61n.fsf@tleepslib.sk.tsukuba.ac.jp> <43F64CAD.5020407@v.loewis.de> <1140460369.13739.117.camel@localhost.localdomain> Message-ID: <43FA4797.4080606@v.loewis.de> Jan Claeys wrote: >>That, in turn, is because nobody is so short of disk space that >>you really *have* to share /usr/share across architectures, > > > I can see diskless thin clients that boot from flash memory doing things > like that? (E.g. having documentation and header files and other > less-important stuff on an nfs mount?) Having parts of the file system on NFS: sure, even have root on NFS: all the time. But if you have two classes of machines (say, diskless SPARC and diskless x86 PCs) for which you have to provide different sets of binaries on NFS: why do you have to share /usr/share across architectures? It will only save you a small percentage of disk space, and at additional hassles. Regards, Martin From martin at v.loewis.de Mon Feb 20 23:53:47 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2006 23:53:47 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43FA2D4D.9000202@voidspace.org.uk> References: <43F8265A.2020209@v.loewis.de> <43FA2D4D.9000202@voidspace.org.uk> Message-ID: <43FA487B.4070909@v.loewis.de> Michael Foord wrote: > Has a machine been volunteered ? Not yet. > I have a spare machine and an always on connection. Would the 'right' > development tools be needed ? (In the case of Microsoft they are a touch > expensive I believe.) Any build process would do. I would prefer to see the official tools on the buildbot (i.e. VS.NET 2003), but anything else that can build Python and pass the test suite could do as well. One issue is that you also have to to work with me on defining the build steps: what sequence of commands to send in what order. For Unix, that is easy; for Windows, not so. Regards, Martin From steven.bethard at gmail.com Mon Feb 20 23:58:09 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 20 Feb 2006 15:58:09 -0700 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: I wrote: > # I want to do ``dd[item] += 1`` Guido van Rossum wrote: > You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 > is perfectly fine there and hard to improve upon. Alex Martelli wrote: > I see d[k]+=1 as a substantial improvement -- conceptually more > direct, "I've now seen one more k than I had seen before". Guido van Rossum wrote: > Yes, I now agree. This means that I'm withdrawing proposal A (new > method) and championing only B (a subclass that implements > __getitem__() calling on_missing() and on_missing() defined in that > subclass as before, calling default_factory unless it's None). Probably already obvious from my previous post, but FWIW, +1. Two unaddressed issues: * What module should hold the type? I hope the collections module isn't too controversial. * Should default_factory be an argument to the constructor? The three answers I see: - "No." I'm not a big fan of this answer. Since the whole point of creating a defaultdict type is to provide a default, requiring two statements (the constructor call and the default_factory assignment) to initialize such a dictionary seems a little inconvenient. - "Yes and it should be followed by all the normal dict constructor arguments." This is okay, but a few errors, like ``defaultdict({1:2})`` will pass silently (until you try to use the dict, of course). - "Yes and it should be the only constructor argument." This is my favorite mainly because I think it's simple, and I couldn't think of good examples where I really wanted to do ``defaultdict(list, some_dict_or_iterable)`` or ``defaultdict(list, **some_keyword_args)``. It's also forward compatible if we need to add some of the dict constructor args in later. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Tue Feb 21 00:00:43 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 00:00:43 +0100 Subject: [Python-Dev] Win64 AMD64 (aka x64) binaries available64 Message-ID: <43FA4A1B.3030209@v.loewis.de> I have now produces a snapshot of a Win64 build for AMD64 processors (also known as EM64T or x64); this is different from IA-64 (which is also known as Itanium)... Anyway, the binaries are http://www.dcl.hpi.uni-potsdam.de/home/loewis/python-2.5.13199.amd64.msi This is from today's trunk. If you have general remarks/discussion, please post to python-dev. If you have specific bug reports, file them on SF. Bug fixes are particularly welcome. Known issues: - _ssl.pyd is not build (I get linker errors) - some of the tests fail (in some cases, due to bugs in the test suite) If you want to build extensions for this build using distutils, you need to 1. install the platform SDK (2003 SP1 should work) 2. open an AMD64 retail shell 3. run the included distutils It might be possible to drop 2) some day, but finding the SDK from the registry is really tricky. Regards, Martin From brett at python.org Tue Feb 21 00:04:57 2006 From: brett at python.org (Brett Cannon) Date: Mon, 20 Feb 2006 15:04:57 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: On 2/20/06, Steven Bethard wrote: > I wrote: > > # I want to do ``dd[item] += 1`` > > Guido van Rossum wrote: > > You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 > > is perfectly fine there and hard to improve upon. > > Alex Martelli wrote: > > I see d[k]+=1 as a substantial improvement -- conceptually more > > direct, "I've now seen one more k than I had seen before". > > Guido van Rossum wrote: > > Yes, I now agree. This means that I'm withdrawing proposal A (new > > method) and championing only B (a subclass that implements > > __getitem__() calling on_missing() and on_missing() defined in that > > subclass as before, calling default_factory unless it's None). > > Probably already obvious from my previous post, but FWIW, +1. > > Two unaddressed issues: > > * What module should hold the type? I hope the collections module > isn't too controversial. > > * Should default_factory be an argument to the constructor? The three > answers I see: > > - "No." I'm not a big fan of this answer. Since the whole point of > creating a defaultdict type is to provide a default, requiring two > statements (the constructor call and the default_factory assignment) > to initialize such a dictionary seems a little inconvenient. > - "Yes and it should be followed by all the normal dict constructor > arguments." This is okay, but a few errors, like > ``defaultdict({1:2})`` will pass silently (until you try to use the > dict, of course). > - "Yes and it should be the only constructor argument." This is my > favorite mainly because I think it's simple, and I couldn't think of > good examples where I really wanted to do ``defaultdict(list, > some_dict_or_iterable)`` or ``defaultdict(list, > **some_keyword_args)``. It's also forward compatible if we need to > add some of the dict constructor args in later. > While #3 is my preferred solution as well, it does pose a Liskov violation if this is a direct dict subclass instead of storing a dict internally (can't remember the name of the design pattern that does this). But I think it is good to have the constructor be different since it does also help drive home the point that this is not a standard dict. -Brett From dan.gass at gmail.com Tue Feb 21 00:08:07 2006 From: dan.gass at gmail.com (Dan Gass) Date: Mon, 20 Feb 2006 17:08:07 -0600 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> Message-ID: On 2/20/06, Raymond Hettinger wrote: > > > An alternative is to have two possible attributes: > d.default_factory = list > or > d.default_value = 0 > with an exception being raised when both are defined (the test is done > when the > attribute is created, not when the lookup is performed). > > Why not have the factory function take the key being looked up as an argument? Seems like there would be uses to customize the default based on the key. It also forces you to handle list factory functions and constant factory functions (amongst others) to be handled the same way: d.default_factory = lambda k : list() d.default_factory = lambda k : 0 Dan Gass -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060220/2f99b73a/attachment.htm From python at rcn.com Tue Feb 21 00:14:13 2006 From: python at rcn.com (Raymond Hettinger) Date: Mon, 20 Feb 2006 18:14:13 -0500 Subject: [Python-Dev] defaultdict proposal round three References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: <00c901c63673$631c8430$7600a8c0@RaymondLaptop1> [Steven Bethard] > * Should default_factory be an argument to the constructor? The three > answers I see: > > - "No." I'm not a big fan of this answer. Since the whole point of > creating a defaultdict type is to provide a default, requiring two > statements (the constructor call and the default_factory assignment) > to initialize such a dictionary seems a little inconvenient. You still have to allow assignments to the default_factory attribute to allow the factory to be changed: dd.default_factory = SomeFactory If it's too much effort to do the initial setup in two lines, a classmethod could serve as an alternate constructor (leaving the regular contructor fully interchangeable with dicts): dd = defaultdict.setup(list, {'k1':'v1', 'k2:v2'}) or when there are no initial values: dd = defaultdict.setup(list) Raymond From steven.bethard at gmail.com Tue Feb 21 00:14:27 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 20 Feb 2006 16:14:27 -0700 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> Message-ID: On 2/20/06, Dan Gass wrote: > Why not have the factory function take the key being looked up as an > argument? Seems like there would be uses to customize the default based on > the key. It also forces you to handle list factory functions and constant > factory functions (amongst others) to be handled the same way: > > d.default_factory = lambda k : list() > d.default_factory = lambda k : 0 Guido's currently backing "a subclass that implements __getitem__() calling on_missing() and on_missing() ... calling default_factory unless it's None". I think for 90% of the use-cases, you don't need a key argument. If you do, you should subclass defaultdict and override the on_missing() method. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From guido at python.org Tue Feb 21 00:17:56 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 15:17:56 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: On 2/20/06, Brett Cannon wrote: > While #3 is my preferred solution as well, it does pose a Liskov > violation if this is a direct dict subclass instead of storing a dict > internally (can't remember the name of the design pattern that does > this). But I think it is good to have the constructor be different > since it does also help drive home the point that this is not a > standard dict. I've heard this argument a few times now from different folks and I'm tired of it. It's wrong. It's not true. It's a dead argument. It's pushing up the daisies, so to speak. Please stop abusing Barbara Liskov's name and remember that the constructor signature is *not* part of the interface to an instance! Changing the constructor signature in a subclass does *not* cause *any* "Liskov" violations because the constructor is not called by *users* of the object -- it is only called to *create* an object. As the *user* of an object you're not allowed to *create* another instance (unless the object provides an explicit API to do so, of course, in which case you deal with that API's signature, not with the constructor). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 21 00:23:36 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 15:23:36 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> Message-ID: On 2/20/06, Dan Gass wrote: > Why not have the factory function take the key being looked up as an > argument? This was considered and rejected already. You can already customize based on the key by overriding on_missing() [*]. If the factory were to take a key argument, we couldn't use list or int as the factory function; we'd have to write lambda key: list(). There aren't that many use cases for having the factory function depend on the key anyway; it's mostly on_missing() that needs the key so it can insert the new value into the dict. [*] Earlier in this thread I wrote that on_missing() could be inlined. I take that back; I think it's better to have it be available explicitly so you can override it without having to override __getitem__(). This is faster, assuming most __getitem__() calls find the key already inserted, and reduces the amount of code you have to write to customize the behavior; it also reduces worries about how to call the superclass __getitem__ method (catching KeyError *might* catch an unrelated KeyError caused by a bug in the key's __hash__ or __eq__ method). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Feb 21 00:19:36 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 12:19:36 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: Message-ID: <43FA4E88.4090507@canterbury.ac.nz> Guido van Rossum wrote: > I see two alternatives. Have you considered the third alternative that's been mentioned -- a wrapper? The issue of __contains__ etc. could be sidestepped by not giving the wrapper a __contains__ method at all. If you want to do an 'in' test you do it on the underlying dict, and then the semantics are clear. Greg From fuzzyman at voidspace.org.uk Tue Feb 21 00:37:44 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 20 Feb 2006 23:37:44 +0000 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43FA487B.4070909@v.loewis.de> References: <43F8265A.2020209@v.loewis.de> <43FA2D4D.9000202@voidspace.org.uk> <43FA487B.4070909@v.loewis.de> Message-ID: <43FA52C8.70201@voidspace.org.uk> Martin v. L?wis wrote: > Michael Foord wrote: > >> Has a machine been volunteered ? >> > > Not yet. > > >> I have a spare machine and an always on connection. Would the 'right' >> development tools be needed ? (In the case of Microsoft they are a touch >> expensive I believe.) >> > > Any build process would do. I would prefer to see the official tools on > the buildbot (i.e. VS.NET 2003), Man, that's a difficult (and expensive) piece of software to obtain, unless you're a student. I couldn't find a legal non-academic version for less than ?100. I might hunt around though. Shame. I suspect that hacking the free compilers to work would require more knowledge than I possess. Sorry. > but anything else that can build Python > and pass the test suite could do as well. > > One issue is that you also have to to work with me on defining the build > steps: what sequence of commands to send in what order. For Unix, that > is easy; for Windows, not so. > > Working with you wouldn't be a problem. Looks like the idea is a currently a bit of a dead dog though. All the best, Michael Foord > Regards, > Martin > > From rrr at ronadam.com Tue Feb 21 00:40:19 2006 From: rrr at ronadam.com (Ron Adam) Date: Mon, 20 Feb 2006 17:40:19 -0600 Subject: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?] In-Reply-To: <43f9af36.1288886430@news.gmane.org> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> <43F7113E.8090300@egenix.com> <20060218223315.GG23863@xs4all.nl> <43f9af36.1288886430@news.gmane.org> Message-ID: <43FA5363.7030602@ronadam.com> Bengt Richter wrote: > On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters wrote: > note what base64 really is for. It's essence is to create a _character_ sequence > which can succeed in being encoded as ascii. The concept of base64 going str->str > is really a mental shortcut for s_str.decode('base64').encode('ascii'), where > 3 octets are decoded as code for 4 characters modulo padding logic. Wouldn't it be... obj.encode('base64').encode('ascii') This would probably also work... obj.encode('base64').decode('ascii') -> ascii alphabet in unicode Where the underlying sequence might be ... obj -> bytes -> bytes:base64 -> base64 ascii character set The point is to have the data in a safe to transmit form that can survive being encoded and decoded into different forms along the transmission path and still be restored at the final destination. base64 ascii character set -> bytes:base64 -> original bytes -> obj * a related note, derived from this and your other post in this thread. If the str type constructor had an encode argument like the unicode type does, along with a str.encoded_with attribute. Then it might be possible to depreciate the .decode() and .encode() methods and remove them form P3k entirely or use them as data coders/decoders instead of char type encoders. It could also create a clear separation between character encodings and data coding. The following should give an exception. str(str, 'rot13')) Rot13 isn't a character encoding, but a data coding method. data_str.encode('rot13') # could be ok But this wouldn't... new_str = data_str.encode('latin_1') # could cause an exception We'd have to use... new_str = str(data_str, 'latin_1') # New string sub type... Cheers, Ronald Adam From jcarlson at uci.edu Tue Feb 21 00:51:00 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 20 Feb 2006 15:51:00 -0800 Subject: [Python-Dev] buildbot is all green In-Reply-To: <43FA52C8.70201@voidspace.org.uk> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> Message-ID: <20060220154659.5FF5.JCARLSON@uci.edu> Michael Foord wrote: > Martin v. L?wis wrote: > > Any build process would do. I would prefer to see the official tools on > > the buildbot (i.e. VS.NET 2003), I can get a free academic license for VS.NET 2003 professional with my university (MSDNAA), and I've also got a Windows machine sitting in my office with a few spare cycles. > > One issue is that you also have to to work with me on defining the build > > steps: what sequence of commands to send in what order. For Unix, that > > is easy; for Windows, not so. If you're up for it, I'm up for it. It'll take me a bit to get the software on the machine. Want me to ping you when I get the toolset installed? - Josiah From seojiwon at gmail.com Tue Feb 21 00:55:19 2006 From: seojiwon at gmail.com (Jiwon Seo) Date: Mon, 20 Feb 2006 15:55:19 -0800 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: Regarding this Grammar change; (last October) from argument: [test '=' ] test [gen_for] to argument: test [gen_for] | test '=' test ['(' gen_for ')'] - to raise error for "bar(a = i for i in range(10)) )" I think we should change it to argument: test [gen_for] | test '=' test instead of argument: test [gen_for] | test '=' test ['(' gen_for ')'] that is, without ['(' gen_for ')'] . We don't need that extra term, because "test" itself includes generator expressions - with all those parensises. Actually with that extra ['(' gen_for ')'] , foo(a= 10 (for y in 'a')) is grammartically correct ; although that error seems to be checked elsewhere. I tested without ['(' gen_for ')'] , and worked fine passing Lib/test/test_genexps.py -Jiwon On 10/20/05, Neal Norwitz wrote: > On 10/16/05, Neal Norwitz wrote: > > On 10/10/05, Neal Norwitz wrote: > > > There's a problem with genexp's that I think really needs to get > > > fixed. See http://python.org/sf/1167751 the details are below. This > > > code: > > > > > > >>> foo(a = i for i in range(10)) > > > > > > I agree with the bug report that the code should either raise a > > > SyntaxError or do the right thing. > > > > The change to Grammar/Grammar below seems to fix the problem and all > > the tests pass. Can anyone comment on whether this fix is > > correct/appropriate? Is there a better way to fix the problem? > > Since no one responded other than Jiwon, I checked in this change. I > did *not* backport it since what was syntactically correct in 2.4.2 > would raise an error in 2.4.3. I'm not sure which is worse. I'll > leave it up to Anthony whether this should be backported. > > BTW, the change was the same regardless of old code vs. new AST code. > > n > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com > From greg.ewing at canterbury.ac.nz Tue Feb 21 01:00:32 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 13:00:32 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: <43FA5820.8040705@canterbury.ac.nz> Brett Cannon wrote: > While #3 is my preferred solution as well, it does pose a Liskov > violation if this is a direct dict subclass I'm not sure we should be too worried about that. Inheritance in Python has always been more about implementation than interface, so Liskov doesn't really apply in the same way it does in statically typed languages. In other words, just because A inherits from B in Python isn't meant to imply that an A is a drop-in replacement for a B. Greg From trentm at ActiveState.com Tue Feb 21 01:17:45 2006 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 20 Feb 2006 16:17:45 -0800 Subject: [Python-Dev] Win64 AMD64 (aka x64) binaries available64 In-Reply-To: <43FA4A1B.3030209@v.loewis.de> References: <43FA4A1B.3030209@v.loewis.de> Message-ID: <20060221001745.GC13482@activestate.com> [Martin v. Loewis wrote] > If you want to build extensions for this build using distutils, you > need to > ... > 2. open an AMD64 retail shell > ... > > It might be possible to drop 2) some day, but finding the SDK from > the registry is really tricky. Look for: def find_platform_sdk_dir() here: http://cvs.sourceforge.net/viewcvs.py/pywin32/pywin32/setup.py?view=markup That is the best code I know for doing that. Trent -- Trent Mick TrentM at ActiveState.com From tdelaney at avaya.com Tue Feb 21 01:19:24 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 21 Feb 2006 11:19:24 +1100 Subject: [Python-Dev] defaultdict proposal round three Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> Greg Ewing wrote: > In other words, just because A inherits from B in > Python isn't meant to imply that an A is a drop-in > replacement for a B. Hmm - this is interesting. I'm not arguing Liskov violations or anything ... However, *because* Python uses duck typing, I tend to feel that subclasses in Python *should* be drop-in replacements. If it's not a drop-in replacement, then it should probably not subclass, but just use duck typing (probably by wrapping). Subclassing implies a stronger relationship to me. Which is why I think I prefer using a wrapper for a default dict, rather than a subclass. Tim Delaney From martin at v.loewis.de Tue Feb 21 01:36:57 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 01:36:57 +0100 Subject: [Python-Dev] buildbot is all green In-Reply-To: <20060220154659.5FF5.JCARLSON@uci.edu> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> Message-ID: <43FA60A9.4030209@v.loewis.de> Josiah Carlson wrote: > If you're up for it, I'm up for it. It'll take me a bit to get the > software on the machine. Want me to ping you when I get the toolset > installed? Sure! That should work fine. It would be best if the buildbot would run with the environment variables all set up, so that both svn.exe and devenv.exe can be found in the path. Then I would need the sequence of commands that the buildbot master should issue (svn update, build, run tests, clean). Regards, Martin From martin at v.loewis.de Tue Feb 21 01:41:50 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 01:41:50 +0100 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> Message-ID: <43FA61CE.2020501@v.loewis.de> Delaney, Timothy (Tim) wrote: > However, *because* Python uses duck typing, I tend to feel that > subclasses in Python *should* be drop-in replacements. If it's not a > drop-in replacement, then it should probably not subclass, but just use > duck typing (probably by wrapping). Inheritance is more about code reuse than about polymorphism. Regards, Martin From rrr at ronadam.com Tue Feb 21 01:40:45 2006 From: rrr at ronadam.com (Ron Adam) Date: Mon, 20 Feb 2006 18:40:45 -0600 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]] In-Reply-To: <43f9424b.1261003927@news.gmane.org> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060218051344.GC28761@panix.com> <43F6E1FA.7080909@v.loewis.de> <43f9424b.1261003927@news.gmane.org> Message-ID: <43FA618D.7070804@ronadam.com> Bengt Richter wrote: > On Sat, 18 Feb 2006 09:59:38 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > Thinking about bytes recently, it occurs to me that bytes are really not intrinsically > numeric in nature. They don't necessarily represent uint8's. E.g., a binary file is > really a sequence of bit octets in its most primitive and abstract sense. In that you would want to do different types of operations on single byte (an octet) than you would on str, or integer, I agree. Storing byte information as 16 or 32 bits ints could take up a rather lot of memory in some cases. I don't think it's been clarified yet weather the bytes() type would be implemented in C where it could be a single object with access to it's individual bytes via indexing, or python list type object which stores integers, chars or some other byte length object like octets. My first impression is that it would be done in C with a way to access and change the actual bytes. So a Python octet type wouldn't be needed. But if it is implemented as a Python subclass of list or array, then an octet type would probably also be desired. > Bottom line thought: binary octets aren't numeric ;-) +1 Cheers, Ronald Adam From martin at v.loewis.de Tue Feb 21 01:44:18 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 01:44:18 +0100 Subject: [Python-Dev] Win64 AMD64 (aka x64) binaries available64 In-Reply-To: <20060221001745.GC13482@activestate.com> References: <43FA4A1B.3030209@v.loewis.de> <20060221001745.GC13482@activestate.com> Message-ID: <43FA6262.2030606@v.loewis.de> Trent Mick wrote: > Look for: > def find_platform_sdk_dir() > here: > http://cvs.sourceforge.net/viewcvs.py/pywin32/pywin32/setup.py?view=markup > > That is the best code I know for doing that. Right; I was planning something similar (although I would probably hard-code the 2003 SP1 registry key - it is not at all certain that future SDK releases will use the same registry scheme, and Microsoft has tricked users often enough in thinking they understood the scheme, just to change it with the next release entirely). Regards, Martin From aleaxit at gmail.com Tue Feb 21 01:55:34 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 16:55:34 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: On Feb 20, 2006, at 3:04 PM, Brett Cannon wrote: ... >> - "Yes and it should be the only constructor argument." This is my ... > While #3 is my preferred solution as well, it does pose a Liskov > violation if this is a direct dict subclass instead of storing a dict How so? Liskov's principle is (in her own words): If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T. How can this ever be broken by the mere presence of incompatible signatures for T's and S's ctors? I believe the principle, as stated above, was imperfectly stated, btw (it WAS preceded by "something like the following substitution property", indicating that Liskov was groping towards a good formulation), but that's an aside -- the point is that the principle is about substitution of _objects_, i.e., _instances_ of the types S and T, not about substitution of the _types_ themselves for each other. Instances exist and are supposed to satisfy their invariants _after_ ctors are done executing; ctor's signatures don't matter. In Python, of course, you _could_ call type(o2)(...) and possibly get different behavior if that was changed into type(o1)(...) -- the curse of powerful introspection;-). But then, isn't it trivial to obtain cases in which the behavior is NOT unchanged? If it was always unchanged, what would be the point of ever subclassing?-) Say that o2 is an int and o1 is a bool -- just a "print o2" already breaks the principle as stated (it's harder to get a simpler P than this...). Unless you have explicitly documented invariants (such as "any 'print o' must emit 1+ digits followed by a newline" for integers), you cannot say that some alleged subclass is breaking Liskov's property, in general. Mere "change of behavior" in the most general case cannot qualify, if method overriding is to be any use; such change IS traditionally allowed as long as preconditions are looser and postconditions are stricter; and I believe than in any real-world subclassing, with sufficient introspection you'll always find a violation E.g., a subtype IS allowed to add methods, by Liskov's specific example; but then, len(dir(o1)) cannot fail to be a higher number than len(dir(o2)), from which you can easily construct a P which "changes behavior" for any definition you care to choose. E.g., pick constant N as the len(dir(...)) for instances of type T, and say that M>N is the len(dir(...)) for instances of S. Well, then, math.sqrt(N-len(dir(o2))) is well defined -- but change o2 into o1, and since N-M is <0, you'll get an exception. If you can give an introspection-free example showing how Liskov substitution would be broken by a mere change to incompatible signature in the ctor, I'll be grateful; but I don't think it can be done. Alex From python at rcn.com Tue Feb 21 02:05:33 2006 From: python at rcn.com (Raymond Hettinger) Date: Mon, 20 Feb 2006 20:05:33 -0500 Subject: [Python-Dev] defaultdict proposal round three References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> Message-ID: <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> [Alex] >> I see d[k]+=1 as a substantial improvement -- conceptually more >> direct, "I've now seen one more k than I had seen before". [Guido] > Yes, I now agree. This means that I'm withdrawing proposal A (new > method) and championing only B (a subclass that implements > __getitem__() calling on_missing() and on_missing() defined in that > subclass as before, calling default_factory unless it's None). I don't > think this crisis is big enough to need *two* solutions, and this > example shows B's superiority over A. FWIW, I'm happy with the proposal and think it is a nice addition to Py2.5. Raymond From aahz at pythoncraft.com Tue Feb 21 02:11:48 2006 From: aahz at pythoncraft.com (Aahz) Date: Mon, 20 Feb 2006 17:11:48 -0800 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FA60A9.4030209@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> Message-ID: <20060221011148.GA20714@panix.com> If you're willing to commit to running a buildbot, and the only thing preventing you is shelling out $$$ to Microsoft, send me e-mail. I'll compile a list to send to the PSF and we'll either poke Microsoft to provide some more free licenses or pay for it ourselves. This is what the PSF is for! Note the emphasis on the word "commit", please. I'm setting an arbitrary deadline of Saturday Feb 25 so I don't have to monitor indefinitely. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From aleaxit at gmail.com Tue Feb 21 02:46:06 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 17:46:06 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> Message-ID: <5A1C75B0-FFAC-4B70-B27C-FE58F287A2F7@gmail.com> On Feb 20, 2006, at 5:05 PM, Raymond Hettinger wrote: > [Alex] >>> I see d[k]+=1 as a substantial improvement -- conceptually more >>> direct, "I've now seen one more k than I had seen before". > > [Guido] >> Yes, I now agree. This means that I'm withdrawing proposal A (new >> method) and championing only B (a subclass that implements >> __getitem__() calling on_missing() and on_missing() defined in that >> subclass as before, calling default_factory unless it's None). I >> don't >> think this crisis is big enough to need *two* solutions, and this >> example shows B's superiority over A. > > FWIW, I'm happy with the proposal and think it is a nice addition > to Py2.5. OK, sounds great to me. collections.defaultdict, then? Alex From crutcher at gmail.com Tue Feb 21 02:57:30 2006 From: crutcher at gmail.com (Crutcher Dunnavant) Date: Mon, 20 Feb 2006 17:57:30 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <007701c63666$aaf98080$7600a8c0@RaymondLaptop1> References: <007701c63666$aaf98080$7600a8c0@RaymondLaptop1> Message-ID: in two ways: 1) dict.get doesn't work for object dicts or in exec/eval contexts, and 2) dict.get requires me to generate the default value even if I'm not going to use it, a process which may be expensive. On 2/20/06, Raymond Hettinger wrote: > [Crutcher Dunnavant ] > >> There are many times that I want d[key] to give me a value even when > >> it isn't defined, but that doesn't always mean I want to _save_ that > >> value in the dict. > > How does that differ from the existing dict.get method? > > > Raymond > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From guido at python.org Tue Feb 21 03:03:34 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 18:03:34 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <5A1C75B0-FFAC-4B70-B27C-FE58F287A2F7@gmail.com> References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> <5A1C75B0-FFAC-4B70-B27C-FE58F287A2F7@gmail.com> Message-ID: On 2/20/06, Alex Martelli wrote: > > [Alex] > >>> I see d[k]+=1 as a substantial improvement -- conceptually more > >>> direct, "I've now seen one more k than I had seen before". > > > > [Guido] > >> Yes, I now agree. This means that I'm withdrawing proposal A (new > >> method) and championing only B (a subclass that implements > >> __getitem__() calling on_missing() and on_missing() defined in that > >> subclass as before, calling default_factory unless it's None). I don't > >> think this crisis is big enough to need *two* solutions, and this > >> example shows B's superiority over A. [Raymond] > > FWIW, I'm happy with the proposal and think it is a nice addition > > to Py2.5. [Alex] > OK, sounds great to me. collections.defaultdict, then? I have a patch ready that implements this. I've assigned it to Raymond for review. I'm just reusing the same SF patch as before: python.org/sf/1433928. One subtlety: for maximul flexibility and speed, the standard dict type now defines an on_missing(key) method; however this version *just* raises KeyError and the implementation actually doesn't call it unless the class is a subtype (with the possibility of overriding on_missing()). collections.defaultdict overrides on_missing(key) to insert and return self.fefault_factory() if it is not empty; otherwise it raises KeyError. (It should really call the base class on_missing() but I figured I'd just in-line it which is easier to code in C than a super-call.) The defaultdict signature takes an optional positional argument which is the default_factory, defaulting to None. The remaining positional and all keyword arguments are passed to the dict constructor. IOW: d = defaultdict(list, [(1, 2)]) is equivalent to: d = defaultdict() d.default_factory = list d.update([(1, 2)]) At this point, repr(d) will be: defaultdict(, {1: 2}) Once Raymond approves the patch I'll check it in. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 21 03:06:13 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 18:06:13 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> <5A1C75B0-FFAC-4B70-B27C-FE58F287A2F7@gmail.com> Message-ID: On 2/20/06, Guido van Rossum wrote: > [stuff with typos] Here's the proofread version: I have a patch ready that implements this. I've assigned it to Raymond for review. I'm just reusing the same SF patch as before: http://python.org/sf/1433928 . One subtlety: for maximal flexibility and speed, the standard dict type now defines an on_missing(key) method; however this version *just* raises KeyError and the implementation actually doesn't call it unless the class is a subtype (with the possibility of overriding on_missing()). collections.defaultdict overrides on_missing(key) to insert and return self.default_factory() if it is not None; otherwise it raises KeyError. (It should really call the base class on_missing() but I figured I'd just in-line it which is easier to code in C than a super-call.) The defaultdict signature takes an optional positional argument which is the default_factory, defaulting to None. The remaining positional and all keyword arguments are passed to the dict constructor. IOW: d = defaultdict(list, [(1, 2)]) is equivalent to: d = defaultdict() d.default_factory = list d.update([(1, 2)]) At this point, repr(d) will be: defaultdict(, {1: 2}) Once Raymond approves the patch I'll check it in. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Tue Feb 21 03:10:21 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 21 Feb 2006 13:10:21 +1100 Subject: [Python-Dev] defaultdict proposal round three Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB972@au3010avexu1.global.avaya.com> "Martin v. L?wis" wrote: > Delaney, Timothy (Tim) wrote: >> However, *because* Python uses duck typing, I tend to feel that >> subclasses in Python *should* be drop-in replacements. If it's not a >> drop-in replacement, then it should probably not subclass, but just >> use duck typing (probably by wrapping). > > Inheritance is more about code reuse than about polymorphism. Oh - it's definitely no hard-and-fast rule. owever, I have found that *usually* people (including myself) only subclass when they want an is-a relationship, whereas duck typing is behaves-like. In any case, Guido has produced a patch, and the tone of his message sounded like a Pronouncement ... Tim Delaney From guido at python.org Tue Feb 21 03:12:50 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 18:12:50 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FA4E88.4090507@canterbury.ac.nz> References: <43FA4E88.4090507@canterbury.ac.nz> Message-ID: On 2/20/06, Greg Ewing wrote: > Have you considered the third alternative that's been > mentioned -- a wrapper? I don't like that at all. It's quite tricky to implement a fully transparent wrapper that supports all the special methods (__setitem__ etc.). It will be slower. And it will be more cumbersome to use. > The issue of __contains__ etc. could be sidestepped by > not giving the wrapper a __contains__ method at all. > If you want to do an 'in' test you do it on the > underlying dict, and then the semantics are clear. The semantics of defaultdict are crystal clear. __contains__(), keys() and friends represent the *actual*, *current* keys. Only __getitem__() calls on_missing() when the key is not present; being a "hook", on_missing() can do whatever it wants. What's the practical use case for not wanting __contains__() to function? All I hear is fear of theoretical bugs. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 21 03:48:19 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 18:48:19 -0800 Subject: [Python-Dev] readline compilarion fails on OSX Message-ID: On OSX (10.4.4) the readline module in the svn HEAD fails compilation as follows. This is particularly strange since the buildbot is green for OSX... What could be up with this? building 'readline' extension gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I/Users/guido/projects/python/trunk/./Include -I/Users/guido/projects/python/trunk/./Mac/Include -I../Include -I. -I/usr/local/include -I/Users/guido/projects/python/trunk/Include -I/Users/guido/projects/python/trunk/osx -c /Users/guido/projects/python/trunk/Modules/readline.c -o build/temp.darwin-8.4.0-Power_Macintosh-2.5/readline.o /Users/guido/projects/python/trunk/Modules/readline.c: In function 'write_history_file': /Users/guido/projects/python/trunk/Modules/readline.c:112: warning: implicit declaration of function 'history_truncate_file' /Users/guido/projects/python/trunk/Modules/readline.c: In function 'py_remove_history': /Users/guido/projects/python/trunk/Modules/readline.c:301: warning: implicit declaration of function 'remove_history' /Users/guido/projects/python/trunk/Modules/readline.c:301: warning: assignment makes pointer from integer without a cast /Users/guido/projects/python/trunk/Modules/readline.c:310: warning: passing argument 1 of 'free' discards qualifiers from pointer target type /Users/guido/projects/python/trunk/Modules/readline.c:312: warning: passing argument 1 of 'free' discards qualifiers from pointer target type /Users/guido/projects/python/trunk/Modules/readline.c: In function 'py_replace_history': /Users/guido/projects/python/trunk/Modules/readline.c:338: warning: implicit declaration of function 'replace_history_entry' /Users/guido/projects/python/trunk/Modules/readline.c:338: warning: assignment makes pointer from integer without a cast /Users/guido/projects/python/trunk/Modules/readline.c:347: warning: passing argument 1 of 'free' discards qualifiers from pointer target type /Users/guido/projects/python/trunk/Modules/readline.c:349: warning: passing argument 1 of 'free' discards qualifiers from pointer target type /Users/guido/projects/python/trunk/Modules/readline.c: In function 'get_current_history_length': /Users/guido/projects/python/trunk/Modules/readline.c:453: error: 'HISTORY_STATE' undeclared (first use in this function) /Users/guido/projects/python/trunk/Modules/readline.c:453: error: (Each undeclared identifier is reported only once /Users/guido/projects/python/trunk/Modules/readline.c:453: error: for each function it appears in.) /Users/guido/projects/python/trunk/Modules/readline.c:453: error: 'hist_st' undeclared (first use in this function) /Users/guido/projects/python/trunk/Modules/readline.c:455: warning: implicit declaration of function 'history_get_history_state' /Users/guido/projects/python/trunk/Modules/readline.c: In function 'insert_text': /Users/guido/projects/python/trunk/Modules/readline.c:503: warning: implicit declaration of function 'rl_insert_text' /Users/guido/projects/python/trunk/Modules/readline.c: In function 'on_completion': /Users/guido/projects/python/trunk/Modules/readline.c:637: error: 'rl_attempted_completion_over' undeclared (first use in this function) /Users/guido/projects/python/trunk/Modules/readline.c: In function 'flex_complete': /Users/guido/projects/python/trunk/Modules/readline.c:675: warning: passing argument 2 of 'completion_matches' from incompatible pointer type /Users/guido/projects/python/trunk/Modules/readline.c: In function 'setup_readline': /Users/guido/projects/python/trunk/Modules/readline.c:700: warning: passing argument 2 of 'rl_bind_key_in_map' from incompatible pointer type /Users/guido/projects/python/trunk/Modules/readline.c:701: warning: passing argument 2 of 'rl_bind_key_in_map' from incompatible pointer type /Users/guido/projects/python/trunk/Modules/readline.c: In function 'readline_until_enter_or_signal': /Users/guido/projects/python/trunk/Modules/readline.c:758: warning: passing argument 2 of 'rl_callback_handler_install' from incompatible pointer type /Users/guido/projects/python/trunk/Modules/readline.c:788: warning: implicit declaration of function 'rl_free_line_state' /Users/guido/projects/python/trunk/Modules/readline.c:789: warning: implicit declaration of function 'rl_cleanup_after_signal' /Users/guido/projects/python/trunk/Modules/readline.c: In function 'call_readline': /Users/guido/projects/python/trunk/Modules/readline.c:883: error: 'HISTORY_STATE' undeclared (first use in this function) /Users/guido/projects/python/trunk/Modules/readline.c:883: error: 'state' undeclared (first use in this function) /Users/guido/projects/python/trunk/Modules/readline.c:885: warning: assignment discards qualifiers from pointer target type (Yes, the keynote slides are coming along just fine... :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Tue Feb 21 04:04:08 2006 From: bob at redivi.com (Bob Ippolito) Date: Mon, 20 Feb 2006 19:04:08 -0800 Subject: [Python-Dev] readline compilarion fails on OSX In-Reply-To: References: Message-ID: <2A98FC2D-2D3F-4DEB-B0DC-63E174EAEABE@redivi.com> On Feb 20, 2006, at 6:48 PM, Guido van Rossum wrote: > On OSX (10.4.4) the readline module in the svn HEAD fails compilation > as follows. This is particularly strange since the buildbot is green > for OSX... What could be up with this? > > building 'readline' extension -lots of build junk- In Apple's quest to make our lives harder, they installed BSD libedit and symlinked it to readline. Python doesn't like that. The buildbot might have a real readline installation, or maybe the buildbot is skipping those tests. You'll need to install a real libreadline if you want it to work. I've also put together a little tarball that'll build readline.so statically, and there's pre-built eggs for OS X so the easy_install should be quick: http://python.org/pypi/readline -bob From guido at python.org Tue Feb 21 04:18:24 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2006 19:18:24 -0800 Subject: [Python-Dev] readline compilarion fails on OSX In-Reply-To: <2A98FC2D-2D3F-4DEB-B0DC-63E174EAEABE@redivi.com> References: <2A98FC2D-2D3F-4DEB-B0DC-63E174EAEABE@redivi.com> Message-ID: Thanks! That worked. But shouldn't we try to fix setup.py to detect this situation instead of making loud clattering noises? --Guido On 2/20/06, Bob Ippolito wrote: > > On Feb 20, 2006, at 6:48 PM, Guido van Rossum wrote: > > > On OSX (10.4.4) the readline module in the svn HEAD fails compilation > > as follows. This is particularly strange since the buildbot is green > > for OSX... What could be up with this? > > > > building 'readline' extension > -lots of build junk- > > In Apple's quest to make our lives harder, they installed BSD libedit > and symlinked it to readline. Python doesn't like that. The > buildbot might have a real readline installation, or maybe the > buildbot is skipping those tests. > > You'll need to install a real libreadline if you want it to work. > > I've also put together a little tarball that'll build readline.so > statically, and there's pre-built eggs for OS X so the easy_install > should be quick: > http://python.org/pypi/readline > > -bob > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.peters at gmail.com Tue Feb 21 04:24:59 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 20 Feb 2006 22:24:59 -0500 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <20060221011148.GA20714@panix.com> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> Message-ID: <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> [Aahz] > If you're willing to commit to running a buildbot, and the only thing > preventing you is shelling out $$$ to Microsoft, send me e-mail. I'll > compile a list to send to the PSF and we'll either poke Microsoft to > provide some more free licenses or pay for it ourselves. This is what > the PSF is for! Speaking as a PSF director, I might not vote for that :-) Fact is I've been keeping the build & tests 100% healthy on WinXP Pro, and that requires more than just running the tests (it also requires repairing compiler warnings and Unixisms). Speaking of which, a number of test failures over the past few weeks were provoked here only under -r (run tests in random order) or under a debug build, and didn't look like those were specific to Windows. Adding -r to the buildbot test recipe is a decent idea. Getting _some_ debug-build test runs would also be good (or do we do that already?). Anyway, since XP Pro is effectively covered, I'd be keener to see a Windows buildbot running under a different flavor of Windows. I expect I'll eventually volunteer my home box to run an XP buildbot, but am in no hurry (and probably won't leave any machine here turned on 24/7 regardless). From rhamph at gmail.com Tue Feb 21 04:55:22 2006 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 20 Feb 2006 20:55:22 -0700 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: On 2/20/06, Jim Jewett wrote: > Adam Olsen asked: > > ... d.getorset(key, func) would work in your use cases? > > It is an improvement over setdefault, because it doesn't > always evaluate the expensive func. (But why should every > call have to pass in the function, when it is a property of > the dictionary?) Because usually it's a property of how you use it, not a property of the dictionary. The dictionary is just a generic storage mechanism. > [snip] > In other words, the program would work correctly if I passed > in a normal but huge dictionary; I want to avoid that for reasons > of efficiency. This isn't the only use for a mapping, but it is > the only one I've seen where KeyError is "expected" by the > program's normal flow. Looking at your explanation, I agree, getorset is useless for that use case. However, I'm beginning to think we shouldn't be comparing them. defaultdict is a powerful but heavyweight option, intended for complicated behavior. getorset and setdefault are intended to be very lightweight, even lighter than the "try/except KeyError" and "if key not in X: X[key] = default" memes we have right now. getorset's factory function is only appropriate for preexisting functions, not user defined ones. Essentially, I believe getorset should be discussed on its own merits, independent of the addition of a defaultdict class. Perhaps discussion of it (and the deprecation of setdefault) should wait until after defaultdict has been completed? -- Adam Olsen, aka Rhamphoryncus From barry at python.org Tue Feb 21 05:11:26 2006 From: barry at python.org (Barry Warsaw) Date: Mon, 20 Feb 2006 23:11:26 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: <20060218125325.pz5tgfem0qdcssc0@monbureau3.cirad.fr> References: <20060216212133.GB23859@xs4all.nl> <17397.58935.8669.271616@montanaro.dyndns.org> <20060218125325.pz5tgfem0qdcssc0@monbureau3.cirad.fr> Message-ID: <1140495086.17689.74.camel@geddy.wooz.org> On Sat, 2006-02-18 at 12:53 +0100, Pierre Barbier de Reuille wrote: > > Guido> Over lunch with Alex Martelli, he proposed that a subclass of > > Guido> dict with this behavior (but implemented in C) would be a good > > Guido> addition to the language. I agree that .setdefault() is a well-intentioned failure, although I'm much less concerned about any potential performance impact than the fact that it's completely unreadable. And while I like the basic idea, I also agree that deriving from dict is problematic, both because of the constructor signature is tough to forward, but also because dict is such a fundamental type that APIs that return dicts may have to be changed to allow passing in a factory type. I'd rather like to see what Pierre proposes, with a few minor differences. > Well, first not ot break the current interface, and second because I think it > reads better I would prefer : > > d = {'a': 1}' > d['b'] # raises KeyError > d.get('c') # evaluates to None > d.default = 42 > d['b'] # evaluates to 42 > d.get('c') # evaluates to 42 So far so good. > And to undo the default, you can simply do : > > del d.default Although this I'm not crazy about. If you let .default be a callable, you could also write this as def keyerror(): raise KeyError d.default = keyerror or possibly just this as a shortcut: d.default = KeyError > > The only question in my mind is whether or not getting a non-existent value > > under the influence of a given default value should stick that value in the > > dictionary or not. Agreed. I'm not sure whether .get(onearg) should return None or .default. I /think/ I want the latter, but I'd have to play with some real code to know for sure. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060220/3adbcbd6/attachment.pgp From bob at redivi.com Tue Feb 21 05:27:22 2006 From: bob at redivi.com (Bob Ippolito) Date: Mon, 20 Feb 2006 20:27:22 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <7C98F41D-2857-4EE4-AD70-3BB63978417F@redivi.com> On Feb 20, 2006, at 7:25 PM, Stephen J. Turnbull wrote: >>>>>> "Martin" == Martin v L?wis writes: > > Martin> Please do take a look. It is the only way: If you were to > Martin> embed base64 *bytes* into character data content of an XML > Martin> element, the resulting XML file might not be well-formed > Martin> anymore (if the encoding of the XML file is not an ASCII > Martin> superencoding). > > Excuse me, I've been doing category theory recently. By "embedding" I > mean a map from an intermediate object which is a stream of bytes to > the corresponding stream of characters. In the case of UTF-16-coded > characters, this would necessarily imply a representation change, as > you say. > > What I advocate for Python is to require that the standard base64 > codec be defined only on bytes, and always produce bytes. Any > representation change should be done explicitly. This is surely > conformant with RFC 2045's definition and with RFC 3548. +1 -bob From almann.goo at gmail.com Tue Feb 21 05:29:53 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Mon, 20 Feb 2006 23:29:53 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes Message-ID: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> I am considering developing a PEP for enabling a mechanism to assign to free variables in a closure (nested function). My rationale is that with the advent of PEP 227 , Python has proper nested lexical scopes, but can have undesirable behavior (especially with new developers) when a user makes wants to make an assignment to a free variable within a nested function. Furthermore, after seeing numerous kludges to "solve" the problem with a mutable object, like a list, as the free variable do not seem "Pythonic." I have also seen mention that the use of classes can mitigate this, but that seems, IMHO, heavy handed in cases when an elegant solution using a closure would suffice and be more appropriate--especially when Python already has nested lexical scopes. I propose two possible approaches to solve this issue: 1. Adding a keyword such as "use" that would follow similar semantics as " global" does today. A nested scope could declare names with this keyword to enable assignment to such names to change the closest parent's binding. The semantic would be to keep the behavior we experience today but tell the compiler/interpreter that a name declared with the "use" keyword would explicitly use an enclosing scope. I personally like this approach the most since it would seem to be in keeping with the current way the language works and would probably be the most backwards compatible. The semantics for how this interacts with the global scope would also need to be defined (should " use" be equivalent to a global when no name exists all parent scopes, etc.) def incgen( inc = 1 ) : a = 6 def incrementer() : use a #use a, inc <-- list of names okay too a += inc return a return incrementer Of course, this approach suffers from a downside that every nested scope that wanted to assign to a parent scope's name would need to have the "use" keyword for those names--but one could argue that this is in keeping with one of Python's philosophies that "Explicit is better than implicit" (PEP 20). This approach also has to deal with a user declaring a name with "use" that is a named parameter--this would be a semantic error that could be handled like "global" does today with a SyntaxError. 2. Adding a keyword such as "scope" that would behave similarly to JavaScript's "var" keyword. A name could be declared with such a keyword optionally and all nested scopes would use the declaring scope's binding when accessing or assigning to a particular name. This approach has similar benefits to my first approach, but is clearly more top-down than the first approach. Subsequent "scope" declarations would create a new binding at the declaring scope for the declaring and child scopes to use. This could potentially be a gotcha for users expecting the binding semantics in place today. Also the scope keyword would have to be allowed to be used on parameters to allow such parameter names to be used in a similar fashion in a child scope. def incgen( inc = 1 ) : #scope inc <-- allow scope declaration for bound parameters (not a big fan of this) scope a = 6 def incrementer() : a += inc return a return incrementer This approach would be similar to languages like JavaScript that allow for explicit scope binding with the use of "var" or more static languages that allow re-declaring names at lower scopes. I am less in favor of this, because I don't think it feels very "Pythonic". As a point of reference, some languages such as Ruby will only bind a new name to a scope on assignment when an enclosing scope does not have the name bound. I do believe the Python name binding semantics have issues (for which the "global" keyword was born), but I feel that the "fixing" the Python semantic to a more "Ruby-like" one adds as many problems as it solves since the "Ruby-like" one is just as implicit in nature. Not to mention the backwards compatibility impact is probably much larger. I would like the community's opinion if there is enough out there that think this would be a worthwile endevour--or if there is already an initiative that I missed. Please let me know your questions, comments. Best Regards, Almann -- Almann T. Goo almann.goo at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060220/6d910c2f/attachment.html From barry at python.org Tue Feb 21 05:29:59 2006 From: barry at python.org (Barry Warsaw) Date: Mon, 20 Feb 2006 23:29:59 -0500 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <1140496199.17689.78.camel@geddy.wooz.org> On Fri, 2006-02-17 at 11:09 -0800, Guido van Rossum wrote: > Thanks for all the constructive feedback. Here are some responses and > a new proposal. > > - Yes, I'd like to kill setdefault() in 3.0 if not sooner. A worthy goal, but not possible unless you want to break existing code. I don't think it's worth a DeprecationWarning either. Slating it for removal in 3.0 seems fine. Everything else about your proposal seems great. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060220/064a9aab/attachment.pgp From skip at pobox.com Tue Feb 21 05:34:04 2006 From: skip at pobox.com (skip at pobox.com) Date: Mon, 20 Feb 2006 22:34:04 -0600 Subject: [Python-Dev] readline compilarion fails on OSX In-Reply-To: References: <2A98FC2D-2D3F-4DEB-B0DC-63E174EAEABE@redivi.com> Message-ID: <17402.38972.847502.11936@montanaro.dyndns.org> Guido> But shouldn't we try to fix setup.py to detect this situation Guido> instead of making loud clattering noises? Here's a first-cut try at a setup.py patch: http://python.org/sf/1435651 Unfortunately, I don't think distutils provides a clean way to detect symbols the way configure does, so it's a bit clumsy... Skip From jcarlson at uci.edu Tue Feb 21 05:44:01 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 20 Feb 2006 20:44:01 -0800 Subject: [Python-Dev] Proposal: defaultdict In-Reply-To: References: Message-ID: <20060220203401.5FFE.JCARLSON@uci.edu> "Adam Olsen" wrote: > However, I'm beginning to think we shouldn't be comparing them. > defaultdict is a powerful but heavyweight option, intended for > complicated behavior. Check out Guido's patch. It's not that "heavyweight", and its intended behavior is to make some operations *more* intuitive, if not a bit faster in some cases. Whether or not getorset is introduced, I don't much care, as defaultdict will cover every use case I've been using setdefault for, as well as most of my use cases for get. - Josiah From tim.peters at gmail.com Tue Feb 21 05:44:20 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 20 Feb 2006 23:44:20 -0500 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <43FA4E88.4090507@canterbury.ac.nz> Message-ID: <1f7befae0602202044p27f32caatd78dc759e49e753b@mail.gmail.com> [Guido] > ... > What's the practical use case for not wanting __contains__() to > function? I don't know. I have practical use cases for wanting __contains__() to function, but there's been no call for those. For an example, think of any real use ;-) For example, I often use dicts to represent multisets, where a key maps to a strictly positive count of the number of times that key appears in the multiset. A default of 0 is the right thing to return for a key not in the multiset, so that M[k] += 1 works to add another k to multiset M regardless of whether k was already present. I sure hope I can implement multiset intersection as, e.g., def minter(a, b): if len(b) < len(a): # make `a` the smaller, and iterate over it a, b = b, a result = defaultdict defaulting to 0, however that's spelled for k in a: if k in b: result[k] = min(a[k], b[k]) return result Replacing the loop nest with: for k in a: result[k] = min(a[k], b[k]) would be semantically correct so far as it goes, but pragmatically wrong: I maintain my "strictly positive count" invariant because consuming RAM to hold elements "that aren't there" can be a pragmatic disaster. (When `k` is in `a` but not in `b`, I don't want `k` to be stored in `result`) I have other examples, but they come so easily it's better to leave that an exercise for the reader. From jcarlson at uci.edu Tue Feb 21 06:01:01 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 20 Feb 2006 21:01:01 -0800 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: <20060220204850.6004.JCARLSON@uci.edu> "Almann T. Goo" wrote: > I would like the community's opinion if there is enough out there that think > this would be a worthwile endevour--or if there is already an initiative > that I missed. Please let me know your questions, comments. -1 Mechanisms which rely on manipulating variables within closures or nested scopes to function properly can be elegant, but I've not yet seen one that *really* is. You state that using classes can be "heavy handed", but one of the major uses of classes is as a *namespace*. Many desired uses of closures (including the various uses you have outlined) is to hide a *namespace*, and combining both closures with classes can offer that to you, without requiring a language change. Of course using classes directly with a bit of work can offer you everything you want from a closure, with all of the explcitness that you could ever want. As an aside, you mention both 'use' and 'scope' as possible keyword additions for various uses of nested scopes. In my experience, when one goes beyond 3 or so levels of nested scopes (methods of a class defined within a class namespace, or perhaps methods of a class defined within a method of a class), it starts getting to the point where the programmer is trying to be too clever. - Josiah From almann.goo at gmail.com Tue Feb 21 07:09:38 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Tue, 21 Feb 2006 01:09:38 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <20060220204850.6004.JCARLSON@uci.edu> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <20060220204850.6004.JCARLSON@uci.edu> Message-ID: <7e9b97090602202209j534d9addp3a1f797981ebed18@mail.gmail.com> > Mechanisms which rely on manipulating variables within closures or > nested scopes to function properly can be elegant, but I've not yet seen > one that *really* is. This really isn't a case for or against what I'm proposing since we can already do this in today's Python with mutable variables in an enclosing scope (see below). I am proposing a language change to help make closures more orthogonal to the scoping constructs that are already in place for the global scope. > You state that using classes can be "heavy handed", > but one of the major uses of classes is as a *namespace*. Many desired > uses of closures (including the various uses you have outlined) > is to hide a *namespace*, and combining both closures with classes can offer > that to you, without requiring a language change. Closures are also used in more functional styles of programming for defining customized control structures (those Ruby folks like them for this purpose). Granted you can do this with classes/objects and defining interfaces the end result can be somewhat un-natural for some problems--but I don't want to get into an argument between closures vs. objects since that is not what my proposal is aimed at and Python already has both. > Of course using > classes directly with a bit of work can offer you everything you want > from a closure, with all of the explcitness that you could ever want. Really, the easiest way to emulate what I want in today's Python is to create a mutable object (like a dict or list) in the enclosing scope to work around the semantic that the first assignment in a local scope binds a new name. Doing this seems rather un-natural and forcing the use of classes doesn't seem more natural def incgen( inc = 1 ) : env = [ 6 ] def incrementor() : env[ 0 ] += inc return env[ 0 ] return incrementor This is a work around for something a developer cannot do more naturally today. I do not think using some combination of classes and closures makes things clearer--it is still working around what I would construe as the non-orthogonal nature of nested lexical scopes in Python since the language provides a construct to deal with the problem for global variables. a = 6 def incgen( inc = 1 ) : def incrementor() : global a a += inc return a return incrementor Granted this is a somewhat trivial example, but I think it demonstrates my point about how nested lexical scopes are second class (since the language has no equivalent construct for them) and don't behave like the global scope. > As an aside, you mention both 'use' and 'scope' as possible keyword > additions for various uses of nested scopes. In my experience, when one > goes beyond 3 or so levels of nested scopes (methods of a class defined > within a class namespace, or perhaps methods of a class defined within a > method of a class), it starts getting to the point where the programmer > is trying to be too clever. Even though I may agree with you on this, your argument is more of an argument against PEP 227 than what I am proposing. Again, today's Python already allows a developer to have deep nested scopes. -Almann -- Almann T. Goo almann.goo at gmail.com From bokr at oz.net Tue Feb 21 07:43:00 2006 From: bokr at oz.net (Bengt Richter) Date: Tue, 21 Feb 2006 06:43:00 GMT Subject: [Python-Dev] defaultdict proposal round three References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> Message-ID: <43fa60a7.1334311178@news.gmane.org> On Mon, 20 Feb 2006 11:09:48 -0800, Alex Martelli wrote: > >On Feb 20, 2006, at 8:35 AM, Raymond Hettinger wrote: > >> [GvR] >>> I'm not convinced by the argument >>> that __contains__ should always return True >> >> Me either. I cannot think of a more useless behavior or one more >> likely to have >> unexpected consequences. Besides, as Josiah pointed out, it is >> much easier for >> a subclass override to substitute always True return values than >> vice-versa. > >Agreed on all counts. > >> I prefer this approach over subclassing. The mental load from an >> additional >> method is less than the load from a separate type (even a >> subclass). Also, >> avoidance of invariant issues is a big plus. Besides, if this allows >> setdefault() to be deprecated, it becomes an all-around win. > >I'd love to remove setdefault in 3.0 -- but I don't think it can be >done before that: default_factory won't cover the occasional use >cases where setdefault is called with different defaults at different >locations, and, rare as those cases may be, any 2.* should not break >any existing code that uses that approach. > >>> - Even if the default_factory were passed to the constructor, it >>> still >>> ought to be a writable attribute so it can be introspected and >>> modified. A defaultdict that can't change its default factory after >>> its creation is less useful. >> >> Right! My preference is to have default_factory not passed to the >> constructor, >> so we are left with just one way to do it. But that is a nit. > How about doing it as an expression, empowering ( ;-) the dict just afer creation? E.g., for d = dict() d.default_factory = list you could write d = dict()**list I made a hack to illustrate functionality (code at end). DD simulates the new dict without defaults. >>> d = DD(a=1) >>> d {'a': 1} So d is the plain dict with no default action enabled >>> ddl = DD()**list >>> ddl DD({} <= list) This is a new dict with list default factory >>> ddl[42] [] Beats the heck out of ddl.setdefault(42, []) >>> ddl[42].append(1) >>> ddl[42].append(2) >>> ddl DD({42: [1, 2]} <= list) Now take the non-default dict d and make an int default wrapper >>> ddi = d**int >>> ddi DD({'a': 1} <= int) Show there's no default on the orig: >>> d['b']+=1 Traceback (most recent call last): File "", line 1, in ? KeyError: 'b' But use the wrapper proxy: >>> ddi['b']+=1 >>> ddi DD({'a': 1, 'b': 1} <= int) >>> ddi['b']+=1 >>> ddi DD({'a': 1, 'b': 2} <= int) Note that augassign works. And info is visible in d: >>> d {'a': 1, 'b': 2} probably unusual use, but a one-off d.setdefault('S', set()).add(42) can be written >>> (d**set)['S'].add(42) >>> d {'a': 1, 'S': set([42]), 'b': 2} i.e., d**different_factory_value creates a temporary d-accessing proxy with default_factory set to different_factory_value, without affecting other bindings of d unless you rebind them with the expression result. I haven't implemented a check for compatible types on mixed defaults. e.g. the integer-default proxy will show 'S', but note: >>> ddi['S'] set([42]) >>> ddi['S'] += 5 Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand type(s) for +=: 'set' and 'int' I guess the programmer deserves it ;-) You can get a new defaulting proxy from an existing one, as it will use the same base plain dict: >>> ddd = ddi**dict >>> ddd DD({'a': 1, 'S': set([42]), 'b': 2, 'd': 0} <= dict) >>> ddd['adict'].update(check=1, this=2) >>> ddd DD({'a': 1, 'S': set([42]), 'b': 2, 'adict': {'this': 2, 'check': 1}, 'd': 0} <= dict) >>> d {'a': 1, 'S': set([42]), 'b': 2, 'adict': {'this': 2, 'check': 1}, 'd': 0} Not sure what the C implementation ramifications would be, but it makes setdefault easy to spell. And using both modes interchangeably is easy. And stuff like >>> d = DD()**int >>> for c in open('dd.py').read(): d[c]+=1 ... >>> print sorted(d.items(), key=lambda t:t[1])[-5:] [('f', 50), ('t', 52), ('_', 71), ('e', 74), (' ', 499)] Is nice ;-) >>> len(d) Traceback (most recent call last): File "", line 1, in ? TypeError: len() of unsized object Oops. >>> len(d.keys()) 40 >>> len(open('dd.py').read()) 1185 >>> sum(d.values()) 1185 >No big deal either way, but I see "passing the default factory to the >ctor" as the "one obvious way to do it", so I'd rather have it (be it >with a subclass or a classmethod-alternate constructor). I won't weep >bitter tears if this drops out, though. > > >>> - It would be unwise to have a default value that would be called if >>> it was callable: what if I wanted the default to be a class instance >>> that happens to have a __call__ method for unrelated reasons? >>> Callability is an elusive propperty; APIs should not attempt to >>> dynamically decide whether an argument is callable or not. >> >> That makes sense, though it seems over-the-top to need a zero- >> factory for a >> multiset. > >But int is a convenient zero-factory. Aha, good one. I didn't think of that one^H^H^Hzero ;-) I used it in the examples above ;-) Here is the code (be kind ;-) ----< dd.py >----------------------------------------------- class DD(dict): def __pow__(self, factory): class proxy(object): def __init__(self, dct, factory): self._d = dct self._f = factory def __getattribute__(self, attr): if attr in ('_d', '_f'): return object.__getattribute__(self, attr) else: _d = object.__getattribute__(self, '_d') return object.__getattribute__(_d, attr) def __getitem__(self, k): if k in self._d: v = self._d[k] elif self._f: v = self._d[k] = self._f() else: raise KeyError(repr(k)) return v def __setitem__(self, i, v): self._d[i]=v def __delitem__(self, i): del self._d[i] def __repr__(self): if self._f: return 'DD(%r <= %s)'%(self._d, self._f.__name__) else: return dict.__repr__(self._d) def __pow__(self, fct): return type(self)(self._d, fct) return proxy(self, factory) ------------------------------------------------------------ Regards, Bengt Richter From bokr at oz.net Tue Feb 21 07:53:41 2006 From: bokr at oz.net (Bengt Richter) Date: Tue, 21 Feb 2006 06:53:41 GMT Subject: [Python-Dev] Memory Error the right error for coding cookie promise violation? Message-ID: <43fab852.1356754059@news.gmane.org> Perhaps a more informative message would be nice. Here's an easy way to trigger it: >>> compile("#-*- coding: ascii -*-\nprint 'ab%c'\n"%0x80, '','exec') Traceback (most recent call last): File "", line 1, in ? MemoryError Regards, Bengt Richter From jcarlson at uci.edu Tue Feb 21 08:03:08 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 20 Feb 2006 23:03:08 -0800 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <7e9b97090602202209j534d9addp3a1f797981ebed18@mail.gmail.com> References: <20060220204850.6004.JCARLSON@uci.edu> <7e9b97090602202209j534d9addp3a1f797981ebed18@mail.gmail.com> Message-ID: <20060220222933.6007.JCARLSON@uci.edu> "Almann T. Goo" wrote: > > Mechanisms which rely on manipulating variables within closures or > > nested scopes to function properly can be elegant, but I've not yet seen > > one that *really* is. > > This really isn't a case for or against what I'm proposing since we > can already do this in today's Python with mutable variables in an > enclosing scope (see below). I am proposing a language change to help > make closures more orthogonal to the scoping constructs that are > already in place for the global scope. Actually, it is. Introducing these two new keywords is equivalent to encouraging nested scope use. Right now nested scope use is "limited" or "fraught with gotchas". Adding the 'use' and 'scope' keywords to label levels of scopes for name resolution will only encourage users to write closures which could have written better or not written at all (see some of my later examples). Users who had been using closures to solve real problems "elegantly" likely have not been affected by the current state of affairs, so arguably may not gain much in 'use' and 'scope'. > > You state that using classes can be "heavy handed", > > but one of the major uses of classes is as a *namespace*. Many desired > > uses of closures (including the various uses you have outlined) > > is to hide a *namespace*, and combining both closures with classes can offer > > that to you, without requiring a language change. > > Closures are also used in more functional styles of programming for > defining customized control structures (those Ruby folks like them for > this purpose). Except that Python does not offer user-defined control structures, so this is not a Python use-case. > > Of course using > > classes directly with a bit of work can offer you everything you want > > from a closure, with all of the explcitness that you could ever want. > > Really, the easiest way to emulate what I want in today's Python is to > create a mutable object (like a dict or list) in the enclosing scope > to work around the semantic that the first assignment in a local scope > binds a new name. Doing this seems rather un-natural and forcing the > use of classes doesn't seem more natural > > def incgen( inc = 1 ) : > env = [ 6 ] > def incrementor() : > env[ 0 ] += inc > return env[ 0 ] > return incrementor Indeed, there are other "more natural" ways of doing that right now. #for inc=1 cases from itertools import count as incgen #for limited-range but arbitrary integer inc cases: from sys import maxint def incgen(env=6, inc=1): return iter(xrange(env, (-maxint-1, maxint)[inc>0], inc)).next Or if you want to get fancier, a generator factory works quite well. def mycount(start, inc): while 1: yield start start += inc def incgen(env=6, inc=1): return mycount(env, inc).next All of which I find clearer than the closure example... but this isn't a discussion on how to create counters, it's a discussion about the use of closures and nested scopes, or more specifically, Python's lack of orthogonality on lexically nested scopes. Which brings up a question: what is your actual use-case for nested scopes and closures which makes the current "use a mutable or class" awkward? I would like to see a non-toy example of its use which would not be clearer through the use of a class, and which is nontrivially hampered by the current state of Python's nested scopes and name resolution. - Josiah From aleaxit at gmail.com Tue Feb 21 08:15:00 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 20 Feb 2006 23:15:00 -0800 Subject: [Python-Dev] Memory Error the right error for coding cookie promise violation? In-Reply-To: <43fab852.1356754059@news.gmane.org> References: <43fab852.1356754059@news.gmane.org> Message-ID: <65D61120-CD9F-4F67-982D-CF933939285E@gmail.com> On Feb 21, 2006, at 6:53 AM, Bengt Richter wrote: > Perhaps a more informative message would be nice. > Here's an easy way to trigger it: > >>>> compile("#-*- coding: ascii -*-\nprint 'ab%c'\n"%0x80, '','exec') > Traceback (most recent call last): > File "", line 1, in ? > MemoryError Definitely looks like a bug, please open a bug report for it. Thanks, Alex From rasky at develer.com Tue Feb 21 08:51:03 2006 From: rasky at develer.com (Giovanni Bajo) Date: Tue, 21 Feb 2006 08:51:03 +0100 Subject: [Python-Dev] defaultdict proposal round three References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> Message-ID: <074c01c636bb$96669f90$09b92997@bagio> Raymond Hettinger wrote: >> - It would be unwise to have a default value that would be called if >> it was callable: what if I wanted the default to be a class instance >> that happens to have a _call_ method for unrelated reasons? >> Callability is an elusive propperty; APIs should not attempt to >> dynamically decide whether an argument is callable or not. > > That makes sense, though it seems over-the-top to need a zero-factory > for a multiset. > > An alternative is to have two possible attributes: > d.default_factory = list > or > d.default_value = 0 > with an exception being raised when both are defined (the test is > done when the > attribute is created, not when the lookup is performed). What does this buy over just doing: d.default_factory = lambda: 0 which is also totally unambiguous wrt the semantic of usage of the default value (copy vs deepcopy vs whatever)? Given that the most of the default values I have ever wanted to use do not even require a lambda (list, set, int come to mind). -- Giovanni Bajo From nnorwitz at gmail.com Tue Feb 21 09:09:12 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Feb 2006 00:09:12 -0800 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> Message-ID: On 2/20/06, Tim Peters wrote: > > Speaking as a PSF director, I might not vote for that :-) Fact is > I've been keeping the build & tests 100% healthy on WinXP Pro, and > that requires more than just running the tests (it also requires > repairing compiler warnings and Unixisms). These are some ways we need buildbot to help us more. IMO compiler warnings should generate emails from buildbot. We would need to filter out a bunch, but it would be desirable to know about warnings on different architectures. Unfortunately, there are a ton of warnings on OS X right now. > Adding -r to the buildbot test recipe is a decent idea. Getting > _some_ debug-build test runs would also be good (or do we do that > already?). Buildbot runs "make testall" which does not run the tests in random order. There's nothing to prevent buildbot from making debug builds, though that is not currently done. The builds I run on the x86 box every 12 hours *do* use debug builds (Misc/build.sh). The results are here: http://docs.python.org/dev/results/ I also recently switched the email to go to python-checkins, though there haven't been any failures yet (unless they are sitting in a spam queue). There are some hangs (like right now): Thread 1: Lib/threading.py (204): wait Lib/threading.py (543): join Lib/threading.py (637): __exitfunc Lib/atexit.py (25): _run_exitfuncs Thread 2: Lib/socket.py (170): accept Lib/SocketServer.py (373): get_request Lib/SocketServer.py (218): handle_request Lib/test/test_socketserver.py (33): serve_a_few Lib/test/test_socketserver.py (82): run Lib/threading.py (445): __bootstrap I've seen test_socketserver fail before, this could be due to running 2 tests simultaneously. > Anyway, since XP Pro is effectively covered, I'd be keener to see a > Windows buildbot running under a different flavor of Windows. +1 n From nnorwitz at gmail.com Tue Feb 21 09:28:06 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Feb 2006 00:28:06 -0800 Subject: [Python-Dev] Memory Error the right error for coding cookie promise violation? In-Reply-To: <43fab852.1356754059@news.gmane.org> References: <43fab852.1356754059@news.gmane.org> Message-ID: On 2/20/06, Bengt Richter wrote: > Perhaps a more informative message would be nice. > Here's an easy way to trigger it: > > >>> compile("#-*- coding: ascii -*-\nprint 'ab%c'\n"%0x80, '','exec') > Traceback (most recent call last): > File "", line 1, in ? > MemoryError This was fixed in 2.5, but looks like it wasn't backported. I don't recall exactly why. -- n Python 2.5a0 (trunk:42526M, Feb 20 2006, 16:00:48) >>> compile("#-*- coding: ascii -*-\nprint 'ab%c'\n"%0x80, '','exec') Traceback (most recent call last): File "", line 1, in File "", line 0 SyntaxError: unknown encoding: ascii From jeff at taupro.com Tue Feb 21 09:57:27 2006 From: jeff at taupro.com (Jeff Rush) Date: Tue, 21 Feb 2006 02:57:27 -0600 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43F9C2CA.4010808@egenix.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> <43F9C2CA.4010808@egenix.com> Message-ID: <43FAD5F7.3060006@taupro.com> M.-A. Lemburg wrote: > I'd say that the parties interested in non-Unicode versions of > Python should maintain these branches of Python. Dito for other > stripped down versions. I understand where you're coming from but the embedded market I encounter tends to focus on the hardware side. If they can get a marketing star by grabbing Python off-the shelf, tweak the build and produce something to include with their product, they will. But if they have to maintain a branch, they'll just go with the defacto C API most such devices use. > Note that this does not mean that we should forget about memory > consumption issues. It's just that if there's only marginal > interest in certain special builds of Python, I don't see the > requirement for the Python core developers to maintain them. These requirements of customization may not be a strong case for today but could be impacting future growth of the language in certain sectors. I'm a rabid Python evangelist and alway try to push Python into more nooks and crannies of the marketplace, similar to how the Linux kernel is available from the tiniest machines to the largest iron. If the focus of Python is to be strictly a desktop, conventional (mostly ;-) language, restricting its adaptability to other less interesting environments may be a reasonable tradeoff to improve its maintainability. But adaptability, especially when you don't fully grok where or how it will be used, can also be a competitive advantage. -Jeff From ronaldoussoren at mac.com Tue Feb 21 10:10:16 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 21 Feb 2006 10:10:16 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> Message-ID: <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> On 21-feb-2006, at 9:09, Neal Norwitz wrote: > Unfortunately, there are a ton of > warnings on OS X right now. How many of those do you see when you ignore the warnings you get while building the Carbon extensions? Those extensions wrap loads of deprecated functions, each of which will give a warning. Ronald From nnorwitz at gmail.com Tue Feb 21 10:19:03 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Feb 2006 01:19:03 -0800 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> Message-ID: On 2/21/06, Ronald Oussoren wrote: > > On 21-feb-2006, at 9:09, Neal Norwitz wrote: > > > Unfortunately, there are a ton of > > warnings on OS X right now. > > How many of those do you see when you ignore the warnings you get > while building the Carbon extensions? Those extensions wrap loads of > deprecated functions, each of which will give a warning. RIght: http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/138/step-compile/0 Most but not all of the warnings are due to Carbon AFAICT. I'd like to fix those that are important, but it's so far down on the priority list. :-( n From ronaldoussoren at mac.com Tue Feb 21 10:26:42 2006 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 21 Feb 2006 10:26:42 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> Message-ID: <073F56B7-3575-4AB0-BD91-43D886B9D0D6@mac.com> On 21-feb-2006, at 10:19, Neal Norwitz wrote: > On 2/21/06, Ronald Oussoren wrote: >> >> On 21-feb-2006, at 9:09, Neal Norwitz wrote: >> >>> Unfortunately, there are a ton of >>> warnings on OS X right now. >> >> How many of those do you see when you ignore the warnings you get >> while building the Carbon extensions? Those extensions wrap loads of >> deprecated functions, each of which will give a warning. > > RIght: > http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/ > 138/step-compile/0 > > Most but not all of the warnings are due to Carbon AFAICT. I'd like > to fix those that are important, but it's so far down on the priority > list. :-( I'm working with Bob I. on a universal binary build of python 2.4. Some of our patches fix warnings like the ones for _CFmodule.c. I'll be starting with submitting the less controversial patches once the universal build is mostly ready, which should be any day now. Ronald > > n From jeff at taupro.com Tue Feb 21 09:57:27 2006 From: jeff at taupro.com (Jeff Rush) Date: Tue, 21 Feb 2006 02:57:27 -0600 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43F9C2CA.4010808@egenix.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> <43F9C2CA.4010808@egenix.com> Message-ID: <43FAD5F7.3060006@taupro.com> M.-A. Lemburg wrote: > I'd say that the parties interested in non-Unicode versions of > Python should maintain these branches of Python. Dito for other > stripped down versions. I understand where you're coming from but the embedded market I encounter tends to focus on the hardware side. If they can get a marketing star by grabbing Python off-the shelf, tweak the build and produce something to include with their product, they will. But if they have to maintain a branch, they'll just go with the defacto C API most such devices use. > Note that this does not mean that we should forget about memory > consumption issues. It's just that if there's only marginal > interest in certain special builds of Python, I don't see the > requirement for the Python core developers to maintain them. These requirements of customization may not be a strong case for today but could be impacting future growth of the language in certain sectors. I'm a rabid Python evangelist and alway try to push Python into more nooks and crannies of the marketplace, similar to how the Linux kernel is available from the tiniest machines to the largest iron. If the focus of Python is to be strictly a desktop, conventional (mostly ;-) language, restricting its adaptability to other less interesting environments may be a reasonable tradeoff to improve its maintainability. But adaptability, especially when you don't fully grok where or how it will be used, can also be a competitive advantage. -Jeff From mal at egenix.com Tue Feb 21 10:36:29 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 21 Feb 2006 10:36:29 +0100 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43FA44E3.9090206@v.loewis.de> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> <43F9C2CA.4010808@egenix.com> <43FA44E3.9090206@v.loewis.de> Message-ID: <43FADF1D.90603@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: >> Note that this does not mean that we should forget about memory >> consumption issues. It's just that if there's only marginal >> interest in certain special builds of Python, I don't see the >> requirement for the Python core developers to maintain them. > > Well, the cost of Unicode support is not so much in the algorithmic > part, but in the tables that come along with it. AFAICT, everything > but unicodectype is optional; that is 5KiB of code and 20KiB of data > on x86. Actually, the size of the code *does* matter, at a second > glance. Here are the largest object files in the Python code base > on my system (not counting dynamic modules): > > text data bss dec hex filename > 4845 19968 0 24813 60ed Objects/unicodectype.o > 22633 2432 352 25417 6349 Objects/listobject.o > 29259 1412 152 30823 7867 Objects/classobject.o > 20696 11488 4 32188 7dbc Python/bltinmodule.o > 33579 740 0 34319 860f Objects/longobject.o > 34119 16 288 34423 8677 Python/ceval.o > 35179 2796 0 37975 9457 Modules/_sre.o > 26539 15820 416 42775 a717 Modules/posixmodule.o > 35283 8800 1056 45139 b053 Objects/stringobject.o > 50360 0 28 50388 c4d4 Python/compile.o > 68455 4624 440 73519 11f2f Objects/typeobject.o > 69993 9316 1196 80505 13a79 Objects/unicodeobject.o > > So it appears that dropping Unicode support can indeed provide > some savings. > > For reference, we also have an option to drop complex numbers: > > 9654 692 4 10350 286e Objects/complexobject.o So why not drop that as well ? Note that I'm not saying that these switches are useless - of course they do allow to strip down the Python interpreter. I believe that only very few people are interested in having these options and it's fair enough to put the burden of maintaining these branches on them. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 21 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Tue Feb 21 10:43:42 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 21 Feb 2006 10:43:42 +0100 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43FAD5F7.3060006@taupro.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> <43F9C2CA.4010808@egenix.com> <43FAD5F7.3060006@taupro.com> Message-ID: <43FAE0CE.5000302@egenix.com> Jeff Rush wrote: > M.-A. Lemburg wrote: > >> I'd say that the parties interested in non-Unicode versions of >> Python should maintain these branches of Python. Dito for other >> stripped down versions. > > I understand where you're coming from but the embedded market I > encounter tends to focus on the hardware side. If they can get a > marketing star by grabbing Python off-the shelf, tweak the build and > produce something to include with their product, they will. But if they > have to maintain a branch, they'll just go with the defacto C API most > such devices use. > >> Note that this does not mean that we should forget about memory >> consumption issues. It's just that if there's only marginal >> interest in certain special builds of Python, I don't see the >> requirement for the Python core developers to maintain them. > > These requirements of customization may not be a strong case for today > but could be impacting future growth of the language in certain > sectors. I'm a rabid Python evangelist and alway try to push Python > into more nooks and crannies of the marketplace, similar to how the > Linux kernel is available from the tiniest machines to the largest > iron. If the focus of Python is to be strictly a desktop, conventional > (mostly ;-) language, restricting its adaptability to other less > interesting environments may be a reasonable tradeoff to improve its > maintainability. But adaptability, especially when you don't fully grok > where or how it will be used, can also be a competitive advantage. I don't think this is a strong enough case to warrant having to maintain a separate branch of the Python core. Even platforms like Palm nowadays have enough RAM to cope with the 100kB or so that Unicode support adds. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 21 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From greg.ewing at canterbury.ac.nz Tue Feb 21 10:50:17 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:50:17 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> Message-ID: <43FAE259.607@canterbury.ac.nz> Delaney, Timothy (Tim) wrote: > However, *because* Python uses duck typing, I tend to feel that > subclasses in Python *should* be drop-in replacements. Duck-typing means that the only reliable way to assess whether two types are sufficiently compatible for some purpose is to consult the documentation -- you can't just look at the base class list. I think this should work both ways. It should be okay to *not* document autodict as being a subclass of dict, even if it happens to be implemented that way. I've adopted a convention like this in PyGUI, where I document the classes in terms of a conceptual interface hierarchy, without promising that they will be implemented that way. Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 10:50:56 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:50:56 +1300 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]] In-Reply-To: <43FA618D.7070804@ronadam.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060218051344.GC28761@panix.com> <43F6E1FA.7080909@v.loewis.de> <43f9424b.1261003927@news.gmane.org> <43FA618D.7070804@ronadam.com> Message-ID: <43FAE280.8040402@canterbury.ac.nz> Ron Adam wrote: > Storing byte information as 16 or 32 bits ints could take up a rather > lot of memory in some cases. I don't quite see the point here. Inside a bytes object, they would be stored 1 byte per byte. Nobody is suggesting that they would take up more than that just because a_bytes_object[i] happens to return an int. So the only reason to introduce a new "byte" type is to remove some of the operations that int has. We can already do bitwise operations on an int, so we don't need a new type to add that capability. What's more, I can see this leading to people asking for arithmetic operations to be *added* to the byte type so they can do wrap-around arithmetic, and then for 16-bit, 32-bit, 64-bit etc. versions of it, etc. etc. Do we really want to get onto that slope? Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 10:51:08 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:51:08 +1300 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]] In-Reply-To: <43FA618D.7070804@ronadam.com> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060218051344.GC28761@panix.com> <43F6E1FA.7080909@v.loewis.de> <43f9424b.1261003927@news.gmane.org> <43FA618D.7070804@ronadam.com> Message-ID: <43FAE28C.1030009@canterbury.ac.nz> Ron Adam wrote: > Storing byte information as 16 or 32 bits ints could take up a rather > lot of memory in some cases. I don't quite see the point here. Inside a bytes object, they would be stored 1 byte per byte. Nobody is suggesting that they would take up more than that just because a_bytes_object[i] happens to return an int. So the only reason to introduce a new "byte" type is to remove some of the operations that int has. We can already do bitwise operations on an int, so we don't need a new type to add that capability. What's more, I can see this leading to people asking for arithmetic operations to be *added* to the byte type so they can do wrap-around arithmetic, and then for 16-bit, 32-bit, 64-bit etc. versions of it, etc. etc. Do we really want to get onto that slope? Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 10:51:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:51:20 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <43FA4E88.4090507@canterbury.ac.nz> Message-ID: <43FAE298.5040404@canterbury.ac.nz> Guido van Rossum wrote: > It's quite tricky to implement a fully > transparent wrapper that supports all the special > methods (__setitem__ etc.). I was thinking the wrapper would only be a means of filling the dict -- it wouldn't even pretend to implement the full dict interface. The only method it would really need to have is __getitem__. > The semantics of defaultdict are crystal clear. __contains__(), keys() > and friends represent the *actual*, *current* keys. If you're happy with that, then I am too. I was never particularly attached to the wrapper idea -- I just mentioned it as a possible alternative. Just one more thing -- have you made a final decision about the name yet? I'd still prefer something like 'autodict', because to me 'defaultdict' suggests a type that just returns default values without modifying the dict. Maybe it should be reserved for some possible future type that behaves that way. Also, considering the intended use cases (accumulation, etc.) it seems more accurate to think of the value produced by the factory as an 'initial value' rather than a 'default value', and I'd prefer to see it described that way in the docs. If that is done, having 'default' in the name wouldn't be so appropriate. Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 10:57:31 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:57:31 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FAE40B.8090406@canterbury.ac.nz> Stephen J. Turnbull wrote: > What I advocate for Python is to require that the standard base64 > codec be defined only on bytes, and always produce bytes. I don't understand that. It seems quite clear to me that base64 encoding (in the general sense of encoding, not the unicode sense) takes binary data (bytes) and produces characters. That's the whole point of base64 -- so you can send arbitrary data over a channel that is only capable of dealing with characters. So in Py3k the correct usage would be base64 unicode encode encode(x) original bytes --------> unicode ---------> bytes for transmission <-------- <--------- base64 unicode decode decode(x) where x is whatever unicode encoding the transmission channel uses for characters (probably ascii or an ascii superset, but not necessarily). So, however it's spelled, the typing is such that base64_encode(bytes) --> unicode and base64_decode(unicode) --> bytes -- Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 10:58:55 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:58:55 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <20060220204850.6004.JCARLSON@uci.edu> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <20060220204850.6004.JCARLSON@uci.edu> Message-ID: <43FAE45F.3020000@canterbury.ac.nz> Josiah Carlson wrote: > Mechanisms which rely on manipulating variables within closures or > nested scopes to function properly can be elegant, but I've not yet seen > one that *really* is. It seems a bit inconsistent to say on the one hand that direct assignment to a name in an outer scope is not sufficiently useful to be worth supporting, while at the same time providing a way to do it for one particular scope, i.e. 'global'. Would you advocate doing away with it? > Of course using > classes directly with a bit of work can offer you everything you want > from a closure, with all of the explcitness that you could ever want. There are cases where the overhead (in terms of amount of code) of defining a class and creating an instance of it swamps the code which does the actual work, and, I feel, actually obscures what is being done rather than clarifies it. These cases benefit from the ability to refer to names in enclosing scopes, and I believe they would benefit further from the ability to assign to such names. Certainly the feature could be abused, as can the existing nested scope facilities, or any other language feature for that matter. Mere potential for abuse is not sufficient reason to reject a feature, or the language would have no features at all. Another consideration is efficiency. CPython currently implements access to local variables (both in the current scope and all outer ones except the module scope) in an extremely efficient way. There's always the worry that using attribute access in place of local variable access is greatly increasing the runtime overhead for no corresponding benefit. You mention the idea of namespaces. Maybe an answer is to provide some lightweight way of defining a temporary, singe-use namespace for use within nested scopes -- lightweight in terms of both code volume and runtime overhead. Perhaps something like def my_func(): namespace foo foo.x = 42 def inc_x(): foo.x += 1 The idea here is that foo wouldn't be an object in its own right, but just a collection of names that would be implemented as local variables of my_func. Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 10:59:06 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 22:59:06 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43fa60a7.1334311178@news.gmane.org> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <43fa60a7.1334311178@news.gmane.org> Message-ID: <43FAE46A.7070308@canterbury.ac.nz> Bengt Richter wrote: > you could write > > d = dict()**list Or alternatively, ld = dict[list] i.e. "a dict of lists". In the maximally twisted form of this idea, the result wouldn't be a dict but a new *type* of dict, which you would then instantiate: d = ld(your_favourite_args_here) This solves both the constructor-argument problem (the new type can have the same constructor signature as a regular dict with no conflict) and the perceived-Liskov-nonsubstitutability problem (there's no requirement that the new type have any particular conceptual and/or actual inheritance relationship to any other type). Plus being a really cool introduction to the concepts of metaclasses, higher-order functions and all that neat head-exploding stuff. :-) Resolving-not-to-coin-any-more-multihyphenated- hyperpolysyllabic-words-like-'perceived-Liskov- nonsubstitutability'-this-week-ly, Greg From greg.ewing at canterbury.ac.nz Tue Feb 21 11:01:51 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 21 Feb 2006 23:01:51 +1300 Subject: [Python-Dev] Papal encyclical on the use of closures (Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <20060220222933.6007.JCARLSON@uci.edu> References: <20060220204850.6004.JCARLSON@uci.edu> <7e9b97090602202209j534d9addp3a1f797981ebed18@mail.gmail.com> <20060220222933.6007.JCARLSON@uci.edu> Message-ID: <43FAE50F.3090801@canterbury.ac.nz> Josiah Carlson wrote: > Introducing these two new keywords is equivalent to > encouraging nested scope use. Right now nested scope > use is "limited" or "fraught with gotchas". What you seem to be saying here is: Nested scope use is Inherently Bad. Therefore we will keep them Limited and Fraught With Gotchas, so people will be discouraged from using them. Sounds a bit like the attitude of certain religious groups to condoms. (Might encourage people to have sex -- can't have that -- look at all the nasty diseases you can get!) Greg From fuzzyman at voidspace.org.uk Tue Feb 21 11:17:52 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 21 Feb 2006 10:17:52 +0000 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FAE259.607@canterbury.ac.nz> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> Message-ID: <43FAE8D0.9040000@voidspace.org.uk> Greg Ewing wrote: >Delaney, Timothy (Tim) wrote: > > > >>However, *because* Python uses duck typing, I tend to feel that >>subclasses in Python *should* be drop-in replacements. >> >> > >Duck-typing means that the only reliable way to >assess whether two types are sufficiently compatible >for some purpose is to consult the documentation -- >you can't just look at the base class list. > > > What's the API for that ? I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. Slightly-off-topic'ly-yours, Michael Foord From skip at pobox.com Tue Feb 21 12:58:04 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 05:58:04 -0600 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> Message-ID: <17403.76.288989.178176@montanaro.dyndns.org> Neal> IMO compiler warnings should generate emails from buildbot. It doesn't generate emails for any other condition. I think it should just turn the compilation section yellow. Skip From skip at pobox.com Tue Feb 21 13:01:24 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 06:01:24 -0600 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> Message-ID: <17403.276.782271.715060@montanaro.dyndns.org> >> Unfortunately, there are a ton of warnings on OS X right now. Ronald> How many of those do you see when you ignore the warnings you Ronald> get while building the Carbon extensions? I see a bunch related to Py_ssize_t. Those have nothing to do with Carbon. I don't see them on the gentoo build, so I assume they just haven't been tackled yet. Skip From barry at python.org Tue Feb 21 13:31:54 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2006 07:31:54 -0500 Subject: [Python-Dev] bytes type discussion In-Reply-To: References: <20060215223943.GL6027@xs4all.nl> <1140049757.14818.45.camel@geddy.wooz.org> Message-ID: <1140525114.17666.88.camel@geddy.wooz.org> On Fri, 2006-02-17 at 00:43 -0500, Steve Holden wrote: > Fredrik Lundh wrote: > > Barry Warsaw wrote: > > > > > >>We know at least there will never be a 2.10, so I think we still have > >>time. > > > > > > because there's no way to count to 10 if you only have one digit? > > > > we used to think that back when the gas price was just below 10 SEK/L, > > but they found a way... > > > IIRC Guido is on record as saying "There will be no Python 2.10 because > I hate the ambiguity of double-digit minor release numbers", or words to > that effect. I heard the same quote, so that's what I was referring to! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060221/81dc8e13/attachment.pgp From barry at python.org Tue Feb 21 13:55:31 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2006 07:55:31 -0500 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87hd6vs9p1.fsf@tleepslib.sk.tsukuba.ac.jp> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <43F6FFBD.7080704@egenix.com> <20060218113358.GG23859@xs4all.nl> <43F7113E.8090300@egenix.com> <87hd6vs9p1.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <1140526531.17665.97.camel@geddy.wooz.org> On Sun, 2006-02-19 at 23:30 +0900, Stephen J. Turnbull wrote: > >>>>> "M" == "M.-A. Lemburg" writes: > M> * for Unicode codecs the original form is Unicode, the derived > M> form is, in most cases, a string > > First of all, that's Martin's point! > > Second, almost all Americans, a large majority of Japanese, and I > would bet most Western Europeans would say you have that backwards. > That's the problem, and it's the Unicode advocates' problem (ie, > ours), not the users'. Even if we're right: education will require > lots of effort. Rather, we should just make it as easy as possible to > do it right, and hard to do it wrong. I think you've hit the nail squarely on the head. Even though I /know/ what the intended semantics are, the originality of the string form is deeply embedded in my nearly 30 years of programming experience, almost all of it completely American English-centric. I always have to stop and think about which direction .encode() and .decode() go in because it simply doesn't feel natural. Or more simply put, my brain knows what's right, but my heart doesn't and that's why converting from one to the other is always a hiccup in the smooth flow of coding. And while I'm sympathetic to MAL's design decisions, the overlaying of the generalizations doesn't help. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060221/d6e4091b/attachment.pgp From jeremy at alum.mit.edu Tue Feb 21 14:02:08 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 21 Feb 2006 08:02:08 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: Almann, The lack of support for rebinding names in enclosing scopes is certainly a wart. I think option one is a better fit for Python, because it more closely matches the existing naming semantics. Namely that assignment in a block creates a new name unless a global statement indicates otherwise. The revised rules would be that assignment creates a new name unless a global or XXX statement indicates otherwise. The names of naming statements are quite hard to get right, I fear. I don't particularly like "use." It's too generic. (I don't particularly like "scope" for option 2, either, for similar reasons. It doesn't indicate what kind of scope issue is being declared.) The most specifc thing I can think of is "free" to indicate that the variable is free in the current scope. It may be too specialized a term to be familiar to most people. I think free == global in the absence of other bindings. Jeremy On 2/20/06, Almann T. Goo wrote: > I am considering developing a PEP for enabling a mechanism to assign to free > variables in a closure (nested function). My rationale is that with the > advent of PEP 227 , Python has proper nested lexical scopes, but can have > undesirable behavior (especially with new developers) when a user makes > wants to make an assignment to a free variable within a nested function. > Furthermore, after seeing numerous kludges to "solve" the problem with a > mutable object, like a list, as the free variable do not seem "Pythonic." I > have also seen mention that the use of classes can mitigate this, but that > seems, IMHO, heavy handed in cases when an elegant solution using a closure > would suffice and be more appropriate--especially when Python already has > nested lexical scopes. > > I propose two possible approaches to solve this issue: > > 1. Adding a keyword such as "use" that would follow similar semantics as > "global" does today. A nested scope could declare names with this keyword > to enable assignment to such names to change the closest parent's binding. > The semantic would be to keep the behavior we experience today but tell the > compiler/interpreter that a name declared with the "use" keyword would > explicitly use an enclosing scope. I personally like this approach the most > since it would seem to be in keeping with the current way the language works > and would probably be the most backwards compatible. The semantics for how > this interacts with the global scope would also need to be defined (should > "use" be equivalent to a global when no name exists all parent scopes, etc.) > > > def incgen( inc = 1 ) : > a = 6 > def incrementer() : > use a > #use a, inc <-- list of names okay too > a += inc > return a > return incrementer > > Of course, this approach suffers from a downside that every nested scope > that wanted to assign to a parent scope's name would need to have the "use" > keyword for those names--but one could argue that this is in keeping with > one of Python's philosophies that "Explicit is better than implicit" (PEP > 20). This approach also has to deal with a user declaring a name with " > use" that is a named parameter--this would be a semantic error that could be > handled like "global " does today with a SyntaxError. > > 2. Adding a keyword such as "scope" that would behave similarly to > JavaScript's " var" keyword. A name could be declared with such a keyword > optionally and all nested scopes would use the declaring scope's binding > when accessing or assigning to a particular name. This approach has similar > benefits to my first approach, but is clearly more top-down than the first > approach. Subsequent "scope" declarations would create a new binding at the > declaring scope for the declaring and child scopes to use. This could > potentially be a gotcha for users expecting the binding semantics in place > today. Also the scope keyword would have to be allowed to be used on > parameters to allow such parameter names to be used in a similar fashion in > a child scope. > > > def incgen( inc = 1 ) : > #scope inc <-- allow scope declaration for bound parameters (not a big > fan of this) > scope a = 6 > def incrementer() : > a += inc > return a > return incrementer > > This approach would be similar to languages like JavaScript that allow for > explicit scope binding with the use of "var" or more static languages that > allow re-declaring names at lower scopes. I am less in favor of this, > because I don't think it feels very "Pythonic". > > As a point of reference, some languages such as Ruby will only bind a new > name to a scope on assignment when an enclosing scope does not have the name > bound. I do believe the Python name binding semantics have issues (for > which the "global" keyword was born), but I feel that the "fixing" the > Python semantic to a more "Ruby-like" one adds as many problems as it solves > since the "Ruby-like" one is just as implicit in nature. Not to mention the > backwards compatibility impact is probably much larger. > > I would like the community's opinion if there is enough out there that think > this would be a worthwile endevour--or if there is already an initiative > that I missed. Please let me know your questions, comments. > > Best Regards, > Almann > > -- > Almann T. Goo > almann.goo at gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > > > From almann.goo at gmail.com Tue Feb 21 14:16:08 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Tue, 21 Feb 2006 08:16:08 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Jeremy, I definitely agree that option one is more in line with the semantics in place within Python today. > The names of naming statements are quite hard to get right, I fear. I > don't particularly like "use." It's too generic. (I don't > particularly like "scope" for option 2, either, for similar reasons. > It doesn't indicate what kind of scope issue is being declared.) The > most specifc thing I can think of is "free" to indicate that the > variable is free in the current scope. It may be too specialized a > term to be familiar to most people. I am not married to any particular keyword for sure--I would be happy for the most part if the language was fixed regardless of the keyword chosen. "free" gives me the sense that I am de-allocating memory (my C background talking), I don't think most people would get the mathematical reference for "free". I certainly hope that an initiative like this doesn't get stymied by the lack of a good name for such a keyword. Maybe something like "outer"? > I think free == global in the absence of other bindings. I actually like this, would sort of make "global" obsolete (and thus making the global scope behave like other lexical scopes with regard to to re-binding, which is probably a good thing) -Almann -- Almann T. Goo almann.goo at gmail.com From guido at python.org Tue Feb 21 14:58:52 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 21 Feb 2006 05:58:52 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43fa60a7.1334311178@news.gmane.org> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <43fa60a7.1334311178@news.gmane.org> Message-ID: On 2/20/06, Bengt Richter wrote: > How about doing it as an expression, empowering ( ;-) the dict just afer creation? > E.g., for > > d = dict() > d.default_factory = list > > you could write > > d = dict()**list Bengt, can you let your overactive imagination rest for a while? I recommend that you sit back, relax for a season, and reflect on the zen nature of Pythonicity. Then come back and hopefully you'll be able to post without embarrassing yourself continuously. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 21 15:04:34 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 21 Feb 2006 06:04:34 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FAE8D0.9040000@voidspace.org.uk> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> Message-ID: On 2/21/06, Fuzzyman wrote: > I've had problems in code that needs to treat strings, lists and > dictionaries differently (assigning values to a container where all > three need different handling) and telling the difference but allowing > duck typing is *problematic*. Consider designing APIs that don't require you to mae that kind of distinction, if you're worried about edge cases and classifying arbitrary other objects correctly. It's totally possible to create an object that behaves like a hybrid of a string and a dict. If you're only interested in classifying the three specific built-ins you mention, I'd check for the presense of certain attributes: hasattr(x, "lower") -> x is a string of some kind; hasattr(x, "sort") -> x is a list; hasattr(x, "update") -> x is a dict. Also, hasattr(x, "union") -> x is a set; hasattr(x, "readline") -> x is a file. That's duck typing! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Feb 21 15:07:56 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 08:07:56 -0600 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <17403.276.782271.715060@montanaro.dyndns.org> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> <17403.276.782271.715060@montanaro.dyndns.org> Message-ID: <17403.7868.343827.845670@montanaro.dyndns.org> Ronald> How many of those do you see when you ignore the warnings you Ronald> get while building the Carbon extensions? skip> I see a bunch related to Py_ssize_t. Those have nothing to do skip> with Carbon. I don't see them on the gentoo build, so I assume skip> they just haven't been tackled yet. Let me rephrase that. I assume the people digging through Py_ssize_t issues have been looking at compilation warnings for platforms other than Mac OSX. Skip From thomas at xs4all.net Tue Feb 21 15:27:54 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Tue, 21 Feb 2006 15:27:54 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: <20060221142754.GL23859@xs4all.nl> On Tue, Feb 21, 2006 at 08:02:08AM -0500, Jeremy Hylton wrote: > The lack of support for rebinding names in enclosing scopes is > certainly a wart. I think option one is a better fit for Python, > because it more closely matches the existing naming semantics. Namely > that assignment in a block creates a new name unless a global > statement indicates otherwise. The revised rules would be that > assignment creates a new name unless a global or XXX statement > indicates otherwise. I agree with Jeremy on this. I've been thinking about doing something like this myself, but never got 'round to it. It doesn't make working with closures much easier, and I doubt it'll encourage using closures much, but it does remove the wart of needing to use mutable objects to make them read-write. > The names of naming statements are quite hard to get right, I fear. I > don't particularly like "use." It's too generic. (I don't > particularly like "scope" for option 2, either, for similar reasons. > It doesn't indicate what kind of scope issue is being declared.) The > most specifc thing I can think of is "free" to indicate that the > variable is free in the current scope. It may be too specialized a > term to be familiar to most people. I was contemplating 'enclosed' as a declaration, myself. Maybe, if there's enough of a consent on any name before Python 2.5a1 is released, and the feature isn't going to make it into 2.5, we could ease the introduction of a new keyword by issuing warning about the keyword in 2.5 already. (Rather than a future-import to enable it in 2.6.) Maybe, and only if there's no doubt about how it's going in, of course. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From bokr at oz.net Tue Feb 21 15:29:11 2006 From: bokr at oz.net (Bengt Richter) Date: Tue, 21 Feb 2006 14:29:11 GMT Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: <43fb1c0a.1382282887@news.gmane.org> On Tue, 21 Feb 2006 08:02:08 -0500, "Jeremy Hylton" wrote: >Almann, > >The lack of support for rebinding names in enclosing scopes is >certainly a wart. I think option one is a better fit for Python, >because it more closely matches the existing naming semantics. Namely >that assignment in a block creates a new name unless a global >statement indicates otherwise. The revised rules would be that >assignment creates a new name unless a global or XXX statement >indicates otherwise. > >The names of naming statements are quite hard to get right, I fear. I >don't particularly like "use." It's too generic. (I don't >particularly like "scope" for option 2, either, for similar reasons. >It doesn't indicate what kind of scope issue is being declared.) The >most specifc thing I can think of is "free" to indicate that the >variable is free in the current scope. It may be too specialized a >term to be familiar to most people. > >I think free == global in the absence of other bindings. > >Jeremy Hey, only Guido is allowed to top-post. He said so ;-) But to the topic, it just occurred to me that any outer scopes could be given names (including global namespace, but that would have the name global by default, so global.x would essentially mean what globals()['x'] means now, except it would be a name error if x didn't pre-exist when accessed via namespace_name.name_in_space notation. namespace g_alias # g_alias.x becomes alternate spelling of global.x def outer(): namespace mezzanine a = 123 print a # => 123 print mezzanine.a # => 123 (the name space name is visible and functional locally) def inner(): print mezzanine.a => 123 mezznine.a =456 inner() print a # = 456 global.x = re-binds global x, name error if not preexisting. This would allow creating mezzanine like an attribute view of the slots in that local namespace, as well as making namespace itself visible there, so the access to mezzanine would look like a read access to an ordinary object named mezzanine that happened to have attribute slots matching outer's local name space. Efficiency might make it desirable not to extend named namespaces with new names, function locals being slotted in a fixed space tied into the frame (I think). But there are tricks I guess. Anyway, I hadn't seen this idea before. Seems Regards, Bengt Richter > >On 2/20/06, Almann T. Goo wrote: >> I am considering developing a PEP for enabling a mechanism to assign to free >> variables in a closure (nested function). My rationale is that with the >> advent of PEP 227 , Python has proper nested lexical scopes, but can have >> undesirable behavior (especially with new developers) when a user makes >> wants to make an assignment to a free variable within a nested function. >> Furthermore, after seeing numerous kludges to "solve" the problem with a >> mutable object, like a list, as the free variable do not seem "Pythonic." I >> have also seen mention that the use of classes can mitigate this, but that >> seems, IMHO, heavy handed in cases when an elegant solution using a closure >> would suffice and be more appropriate--especially when Python already has >> nested lexical scopes. >> >> I propose two possible approaches to solve this issue: >> >> 1. Adding a keyword such as "use" that would follow similar semantics as >> "global" does today. A nested scope could declare names with this keyword >> to enable assignment to such names to change the closest parent's binding. >> The semantic would be to keep the behavior we experience today but tell the >> compiler/interpreter that a name declared with the "use" keyword would >> explicitly use an enclosing scope. I personally like this approach the most >> since it would seem to be in keeping with the current way the language works >> and would probably be the most backwards compatible. The semantics for how >> this interacts with the global scope would also need to be defined (should >> "use" be equivalent to a global when no name exists all parent scopes, etc.) >> >> >> def incgen( inc = 1 ) : >> a = 6 >> def incrementer() : >> use a >> #use a, inc <-- list of names okay too >> a += inc >> return a >> return incrementer >> >> Of course, this approach suffers from a downside that every nested scope >> that wanted to assign to a parent scope's name would need to have the "use" >> keyword for those names--but one could argue that this is in keeping with >> one of Python's philosophies that "Explicit is better than implicit" (PEP >> 20). This approach also has to deal with a user declaring a name with " >> use" that is a named parameter--this would be a semantic error that could be >> handled like "global " does today with a SyntaxError. >> >> 2. Adding a keyword such as "scope" that would behave similarly to >> JavaScript's " var" keyword. A name could be declared with such a keyword >> optionally and all nested scopes would use the declaring scope's binding >> when accessing or assigning to a particular name. This approach has similar >> benefits to my first approach, but is clearly more top-down than the first >> approach. Subsequent "scope" declarations would create a new binding at the >> declaring scope for the declaring and child scopes to use. This could >> potentially be a gotcha for users expecting the binding semantics in place >> today. Also the scope keyword would have to be allowed to be used on >> parameters to allow such parameter names to be used in a similar fashion in >> a child scope. >> >> >> def incgen( inc = 1 ) : >> #scope inc <-- allow scope declaration for bound parameters (not a big >> fan of this) >> scope a = 6 >> def incrementer() : >> a += inc >> return a >> return incrementer >> >> This approach would be similar to languages like JavaScript that allow for >> explicit scope binding with the use of "var" or more static languages that >> allow re-declaring names at lower scopes. I am less in favor of this, >> because I don't think it feels very "Pythonic". >> >> As a point of reference, some languages such as Ruby will only bind a new >> name to a scope on assignment when an enclosing scope does not have the name >> bound. I do believe the Python name binding semantics have issues (for >> which the "global" keyword was born), but I feel that the "fixing" the >> Python semantic to a more "Ruby-like" one adds as many problems as it solves >> since the "Ruby-like" one is just as implicit in nature. Not to mention the >> backwards compatibility impact is probably much larger. >> >> I would like the community's opinion if there is enough out there that think >> this would be a worthwile endevour--or if there is already an initiative >> that I missed. Please let me know your questions, comments. >> >> Best Regards, >> Almann >> >> -- >> Almann T. Goo >> almann.goo at gmail.com >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu >> >> >> >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > From jeremy at alum.mit.edu Tue Feb 21 15:32:55 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 21 Feb 2006 09:32:55 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43fb1c0a.1382282887@news.gmane.org> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43fb1c0a.1382282887@news.gmane.org> Message-ID: I had to lookup top-post :-). On 2/21/06, Bengt Richter wrote: > On Tue, 21 Feb 2006 08:02:08 -0500, "Jeremy Hylton" wrote: > >Jeremy > Hey, only Guido is allowed to top-post. He said so ;-) The Gmail UI makes it really easy to forget where the q > But to the topic, it just occurred to me that any outer scopes could be given names > (including global namespace, but that would have the name global by default, so > global.x would essentially mean what globals()['x'] means now, except it would > be a name error if x didn't pre-exist when accessed via namespace_name.name_in_space notation. > > > namespace g_alias # g_alias.x becomes alternate spelling of global.x > def outer(): > namespace mezzanine > a = 123 > print a # => 123 > print mezzanine.a # => 123 (the name space name is visible and functional locally) > def inner(): > print mezzanine.a => 123 > mezznine.a =456 > inner() > print a # = 456 > global.x = re-binds global x, name error if not preexisting. > > This would allow creating mezzanine like an attribute view of the slots in that local namespace, > as well as making namespace itself visible there, so the access to mezzanine would look like a read access to > an ordinary object named mezzanine that happened to have attribute slots matching outer's local name space. > > Efficiency might make it desirable not to extend named namespaces with new names, function locals being > slotted in a fixed space tied into the frame (I think). But there are tricks I guess. > Anyway, I hadn't seen this idea before. Seems > > Regards, > Bengt Richter > > > > >On 2/20/06, Almann T. Goo wrote: > >> I am considering developing a PEP for enabling a mechanism to assign to free > >> variables in a closure (nested function). My rationale is that with the > >> advent of PEP 227 , Python has proper nested lexical scopes, but can have > >> undesirable behavior (especially with new developers) when a user makes > >> wants to make an assignment to a free variable within a nested function. > >> Furthermore, after seeing numerous kludges to "solve" the problem with a > >> mutable object, like a list, as the free variable do not seem "Pythonic." I > >> have also seen mention that the use of classes can mitigate this, but that > >> seems, IMHO, heavy handed in cases when an elegant solution using a closure > >> would suffice and be more appropriate--especially when Python already has > >> nested lexical scopes. > >> > >> I propose two possible approaches to solve this issue: > >> > >> 1. Adding a keyword such as "use" that would follow similar semantics as > >> "global" does today. A nested scope could declare names with this keyword > >> to enable assignment to such names to change the closest parent's binding. > >> The semantic would be to keep the behavior we experience today but tell the > >> compiler/interpreter that a name declared with the "use" keyword would > >> explicitly use an enclosing scope. I personally like this approach the most > >> since it would seem to be in keeping with the current way the language works > >> and would probably be the most backwards compatible. The semantics for how > >> this interacts with the global scope would also need to be defined (should > >> "use" be equivalent to a global when no name exists all parent scopes, etc.) > >> > >> > >> def incgen( inc = 1 ) : > >> a = 6 > >> def incrementer() : > >> use a > >> #use a, inc <-- list of names okay too > >> a += inc > >> return a > >> return incrementer > >> > >> Of course, this approach suffers from a downside that every nested scope > >> that wanted to assign to a parent scope's name would need to have the "use" > >> keyword for those names--but one could argue that this is in keeping with > >> one of Python's philosophies that "Explicit is better than implicit" (PEP > >> 20). This approach also has to deal with a user declaring a name with " > >> use" that is a named parameter--this would be a semantic error that could be > >> handled like "global " does today with a SyntaxError. > >> > >> 2. Adding a keyword such as "scope" that would behave similarly to > >> JavaScript's " var" keyword. A name could be declared with such a keyword > >> optionally and all nested scopes would use the declaring scope's binding > >> when accessing or assigning to a particular name. This approach has similar > >> benefits to my first approach, but is clearly more top-down than the first > >> approach. Subsequent "scope" declarations would create a new binding at the > >> declaring scope for the declaring and child scopes to use. This could > >> potentially be a gotcha for users expecting the binding semantics in place > >> today. Also the scope keyword would have to be allowed to be used on > >> parameters to allow such parameter names to be used in a similar fashion in > >> a child scope. > >> > >> > >> def incgen( inc = 1 ) : > >> #scope inc <-- allow scope declaration for bound parameters (not a big > >> fan of this) > >> scope a = 6 > >> def incrementer() : > >> a += inc > >> return a > >> return incrementer > >> > >> This approach would be similar to languages like JavaScript that allow for > >> explicit scope binding with the use of "var" or more static languages that > >> allow re-declaring names at lower scopes. I am less in favor of this, > >> because I don't think it feels very "Pythonic". > >> > >> As a point of reference, some languages such as Ruby will only bind a new > >> name to a scope on assignment when an enclosing scope does not have the name > >> bound. I do believe the Python name binding semantics have issues (for > >> which the "global" keyword was born), but I feel that the "fixing" the > >> Python semantic to a more "Ruby-like" one adds as many problems as it solves > >> since the "Ruby-like" one is just as implicit in nature. Not to mention the > >> backwards compatibility impact is probably much larger. > >> > >> I would like the community's opinion if there is enough out there that think > >> this would be a worthwile endevour--or if there is already an initiative > >> that I missed. Please let me know your questions, comments. > >> > >> Best Regards, > >> Almann > >> > >> -- > >> Almann T. Goo > >> almann.goo at gmail.com > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> http://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > >> > >> > >> > >_______________________________________________ > >Python-Dev mailing list > >Python-Dev at python.org > >http://mail.python.org/mailman/listinfo/python-dev > >Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From jeremy at alum.mit.edu Tue Feb 21 15:37:06 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 21 Feb 2006 09:37:06 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43fb1c0a.1382282887@news.gmane.org> Message-ID: On 2/21/06, Jeremy Hylton wrote: > I had to lookup top-post :-). > > On 2/21/06, Bengt Richter wrote: > > On Tue, 21 Feb 2006 08:02:08 -0500, "Jeremy Hylton" wrote: > > >Jeremy > > Hey, only Guido is allowed to top-post. He said so ;-) > > The Gmail UI makes it really easy to forget where the q Sorry about that. Hit the send key by mistake. The Gmail UI makes it really easy to forget where the quoted text is in relation to your own text. > > But to the topic, it just occurred to me that any outer scopes could be given names > > (including global namespace, but that would have the name global by default, so > > global.x would essentially mean what globals()['x'] means now, except it would > > be a name error if x didn't pre-exist when accessed via namespace_name.name_in_space notation. Isn't this suggestion that same as Greg Ewing's? > > namespace g_alias # g_alias.x becomes alternate spelling of global.x > > def outer(): > > namespace mezzanine > > a = 123 > > print a # => 123 > > print mezzanine.a # => 123 (the name space name is visible and functional locally) > > def inner(): > > print mezzanine.a => 123 > > mezznine.a =456 > > inner() > > print a # = 456 > > global.x = re-binds global x, name error if not preexisting. > > > > This would allow creating mezzanine like an attribute view of the slots in that local namespace, > > as well as making namespace itself visible there, so the access to mezzanine would look like a read access to > > an ordinary object named mezzanine that happened to have attribute slots matching outer's local name space. I don't think using attribute access is particularly clear here. It introduces an entirely new concept, a first-class namespace, in order to solve a small scoping problem. It looks too much like attribute access and not enough like accessing a variable. Jeremy > > Efficiency might make it desirable not to extend named namespaces with new names, function locals being > > slotted in a fixed space tied into the frame (I think). But there are tricks I guess. > > Anyway, I hadn't seen this idea before. Seems > > > > Regards, > > Bengt Richter > > > > > > > >On 2/20/06, Almann T. Goo wrote: > > >> I am considering developing a PEP for enabling a mechanism to assign to free > > >> variables in a closure (nested function). My rationale is that with the > > >> advent of PEP 227 , Python has proper nested lexical scopes, but can have > > >> undesirable behavior (especially with new developers) when a user makes > > >> wants to make an assignment to a free variable within a nested function. > > >> Furthermore, after seeing numerous kludges to "solve" the problem with a > > >> mutable object, like a list, as the free variable do not seem "Pythonic." I > > >> have also seen mention that the use of classes can mitigate this, but that > > >> seems, IMHO, heavy handed in cases when an elegant solution using a closure > > >> would suffice and be more appropriate--especially when Python already has > > >> nested lexical scopes. > > >> > > >> I propose two possible approaches to solve this issue: > > >> > > >> 1. Adding a keyword such as "use" that would follow similar semantics as > > >> "global" does today. A nested scope could declare names with this keyword > > >> to enable assignment to such names to change the closest parent's binding. > > >> The semantic would be to keep the behavior we experience today but tell the > > >> compiler/interpreter that a name declared with the "use" keyword would > > >> explicitly use an enclosing scope. I personally like this approach the most > > >> since it would seem to be in keeping with the current way the language works > > >> and would probably be the most backwards compatible. The semantics for how > > >> this interacts with the global scope would also need to be defined (should > > >> "use" be equivalent to a global when no name exists all parent scopes, etc.) > > >> > > >> > > >> def incgen( inc = 1 ) : > > >> a = 6 > > >> def incrementer() : > > >> use a > > >> #use a, inc <-- list of names okay too > > >> a += inc > > >> return a > > >> return incrementer > > >> > > >> Of course, this approach suffers from a downside that every nested scope > > >> that wanted to assign to a parent scope's name would need to have the "use" > > >> keyword for those names--but one could argue that this is in keeping with > > >> one of Python's philosophies that "Explicit is better than implicit" (PEP > > >> 20). This approach also has to deal with a user declaring a name with " > > >> use" that is a named parameter--this would be a semantic error that could be > > >> handled like "global " does today with a SyntaxError. > > >> > > >> 2. Adding a keyword such as "scope" that would behave similarly to > > >> JavaScript's " var" keyword. A name could be declared with such a keyword > > >> optionally and all nested scopes would use the declaring scope's binding > > >> when accessing or assigning to a particular name. This approach has similar > > >> benefits to my first approach, but is clearly more top-down than the first > > >> approach. Subsequent "scope" declarations would create a new binding at the > > >> declaring scope for the declaring and child scopes to use. This could > > >> potentially be a gotcha for users expecting the binding semantics in place > > >> today. Also the scope keyword would have to be allowed to be used on > > >> parameters to allow such parameter names to be used in a similar fashion in > > >> a child scope. > > >> > > >> > > >> def incgen( inc = 1 ) : > > >> #scope inc <-- allow scope declaration for bound parameters (not a big > > >> fan of this) > > >> scope a = 6 > > >> def incrementer() : > > >> a += inc > > >> return a > > >> return incrementer > > >> > > >> This approach would be similar to languages like JavaScript that allow for > > >> explicit scope binding with the use of "var" or more static languages that > > >> allow re-declaring names at lower scopes. I am less in favor of this, > > >> because I don't think it feels very "Pythonic". > > >> > > >> As a point of reference, some languages such as Ruby will only bind a new > > >> name to a scope on assignment when an enclosing scope does not have the name > > >> bound. I do believe the Python name binding semantics have issues (for > > >> which the "global" keyword was born), but I feel that the "fixing" the > > >> Python semantic to a more "Ruby-like" one adds as many problems as it solves > > >> since the "Ruby-like" one is just as implicit in nature. Not to mention the > > >> backwards compatibility impact is probably much larger. > > >> > > >> I would like the community's opinion if there is enough out there that think > > >> this would be a worthwile endevour--or if there is already an initiative > > >> that I missed. Please let me know your questions, comments. > > >> > > >> Best Regards, > > >> Almann > > >> > > >> -- > > >> Almann T. Goo > > >> almann.goo at gmail.com > > >> _______________________________________________ > > >> Python-Dev mailing list > > >> Python-Dev at python.org > > >> http://mail.python.org/mailman/listinfo/python-dev > > >> Unsubscribe: > > >> http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > > >> > > >> > > >> > > >_______________________________________________ > > >Python-Dev mailing list > > >Python-Dev at python.org > > >http://mail.python.org/mailman/listinfo/python-dev > > >Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > > > > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > > > From almann.goo at gmail.com Tue Feb 21 16:12:06 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Tue, 21 Feb 2006 10:12:06 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43fb1c0a.1382282887@news.gmane.org> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43fb1c0a.1382282887@news.gmane.org> Message-ID: <7e9b97090602210712x5af54877w71cd0fffdb60d779@mail.gmail.com> > But to the topic, it just occurred to me that any outer scopes could be given names > (including global namespace, but that would have the name global by default, so > global.x would essentially mean what globals()['x'] means now, except it would > be a name error if x didn't pre-exist when accessed via namespace_name.name_in_space notation. > > > namespace g_alias # g_alias.x becomes alternate spelling of global.x > def outer(): > namespace mezzanine > a = 123 > print a # => 123 > print mezzanine.a # => 123 (the name space name is visible and functional locally) > def inner(): > print mezzanine.a => 123 > mezznine.a =456 > inner() > print a # = 456 > global.x = re-binds global x, name error if not preexisting. > > This would allow creating mezzanine like an attribute view of the slots in that local namespace, > as well as making namespace itself visible there, so the access to mezzanine would look like a read access to > an ordinary object named mezzanine that happened to have attribute slots matching outer's local name space. > This seems like a neat idea in principle, but I wonder if it removes consistency from the language. Consider that the scope that declares the namespace and its child scopes the names could be accessed by the namespace object or the direct name, but *only* in the child scopes can re-binding for the name be done via the namespace object. def outer() : namespace n a = 5 # <-- same as n.a = 5 def inner() : print a # <-- same as n.a n.a = 7 # <-- *not* the same as a = 7 print n.a I don't like how a child scope can access a free variable from an enclosing scope without the namespace object, but needs to use it for re-binding. Because of this, namespace objects have the potential to obfuscate things more than fix the language issue that I am addressing. -Almann -- Almann T. Goo almann.goo at gmail.com From rrr at ronadam.com Tue Feb 21 16:48:21 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 21 Feb 2006 09:48:21 -0600 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43fb1c0a.1382282887@news.gmane.org> Message-ID: <43FB3645.6050702@ronadam.com> Jeremy Hylton wrote: > On 2/21/06, Jeremy Hylton wrote: >> I had to lookup top-post :-). >> >> On 2/21/06, Bengt Richter wrote: >>> On Tue, 21 Feb 2006 08:02:08 -0500, "Jeremy Hylton" wrote: >>>> Jeremy >>> Hey, only Guido is allowed to top-post. He said so ;-) >> The Gmail UI makes it really easy to forget where the q > > Sorry about that. Hit the send key by mistake. > > The Gmail UI makes it really easy to forget where the quoted text is > in relation to your own text. > >>> But to the topic, it just occurred to me that any outer scopes could be given names >>> (including global namespace, but that would have the name global by default, so >>> global.x would essentially mean what globals()['x'] means now, except it would >>> be a name error if x didn't pre-exist when accessed via namespace_name.name_in_space notation. > > Isn't this suggestion that same as Greg Ewing's? > >>> namespace g_alias # g_alias.x becomes alternate spelling of global.x >>> def outer(): >>> namespace mezzanine >>> a = 123 >>> print a # => 123 >>> print mezzanine.a # => 123 (the name space name is visible and functional locally) >>> def inner(): >>> print mezzanine.a => 123 >>> mezznine.a =456 >>> inner() >>> print a # = 456 >>> global.x = re-binds global x, name error if not preexisting. >>> >>> This would allow creating mezzanine like an attribute view of the slots in that local namespace, >>> as well as making namespace itself visible there, so the access to mezzanine would look like a read access to >>> an ordinary object named mezzanine that happened to have attribute slots matching outer's local name space. Why not just use a class? def incgen(start=0, inc=1) : class incrementer(object): a = start - inc def __call__(self): self.a += inc return self.a return incrementer() a = incgen(7, 5) for n in range(10): print a(), 7 12 17 22 27 32 37 42 47 52 Cheers, Ronald Adam From barry at python.org Tue Feb 21 16:50:58 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2006 10:50:58 -0500 Subject: [Python-Dev] Deprecate ``multifile``? In-Reply-To: References: Message-ID: <1140537058.10770.2.camel@geddy.wooz.org> On Fri, 2006-02-17 at 14:01 +0100, Georg Brandl wrote: > Fredrik Lundh wrote: > > Georg Brandl wrote: > > > >> as Jim Jewett noted, multifile is supplanted by email as much as mimify etc. > >> but it is not marked as deprecated. Should it be deprecated in 2.5? > > > > -0.5 (gratuitous breakage). > > > > I think the current "see also/supersedes" link is good enough. > > Well, it would be deprecated like the other email modules, that is, only > a note is added to the docs and it is added to PEP 4. There would be no > warning. IIRC, when I brought this up ages ago, there was some grumbling that multifile is useful for other than email/MIME applications. Still, I'm +1 on PEP 4'ing it. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060221/73f6edf0/attachment.pgp From aleaxit at gmail.com Tue Feb 21 16:52:22 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 21 Feb 2006 07:52:22 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FAE298.5040404@canterbury.ac.nz> References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> Message-ID: <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> On Feb 21, 2006, at 1:51 AM, Greg Ewing wrote: ... > Just one more thing -- have you made a final decision > about the name yet? I'd still prefer something like > 'autodict', because to me 'defaultdict' suggests autodict is shorter and sharper and I prefer it, too: +1 > etc.) it seems more accurate to think of the value > produced by the factory as an 'initial value' rather > than a 'default value', and I'd prefer to see it If we call the type autodict, then having the factory attribute named autofactory seems to fit. This leaves it open to the reader's imagination to choose whether to think of the value as "initial" or "default" -- it's the *auto* (automatic) value. Alex From fuzzyman at voidspace.org.uk Tue Feb 21 17:09:18 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Tue, 21 Feb 2006 16:09:18 +0000 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> Message-ID: <43FB3B2E.20000@voidspace.org.uk> Guido van Rossum wrote: >On 2/21/06, Fuzzyman wrote: > > >>I've had problems in code that needs to treat strings, lists and >>dictionaries differently (assigning values to a container where all >>three need different handling) and telling the difference but allowing >>duck typing is *problematic*. >> >> > >Consider designing APIs that don't require you to mae that kind of >distinction, if you're worried about edge cases and classifying >arbitrary other objects correctly. It's totally possible to create an >object that behaves like a hybrid of a string and a dict. > > > Understood. >If you're only interested in classifying the three specific built-ins >you mention, I'd check for the presense of certain attributes: >hasattr(x, "lower") -> x is a string of some kind; hasattr(x, "sort") >-> x is a list; hasattr(x, "update") -> x is a dict. Also, hasattr(x, >"union") -> x is a set; hasattr(x, "readline") -> x is a file. > >That's duck typing! > > Sure, but that requires a "dictionary like object" to define an update method, and a "list like object" to define a sort method. The mapping and sequence protocols are so loosely defined that some arbitrary decision like this has to be made. (Any object that defines "__getitem__" could follow either or both and duck typing doesn't help you unless you're prepared to make an additional requirement that is outside the loose requirements of the protocol.) I can't remember how we solved it, but I think we decided that an object would be treated as a string if it passed isinstance, and a dictionary or sequence if it has _getitem__ (but isn't a string instance or subclass). If it has update as well as __getitem__ it is a "dictionary-alike". All the best, Michael Foord >-- >--Guido van Rossum (home page: http://www.python.org/~guido/) > > > From g.brandl at gmx.net Tue Feb 21 17:13:12 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2006 17:13:12 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FAE45F.3020000@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > def my_func(): > namespace foo > foo.x = 42 > > def inc_x(): > foo.x += 1 > > The idea here is that foo wouldn't be an object in > its own right, but just a collection of names that > would be implemented as local variables of my_func. But why is that better than class namespace(object): pass def my_func(): foo = namespace() (...) ? Georg From almann.goo at gmail.com Tue Feb 21 17:15:40 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Tue, 21 Feb 2006 11:15:40 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FB3645.6050702@ronadam.com> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43fb1c0a.1382282887@news.gmane.org> <43FB3645.6050702@ronadam.com> Message-ID: <7e9b97090602210815u3e6ba9f4jc9eceea6fac0facb@mail.gmail.com> > Why not just use a class? > > > def incgen(start=0, inc=1) : > class incrementer(object): > a = start - inc > def __call__(self): > self.a += inc > return self.a > return incrementer() > > a = incgen(7, 5) > for n in range(10): > print a(), Because I think that this is a workaround for a concept that the language doesn't support elegantly with its lexically nested scopes. IMO, you are emulating name rebinding in a closure by creating an object to encapsulate the name you want to rebind--you don't need this workaround if you only need to access free variables in an enclosing scope. I provided a "lighter" example that didn't need a callable object but could use any mutable such as a list. This kind of workaround is needed as soon as you want to re-bind a parent scope's name, except in the case when the parent scope is the global scope (since there is the "global" keyword to handle this). It's this dichotomy that concerns me, since it seems to be against the elegance of Python--at least in my opinion. It seems artificially limiting that enclosing scope name rebinds are not provided for by the language especially since the behavior with the global scope is not so. In a nutshell I am proposing a solution to make nested lexical scopes to be orthogonal with the global scope and removing a "wart," as Jeremy put it, in the language. -Almann -- Almann T. Goo almann.goo at gmail.com From g.brandl at gmx.net Tue Feb 21 17:18:01 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2006 17:18:01 +0100 Subject: [Python-Dev] Deprecate ``multifile``? In-Reply-To: <1140537058.10770.2.camel@geddy.wooz.org> References: <1140537058.10770.2.camel@geddy.wooz.org> Message-ID: Barry Warsaw wrote: > On Fri, 2006-02-17 at 14:01 +0100, Georg Brandl wrote: >> Fredrik Lundh wrote: >> > Georg Brandl wrote: >> > >> >> as Jim Jewett noted, multifile is supplanted by email as much as mimify etc. >> >> but it is not marked as deprecated. Should it be deprecated in 2.5? >> > >> > -0.5 (gratuitous breakage). >> > >> > I think the current "see also/supersedes" link is good enough. >> >> Well, it would be deprecated like the other email modules, that is, only >> a note is added to the docs and it is added to PEP 4. There would be no >> warning. > > IIRC, when I brought this up ages ago, there was some grumbling that > multifile is useful for other than email/MIME applications. Still, I'm > +1 on PEP 4'ing it. Which means "go ahead" or "wait for others to be -1"? Georg From g.brandl at gmx.net Tue Feb 21 17:17:11 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2006 17:17:11 +0100 Subject: [Python-Dev] Removing Non-Unicode Support? In-Reply-To: <43FADF1D.90603@egenix.com> References: <20060216052542.1C7F21E4009@bag.python.org> <43F45039.2050308@egenix.com> <43F5D222.2030402@egenix.com> <43F90E9E.8060603@taupro.com> <43F98D89.2060102@taupro.com> <43F9C2CA.4010808@egenix.com> <43FA44E3.9090206@v.loewis.de> <43FADF1D.90603@egenix.com> Message-ID: M.-A. Lemburg wrote: > Note that I'm not saying that these switches are useless - of > course they do allow to strip down the Python interpreter. > I believe that only very few people are interested in having these > options and it's fair enough to put the burden of maintaining these > branches on them. Which is proven by the fact that many tests fail without unicode. So at least the people building --without-unicode don't care much about brokenness. Georg From tjreedy at udel.edu Tue Feb 21 17:20:13 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 21 Feb 2006 11:20:13 -0500 Subject: [Python-Dev] buildbot vs. Windows References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk><20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de><20060221011148.GA20714@panix.com><1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> Message-ID: "Neal Norwitz" wrote in message news:ee2a432c0602210009x2f4d1fffl3d49037b9b084d1e at mail.gmail.com... > There's nothing to prevent buildbot from making debug builds, though > that is not currently done. Now that there are separate report pages for 2.4 and 2.5, you could add pages for debug builds, perhaps with a lower frequency (once a day?), without cluttering up the main two pages. From guido at python.org Tue Feb 21 17:31:49 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 21 Feb 2006 08:31:49 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> Message-ID: On 2/21/06, Alex Martelli wrote: > > On Feb 21, 2006, at 1:51 AM, Greg Ewing wrote: > ... > > Just one more thing -- have you made a final decision > > about the name yet? I'd still prefer something like > > 'autodict', because to me 'defaultdict' suggests > > autodict is shorter and sharper and I prefer it, too: +1 Apart from it somehow hashing to the same place as "autodidact" in my brain :), I don't like it as much.; someone who doesn't already know what it is doesn't have a clue what an "automatic dictionary" would offer compared to a regular one. IMO "default" conveys just enough of a hint that something is being defaulted. A name long enough to convey all the details of why, when, and it defaults wouldn't be practical. (Look up the history of botanical names under Linnaeus for a simile.) I'll let it brew in SF for a while but I expect to be checking this in at PyCon. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bokr at oz.net Tue Feb 21 18:09:36 2006 From: bokr at oz.net (Bengt Richter) Date: Tue, 21 Feb 2006 17:09:36 GMT Subject: [Python-Dev] defaultdict proposal round three References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <43fa60a7.1334311178@news.gmane.org> Message-ID: <43fb2472.1384434281@news.gmane.org> On Tue, 21 Feb 2006 05:58:52 -0800, "Guido van Rossum" wrote: >On 2/20/06, Bengt Richter wrote: >> How about doing it as an expression, empowering ( ;-) the dict just afer creation? >> E.g., for >> >> d = dict() >> d.default_factory = list >> >> you could write >> >> d = dict()**list > >Bengt, can you let your overactive imagination rest for a while? I >recommend that you sit back, relax for a season, and reflect on the >zen nature of Pythonicity. Then come back and hopefully you'll be able >to post without embarrassing yourself continuously. > It is tempting to seek vindication re "embarrassing yourself continuously" but I'll let it go, and treat it as an opportunity to explore the nature of my ego a little further ;-) I am not embarrassed by having an "overactive imagination," thank you, but if it is causing a problem for you here, I apologize, and will withdraw. Thanks for the nudge. I really have been wasting a lot of time using python trivial pursuits as an escape from tackling stuff that I haven't been ready for. It's time I focus. Thanks, and good luck. I'll be off now ;-) Regards, Bengt Richter From rrr at ronadam.com Tue Feb 21 18:40:37 2006 From: rrr at ronadam.com (Ron Adam) Date: Tue, 21 Feb 2006 11:40:37 -0600 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]] In-Reply-To: <43FAE28C.1030009@canterbury.ac.nz> References: <43F5CFE6.3040502@egenix.com> <43F6338D.8050300@v.loewis.de> <20060217141138.5F99.JCARLSON@uci.edu> <43F6539F.5040707@v.loewis.de> <20060218051344.GC28761@panix.com> <43F6E1FA.7080909@v.loewis.de> <43f9424b.1261003927@news.gmane.org> <43FA618D.7070804@ronadam.com> <43FAE28C.1030009@canterbury.ac.nz> Message-ID: <43FB5095.5080308@ronadam.com> Greg Ewing wrote: > Ron Adam wrote: > >> Storing byte information as 16 or 32 bits ints could take up a rather >> lot of memory in some cases. > > I don't quite see the point here. Inside a bytes object, > they would be stored 1 byte per byte. Nobody is suggesting > that they would take up more than that just because > a_bytes_object[i] happens to return an int. Yes, and the above is the obvious reason why not. Not that I thought it was being considered. > So the only reason to introduce a new "byte" type is to > remove some of the operations that int has. We can already > do bitwise operations on an int, so we don't need a new > type to add that capability. Yes, and a byte type isn't needed if the individual bytes are always in a bytes object. A bytes object with a single byte would be an octet in that case. > What's more, I can see this leading to people asking for > arithmetic operations to be *added* to the byte type so > they can do wrap-around arithmetic, and then for 16-bit, > 32-bit, 64-bit etc. versions of it, etc. etc. I agree the bytes object shouldn't re implement arithmetic. I would like bitwise logic operations on bytes() and byte ranges() if possible. Cheers, Ronald Adam From barry at python.org Tue Feb 21 19:19:24 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2006 13:19:24 -0500 Subject: [Python-Dev] Deprecate ``multifile``? In-Reply-To: References: <1140537058.10770.2.camel@geddy.wooz.org> Message-ID: <1140545964.10794.5.camel@geddy.wooz.org> On Tue, 2006-02-21 at 17:18 +0100, Georg Brandl wrote: > > IIRC, when I brought this up ages ago, there was some grumbling that > > multifile is useful for other than email/MIME applications. Still, I'm > > +1 on PEP 4'ing it. > > Which means "go ahead" or "wait for others to be -1"? s/or/and/ ? :) I say go ahead and add it to PEP 4. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060221/81cfbe97/attachment.pgp From barry at python.org Tue Feb 21 19:26:45 2006 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2006 13:26:45 -0500 Subject: [Python-Dev] A codecs nit In-Reply-To: <43F724BD.9060000@egenix.com> References: <43F397F6.4090402@egenix.com> <1140040661.14818.42.camel@geddy.wooz.org> <43F724BD.9060000@egenix.com> Message-ID: <1140546405.10770.13.camel@geddy.wooz.org> On Sat, 2006-02-18 at 14:44 +0100, M.-A. Lemburg wrote: > In Py 2.5 we'll change that. The encodings package search > function will only allow codecs in that package to be > imported. All other codec packages will have to provide > their own search function and register this with the > codecs registry. My weekend experimentation used the imp functions to constrain the module search path to encodings.__path__, but I'm not sure that's much better than prepending 'encodings.' on the module name and letting __import__() do its thing. > The big question is: what to do about 2.3 and 2.4 - adding > the same patch will cause serious breakage, since popular > codec packages such as Tamito's Japanese package rely > on the existing behavior. FWIW, Mailman has had to do a bunch of special case loading of the 3rd party Japanese and Korean codecs for older Pythons, and the email package also has to do special tests for e.g. euc-jp before it'll do the Asian codec tests. I think most of the latter is unnecessary for 2.4 and beyond, and I suspect that the former is also unnecessary for 2.4 and beyond. It's probably still necessary for 2.3. IIUC, there are still people who prefer Tamito's package over the built-in Japanese codecs in 2.4, but I don't understand all the details. My preference would be to backport the fix to 2.4 but not worry about 2.3 since there are no plans to ever release a 2.3.6 AFAIK. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060221/d5e27126/attachment.pgp From rasky at develer.com Tue Feb 21 19:47:04 2006 From: rasky at develer.com (Giovanni Bajo) Date: Tue, 21 Feb 2006 19:47:04 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: <0afe01c63717$351ff4a0$bf03030a@trilan> Almann T. Goo wrote: >> 1. Adding a keyword such as "use" that would follow similar semantics as " >> global" does today. A nested scope could declare names with this keyword >> to enable assignment to such names to change the closest parent's binding. +0, and I like "outer". I like the idea, but I grepped several Python programs I wrote, and found out that I used the list trick many times, but almost always in quick-hack code in unittests. I wasn't able to find a single instance of this in real code I wrote, so I can't really be +1. -- Giovanni Bajo From jcarlson at uci.edu Tue Feb 21 19:52:48 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 21 Feb 2006 10:52:48 -0800 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FAE40B.8090406@canterbury.ac.nz> References: <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> Message-ID: <20060221104839.601C.JCARLSON@uci.edu> Greg Ewing wrote: > > Stephen J. Turnbull wrote: > > > What I advocate for Python is to require that the standard base64 > > codec be defined only on bytes, and always produce bytes. > > I don't understand that. It seems quite clear to me that > base64 encoding (in the general sense of encoding, not the > unicode sense) takes binary data (bytes) and produces characters. > That's the whole point of base64 -- so you can send arbitrary > data over a channel that is only capable of dealing with > characters. > > So in Py3k the correct usage would be > > base64 unicode > encode encode(x) > original bytes --------> unicode ---------> bytes for transmission > <-------- <--------- > base64 unicode > decode decode(x) > > where x is whatever unicode encoding the transmission > channel uses for characters (probably ascii or an ascii > superset, but not necessarily). It doesn't seem strange to you to need to encode data twice to be able to have a usable sequence of characters which can be embedded in an effectively 7-bit email; when base64 was, dare I say it, designed to have 7-bit email as its destination in the first place? It does to me. - Josiah From python at rcn.com Tue Feb 21 19:53:24 2006 From: python at rcn.com (Raymond Hettinger) Date: Tue, 21 Feb 2006 13:53:24 -0500 Subject: [Python-Dev] defaultdict proposal round three References: <43FA4E88.4090507@canterbury.ac.nz><43FAE298.5040404@canterbury.ac.nz><71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> Message-ID: <007a01c63718$18335d40$6a01a8c0@RaymondLaptop1> >> > Just one more thing -- have you made a final decision >> > about the name yet? I'd still prefer something like >> > 'autodict', because to me 'defaultdict' suggests >> >> autodict is shorter and sharper and I prefer it, too: +1 > > Apart from it somehow hashing to the same place as "autodidact" in my > brain :), I don't like it as much.; someone who doesn't already know > what it is doesn't have a clue what an "automatic dictionary" would > offer compared to a regular one. IMO "default" conveys just enough of > a hint that something is being defaulted. A name long enough to convey > all the details of why, when, and it defaults wouldn't be practical. > (Look up the history of botanical names under Linnaeus for a simile.) I'm with Guido on this one. The word default is closely associated with what makes this different from regular dictionaries and it is closely associated with the name of the attribute, default_factory. Also, the word has a history of parallel use in the context of dict.get(). The word "auto" on the other hand is associated with nothing. You might as well argue to call it magicdictionary because "magic" has two letters less than "default" ;-) Raymond From tjreedy at udel.edu Tue Feb 21 20:16:17 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 21 Feb 2006 14:16:17 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: "Almann T. Goo" wrote in message news:7e9b97090602210516o5d1a823apedcea66846a271b5 at mail.gmail.com... > I certainly hope that an initiative like this doesn't get stymied by > the lack of a good name for such a keyword. Maybe something like > "outer"? Adding a keyword has a cost that you have so far ignored. Guido is rightfully very cautious about additions, especially for esthetic reasons. The issue of rebinding enclosed names was partly discussed in PEP 227. Sometime after the implementation of the PEP in 2.1, it was thoroughly discussed again (100+ posts?) in this forum. There were perhaps 10 different proposals, including, I believe, 'outer'. Guido rejected them all as having costs greater than the benefits. Perhaps you can find this discussion in the archives. I remember it as a Jan-Feb discussion but might be wrong. This thread so far seems like a rehash of parts of the earlier discussion. In the absence of indication from Guido that he is ready to reopen the issue, perhaps it would better go to comp.lang.python. In and case, reconsideration is more likely to be stimulated by new experience with problems in real code than by repeats of 'orthogonality' desires and rejected changes. --- In another post, you rejected the use of class instances by opining: >Because I think that this is a workaround for a concept that the >language doesn't support elegantly with its lexically nested scopes. >IMO, you are emulating name rebinding in a closure by creating an >object to encapsulate the name you want to rebind Guido, on the other hand, views classes and instances as Python's method of doing what other (functional) languages do with closures. From the PEP: "Given that this would encourage the use of local variables to hold state that is better stored in a class instance, it's not worth adding new syntax to make this possible (in Guido's opinion)." He reiterated this viewpoint in the post-PEP discussion mentioned above. I think he would specificly reject the view that Python's alternative is a 'workaround' and 'emulation' of what you must consider to be the real thing. Terry Jan Reedy From jeremy at alum.mit.edu Tue Feb 21 20:25:06 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue, 21 Feb 2006 14:25:06 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: On 2/21/06, Terry Reedy wrote: > > "Almann T. Goo" wrote in message > news:7e9b97090602210516o5d1a823apedcea66846a271b5 at mail.gmail.com... > > > I certainly hope that an initiative like this doesn't get stymied by > > the lack of a good name for such a keyword. Maybe something like > > "outer"? > > Adding a keyword has a cost that you have so far ignored. Guido is > rightfully very cautious about additions, especially for esthetic reasons. > > The issue of rebinding enclosed names was partly discussed in PEP 227. > Sometime after the implementation of the PEP in 2.1, it was thoroughly > discussed again (100+ posts?) in this forum. There were perhaps 10 > different proposals, including, I believe, 'outer'. Guido rejected them > all as having costs greater than the benefits. Perhaps you can find this > discussion in the archives. I remember it as a Jan-Feb discussion but > might be wrong. If I recall the discussion correctly, Guido said he was open to a version of nested scopes that allowed rebinding. Not sure that the specifics of the previous discussion are necessary, but I recall being surprised by the change in opinion since 2.1 :-). Jeremy > > This thread so far seems like a rehash of parts of the earlier discussion. > In the absence of indication from Guido that he is ready to reopen the > issue, perhaps it would better go to comp.lang.python. In and case, > reconsideration is more likely to be stimulated by new experience with > problems in real code than by repeats of 'orthogonality' desires and > rejected changes. > > --- > > In another post, you rejected the use of class instances by opining: > > >Because I think that this is a workaround for a concept that the > >language doesn't support elegantly with its lexically nested scopes. > > >IMO, you are emulating name rebinding in a closure by creating an > >object to encapsulate the name you want to rebind > > Guido, on the other hand, views classes and instances as Python's method of > doing what other (functional) languages do with closures. From the PEP: > "Given that this > would encourage the use of local variables to hold state that is > better stored in a class instance, it's not worth adding new > syntax to make this possible (in Guido's opinion)." > He reiterated this viewpoint in the post-PEP discussion mentioned above. I > think he would specificly reject the view that Python's alternative is a > 'workaround' and 'emulation' of what you must consider to be the real > thing. > > Terry Jan Reedy > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > From g.brandl at gmx.net Tue Feb 21 20:26:23 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2006 20:26:23 +0100 Subject: [Python-Dev] Deprecate ``multifile``? In-Reply-To: <1140545964.10794.5.camel@geddy.wooz.org> References: <1140537058.10770.2.camel@geddy.wooz.org> <1140545964.10794.5.camel@geddy.wooz.org> Message-ID: Barry Warsaw wrote: > On Tue, 2006-02-21 at 17:18 +0100, Georg Brandl wrote: > >> > IIRC, when I brought this up ages ago, there was some grumbling that >> > multifile is useful for other than email/MIME applications. Still, I'm >> > +1 on PEP 4'ing it. >> >> Which means "go ahead" or "wait for others to be -1"? > > s/or/and/ ? :) > > I say go ahead and add it to PEP 4. Done, and added a note in the docs. More will not be needed until 3.0, I suppose. Georg From python at rcn.com Tue Feb 21 20:33:53 2006 From: python at rcn.com (Raymond Hettinger) Date: Tue, 21 Feb 2006 14:33:53 -0500 Subject: [Python-Dev] defaultdict proposal round three References: <007701c63666$aaf98080$7600a8c0@RaymondLaptop1> Message-ID: <003d01c6371d$c0364840$6a01a8c0@RaymondLaptop1> Then you will likely be happy with Guido's current version of the patch. ----- Original Message ----- From: "Crutcher Dunnavant" To: "Raymond Hettinger" Cc: "Python Dev" Sent: Monday, February 20, 2006 8:57 PM Subject: Re: [Python-Dev] defaultdict proposal round three in two ways: 1) dict.get doesn't work for object dicts or in exec/eval contexts, and 2) dict.get requires me to generate the default value even if I'm not going to use it, a process which may be expensive. On 2/20/06, Raymond Hettinger wrote: > [Crutcher Dunnavant ] > >> There are many times that I want d[key] to give me a value even when > >> it isn't defined, but that doesn't always mean I want to _save_ that > >> value in the dict. > > How does that differ from the existing dict.get method? > > > Raymond > -- Crutcher Dunnavant littlelanguages.com monket.samedi-studios.com From jcarlson at uci.edu Tue Feb 21 20:31:50 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 21 Feb 2006 11:31:50 -0800 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FAE45F.3020000@canterbury.ac.nz> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> Message-ID: <20060221105309.601F.JCARLSON@uci.edu> Greg Ewing wrote: > > Josiah Carlson wrote: > > > Mechanisms which rely on manipulating variables within closures or > > nested scopes to function properly can be elegant, but I've not yet seen > > one that *really* is. > > It seems a bit inconsistent to say on the one hand > that direct assignment to a name in an outer scope > is not sufficiently useful to be worth supporting, > while at the same time providing a way to do it for > one particular scope, i.e. 'global'. Would you > advocate doing away with it? I didn't conceive of the idea or implementation of 'global', it was before my time. I have found that *using* global can be convenient (and sometimes even directly manipulating globals() can be even more convenient). However, I believe global was and is necessary for the same reasons for globals in any other language. Are accessors for lexically nested scopes necessary? Obviously no. The arguments for their inclusion are: easier access to parent scopes and potentially faster execution. The question which still remains in my mind, which I previously asked, is whether the use cases are compelling enough to warrant the feature addition. > > Of course using > > classes directly with a bit of work can offer you everything you want > > from a closure, with all of the explcitness that you could ever want. > > There are cases where the overhead (in terms of amount > of code) of defining a class and creating an instance of > it swamps the code which does the actual work, and, > I feel, actually obscures what is being done rather > than clarifies it. These cases benefit from the ability > to refer to names in enclosing scopes, and I believe > they would benefit further from the ability to assign > to such names. class namespace: pass def fcn(...): foo = namespace() ... Overwhelms the user? > Certainly the feature could be abused, as can the > existing nested scope facilities, or any other language > feature for that matter. Mere potential for abuse is > not sufficient reason to reject a feature, or the > language would have no features at all. Indeed, but as I have asked, I would like to see some potential nontrivial *uses*. No one has responded to this particular request. When I am confronted with a lack of uses, and the potential for abuses, I'm going to have to side on "no thanks, the potential abuse outweighs the nonexistant nontrivial use". > Another consideration is efficiency. CPython currently > implements access to local variables (both in the > current scope and all outer ones except the module > scope) in an extremely efficient way. There's > always the worry that using attribute access in > place of local variable access is greatly increasing > the runtime overhead for no corresponding benefit. Indeed, the only benefit to using classes is that you gain explicitness. To gain speed in current Python, one may need to do a bit more work (slots, call frame hacking, perhaps an AST manipulation with the new AST branch, etc.). > You mention the idea of namespaces. Maybe an answer > is to provide some lightweight way of defining a > temporary, singe-use namespace for use within > nested scopes -- lightweight in terms of both code > volume and runtime overhead. Perhaps something like > > def my_func(): > namespace foo > foo.x = 42 > > def inc_x(): > foo.x += 1 Because this discussion is not about "how do I create a counter in Python", let's see some examples which are not counters and which are improved through the use of this "namespace", or "use", "scope", etc. > > Introducing these two new keywords is equivalent to > > encouraging nested scope use. Right now nested scope > > use is "limited" or "fraught with gotchas". > > What you seem to be saying here is: Nested scope use > is Inherently Bad. Therefore we will keep them Limited > and Fraught With Gotchas, so people will be discouraged > from using them. > > Sounds a bit like the attitude of certain religious > groups to condoms. (Might encourage people to have > sex -- can't have that -- look at all the nasty diseases > you can get!) If you take that statement within the context of the other things I had been saying in regards to closures and nested scopes, namely that I find their use rarely, if ever, truely elegant, it becomes less like "condom use" as purported by some organizations, and more like kicking a puppy for barking: it is of my opinion that there are usually better ways of dealing with the problem (don't kick puppies for barking and don't use closures). - Josiah From walter at livinglogic.de Tue Feb 21 20:36:31 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 21 Feb 2006 20:36:31 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <4f0b69dc0602190317g2b98df5cw2fcf42f948540b01@mail.gmail.com> References: <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> <43F63C07.9030901@egenix.com> <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> <43F7585E.4080909@egenix.com> <61847.89.54.8.114.1140296899.squirrel@isar.livinglogic.de> <4f0b69dc0602190317g2b98df5cw2fcf42f948540b01@mail.gmail.com> Message-ID: <43FB6BBF.7030101@livinglogic.de> Hye-Shik Chang wrote: > On 2/19/06, Walter D?rwald wrote: >> M.-A. Lemburg wrote: >>> Walter D?rwald wrote: >>>> Anyway, I've started implementing a patch that just adds >>>> codecs.StatefulEncoder/codecs.StatefulDecoder. UTF8, UTF8-Sig, >>>> UTF-16, UTF-16-LE and UTF-16-BE are already working. >>> Nice :-) >> gencodec.py is updated now too. The rest should be manageble too. >> I'll leave updating the CJKV codecs to Hye-Shik though. > > Okay. I'll look whether how CJK codecs can be improved by the > new protocol soon. I guess it'll be not so difficult because CJK > codecs have a their own common stateful framework already. OK, here's the patch: bugs.python.org/1436130 (assigned to MAL). > BTW, CJK codecs don't have V yet. :-) Bye, Walter D?rwald From mal at egenix.com Tue Feb 21 20:48:06 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 21 Feb 2006 20:48:06 +0100 Subject: [Python-Dev] Stateful codecs [Was: str object going in Py3K] In-Reply-To: <43FB6BBF.7030101@livinglogic.de> References: <43F5DFE0.6090806@livinglogic.de> <43F5E5E9.2040809@egenix.com> <43F5F24D.2000802@livinglogic.de> <43F5F5F4.2000906@egenix.com> <43F5FD88.8090605@livinglogic.de> <43F63C07.9030901@egenix.com> <61425.89.54.8.114.1140279099.squirrel@isar.livinglogic.de> <43F7585E.4080909@egenix.com> <61847.89.54.8.114.1140296899.squirrel@isar.livinglogic.de> <4f0b69dc0602190317g2b98df5cw2fcf42f948540b01@mail.gmail.com> <43FB6BBF.7030101@livinglogic.de> Message-ID: <43FB6E76.5050407@egenix.com> Walter D?rwald wrote: > Hye-Shik Chang wrote: > >> On 2/19/06, Walter D?rwald wrote: >>> M.-A. Lemburg wrote: >>>> Walter D?rwald wrote: >>>>> Anyway, I've started implementing a patch that just adds >>>>> codecs.StatefulEncoder/codecs.StatefulDecoder. UTF8, UTF8-Sig, >>>>> UTF-16, UTF-16-LE and UTF-16-BE are already working. >>>> Nice :-) >>> gencodec.py is updated now too. The rest should be manageble too. >>> I'll leave updating the CJKV codecs to Hye-Shik though. >> >> Okay. I'll look whether how CJK codecs can be improved by the >> new protocol soon. I guess it'll be not so difficult because CJK >> codecs have a their own common stateful framework already. > > OK, here's the patch: bugs.python.org/1436130 (assigned to MAL). Thanks. I won't be able to look into it this week though, probably next week. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 21 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From pje at telecommunity.com Tue Feb 21 21:25:38 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 21 Feb 2006 15:25:38 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <20060221105309.601F.JCARLSON@uci.edu> References: <43FAE45F.3020000@canterbury.ac.nz> <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> Message-ID: <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> At 11:31 AM 2/21/2006 -0800, Josiah Carlson wrote: >Greg Ewing wrote: > > > > It seems a bit inconsistent to say on the one hand > > that direct assignment to a name in an outer scope > > is not sufficiently useful to be worth supporting, > > while at the same time providing a way to do it for > > one particular scope, i.e. 'global'. Would you > > advocate doing away with it? > >I didn't conceive of the idea or implementation of 'global', it was >before my time. I have found that *using* global can be convenient (and >sometimes even directly manipulating globals() can be even more >convenient). However, I believe global was and is necessary for the >same reasons for globals in any other language. Here's a crazy idea, that AFAIK has not been suggested before and could work for both globals and closures: using a leading dot, ala the new relative import feature. e.g.: def incrementer(val): def inc(): .val += 1 return .val return inc The '.' would mean "this name, but in the nearest outer scope that defines it". Note that this could include the global scope, so the 'global' keyword could go away in 2.5. And in Python 3.0, the '.' could become *required* for use in closures, so that it's not necessary for the reader to check a function's outer scope to see whether closure is taking place. EIBTI. Interestingly, the absence of a name before the dot seems to imply that the name is an attribute of the Unnameable. :) Or more prosaically, it treats lexical closures and module globals as special cases of objects. You could perhaps even extend it so that '.' by itself means the same thing as vars(), but that's probably going too far, assuming that the idea wasn't too far gone to begin with. I suspect functional folks will love the '.' idea, but also that folks who wanted to get rid of 'self' will probably scream bloody murder at the idea of using a leading dot to represent a scope intead of 'self'. :) From mbland at acm.org Tue Feb 21 21:39:53 2006 From: mbland at acm.org (Mike Bland) Date: Tue, 21 Feb 2006 12:39:53 -0800 Subject: [Python-Dev] PEP 343 "with" statement patch Message-ID: <57ff0ed00602211239r52a1e96ag4613610005f9b19c@mail.gmail.com> With Neal Norwitz's help, I've submitted an initial patch to implement the "with" statement from PEP 343 (SourceForge request ID 1435715). There is a little more work to be done (on the doc especially), and I have a couple of questions written up on the SourceForge page, but the code works to the best of my understanding of PEP 343 and has a fairly comprehensive set of unit tests to verify it. Looking forward to the review, Mike From mrussell at verio.net Tue Feb 21 21:41:26 2006 From: mrussell at verio.net (Mark Russell) Date: Tue, 21 Feb 2006 20:41:26 +0000 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: On 21 Feb 2006, at 19:25, Jeremy Hylton wrote: > If I recall the discussion correctly, Guido said he was open to a > version of nested scopes that allowed rebinding. PEP 227 mentions using := as a rebinding operator, but rejects the idea as it would encourage the use of closures. But to me it seems more elegant than some special keyword, especially is it could also replace the "global" keyword. It doesn't handle things like "x += y" but I think you could deal with that by just writing "x := x + y". BTW I do think there are some cases where replacing a closure with a class is not an improvement. For example (and assuming the existence of :=): def check_items(items): had_error = False def err(mesg): print mesg had_error := True for item in items: if too_big(item): err("Too big") if too_small(item): err("Too small") if had_error: print "Some items were out of range" Using a class for this kind of trivial bookkeeping just adds boilerplate and obscures the main purpose of the code: def check_items(items): class NoteErrors (object): def __init__(self): self.had_error = False def __call__(self, mesg): print mesg self.had_error = True err = NoteErrors() for item in items: if too_big(item): err("Too big") if too_small(item): err("Too small") if err.had_error: print "Some items were out of range" Any chance of := (and removing "global") in python 3K? Mark Russell From just at letterror.com Tue Feb 21 22:01:55 2006 From: just at letterror.com (Just van Rossum) Date: Tue, 21 Feb 2006 22:01:55 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: Message-ID: Mark Russell wrote: > PEP 227 mentions using := as a rebinding operator, but rejects the > idea as it would encourage the use of closures. But to me it seems > more elegant than some special keyword, especially is it could also > replace the "global" keyword. It doesn't handle things like "x += y" > but I think you could deal with that by just writing "x := x + y". Actually, it could handle += just fine, since that operator has written "rebinding" all over it... I'd be +1 on := (and augmented assignment being rebinding), but the argument against it (if I recall correctly) was that rebinding should be a property of the name, not of the operator. Yet "declaring" a name local is also done trough an operator: a = 1 means a is local (unless it was declared global). It can definitely be argued either way. Btw, PJE's "crazy" idea (.name, to rebind an outer name) was proposed before, but Guido wanted to reserve .name for a (Pascal-like) 'with' statement. Hmm, http://mail.python.org/pipermail/python-dev/2004-March/043545.html confirms that, although it wasn't in response to a rebinding syntax. So maybe it wasn't proposed before after all... Just From ianb at colorstudy.com Tue Feb 21 22:13:22 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 21 Feb 2006 15:13:22 -0600 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: <43FB8272.7060906@colorstudy.com> Mark Russell wrote: > On 21 Feb 2006, at 19:25, Jeremy Hylton wrote: > >>If I recall the discussion correctly, Guido said he was open to a >>version of nested scopes that allowed rebinding. > > > PEP 227 mentions using := as a rebinding operator, but rejects the > idea as it would encourage the use of closures. But to me it seems > more elegant than some special keyword, especially is it could also > replace the "global" keyword. It doesn't handle things like "x += y" > but I think you could deal with that by just writing "x := x + y". By rebinding operator, does that mean it is actually an operator? I.e.: # Required assignment to declare?: chunk = None while chunk := f.read(1000): ... -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From martin at v.loewis.de Tue Feb 21 22:25:49 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:25:49 +0100 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> References: <406C5A8F-7AB6-4E17-A253-C54E4F6A9824@gmail.com> <013801c63682$eb39a5f0$7600a8c0@RaymondLaptop1> Message-ID: <43FB855D.4040004@v.loewis.de> Raymond Hettinger wrote: >>Yes, I now agree. This means that I'm withdrawing proposal A (new >>method) and championing only B (a subclass that implements >>__getitem__() calling on_missing() and on_missing() defined in that >>subclass as before, calling default_factory unless it's None). I don't >>think this crisis is big enough to need *two* solutions, and this >>example shows B's superiority over A. > > > FWIW, I'm happy with the proposal and think it is a nice addition to Py2.5. I agree. I would have preferred if dict itself was modified, but after ruling out changes to dict.__getitem__, d[k]+=1 is too important to not support it. Regards, Martin From g.brandl at gmx.net Tue Feb 21 22:27:26 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2006 22:27:26 +0100 Subject: [Python-Dev] Two patches Message-ID: Hi, I have two patches lying around here, please comment: * I think I've submitted this one to the tracker, but can't remember: It's for PySequence_SetItem and makes something like this possible: tup = ([], ) tup[0] += [1] I can upload it once more to allow review. * One patch for staticmethod and classmethod, which currently silently accept keyword arguments and throw them away. The patch adds error messages. Georg From martin at v.loewis.de Tue Feb 21 22:34:27 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:34:27 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> Message-ID: <43FB8763.7070802@v.loewis.de> Tim Peters wrote: > Speaking of which, a number of test failures over the past few weeks > were provoked here only under -r (run tests in random order) or under > a debug build, and didn't look like those were specific to Windows. > Adding -r to the buildbot test recipe is a decent idea. Getting > _some_ debug-build test runs would also be good (or do we do that > already?). So what is your recipe: Add -r to all buildbots? Only to those which have an 'a' in their name? Only to every third build? Duplicating the number of builders? Same question for --with-pydebug. Combining this with -r would multiply the number of builders by 4 already. I'm not keen on deciding this for myself. Somebody else please decide for me. Regards, Martin From martin at v.loewis.de Tue Feb 21 22:36:51 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:36:51 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> Message-ID: <43FB87F3.7000104@v.loewis.de> Neal Norwitz wrote: >>How many of those do you see when you ignore the warnings you get >>while building the Carbon extensions? Those extensions wrap loads of >>deprecated functions, each of which will give a warning. > > > RIght: > http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/138/step-compile/0 > > Most but not all of the warnings are due to Carbon AFAICT. I'd like > to fix those that are important, but it's so far down on the priority > list. :-( Should we build with -Wno-deprecated (or whatever it is spelled) on OSX? In general, "deprecated" warnings are useless for Python. We *know* we are providing wrappers around many deprecated functions. We will (nearly automatically) discontinue wrapping the functions when they get removed. Regards, Martin From martin at v.loewis.de Tue Feb 21 22:38:53 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:38:53 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <17403.7868.343827.845670@montanaro.dyndns.org> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> <17403.276.782271.715060@montanaro.dyndns.org> <17403.7868.343827.845670@montanaro.dyndns.org> Message-ID: <43FB886D.5080403@v.loewis.de> skip at pobox.com wrote: > Ronald> How many of those do you see when you ignore the warnings you > Ronald> get while building the Carbon extensions? > > skip> I see a bunch related to Py_ssize_t. Those have nothing to do > skip> with Carbon. I don't see them on the gentoo build, so I assume > skip> they just haven't been tackled yet. > > Let me rephrase that. I assume the people digging through Py_ssize_t issues > have been looking at compilation warnings for platforms other than Mac OSX. In the buildbot log, I see only a single one of these, and only in an OSX-specific module. So no - "we" don't look into fixing them, as they don't occur on Linux at all (as _Qdmodule isn't built on Linux). Regards, Martin From martin at v.loewis.de Tue Feb 21 22:39:58 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:39:58 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <17403.76.288989.178176@montanaro.dyndns.org> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <17403.76.288989.178176@montanaro.dyndns.org> Message-ID: <43FB88AE.3040709@v.loewis.de> skip at pobox.com wrote: > Neal> IMO compiler warnings should generate emails from buildbot. > > It doesn't generate emails for any other condition. I think it should just > turn the compilation section yellow. It would be easy to run the builds with -Werror, making warnings let the compilation fail, which in turn is flagged red. Regards, Martin From martin at v.loewis.de Tue Feb 21 22:41:13 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:41:13 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk><20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de><20060221011148.GA20714@panix.com><1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> Message-ID: <43FB88F9.9050909@v.loewis.de> Terry Reedy wrote: > "Neal Norwitz" wrote in message > news:ee2a432c0602210009x2f4d1fffl3d49037b9b084d1e at mail.gmail.com... > >>There's nothing to prevent buildbot from making debug builds, though >>that is not currently done. > > > Now that there are separate report pages for 2.4 and 2.5, you could add > pages for debug builds, perhaps with a lower frequency (once a day?), > without cluttering up the main two pages. Not soon, unless somebody has a complete recipe how to change the master config. Regards, Martin From skip at pobox.com Tue Feb 21 22:41:42 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 15:41:42 -0600 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB8763.7070802@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> Message-ID: <17403.35094.354127.878532@montanaro.dyndns.org> Martin> So what is your recipe: Add -r to all buildbots? Only to those Martin> which have an 'a' in their name? Only to every third build? Martin> Duplicating the number of builders? Martin> Same question for --with-pydebug. Combining this with -r would Martin> multiply the number of builders by 4 already. Martin> I'm not keen on deciding this for myself. Somebody else please Martin> decide for me. Now that you've broken the buildbot page into two (trunk and 2.4) I assume breaking it down even further wouldn't be a major undertaking. If we can recruit a suitable number of boxes I see no particular reason you can't support a 2x, 4x, 8x or more increase in the number of buildbot slaves. Skip From martin at v.loewis.de Tue Feb 21 22:45:13 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:45:13 +0100 Subject: [Python-Dev] readline compilarion fails on OSX In-Reply-To: References: <2A98FC2D-2D3F-4DEB-B0DC-63E174EAEABE@redivi.com> Message-ID: <43FB89E9.8060000@v.loewis.de> Guido van Rossum wrote: > Thanks! That worked. > > But shouldn't we try to fix setup.py to detect this situation instead > of making loud clattering noises? One of my concerns with the distutils build process is that it takes failures lightly. Unlike make, it won't stop when an error occurs, but instead go on with the next module. Then, when you retry make, it will retry building the module again, which will again fail. Also happens with the curses module on Solaris. buildbot's detection of build failures is restricted to the exit status of the build process. The build either succeeds or fails. Regards, Martin From steven.bethard at gmail.com Tue Feb 21 22:47:42 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 21 Feb 2006 14:47:42 -0700 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <20060221105309.601F.JCARLSON@uci.edu> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> Message-ID: On 2/21/06, Josiah Carlson wrote: > The question which still remains in my mind, which I previously asked, > is whether the use cases are compelling enough to warrant the feature > addition. I don't know whether I support the proposal or not, but in reading Mark Russel's email, I realized that I just recently ran into a use case: ---------------------------------------------------------------------- # group tokens into chunks by their chunk labels token_groups = [] curr_suffix = '' curr_tokens = [] for token in document.IterAnnotations('token', percent=80): label = token[attr_name] # determine the prefix and suffix of the label prefix, suffix = label[0], label[2:] # B labels start a new chunk if prefix == 'B': curr_suffix = suffix curr_tokens = [token] token_groups.append((curr_suffix, curr_tokens)) # I labels continue the previous chunk elif prefix == 'I': if curr_suffix == suffix: curr_tokens.append(token) # error: change in suffix - this should be a B label else: # log the error message = '%r followed by %r' last_label = curr_tokens[-1][attr_name] self._logger.info(message % (last_label, label)) # start a new chunk curr_suffix = suffix curr_tokens = [token] token_groups.append((curr_suffix, curr_tokens)) # O labels end any previous chunks elif prefix == 'O': curr_suffix = suffix curr_tokens = [token] ---------------------------------------------------------------------- You can see that the code:: curr_suffix = suffix curr_tokens = [token] token_groups.append((curr_suffix, curr_tokens)) is repeated in two places. I would have liked to factor this out into a function, but since the code requires rebinding curr_suffix and curr_tokens, I can't. I'm not sure I care that much -- it's only three lines of code and only duplicated once -- but using something like ``curr_suffix :=`` or Phillip J. Eby's suggestion of ``.curr_suffix =`` would allow this code to be factored out into a function. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Tue Feb 21 22:53:40 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 22:53:40 +0100 Subject: [Python-Dev] Two patches In-Reply-To: References: Message-ID: <43FB8BE4.9090309@v.loewis.de> Georg Brandl wrote: > * I think I've submitted this one to the tracker, but can't remember: > It's for PySequence_SetItem and makes something like this possible: > > tup = ([], ) > tup[0] += [1] That definitely needs fixing: py> tup = ([], ) py> tup[0] += [1] Traceback (most recent call last): File "", line 1, in ? TypeError: object doesn't support item assignment py> tup ([1],) Errors should never pass silently, but success shouldn't cause an error message, either. > * One patch for staticmethod and classmethod, which currently silently > accept keyword arguments and throw them away. The patch adds error > messages. Sounds good as well. Regards, Martin From martin at v.loewis.de Tue Feb 21 23:00:08 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2006 23:00:08 +0100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <17403.35094.354127.878532@montanaro.dyndns.org> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> <17403.35094.354127.878532@montanaro.dyndns.org> Message-ID: <43FB8D68.8080209@v.loewis.de> skip at pobox.com wrote: > Now that you've broken the buildbot page into two (trunk and 2.4) I assume > breaking it down even further wouldn't be a major undertaking. If we can > recruit a suitable number of boxes I see no particular reason you can't > support a 2x, 4x, 8x or more increase in the number of buildbot slaves. Let me explain the procedure for breaking it down, then: - there are builder objects in buildbot, each displayed as a single lane. - each builder now gets a "category" attribute; currently, the categories are "trunk" and "2.4". - for each page, there is an instance of the Waterfall object, constructed with the list of categories to display. Each Waterfall gets its own port number (currently 9010, 9011, and 9012). - There are reverse proxy rules in Apache's httpd.conf, each page requiring 2 lines (giving currently 6 lines of Apache configuration). So for multiplying this by 8, I would have to create 48 lines of Apache configuration, and use 24 TCP ports. This can be done, but it would take some time to implement. And who is going to look at the 24 pages? Regards, Martin From jjl at pobox.com Tue Feb 21 22:38:03 2006 From: jjl at pobox.com (John J Lee) Date: Tue, 21 Feb 2006 21:38:03 +0000 (UTC) Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> Message-ID: On Tue, 21 Feb 2006, Guido van Rossum wrote: [...] > If you're only interested in classifying the three specific built-ins > you mention, I'd check for the presense of certain attributes: > hasattr(x, "lower") -> x is a string of some kind; hasattr(x, "sort") > -> x is a list; hasattr(x, "update") -> x is a dict. Also, hasattr(x, > "union") -> x is a set; hasattr(x, "readline") -> x is a file. dict and set instances both have an .update() method. I guess "keys" or "items" is a better choice for testing dict-ness, if using "LBYL" at all. (anybody new to "LBYL" can google for that and EAFP -- latter does not stand for European Assoc. of Fish Pathologists in this context, though ;-) > That's duck typing! >>> hasattr(python, "quack") True John From g.brandl at gmx.net Tue Feb 21 23:13:13 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2006 23:13:13 +0100 Subject: [Python-Dev] Two patches In-Reply-To: <43FB8BE4.9090309@v.loewis.de> References: <43FB8BE4.9090309@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Georg Brandl wrote: >> * I think I've submitted this one to the tracker, but can't remember: >> It's for PySequence_SetItem and makes something like this possible: >> >> tup = ([], ) >> tup[0] += [1] > > That definitely needs fixing: > > py> tup = ([], ) > py> tup[0] += [1] > Traceback (most recent call last): > File "", line 1, in ? > TypeError: object doesn't support item assignment > py> tup > ([1],) > > Errors should never pass silently, but success shouldn't cause > an error message, either. The patch is now at SF, item #1436226. >> * One patch for staticmethod and classmethod, which currently silently >> accept keyword arguments and throw them away. The patch adds error >> messages. > > Sounds good as well. Checked in to 2.5 branch. Georg From tjreedy at udel.edu Tue Feb 21 23:17:47 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 21 Feb 2006 17:17:47 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com><7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: "Jeremy Hylton" wrote in message news:e8bf7a530602211125k28fc64bcx62430c375d060a8b at mail.gmail.com... > If I recall the discussion correctly, Guido said he was open to a > version of nested scopes that allowed rebinding. Yes. Among other places, he said in http://article.gmane.org/gmane.comp.python.devel/25153/match=nested+scopes ''' Your PEP wonders why I am against allowing assignment to intermediate levels. Here's my answer: all the syntaxes that have been proposed to spell this have problems. So let's not provide a way to spell it. I predict that it won't be a problem. If it becomes a problem, we can add a way to spell it later. '' tjr From tdelaney at avaya.com Tue Feb 21 23:23:39 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 22 Feb 2006 09:23:39 +1100 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]] Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB977@au3010avexu1.global.avaya.com> Greg Ewing wrote: > I don't quite see the point here. Inside a bytes object, > they would be stored 1 byte per byte. Nobody is suggesting > that they would take up more than that just because > a_bytes_object[i] happens to return an int. Speaking of which, I suspect it'll be a lot more common to need integer objects in the full range [0, 255] than it is now. Perhaps we should extend the pre-allocated integer objects to cover the full byte range. Tim Delaney From python at rcn.com Tue Feb 21 23:31:22 2006 From: python at rcn.com (Raymond Hettinger) Date: Tue, 21 Feb 2006 17:31:22 -0500 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP332 revival in coordination with pep 349?]] References: <2773CAC687FD5F4689F526998C7E4E5F4DB977@au3010avexu1.global.avaya.com> Message-ID: <004c01c63736$8b33fe80$6a01a8c0@RaymondLaptop1> > Speaking of which, I suspect it'll be a lot more common to need integer > objects in the full range [0, 255] than it is now. > > Perhaps we should extend the pre-allocated integer objects to cover the > full byte range. +1 From tdelaney at avaya.com Tue Feb 21 23:34:37 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 22 Feb 2006 09:34:37 +1100 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was: PEP332 revival in coordination with pep 349?]] Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB978@au3010avexu1.global.avaya.com> Raymond Hettinger wrote: >> Speaking of which, I suspect it'll be a lot more common to need >> integer objects in the full range [0, 255] than it is now. >> >> Perhaps we should extend the pre-allocated integer objects to cover >> the full byte range. > > +1 Want me to raise an SF request? Tim Delaney From tdelaney at avaya.com Tue Feb 21 23:44:10 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 22 Feb 2006 09:44:10 +1100 Subject: [Python-Dev] s/bytes/octet/ [Was:Re: bytes.from_hex() [Was:PEP332 revival in coordination with pep 349?]] Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB979@au3010avexu1.global.avaya.com> Delaney, Timothy (Tim) wrote: >>> Perhaps we should extend the pre-allocated integer objects to cover >>> the full byte range. >> >> +1 > > Want me to raise an SF request? Done. Item # 1436243. Tim Delaney From mrussell at verio.net Tue Feb 21 23:56:34 2006 From: mrussell at verio.net (Mark Russell) Date: Tue, 21 Feb 2006 22:56:34 +0000 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FB8272.7060906@colorstudy.com> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FB8272.7060906@colorstudy.com> Message-ID: <95A265A5-B378-42FB-BEB5-A2EACC4396CD@verio.net> On 21 Feb 2006, at 21:13, Ian Bicking wrote: > By rebinding operator, does that mean it is actually an operator? > I.e.: > > # Required assignment to declare?: > chunk = None > while chunk := f.read(1000): > ... No, I think that "x := y" should be a statement not an expression (i.e. just like "x = y" apart from the treatment of bindings). I'd be inclined to require that the target of := be already bound, if only to prevent people randomly using ":=" in places where it's not required. In a new language I would probably also make it an error to use = to do rebinding (i.e. insist on = for new bindings, and := for rebindings). But that's obviously not reasonable for python. Mark Russell From pje at telecommunity.com Wed Feb 22 00:19:09 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 21 Feb 2006 18:19:09 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <43FB992E.8050305@masklinn.net> References: <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FAE45F.3020000@canterbury.ac.nz> <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20060221180806.01e14400@mail.telecommunity.com> At 11:50 PM 2/21/2006 +0100, Morel Xavier wrote: >Phillip J. Eby wrote: >>The '.' would mean "this name, but in the nearest outer scope that >>defines it". Note that this could include the global scope, so the >>'global' keyword could go away in 2.5. And in Python 3.0, the '.' could >>become *required* for use in closures, so that it's not necessary for the >>reader to check a function's outer scope to see whether closure is taking >>place. EIBTI. > >While the idea is interesting, how would this solution behave if the >variable (the name) didn't exist in any outer scope? The compiler should consider it a name in the global scope, and for an assignment the name would be required to have an existing binding, or a NameError would result. (Indicating you are assigning to a global that hasn't been defined.) >Would it create and bind the name in the current scope? No, never. > If yes, why wouldn't this behavior become the default (without > any leading dot), efficiency issues of the lookup? No, it would be because explicit is better than implicit. The whole point of requiring '.' for closures in Python 3.0 would be to keep the person who's reading the code from having to inspect an entire function and its context to figure out which names are referring to variables in outer scopes. That is, it would go against the whole point of my idea, which is to make explicit what variables are part of your closure. From andrew-pythondev at puzzling.org Wed Feb 22 00:23:59 2006 From: andrew-pythondev at puzzling.org (Andrew Bennetts) Date: Wed, 22 Feb 2006 10:23:59 +1100 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB8D68.8080209@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> <17403.35094.354127.878532@montanaro.dyndns.org> <43FB8D68.8080209@v.loewis.de> Message-ID: <20060221232359.GB27541@home.puzzling.org> Martin v. L?wis wrote: > skip at pobox.com wrote: [...] > > So for multiplying this by 8, I would have to create 48 lines of > Apache configuration, and use 24 TCP ports. This can be done, but > it would take some time to implement. And who is going to look > at the 24 pages? This last point is the most important, I think. Most of the time I look at Twisted's buildbot, it's to see at a glance which, if any, builds are broken. I think this is the #1 use case. Second is getting the details of what broke, and who broke it. So massively multiplying the pages seems counter-productive to me. I suspect there's nearly as much advantage to running randomised tests on just one platform as there is on many, so a good trade-off may be to just add one more builder (to each branch) that does -r on just one platform. I'm assuming most of the issues randomisation exposes aren't platform-dependent. -Andrew. From greg.ewing at canterbury.ac.nz Wed Feb 22 01:01:55 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 13:01:55 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> Message-ID: <43FBA9F3.7020003@canterbury.ac.nz> Jeremy Hylton wrote: > The names of naming statements are quite hard to get right, I fear. My vote goes for 'outer'. And if this gets accepted, remove 'global' in 3.0. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 22 01:34:06 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 13:34:06 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <20060221104839.601C.JCARLSON@uci.edu> References: <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <20060221104839.601C.JCARLSON@uci.edu> Message-ID: <43FBB17E.1060006@canterbury.ac.nz> Josiah Carlson wrote: > It doesn't seem strange to you to need to encode data twice to be able > to have a usable sequence of characters which can be embedded in an > effectively 7-bit email; I'm talking about a 3.0 world where all strings are unicode and the unicode <-> external coding is for the most part done automatically by the I/O objects. So you'd be building up your whole email as a string (aka unicode) which happens to only contain code points in the range 0..127, and then writing it to your socket or whatever. You wouldn't need to do the second encoding step explicitly very often. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 22 01:35:26 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 13:35:26 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> Message-ID: <43FBB1CE.6080506@canterbury.ac.nz> Georg Brandl wrote: > But why is that better than > > class namespace(object): pass > > def my_func(): > foo = namespace() > (...) Because then it would be extremely difficult for CPython to optimise accesses to foo into local variable lookups. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 22 01:35:29 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 13:35:29 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> Message-ID: <43FBB1D1.5060808@canterbury.ac.nz> Alex Martelli wrote: > If we call the type autodict, then having the factory attribute named > autofactory seems to fit. Or just 'factory', since it's the only kind of factory the object is going to have. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 22 01:35:32 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 13:35:32 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43fb1c0a.1382282887@news.gmane.org> Message-ID: <43FBB1D4.3080408@canterbury.ac.nz> Jeremy Hylton wrote: > On 2/21/06, Jeremy Hylton wrote: > >>On 2/21/06, Bengt Richter wrote: > >>>But to the topic, it just occurred to me that any outer scopes could be given names >>>(including global namespace, but that would have the name global by default, so >>>global.x would essentially mean what globals()['x'] means now, except it would >>>be a name error if x didn't pre-exist when accessed via namespace_name.name_in_space notation. > > Isn't this suggestion that same as Greg Ewing's? It's not quite the same, because in my scheme the namespace statement creates a new namespace embedded in the scope where it appears, whereas Bengt's one seems to just give a name to the scope itself. I'm not really in favour of either of these -- I'd be just as happy with a simple 'outer' statement. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Feb 22 01:35:33 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 13:35:33 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FAE8D0.9040000@voidspace.org.uk> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> Message-ID: <43FBB1D5.4040309@canterbury.ac.nz> Fuzzyman wrote: > I've had problems in code that needs to treat strings, lists and > dictionaries differently (assigning values to a container where all > three need different handling) and telling the difference but allowing > duck typing is *problematic*. You need to rethink your design so that you don't have to make that kind of distinction. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From fumanchu at amor.org Wed Feb 22 01:48:46 2006 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 21 Feb 2006 16:48:46 -0800 Subject: [Python-Dev] Unifying trace and profile Message-ID: <6949EC6CD39F97498A57E0FA55295B2101C312B6@ex9.hostedexchange.local> There are a number of features I'd like to see happen with Python's tracing and profiling subsystems (but I don't have the C experience to do it myself). I started to write an SF feature-request and then realized it was too much for a single ticket. Maybe a PEP? All of these would be make my latest side project[1] a lot easier. Anyway, here they are (most important and easiest-to-implement first): 1. Allow trace hooks to receive c_call, c_return, and c_exception events (like profile does). 2. Allow profile hooks to receive line events (like trace does). 3. Expose new sys.gettrace() and getprofile() methods, so trace and profile functions that want to play nice can call sys.settrace/setprofile(None) only if they are the current hook. 4. Make "the same move" that sys.exitfunc -> atexit made (from a single function to multiple functions via registration), so multiple tracers/profilers can play nice together. 5. Allow the core to filter on the "event" arg before hook(frame, event, arg) is called. 6. Unify tracing and profiling, which would remove a lot of redundant code in ceval and sysmodule and free up some space in the PyThreadState struct to boot. 7. As if the above isn't enough of a dream, it would be nice to have a bytecode tracer, which didn't bother with the f_lineno logic in maybe_call_line_trace, but just called the hook on every instruction. Robert Brewer System Architect Amor Ministries fumanchu at amor.org [1] PyConquer, a trace hook to help understand and debug concurrent (threaded) code. http://projects.amor.org/misc/wiki/PyConquer From raymond.hettinger at verizon.net Wed Feb 22 01:54:57 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 21 Feb 2006 19:54:57 -0500 Subject: [Python-Dev] defaultdict proposal round three References: <43FA4E88.4090507@canterbury.ac.nz><43FAE298.5040404@canterbury.ac.nz><71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> <43FBB1D1.5060808@canterbury.ac.nz> Message-ID: <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> > Alex Martelli wrote: > >> If we call the type autodict, then having the factory attribute named >> autofactory seems to fit. > > Or just 'factory', since it's the only kind of factory > the object is going to have. Gack, no. You guys are drifting towards complete ambiguity. You might as well call it "thingie_that_doth_return_an_object". The word "factory" by itself says nothing about lookups and default values. Like "autodict" could mean anything. Keep in mind that we may well end-up having this side-by-side with collections.ordered_dict. The word "auto" tells you nothing about how this is different from a regular dict or ordered dictionary. It's meaningless. Please, stick with defaultdictionary and default_factory. While not perfectly descriptive, they are suggest just enough to jog the memory and make the code readable. Try to resist generalizing the name into nothingness. Raymond From tim.peters at gmail.com Wed Feb 22 02:04:20 2006 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 21 Feb 2006 20:04:20 -0500 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB8763.7070802@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> Message-ID: <1f7befae0602211704p7b9cbc19jde245e9e1d6bea02@mail.gmail.com> [Martin v. L?wis] > So what is your recipe: I don't have one. I personally always use -uall and -r, and then run the tests 8 times, w/ and w/o -O, under debug and release builds, and w/ and w/o deleting .py[co] files first. Because that last one almost never finds problems anymore, perhaps it would be good to stop bothering with it routinely (it really doesn't have potential to find a problem unless someone has been mucking with the marshaling of code objects, right?). > Add -r to all buildbots? Sure. -r adds variety to testing at no cost (the same number of tests run, in the same pamount of time, with or without -r). > Only to those which have an 'a' in their name? Sorry, no idea what that means. > Only to every third build? Duplicating the number of builders? For -r, no. I'd always use -r (and always do anyway). > Same question for --with-pydebug. Combining this with -r would multiply > the number of builders by 4 already. I would much rather see a debug-build run than the current "with and without deleting .py[co] files first" variant. If the latter were dropped and the former were added, and -r were used all the time, the number of recipes wouldn't change. Testing time would increase, by the time to _do_ a debug build, and by the extra time a debug build test run requires. We should test with and without -O too, although that's another that rarely finds a problem. > I'm not keen on deciding this for myself. Somebody else please decide > for me. I don't know how hard it is to teach the system how to do something "not so often", and I expect that's an important unknown since I imagine that vastly increasing test time would discourage people from volunteering buildbot slaves. Since the most fruitful variations (IME) for finding code errors are using -r and running a debug build too, I'd change the current run-all-the-time recipes to: - Stop doing the second "without deleting .py[co]" run. - Do one run with a release build. - Do one run with a debug build. - Use -uall -r for both. If we know how to get something done "occasionally", then about once a week it would be prudent to also: - Try the "with and without deleting .py[co] files first" business. - Try with and without -O, Those last two choices cover 8 distinct modes, when paired with each other and with the "release versus debug build" choice. From kbk at shore.net Wed Feb 22 02:09:20 2006 From: kbk at shore.net (Kurt B. Kaiser) Date: Tue, 21 Feb 2006 20:09:20 -0500 (EST) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200602220109.k1M19K9n008943@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 385 open (-14) / 3067 closed (+25) / 3452 total (+11) Bugs : 864 open (-59) / 5621 closed (+68) / 6485 total ( +9) RFE : 211 open ( +2) / 200 closed ( +2) / 411 total ( +4) New / Reopened Patches ______________________ GNU uses double-dashes not single (2006-02-16) http://python.org/sf/1433148 opened by splitscreen restrict codec lookup to encodings package (2006-02-16) CLOSED http://python.org/sf/1433198 reopened by lemburg restrict codec lookup to encodings package (2006-02-16) CLOSED http://python.org/sf/1433198 opened by Guido van Rossum add on_missing() and default_factory to dict (2006-02-17) http://python.org/sf/1433928 opened by Guido van Rossum CHM file contains proprietary link format (2006-02-18) http://python.org/sf/1434298 opened by Alexander Schremmer Patch to support lots of file descriptors (2006-02-19) http://python.org/sf/1434657 opened by Sven Berkvens-Matthijsse Add copy() method to zlib's compress and decompress objects (2006-02-20) http://python.org/sf/1435422 opened by Chris AtLee PEP 343 with statement (2006-02-21) http://python.org/sf/1435715 opened by mbland Incremental codecs (2006-02-21) http://python.org/sf/1436130 opened by Walter D?rwald fix inplace assignment for immutable sequences (2006-02-21) http://python.org/sf/1436226 opened by Georg Brandl Patches Closed ______________ GNU uses double-dashes not single (2006-02-16) http://python.org/sf/1433166 deleted by gvanrossum restrict codec lookup to encodings package (2006-02-16) http://python.org/sf/1433198 closed by lemburg restrict codec lookup to encodings package (2006-02-16) http://python.org/sf/1433198 closed by lemburg use computed goto's in ceval loop (2006-01-18) http://python.org/sf/1408710 closed by loewis have SimpleHTTPServer return last-modified headers (2006-01-28) http://python.org/sf/1417555 closed by birkenfeld Feed style codec API (2005-01-12) http://python.org/sf/1101097 closed by lemburg chunk.py can't handle >2GB chunks (2005-12-05) http://python.org/sf/1373643 closed by birkenfeld Fix of bug 1366000 (2005-11-30) http://python.org/sf/1370147 closed by birkenfeld Optional second argument for startfile (2005-12-29) http://python.org/sf/1393157 closed by birkenfeld Clairify docs on reference stealing (2006-01-26) http://python.org/sf/1415507 closed by birkenfeld urllib proxy_bypass broken (2006-02-07) http://python.org/sf/1426648 closed by birkenfeld Speed up EnumKey call (2004-06-22) http://python.org/sf/977553 closed by birkenfeld [PATCH] Bug #1351707 (2005-11-10) http://python.org/sf/1352711 closed by birkenfeld fileinput patch for bug #1336582 (2005-10-25) http://python.org/sf/1337756 closed by birkenfeld Fix for int(string, base) wrong answers (2005-10-22) http://python.org/sf/1334979 closed by birkenfeld [PATCH] 100x optimization for ngettext (2005-11-06) http://python.org/sf/1349274 closed by birkenfeld commands.getstatusoutput() (2005-11-02) http://python.org/sf/1346211 closed by birkenfeld two fileinput enhancements (fileno, openhook) (2005-06-05) http://python.org/sf/1215184 closed by birkenfeld mode argument for fileinput class (2005-05-31) http://python.org/sf/1212287 closed by birkenfeld do not add directory of sys.argv[0] into sys.path (2004-05-02) http://python.org/sf/946373 closed by gbrandl prefix and exec_prefix as root dir bug (2004-04-08) http://python.org/sf/931938 closed by gbrandl New / Reopened Bugs ___________________ optparse docs double-dash confusion (2006-02-16) http://python.org/sf/1432838 opened by John Veness Logging hangs thread after detaching a StreamHandler's termi (2006-02-13) CLOSED http://python.org/sf/1431253 reopened by yangzhang os.path.expandvars sometimes doesn't expand $HOSTNAME (2006-02-17) CLOSED http://python.org/sf/1433667 opened by Doug Fort normalize function in minidom unlinks empty child nodes (2006-02-17) http://python.org/sf/1433694 opened by RomanKliotzkin string parameter to ioctl not null terminated, includes fix (2006-02-17) http://python.org/sf/1433877 opened by Quentin Barnes pointer aliasing causes core dump, with workaround (2006-02-17) http://python.org/sf/1433886 opened by Quentin Barnes Python crash on __init__/__getattr__/__setattr__ interaction (2004-04-26) CLOSED http://python.org/sf/942706 reopened by hhas Crash when decoding UTF8 (2006-02-20) CLOSED http://python.org/sf/1435487 opened by Viktor Ferenczi CGIHTTPServer doesn't handle path names with embeded space (2006-02-21) http://python.org/sf/1436206 opened by Richard Coupland Bugs Closed ___________ Logging hangs thread after detaching a StreamHandler's termi (2006-02-14) http://python.org/sf/1431253 closed by vsajip logging module's setLoggerClass not really working (2005-09-08) http://python.org/sf/1284928 closed by vsajip pydoc still doesn't handle lambda well (2006-02-15) http://python.org/sf/1432260 closed by birkenfeld smtplib: empty mail addresses (2006-02-12) http://python.org/sf/1430298 closed by birkenfeld IMPORT PROBLEM: Local submodule shadows global module (2006-02-01) http://python.org/sf/1421513 closed by birkenfeld class dictionary shortcircuits __getattr__ (2006-01-31) http://python.org/sf/1419989 closed by birkenfeld SimpleHTTPServer doesn't return last-modified headers (2006-01-28) http://python.org/sf/1417554 closed by birkenfeld os.path.expandvars sometimes doesn't expand $HOSTNAME (2006-02-17) http://python.org/sf/1433667 closed by birkenfeld http response dictionary incomplete (2006-02-01) http://python.org/sf/1421696 closed by birkenfeld Bug bz2.BZ2File(...).seek(0,2) (2005-11-25) http://python.org/sf/1366000 closed by birkenfeld Incorrect Decimal-float behavior for + (2005-11-13) http://python.org/sf/1355842 closed by arigo http auth documentation/implementation conflict (2005-08-13) http://python.org/sf/1258485 closed by birkenfeld bsddb.__init__ causes error (2006-01-04) http://python.org/sf/1396678 closed by birkenfeld StreamReader.readline doesn't advance on decode errors (2005-12-13) http://python.org/sf/1379393 closed by birkenfeld zipimport produces incomplete IOError instances (2005-11-08) http://python.org/sf/1351707 closed by birkenfeld zipfile: inserting some filenames produces corrupt .zips (2006-01-24) http://python.org/sf/1413790 closed by birkenfeld socketmodule.c compile error using SunPro cc (2003-10-06) http://python.org/sf/818490 closed by birkenfeld Python-2.3.3c1, Solaris 2.7: socketmodule does not compile (2003-12-05) http://python.org/sf/854823 closed by birkenfeld README build instructions for fpectl (2004-01-07) http://python.org/sf/872175 closed by birkenfeld test_fcntl fails on netbsd2 (2005-01-12) http://python.org/sf/1101233 closed by birkenfeld Python 2.4 and 2.3.5 won't build on OpenBSD 3.7 (2005-11-01) http://python.org/sf/1345313 closed by loewis getwindowsversion() constants in sys module (2005-10-10) http://python.org/sf/1323369 closed by gbrandl No documentation for PyFunction_* (C-Api) (2004-08-22) http://python.org/sf/1013800 closed by gbrandl pickle files should be opened in binary mode (2005-01-14) http://python.org/sf/1102649 closed by gbrandl os.stat returning a time (2003-09-22) http://python.org/sf/810887 closed by gbrandl test_sax fails on python 2.2.3 & patch for regrtest.py (2003-06-21) http://python.org/sf/758504 closed by gbrandl a exception ocurrs when compiling a Python file (2003-12-09) http://python.org/sf/856841 closed by gbrandl Ms VC 2003 not supported (2003-06-18) http://python.org/sf/756842 closed by gbrandl site.py breaks if prefix is empty (2003-04-01) http://python.org/sf/713601 closed by gbrandl AESend on Jaguar (2002-08-14) http://python.org/sf/595105 closed by jackjansen source files using encoding ./. universal newlines (2003-07-31) http://python.org/sf/780730 closed by lemburg Solaris 8 declares gethostname(). (2005-08-12) http://python.org/sf/1257687 closed by gbrandl httplib.HTTPConnection._send_request header parsing bug (2003-10-27) http://python.org/sf/831271 closed by gbrandl Problem with ftplib on HP-UX11i (2003-09-25) http://python.org/sf/812376 closed by gbrandl locale.getdefaultlocale doesnt handle all locales gracefully (2003-09-27) http://python.org/sf/813449 closed by gbrandl NotImplemented return value misinterpreted in new classes (2003-11-22) http://python.org/sf/847024 closed by gbrandl fileinput does not use universal input (2003-12-15) http://python.org/sf/860515 closed by gbrandl Cursors not correctly closed after exception. (2005-05-28) http://python.org/sf/1210377 closed by gbrandl urllib2 doesn't handle username/password in url (2004-04-29) http://python.org/sf/944396 closed by gbrandl urllib2.HTTPBasicAuthHandler problem with [HOST]:[PORT] (2004-11-29) http://python.org/sf/1075427 closed by gbrandl urllib.urlopen() fails to raise exception (2004-05-04) http://python.org/sf/947571 closed by gbrandl __mul__ taken as __rmul__ for mul-by-int only (2003-10-25) http://python.org/sf/830261 closed by gbrandl Python crash on __init__/__getattr__/__setattr__ interaction (2004-04-26) http://python.org/sf/942706 closed by gbrandl Error "exec"ing python code (2005-03-21) http://python.org/sf/1167300 closed by gbrandl type() and isinstance() do not call __getattribute__ (2005-08-19) http://python.org/sf/1263635 closed by gbrandl Crash when decoding UTF8 (2006-02-20) http://python.org/sf/1435487 closed by nnorwitz New / Reopened RFE __________________ Implement preemptive threads in Python (2006-02-16) http://python.org/sf/1432694 reopened by darkprokoba Implement preemptive threads in Python (2006-02-16) http://python.org/sf/1432694 opened by Andrey Petrov Use new expat version 2.0 (2006-02-17) http://python.org/sf/1433435 opened by Wolfgang Langner python executable optionally should search script on PATH (2005-12-13) http://python.org/sf/1379573 reopened by cconrad Extend pre-allocated integers to cover [0, 255] (2006-02-21) http://python.org/sf/1436243 opened by Tim Delaney RFE Closed __________ Implement preemptive threads in Python (2006-02-16) http://python.org/sf/1432694 closed by mwh fileinput/gzip modules should play well (2002-06-01) http://python.org/sf/563141 closed by birkenfeld From guido at python.org Wed Feb 22 02:22:20 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 21 Feb 2006 20:22:20 -0500 Subject: [Python-Dev] Fixing copy.py to allow copying functions Message-ID: While playing around with the defaultdict patch, adding __reduce__ to make defaultdict objects properly copyable through the copy module, I noticed that copy.py doesn't support copying function objects. This seems an oversight, since the (closely related) pickle module *does* support copying functions. The semantics of pickling a function is that it just stores the module and function name in the pickle; that is, if you unpickle it in the same process it'll just return a reference to the same function object. This would translate into "atomic" semantics for copying functions: the "copy" is just the original, for shallow as well as deep copies. It's a simple patch: --- Lib/copy.py (revision 42537) +++ Lib/copy.py (working copy) @@ -101,7 +101,8 @@ return x for t in (type(None), int, long, float, bool, str, tuple, frozenset, type, xrange, types.ClassType, - types.BuiltinFunctionType): + types.BuiltinFunctionType, + types.FunctionType): d[t] = _copy_immutable for name in ("ComplexType", "UnicodeType", "CodeType"): t = getattr(types, name, None) @@ -217,6 +218,7 @@ d[xrange] = _deepcopy_atomic d[types.ClassType] = _deepcopy_atomic d[types.BuiltinFunctionType] = _deepcopy_atomic +d[types.FunctionType] = _deepcopy_atomic def _deepcopy_list(x, memo): y = [] Any objections? Given that these are picklable, I can't imagine there are any but I thought I'd ask anyway. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Wed Feb 22 03:35:57 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 20:35:57 -0600 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB886D.5080403@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <3A5E4D8E-2171-4CF7-9095-D70322570A82@mac.com> <17403.276.782271.715060@montanaro.dyndns.org> <17403.7868.343827.845670@montanaro.dyndns.org> <43FB886D.5080403@v.loewis.de> Message-ID: <17403.52749.823708.596328@montanaro.dyndns.org> >> Let me rephrase that. I assume the people digging through Py_ssize_t >> issues have been looking at compilation warnings for platforms other >> than Mac OSX. Martin> In the buildbot log, I see only a single one of these, and only Martin> in an OSX-specific module. So no - "we" don't look into fixing Martin> them, as they don't occur on Linux at all (as _Qdmodule isn't Martin> built on Linux). Sure looks like core to me: Objects/bufferobject.c: In function `buffer_repr': Objects/bufferobject.c:250: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/bufferobject.c:258: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/bufferobject.c:258: warning: signed size_t format, Py_ssize_t arg (arg 5) ... Objects/funcobject.c: In function `func_set_code': Objects/funcobject.c:254: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/funcobject.c:254: warning: signed size_t format, Py_ssize_t arg (arg 5) Objects/funcobject.c: In function `func_new': Objects/funcobject.c:406: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/funcobject.c:406: warning: signed size_t format, Py_ssize_t arg (arg 5) ... Objects/listobject.c: In function `list_ass_subscript': Objects/listobject.c:2604: warning: signed size_t format, Py_ssize_t arg (arg 3) Objects/listobject.c:2604: warning: signed size_t format, Py_ssize_t arg (arg 4) ... Objects/dictobject.c: In function `PyDict_MergeFromSeq2': Objects/dictobject.c:1152: warning: signed size_t format, Py_ssize_t arg (arg 4) ... Objects/methodobject.c: In function `PyCFunction_Call': Objects/methodobject.c:85: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/methodobject.c:96: warning: signed size_t format, Py_ssize_t arg (arg 4) ... Objects/structseq.c: In function `structseq_new': Objects/structseq.c:129: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/structseq.c:129: warning: signed size_t format, Py_ssize_t arg (arg 5) Objects/structseq.c:137: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/structseq.c:137: warning: signed size_t format, Py_ssize_t arg (arg 5) Objects/structseq.c:146: warning: signed size_t format, Py_ssize_t arg (arg 4) Objects/structseq.c:146: warning: signed size_t format, Py_ssize_t arg (arg 5) ... Objects/typeobject.c: In function `check_num_args': Objects/typeobject.c:3378: warning: signed size_t format, Py_ssize_t arg (arg 4) ... Objects/unicodeobject.c: In function `unicode_decode_call_errorhandler': Objects/unicodeobject.c:794: warning: signed size_t format, Py_ssize_t arg (arg 3) Objects/unicodeobject.c: In function `unicode_encode_call_errorhandler': Objects/unicodeobject.c:2475: warning: signed size_t format, int arg (arg 3) Objects/unicodeobject.c: In function `unicode_translate_call_errorhandler': Objects/unicodeobject.c:3374: warning: signed size_t format, int arg (arg 3) ... This from the build on g5 osx.3 trunk from 22:54 today (21 Feb). Skip From jcarlson at uci.edu Wed Feb 22 03:42:38 2006 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 21 Feb 2006 18:42:38 -0800 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <20060221105309.601F.JCARLSON@uci.edu> Message-ID: <20060221183451.6035.JCARLSON@uci.edu> "Steven Bethard" wrote: > > On 2/21/06, Josiah Carlson wrote: > > The question which still remains in my mind, which I previously asked, > > is whether the use cases are compelling enough to warrant the feature > > addition. > > I don't know whether I support the proposal or not, but in reading > Mark Russel's email, I realized that I just recently ran into a use > case: [snip example where 3 lines are duplicated twice, and a 2 line subset are duplicated in a third location] > using something > like ``curr_suffix :=`` or Phillip J. Eby's suggestion of > ``.curr_suffix =`` would allow this code to be factored out into a > function. In this particular example, there is no net reduction in line use. The execution speed of your algorithm would be reduced due to function calling overhead. There may be a minor clarification improvement, but arguably no better than the Richie Hindle's functional goto implementation for Python 2.3 and later. - Josiah From skip at pobox.com Wed Feb 22 03:41:16 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 20:41:16 -0600 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB8D68.8080209@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> <17403.35094.354127.878532@montanaro.dyndns.org> <43FB8D68.8080209@v.loewis.de> Message-ID: <17403.53068.12339.400715@montanaro.dyndns.org> Martin> So for multiplying this by 8, I would have to create 48 lines of Martin> Apache configuration, and use 24 TCP ports. This can be done, Martin> but it would take some time to implement. I'm not too worried about that because it's a one-time cost. I'd be willing to help out. Just shoot me the httpd config file and other necessary bits and I'll return you the modified stuff. Martin> And who is going to look at the 24 pages? This is, of course, the bigger problem since it's ongoing. If we solicit buildbot slaves we should solicit a pair of eyeballs for each slave as well. That doesn't need to be the owner of the box, but the owner is the likely first candidate to trick^H^H^H^H^Hask. Skip From mark.m.mcmahon at gmail.com Mon Feb 20 17:06:45 2006 From: mark.m.mcmahon at gmail.com (Mark Mc Mahon) Date: Mon, 20 Feb 2006 11:06:45 -0500 Subject: [Python-Dev] Path PEP: some comments (equality) Message-ID: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> Hi, It seems that the Path module as currently defined leaves equality testing up to the underlying string comparison. My guess is that this is fine for Unix (maybe not even) but it is a bit lacking for Windows. Should the path class implement an __eq__ method that might do some of the following things: - Get the absolute path of both self and the other path - normcase both - now see if they are equal This would make working with paths much easier for keys of a dictionary on windows. (I frequently use a case insensitive string class for paths if I need them to be keys of a dict.) My first email to python-dev :-) Mark From sergey at fidoman.ru Tue Feb 21 10:39:27 2006 From: sergey at fidoman.ru (Sergey Dorofeev) Date: Tue, 21 Feb 2006 12:39:27 +0300 Subject: [Python-Dev] calendar.timegm Message-ID: <000d01c636ca$b53411a0$201010ac@prodo.ru> Hello. Historical question ;) Anyone can explain why function timegm is placed into module calendar, not to module time, where it would be near with similar function mktime? From skip at pobox.com Wed Feb 22 05:47:53 2006 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Feb 2006 22:47:53 -0600 Subject: [Python-Dev] calendar.timegm In-Reply-To: <000d01c636ca$b53411a0$201010ac@prodo.ru> References: <000d01c636ca$b53411a0$201010ac@prodo.ru> Message-ID: <17403.60665.188360.106157@montanaro.dyndns.org> Sergey> Historical question ;) Sergey> Anyone can explain why function timegm is placed into module Sergey> calendar, not to module time, where it would be near with Sergey> similar function mktime? Historical accident. ;-) Skip From nnorwitz at gmail.com Wed Feb 22 06:30:31 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Feb 2006 21:30:31 -0800 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <1f7befae0602211704p7b9cbc19jde245e9e1d6bea02@mail.gmail.com> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> <1f7befae0602211704p7b9cbc19jde245e9e1d6bea02@mail.gmail.com> Message-ID: On 2/21/06, Tim Peters wrote: > > Since the most fruitful variations (IME) for finding code errors are > using -r and running a debug build too, I'd change the current > run-all-the-time recipes to: > > - Stop doing the second "without deleting .py[co]" run. > - Do one run with a release build. > - Do one run with a debug build. > - Use -uall -r for both. I agree with this, but don't know a clean way to do 2 builds. I modified buildbot to: - Stop doing the second "without deleting .py[co]" run. - Do one run with a debug build. - Use -uall -r for both. Buildbot does *not* also do a release build. That's the only difference between your request above. I agree that it would be desirable, but I think the debug build is more important than the release build right now. We don't have to make this perfect right now. We can talk about this at PyCon and resolve the remaining issues. One thing that would be nice is to have the master.cfg checked in somewhere so we can track changes. n From nnorwitz at gmail.com Wed Feb 22 06:36:36 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 21 Feb 2006 21:36:36 -0800 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB88AE.3040709@v.loewis.de> References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <17403.76.288989.178176@montanaro.dyndns.org> <43FB88AE.3040709@v.loewis.de> Message-ID: On 2/21/06, "Martin v. L?wis" wrote: > skip at pobox.com wrote: > > Neal> IMO compiler warnings should generate emails from buildbot. > > > > It doesn't generate emails for any other condition. I think it should just > > turn the compilation section yellow. > > It would be easy to run the builds with -Werror, making warnings let the > compilation fail, which in turn is flagged red. And previously: > Should we build with -Wno-deprecated (or whatever it is spelled) on OSX? Hmmm, I'm really tempted to add both of these flags (-Werror -Wno-deprecated). Let's discuss this at PyCon. We can make lots of changes then. We might want to wait until after the sprints so people don't have to deal with this churn. n From greg.ewing at canterbury.ac.nz Wed Feb 22 07:04:05 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 19:04:05 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: <43FBFED5.5080809@canterbury.ac.nz> Terry Reedy wrote: > There were perhaps 10 > different proposals, including, I believe, 'outer'. Guido rejected them > all as having costs greater than the benefits. As far as I remember, Guido wasn't particularly opposed to the idea, but the discussion fizzled out after having failed to reach a consensus on an obviously right way to go about it. Greg From greg.ewing at canterbury.ac.nz Wed Feb 22 07:45:29 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 19:45:29 +1300 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> References: <43FAE45F.3020000@canterbury.ac.nz> <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> Message-ID: <43FC0889.7040604@canterbury.ac.nz> Phillip J. Eby wrote: > def incrementer(val): > def inc(): > .val += 1 > return .val > return inc -1, too obscure. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 22 08:09:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 20:09:20 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> Message-ID: <43FC0E20.5080809@canterbury.ac.nz> Mark Russell wrote: > PEP 227 mentions using := as a rebinding operator, but rejects the > idea as it would encourage the use of closures. Well, anything that facilitates rebinding in outer scopes is going to encourage the use of closures, so I can't see that as being a reason to reject a particular means of rebinding. You either think such rebinding is a good idea or not -- and that seems to be a matter of highly individual taste. On this particular idea, I tend to think it's too obscure as well. Python generally avoids attaching randomly-chosen semantics to punctuation, and I'd like to see it stay that way. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 22 08:11:40 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 20:11:40 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: Message-ID: <43FC0EAC.9040301@canterbury.ac.nz> Just van Rossum wrote: > Btw, PJE's "crazy" idea (.name, to rebind an outer name) was proposed > before, but Guido wanted to reserve .name for a (Pascal-like) 'with' > statement. Hmm, I guess that doesn't apply any more, since we've already used "with" for something else. Regardless, names with leading dots just look ugly and perlish to me, so I wouldn't be in favour anyway. -- Greg From almann.goo at gmail.com Wed Feb 22 08:36:33 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Wed, 22 Feb 2006 02:36:33 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FBFED5.5080809@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> Message-ID: <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> > As far as I remember, Guido wasn't particularly opposed > to the idea, but the discussion fizzled out after having > failed to reach a consensus on an obviously right way > to go about it. My apologies for bringing this debated topic again to the front-lines--that said, I think there has been good, constructive things said again and sometimes it doesn't hurt to kick up an old topic. After pouring through some of the list archive threads and reading through this thread, it seems clear to me that the community doesn't seem all that keen on fixing issue--which was my goal to ferret out. For me this is one of those things where the Pythonic thing to do is not so clear--and that mysterious, enigmatic definition of what it means to be Pythonic can be quite individual so I definitely don't want to waste my time arguing what that means. The most compelling argument for not doing anything about it is that the use cases are probably not that many--that in itself makes me less apt to push much harder--especially since my pragmatic side agrees with a lot of what has been said to this regard. IMO, Having properly nested scopes in Python in a sense made having closures a natural idiom to the language and part of its "user interface." By not allowing the name re-binding it almost seems like that "user interface" has a rough edge that is almost too easy to get cut on. This in-elegance seems very un-Pythonic to me. Anyhow, good discussion. Cheers, Almann -- Almann T. Goo almann.goo at gmail.com From greg.ewing at canterbury.ac.nz Wed Feb 22 08:47:40 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 20:47:40 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> <43FBB1D1.5060808@canterbury.ac.nz> <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> Message-ID: <43FC171C.8070304@canterbury.ac.nz> Raymond Hettinger wrote: > Like "autodict" could mean anything. Everything is meaningless until you know something about it. If you'd never seen Python before, would you know what 'dict' meant? If I were seeing "defaultdict" for the first time, I would need to look up the docs before I was confident I knew exactly what it did -- as I've mentioned before, my initial guess would have been wrong. The same procedure would lead me to an understanding of 'autodict' just as quickly. Maybe 'autodict' isn't the best term either -- I'm open to suggestions. But my instincts still tell me that 'defaultdict' is the best term for something *else* that we might want to add one day as well, so I'm just trying to make sure we don't squander it lightly. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 22 08:54:45 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 20:54:45 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <20060221183451.6035.JCARLSON@uci.edu> References: <20060221105309.601F.JCARLSON@uci.edu> <20060221183451.6035.JCARLSON@uci.edu> Message-ID: <43FC18C5.5010909@canterbury.ac.nz> Josiah Carlson wrote: > In this particular example, there is no net reduction in line use. The > execution speed of your algorithm would be reduced due to function > calling overhead. If there were more uses of the function, the line count reduction would be greater. In any case, line count and execution speed aren't the only issues -- there is DRY to consider. -- Greg From nnorwitz at gmail.com Wed Feb 22 09:10:43 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 22 Feb 2006 00:10:43 -0800 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> <1f7befae0602211704p7b9cbc19jde245e9e1d6bea02@mail.gmail.com> Message-ID: On 2/21/06, Neal Norwitz wrote: > > I agree with this, but don't know a clean way to do 2 builds. I > modified buildbot to: > > - Stop doing the second "without deleting .py[co]" run. > - Do one run with a debug build. > - Use -uall -r for both. I screwed it up, so now it does: - Do one run with a debug build. - Use -uall -r for both. - Still does the second "deleting .py[co]" run I couldn't think of a simple way to figure out that on most unixes the program is called python, but on Mac OS X, it's called python.exe. So I reverted back to using make testall. We can make a new test target to only run once. I also think I know how to do the "double builds" (one release and one debug). But it's too late for me to change it tonight without screwing it up. The good/bad news after this change is: http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/145/step-test/0 A seg fault on Mac OS when running with -r. :-( n From steve at holdenweb.com Wed Feb 22 09:17:25 2006 From: steve at holdenweb.com (Steve Holden) Date: Wed, 22 Feb 2006 03:17:25 -0500 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FC171C.8070304@canterbury.ac.nz> References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> <43FBB1D1.5060808@canterbury.ac.nz> <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> <43FC171C.8070304@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Raymond Hettinger wrote: > > >>Like "autodict" could mean anything. > > > Everything is meaningless until you know something > about it. If you'd never seen Python before, > would you know what 'dict' meant? > > If I were seeing "defaultdict" for the first time, > I would need to look up the docs before I was > confident I knew exactly what it did -- as I've > mentioned before, my initial guess would have > been wrong. The same procedure would lead me to > an understanding of 'autodict' just as quickly. > > Maybe 'autodict' isn't the best term either -- > I'm open to suggestions. But my instincts still > tell me that 'defaultdict' is the best term > for something *else* that we might want to add > one day as well, so I'm just trying to make > sure we don't squander it lightly. > Given that the default entries behind the non-existent keys don't actually exist, something like "virtual_dict" might be appropriate. Or "phantom_dict", or "ghost_dict". I agree that the naming of things is important. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ From greg.ewing at canterbury.ac.nz Wed Feb 22 09:23:10 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 21:23:10 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> <43FBB1D1.5060808@canterbury.ac.nz> <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> <43FC171C.8070304@canterbury.ac.nz> Message-ID: <43FC1F6E.5070505@canterbury.ac.nz> Steve Holden wrote: > Given that the default entries behind the non-existent keys don't > actually exist, something like "virtual_dict" might be appropriate. No, that would suggest to me something like a wrapper object that delegates most of the mapping protocol to something else. That's even less like what we're discussing. In our case the default values are only virtual until you use them, upon which they become real. Sort of like a wave function collapse... hmmm... I suppose 'heisendict' wouldn't fly, would it? -- Greg From fuzzyman at voidspace.org.uk Wed Feb 22 10:02:33 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Wed, 22 Feb 2006 09:02:33 +0000 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FBB1D5.4040309@canterbury.ac.nz> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> <43FBB1D5.4040309@canterbury.ac.nz> Message-ID: <43FC28A9.4000500@voidspace.org.uk> Greg Ewing wrote: > Fuzzyman wrote: > > >> I've had problems in code that needs to treat strings, lists and >> dictionaries differently (assigning values to a container where all >> three need different handling) and telling the difference but allowing >> duck typing is *problematic*. >> > > You need to rethink your design so that you don't > have to make that kind of distinction. Well... to *briefly* explain the use case, it's for value assignment in ConfigObj. It basically accepts as valid values strings and lists of strings [#]_. You can also create new subsections by assigning a dictionary. It needs to be able to recognise lists in order to check each list member is a string. (See note below, it still needs to be able to recognise lists when writing, even if it is not doing type checking on assignment.) It needs to be able to recognise dictionaries in order to create a new section instance (rather than directly assigning the dictionary). This is *terribly* convenient for the user (trivial example of creating a new config file programatically) : from configobj import ConfigObj cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} cfg.write() Writes out : key = value key2 = value1, value2, value3 [section] key = value key2 = value1, value2, value3 (Note none of those values needed quoting, so they aren't.) Obviously I could force the creation of sections and the assignment of list values to use separate methods, but it's much less readable and unnecessary. The code as is works and has a nice API. It still needs to be able to tell what *type* of value is being assigned. Mapping and sequence protocols are so loosely defined that in order to support 'list like objects' and 'dictionary like objects' some arbitrary decision about what methods they should support has to be made. (For example a read only mapping container is unlikely to implement __setitem__ or methods like update). At first we defined a mapping object as one that defines __getitem__ and keys (not update as I previously said), and list like objects as ones that define __getitem__ and *not* keys. For strings we required a basestring subclass. In the end I think we ripped this out and just settled on isinstance tests. All the best, Michael Foord .. [#] Although it has two modes. In the 'default' mode you can assign any object as a value and a string representation is written out. A more strict mode checks values at the point you assign them - so errors will be raised at that point rather than propagating into the config file. When writing you still need to able to recognise lists because each element is properly quoted. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/2a628df3/attachment-0001.html From fredrik at pythonware.com Wed Feb 22 10:38:38 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 22 Feb 2006 10:38:38 +0100 Subject: [Python-Dev] defaultdict proposal round three References: <43FA4E88.4090507@canterbury.ac.nz><43FAE298.5040404@canterbury.ac.nz><71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com><43FBB1D1.5060808@canterbury.ac.nz> <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > Like "autodict" could mean anything. fwiw, the first google hit for "autodict" appears to be part of someone's link farm At this website we have assistance with autodict. In addition to information for autodict we also have the best web sites concerning dictionary, non profit and new york. This makes autodict.com the most reliable guide for autodict on the Internet. and the second is a description of a self-initializing dictionary data type for Python. From stephen at xemacs.org Wed Feb 22 10:48:16 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 22 Feb 2006 18:48:16 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FAE40B.8090406@canterbury.ac.nz> (Greg Ewing's message of "Tue, 21 Feb 2006 22:57:31 +1300") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> Message-ID: <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing writes: Greg> Stephen J. Turnbull wrote: >> What I advocate for Python is to require that the standard >> base64 codec be defined only on bytes, and always produce >> bytes. Greg> I don't understand that. It seems quite clear to me that Greg> base64 encoding (in the general sense of encoding, not the Greg> unicode sense) takes binary data (bytes) and produces Greg> characters. Base64 is a (family of) wire protocol(s). It's not clear to me that it makes sense to say that the alphabets used by "baseNN" encodings are composed of characters, but suppose we stipulate that. Greg> So in Py3k the correct usage would be [bytes<->unicode]. IMHO, as a wire protocol, base64 simply doesn't care what Python's internal representation of characters is. I don't see any case for "correctness" here, only for convenience, both for programmers on the job and students in the classroom. We can choose the character set that works best for us. I think that's 8-bit US ASCII. My belief is that bytes<->bytes is going to be the dominant use case, although I don't use binary representation in XML. However, AFAIK for on the wire use UTF-8 is strongly recommended for XML, and in that case it's also efficient to use bytes<->bytes for XML, since conversion of base64 bytes to UTF-8 characters is simply a matter of "Simon says, be UTF-8!" And in the classroom, you're just going to confuse students by telling them that UTF-8 --[Unicode codec]--> Python string is decoding but UTF-8 --[base64 codec]--> Python string is encoding, when MAL is telling them that --> Python string is always decoding. Sure, it all makes sense if you already know what's going on. But I have trouble remembering, especially in cases like UTF-8 vs UTF-16 where Perl and Python have opposite internal representations, and glibc has a third which isn't either. If base64 (and gzip, etc) are all considered bytes<->bytes, there just isn't an issue any more. The simple rule wins: to Python string is always decoding. Why fight it when we can run away with efficiency gains? (In the above, "Python string" means the unicode type, not str.) -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From greg.ewing at canterbury.ac.nz Wed Feb 22 11:29:40 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 Feb 2006 23:29:40 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FC28A9.4000500@voidspace.org.uk> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> <43FBB1D5.4040309@canterbury.ac.nz> <43FC28A9.4000500@voidspace.org.uk> Message-ID: <43FC3D14.4030204@canterbury.ac.nz> Fuzzyman wrote: > cfg = ConfigObj(newfilename) > cfg['key'] = 'value' > cfg['key2'] = ['value1', 'value2', 'value3'] > cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} If the main purpose is to support this kind of notational convenience, then I'd be inclined to require all the values used with this API to be concrete strings, lists or dicts. If you're going to make types part of the API, I think it's better to do so with a firm hand rather than being half- hearted and wishy-washy about it. Then, if it's really necessary to support a wider variety of types, provide an alternative API that separates the different cases and isn't type-dependent at all. If someone has a need for this API, using it isn't going to be much of an inconvenience, since he won't be able to write out constructors for his types using notation as compact as the above anyway. -- Greg From jeremy at alum.mit.edu Wed Feb 22 12:14:21 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 22 Feb 2006 06:14:21 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FC0E20.5080809@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FC0E20.5080809@canterbury.ac.nz> Message-ID: On 2/22/06, Greg Ewing wrote: > Mark Russell wrote: > > > PEP 227 mentions using := as a rebinding operator, but rejects the > > idea as it would encourage the use of closures. > > Well, anything that facilitates rebinding in outer scopes > is going to encourage the use of closures, so I can't > see that as being a reason to reject a particular means > of rebinding. You either think such rebinding is a good > idea or not -- and that seems to be a matter of highly > individual taste. At the time PEP 227 was written, nested scopes were contentious. (I recall one developer who said he'd be embarassed to tell his co-workers he worked on Python if it had this feature :-). Rebinding was more contentious, so the feature was left out. I don't think any particular syntax or spelling for rebinding was favored more or less. > On this particular idea, I tend to think it's too obscure > as well. Python generally avoids attaching randomly-chosen > semantics to punctuation, and I'd like to see it stay > that way. I agree. Jeremy From fuzzyman at voidspace.org.uk Wed Feb 22 12:14:12 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Wed, 22 Feb 2006 11:14:12 +0000 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FC3D14.4030204@canterbury.ac.nz> References: <2773CAC687FD5F4689F526998C7E4E5F4DB971@au3010avexu1.global.avaya.com> <43FAE259.607@canterbury.ac.nz> <43FAE8D0.9040000@voidspace.org.uk> <43FBB1D5.4040309@canterbury.ac.nz> <43FC28A9.4000500@voidspace.org.uk> <43FC3D14.4030204@canterbury.ac.nz> Message-ID: <43FC4784.6030701@voidspace.org.uk> Greg Ewing wrote: > Fuzzyman wrote: > > >> cfg = ConfigObj(newfilename) >> cfg['key'] = 'value' >> cfg['key2'] = ['value1', 'value2', 'value3'] >> cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} >> > > If the main purpose is to support this kind of notational > convenience, then I'd be inclined to require all the values > used with this API to be concrete strings, lists or dicts. > If you're going to make types part of the API, I think it's > better to do so with a firm hand rather than being half- > hearted and wishy-washy about it. > [snip..] > Thanks, that's the solution we settled on. We use ``isinstance`` tests to determine types. The user can always do something like : cfg['section'] = dict(dict_like_object) Which isn't so horrible. All the best, Michael > -- > Greg > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/abd30a28/attachment.htm From fuzzyman at voidspace.org.uk Wed Feb 22 12:33:03 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Wed, 22 Feb 2006 11:33:03 +0000 Subject: [Python-Dev] operator.is*Type Message-ID: <43FC4BEF.1060502@voidspace.org.uk> Hello all, Feel free to shoot this down, but a suggestion. The operator module defines two functions : isMappingType isSquenceType These return a guesstimation as to whether an object passed in supports the mapping and sequence protocols. These protocols are loosely defined. Any object which has a ``__getitem__`` method defined could support either protocol. Therefore : >>> from operator import isSequenceType, isMappingType >>> class anything(object): ... def __getitem__(self, index): ... pass ... >>> something = anything() >>> isMappingType(something) True >>> isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. An object prima facie supports the mapping protocol if it defines a ``__getitem__`` method, and a ``keys`` method. An object prima facie supports the sequence protocol if it defines a ``__getitem__`` method, and *not* a ``keys`` method. As a result code which needs to be able to tell the difference can use these functions and can sensibly refer to the definition of the mapping and sequence protocols when documenting what sort of objects an API call can accept. All the best, Michael Foord From raymond.hettinger at verizon.net Wed Feb 22 12:45:47 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 22 Feb 2006 06:45:47 -0500 Subject: [Python-Dev] defaultdict and on_missing() Message-ID: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> I'm concerned that the on_missing() part of the proposal is gratuitous. The main use cases for defaultdict have a simple factory that supplies a zero, empty list, or empty set. The on_missing() hook is only there to support the rarer case of needing a key to compute a default value. The hook is not needed for the main use cases. As it stands, we're adding a method to regular dicts that cannot be usefully called directly. Essentially, it is a framework method meant to be overridden in a subclass. So, it only makes sense in the context of subclassing. In the meantime, we've added an oddball method to the main dict API, arguably the most important object API in Python. To use the hook, you write something like this: class D(dict): def on_missing(self, key): return somefunc(key) However, we can already do something like that without the hook: class D(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: self[key] = value = somefunc(key) return value The latter form is already possible, doesn't require modifying a basic API, and is arguably clearer about when it is called and what it does (the former doesn't explicitly show that the returned value gets saved in the dictionary). Since we can already do the latter form, we can get some insight into whether the need has ever actually arisen in real code. I scanned the usual sources (my own code, the standard library, and my most commonly used third-party libraries) and found no instances of code like that. The closest approximation was safe_substitute() in string.Template where missing keys returned themselves as a default value. Other than that, I conclude that there isn't sufficient need to warrant adding a funky method to the API for regular dicts. I wondered why the safe_substitute() example was unique. I think the answer is that we normally handle default computations through simple in-line code ("if k in d: do1() else do2()" or a try/except pair). Overriding on_missing() then is really only useful when you need to create a type that can be passed to a client function that was expecting a regular dictionary. So it does come-up but not much. Aside: Why on_missing() is an oddball among dict methods. When teaching dicts to beginner, all the methods are easily explainable except this one. You don't call this method directly, you only use it when subclassing, you have to override it to do anything useful, it hooks KeyError but only when raised by __getitem__ and not other methods, etc. I'm concerned that evening having this method in regular dict API will create confusion about when to use dict.get(), when to use dict.setdefault(), when to catch a KeyError, or when to LBYL. Adding this one extra choice makes the choice more difficult. My recommendation: Dump the on_missing() hook. That leaves the dict API unmolested and allows a more straight-forward implementation/explanation of collections.default_dict or whatever it ends-up being named. The result is delightfully simple and easy to understand/explain. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/1d64a51d/attachment.htm From greg.ewing at canterbury.ac.nz Wed Feb 22 12:35:39 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 00:35:39 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FC4C8B.6080300@canterbury.ac.nz> Stephen J. Turnbull wrote: > Base64 is a (family of) wire protocol(s). It's not clear to me that > it makes sense to say that the alphabets used by "baseNN" encodings > are composed of characters, Take a look at http://en.wikipedia.org/wiki/Base64 where it says ...base64 is a binary to text encoding scheme whereby an arbitrary sequence of bytes is converted to a sequence of printable ASCII characters. Also see RFC 2045 (http://www.ietf.org/rfc/rfc2045.txt) which defines base64 in terms of an encoding from octets to characters, and also says A 65-character subset of US-ASCII is used ... This subset has the important property that it is represented identically in all versions of ISO 646 ... and all characters in the subset are also represented identically in all versions of EBCDIC. Which seems to make it perfectly clear that the result of the encoding is to be considered as characters, which are not necessarily going to be encoded using ascii. So base64 on its own is *not* a wire protocol. Only after encoding the characters do you have a wire protocol. > I don't see any case for > "correctness" here, only for convenience, I'm thinking of convenience, too. Keep in mind that in Py3k, 'unicode' will be called 'str' (or something equally neutral like 'text') and you will rarely have to deal explicitly with unicode codings, this being done mostly for you by the I/O objects. So most of the time, using base64 will be just as convenient as it is today: base64_encode(my_bytes) and write the result out somewhere. The reason I say it's *corrrect* is that if you go straight from bytes to bytes, you're *assuming* the eventual encoding is going to be an ascii superset. The programmer is going to have to know about this assumption and understand all its consequences and decide whether it's right, and if not, do something to change it. Whereas if the result is text, the right thing happens automatically whatever the ultimate encoding turns out to be. You can take the text from your base64 encoding, combine it with other text from any other source to form a complete mail message or xml document or whatever, and write it out through a file object that's using any unicode encoding at all, and the result will be correct. > it's also efficient to use bytes<->bytes for XML, since > conversion of base64 bytes to UTF-8 characters is simply a matter of > "Simon says, be UTF-8!" Efficiency is an implementation concern. In Py3k, strings which contain only ascii or latin-1 might be stored as 1 byte per character, in which case this would not be an issue. > And in the classroom, you're just going to confuse students by telling > them that UTF-8 --[Unicode codec]--> Python string is decoding but > UTF-8 --[base64 codec]--> Python string is encoding, when MAL is > telling them that --> Python string is always decoding. Which is why I think that only *unicode* codings should be available through the .encode and .decode interface. Or alternatively there should be something more explicit like .unicode_encode and .unicode_decode that is thus restricted. Also, if most unicode coding is done in the I/O objects, there will be far less need for programmers to do explicit unicode coding in the first place, so likely it will become more of an advanced topic, rather than something you need to come to grips with on day one of using unicode, like it is now. -- Greg From fredrik at pythonware.com Wed Feb 22 12:54:28 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 22 Feb 2006 12:54:28 +0100 Subject: [Python-Dev] defaultdict and on_missing() References: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > Aside: Why on_missing() is an oddball among dict methods. When > teaching dicts to beginner, all the methods are easily explainable ex- > cept this one. You don't call this method directly, you only use it > when subclassing, you have to override it to do anything useful, it > hooks KeyError but only when raised by __getitem__ and not > other methods, etc. agreed. > My recommendation: Dump the on_missing() hook. That leaves > the dict API unmolested and allows a more straight-forward im- > plementation/explanation of collections.default_dict or whatever > it ends-up being named. The result is delightfully simple and easy > to understand/explain. agreed. a separate type in collections, a template object (or factory) passed to the constructor, and implementation inheritance, is more than good en- ough. and if I recall correctly, pretty much what Guido first proposed. I trust his intuition a lot more than I trust the design-by-committee-with- out-use-cases process. From greg.ewing at canterbury.ac.nz Wed Feb 22 12:42:16 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 00:42:16 +1300 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> References: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> Message-ID: <43FC4E18.20403@canterbury.ac.nz> Raymond Hettinger wrote: > I'm concerned that the on_missing() part of the proposal is gratuitous. I second all that. A clear case of YAGNI. -- Greg From python at rcn.com Wed Feb 22 12:59:59 2006 From: python at rcn.com (Raymond Hettinger) Date: Wed, 22 Feb 2006 06:59:59 -0500 Subject: [Python-Dev] operator.is*Type References: <43FC4BEF.1060502@voidspace.org.uk> Message-ID: <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> > >>> from operator import isSequenceType, isMappingType > >>> class anything(object): > ... def __getitem__(self, index): > ... pass > ... > >>> something = anything() > >>> isMappingType(something) > True > >>> isSequenceType(something) > True > > I suggest we either deprecate these functions as worthless, *or* we > define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. Your example simply highlights the consequences of one of Python's most basic, original design choices (using getitem for both sequences and mappings). That choice is now so fundamental to the language that it cannot possibly change. Get used to it. In your example, the results are correct. The "anything" class can be viewed as either a sequence or a mapping. In this and other posts, you seem to be focusing your design around notions of strong typing and mandatory interfaces. I would suggest that that approach is futile unless you control all of the code being run. Raymond From theller at python.net Wed Feb 22 13:03:55 2006 From: theller at python.net (Thomas Heller) Date: Wed, 22 Feb 2006 13:03:55 +0100 Subject: [Python-Dev] operator.is*Type In-Reply-To: <43FC4BEF.1060502@voidspace.org.uk> References: <43FC4BEF.1060502@voidspace.org.uk> Message-ID: <43FC532B.1040307@python.net> Fuzzyman wrote: > Hello all, > > Feel free to shoot this down, but a suggestion. > > The operator module defines two functions : > > isMappingType > isSquenceType > > > These return a guesstimation as to whether an object passed in supports > the mapping and sequence protocols. > > These protocols are loosely defined. Any object which has a > ``__getitem__`` method defined could support either protocol. The docs contain clear warnings about that. > I suggest we either deprecate these functions as worthless, *or* we > define the protocols slightly more clearly for user defined classes. I have no problems deprecating them since I've never used one of these functions. If I want to know if something is a string I use isinstance(), for string-like objects I would use try: obj + "" except TypeError: and so on. > > An object prima facie supports the mapping protocol if it defines a > ``__getitem__`` method, and a ``keys`` method. > > An object prima facie supports the sequence protocol if it defines a > ``__getitem__`` method, and *not* a ``keys`` method. > > As a result code which needs to be able to tell the difference can use > these functions and can sensibly refer to the definition of the mapping > and sequence protocols when documenting what sort of objects an API call > can accept. Thomas From fuzzyman at voidspace.org.uk Wed Feb 22 13:18:09 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Wed, 22 Feb 2006 12:18:09 +0000 Subject: [Python-Dev] operator.is*Type In-Reply-To: <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> References: <43FC4BEF.1060502@voidspace.org.uk> <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> Message-ID: <43FC5681.8000903@voidspace.org.uk> Raymond Hettinger wrote: >> >>> from operator import isSequenceType, isMappingType >> >>> class anything(object): >> ... def __getitem__(self, index): >> ... pass >> ... >> >>> something = anything() >> >>> isMappingType(something) >> True >> >>> isSequenceType(something) >> True >> >> I suggest we either deprecate these functions as worthless, *or* we >> define the protocols slightly more clearly for user defined classes. > > They are not worthless. They do a damned good job of differentiating > anything that CAN be differentiated. > But as far as I can tell (and I may be wrong), they only work if the object is a subclass of a built in type, otherwise they're broken. So you'd have to do a type check as well, unless you document that an API call *only* works with a builtin type or subclass. In which case - an isinstance call does the same, with the advantage of not being broken if the object is a user-defined class. At the very least the function would be better renamed ``MightBeMappingType`` ;-) > Your example simply highlights the consequences of one of Python's > most basic, original design choices (using getitem for both sequences > and mappings). That choice is now so fundamental to the language that > it cannot possibly change. Get used to it. > I have no problem with it - it's useful. > In your example, the results are correct. The "anything" class can be > viewed as either a sequence or a mapping. > But in practise an object is *unlikely* to be both. (Although conceivable a mapping container *could* implement integer indexing an thus be both - but *very* rare). Therefore the current behaviour is not really useful in any conceivable situation - not that I can think of anyway. > In this and other posts, you seem to be focusing your design around > notions of strong typing and mandatory interfaces. I would suggest > that that approach is futile unless you control all of the code being > run. > Not directly. I'm suggesting that the loosely defined protocol (used with duck typing) can be made quite a bit more useful by making the definition *slightly* more specific. A preference for strong typing would require subclassing, surely ? The approach I suggest would allow a *less* 'strongly typed' approach to code, because it establishes a convention to decide whether a user defined class supports the mapping and sequence protocols. The simple alternative (which we took in ConfigObj) is to require a 'strongly typed' interface, because there is currently no useful way to determine whether an object that implements __getitem__ supports mapping or sequence. (Other than *assuming* that a mapping container implements a random choice from the other common mapping methods.) All the best, Michael > > Raymond > > > From mwh at python.net Wed Feb 22 14:53:18 2006 From: mwh at python.net (Michael Hudson) Date: Wed, 22 Feb 2006 13:53:18 +0000 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: <43FB8763.7070802@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Tue, 21 Feb 2006 22:34:27 +0100") References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> Message-ID: <2mu0ar7b5d.fsf@starship.python.net> "Martin v. L?wis" writes: > Tim Peters wrote: >> Speaking of which, a number of test failures over the past few weeks >> were provoked here only under -r (run tests in random order) or under >> a debug build, and didn't look like those were specific to Windows. >> Adding -r to the buildbot test recipe is a decent idea. Getting >> _some_ debug-build test runs would also be good (or do we do that >> already?). > > So what is your recipe: Add -r to all buildbots? Only to those which > have an 'a' in their name? Only to every third build? Duplicating > the number of builders? > > Same question for --with-pydebug. Combining this with -r would multiply > the number of builders by 4 already. Instead of running release and debug builds, why not just run debug builds? They catch more problems, earlier. Cheers, mwh -- This song is for anyone ... fuck it. Shut up and listen. -- Eminem, "The Way I Am" From raymond.hettinger at verizon.net Wed Feb 22 16:21:50 2006 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 22 Feb 2006 10:21:50 -0500 Subject: [Python-Dev] defaultdict proposal round three References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> Message-ID: <002101c637c3$b88dd0d0$6a01a8c0@RaymondLaptop1> [Alex] > I'd love to remove setdefault in 3.0 -- but I don't think it can be done > before that: default_factory won't cover the occasional use cases where > setdefault is called with different defaults at different locations, and, > rare as those cases may be, any 2.* should not break any existing code that > uses that approach. I'm not too concerned about this one. Whenever setdefault gets deprecated , then ALL code that used it would have to be changed. If there were cases with different defaults, a regular try/except would do the job just fine (heck, it might even be faster because the won't be a wasted instantiation in the cases where the key already exists). There may be other reasons to delay removing setdefault(), but multiple default use case isn't one of them. >> An alternative is to have two possible attributes: >> d.default_factory = list >> or >> d.default_value = 0 >> with an exception being raised when both are defined (the test is done when >> the >> attribute is created, not when the lookup is performed). > > I see default_value as a way to get exactly the same beginner's error we > already have with function defaults: That makes sense. I'm somewhat happy with the patch as it stands now. The only part that needs serious rethinking is putting on_missing() in regular dicts. See my other email on that subject. Raymond From chris at atlee.ca Wed Feb 22 16:40:05 2006 From: chris at atlee.ca (Chris AtLee) Date: Wed, 22 Feb 2006 10:40:05 -0500 Subject: [Python-Dev] Copying zlib compression objects In-Reply-To: References: <7790b6530602170848oe892897s4157c39b94082ce5@mail.gmail.com> Message-ID: <7790b6530602220740m1e1f59d4x5be675d5f4be7ee0@mail.gmail.com> On 2/17/06, Guido van Rossum wrote: > > Please submit your patch to SourceForge. I've submitted the zlib patch as patch #1435422. I added some test cases to test_zlib.py and documented the new methods. I'd like to test my gzip / tarfile changes more before creating a patch for it, but I'm interested in any feedback about the idea of adding snapshot() / restore() methods to the GzipFile and TarFile classes. It doesn't look like the underlying bz2 library supports copying compression / decompression streams, so for now it's impossible to make corresponding changes to the bz2 module. I also noticed that the tarfile reimplements the gzip file format when dealing with streams. Would it make sense to refactor some the gzip.py code to expose the methods that read/write the gzip file header, and have the tarfile module use those methods? Cheers, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/28bea403/attachment.htm From aleaxit at gmail.com Wed Feb 22 16:47:33 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 22 Feb 2006 07:47:33 -0800 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <002101c637c3$b88dd0d0$6a01a8c0@RaymondLaptop1> References: <008101c6363b$ad0fc4e0$b83efea9@RaymondLaptop1> <9E666586-CF68-4C26-BE43-E789ECE97FAD@gmail.com> <002101c637c3$b88dd0d0$6a01a8c0@RaymondLaptop1> Message-ID: <2EE63C9A-5E55-4176-AA9D-AFB89E6BAD0E@gmail.com> On Feb 22, 2006, at 7:21 AM, Raymond Hettinger wrote: ... > I'm somewhat happy with the patch as it stands now. The only part > that needs serious rethinking is putting on_missing() in regular > dicts. See my other email on that subject. What if we named it _on_missing? Hook methods intended only to be overridden in subclasses are sometimes spelled that way, and it removes the need to teach about it to beginners -- it looks private so we don't explain it at that point. My favorite example is Queue.Queue: I teach it (and in fact evangelize for it as the one sane way to do threading;-) in "Python 101", *without* ever mentioning _get, _put etc -- THOSE I teach in "Patterns with Python" as the very bext example of the Gof4's classic "Template Method" design pattern. If dict had _on_missing I'd have another wonderful example to teach from! (I believe the Library Reference avoids teaching about _get, _put etc, too, though I haven't checked it for a while). TM is my favorite DP, so I'm biased in favor of Guido's design, and I think that by giving the hook method (not meant to be called, only overridden) a "private name" we're meeting enough of your and /F's concerns to let _on_missing remain. Its existence does simplify the implementation of defaultdict (and some other dict subclasses), and "if the implementation is easy to explain, it may be a good idea", after all;-) Alex From guido at python.org Wed Feb 22 16:49:48 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 22 Feb 2006 10:49:48 -0500 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> References: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> Message-ID: On 2/22/06, Raymond Hettinger wrote: > I'm concerned that the on_missing() part of the proposal is gratuitous. The > main use cases for defaultdict have a simple factory that supplies a zero, > empty list, or empty set. The on_missing() hook is only there to support > the rarer case of needing a key to compute a default value. The hook is not > needed for the main use cases. The on_missing() hook is there to take the action of inserting the default value into the dict. For this it needs the key. It seems attractive to collaps default_factory and on_missing into a single attribute (my first attempt did this, and I was halfway posting about it before I realized the mistake). But on_missing() really needs the key, and at the same time you don't want to lose the convenience of being able to specify set, list, int etc. as default factories, so default_factory() must be called without the key. If you don't have on_missing, then the functionality of inserting the key produced by default_factory would have to be in-lined in __getitem__, which means the machinery put in place can't be reused for other use cases -- several people have claimed to have a use case for returning a value *without* inserting it into the dict. > As it stands, we're adding a method to regular dicts that cannot be usefully > called directly. Essentially, it is a framework method meant to be > overridden in a subclass. So, it only makes sense in the context of > subclassing. In the meantime, we've added an oddball method to the main > dict API, arguably the most important object API in Python. Which to me actually means it's a *good* place to put the hook functionality, since it allows for maximum reuse. > To use the hook, you write something like this: > > class D(dict): > def on_missing(self, key): > return somefunc(key) Or, more likely, def on_missing(key): self[key] = value = somefunc() return value > However, we can already do something like that without the hook: > > class D(dict): > def __getitem__(self, key): > try: > return dict.__getitem__(self, key) > except KeyError: > self[key] = value = somefunc(key) > return value > > The latter form is already possible, doesn't require modifying a basic API, > and is arguably clearer about when it is called and what it does (the former > doesn't explicitly show that the returned value gets saved in the > dictionary). This is exactly what Google's internal DefaultDict does. But it is also its downfall, because now *all* __getitem__ calls are weighed down by going through Python code; in a particular case that came up at Google I had to recommend against using it for performance reasons. > Since we can already do the latter form, we can get some insight into > whether the need has ever actually arisen in real code. I scanned the usual > sources (my own code, the standard library, and my most commonly used > third-party libraries) and found no instances of code like that. The > closest approximation was safe_substitute() in string.Template where missing > keys returned themselves as a default value. Other than that, I conclude > that there isn't sufficient need to warrant adding a funky method to the API > for regular dicts. In this case I don't believe that the absence of real-life examples says much (and BTW Google's DefaultDict *is* such a real life example; it is used in other code). There is not much incentive for subclassing dict and overriding __getitem__ if the alternative is that in a few places you have to write two lines of code instead of one: if key not in d: d[key] = set() # this line would be unneeded d[key].add(value) > I wondered why the safe_substitute() example was unique. I think the answer > is that we normally handle default computations through simple in-line code > ("if k in d: do1() else do2()" or a try/except pair). Overriding > on_missing() then is really only useful when you need to create a type that > can be passed to a client function that was expecting a regular dictionary. > So it does come-up but not much. I think the pattern hasn't been commonly known; people have been struggling with setdefault() all these years. > Aside: Why on_missing() is an oddball among dict methods. When teaching > dicts to beginner, all the methods are easily explainable except this one. You don't seriously teach beginners all dict methods do you? setdefault(), update(), copy() are all advanced material, and so are iteritems(), itervalues() and iterkeys() (*especially* the last since it's redundant through "for i in d:"). > You don't call this method directly, you only use it when subclassing, you > have to override it to do anything useful, it hooks KeyError but only when > raised by __getitem__ and not other methods, etc. The only other methods that raise KeyError are __delitem__, pop() and popitem(). I don't see how these could use the same hook as __getitem__ if the only real known use case for the latter is a hook that inserts the value -- these methods all *delete* an item, so they would need a different hook anyway (two different hooks, really, since __delitem__ doesn't need a value). And I can't even think of a theoretical use case for hooking these, let alone a real one. > I'm concerned that > evening having this method in regular dict API will create confusion about > when to use dict.get(), when to use dict.setdefault(), when to catch a > KeyError, or when to LBYL. Adding this one extra choice makes the choice > more difficult. Well, obviously if you're not subclassing you can't use on_missing(), so it doesn't really add much to the available choices, *unless* you subclass, which is a choice you're likely to make in a different phase of the design, and not lightly. > My recommendation: Dump the on_missing() hook. That leaves the dict API > unmolested and allows a more straight-forward implementation/explanation of > collections.default_dict or whatever it ends-up being named. The result is > delightfully simple and easy to understand/explain. I disagree. on_missing() is exactly the right refactoring. If we removed on_missing() from dict, we'd have to override __getitem__ in defaultdict (regardless of whether we give defaultdict an on_missing() hook or in-line it). But the base class __getitem__ is a careful piece of work! The override in defaultdict basically has two choices: invoke dict.__getitem__ and catch the KeyError exception, or copy all the code. (Using PyDict_GetItem would be even more wrong since it suppresses exceptions in the hash and comparison phase of the lookup.) Copying all the code is fraught with maintenance problems. Calling dict.__getitem__ has the problem that it *could* raise KeyError for reasons that have nothing to do (directly) with a missing item -- a broken hash or comparison could also raise this, and in that case it would be a mistake to call on_missing(). IMO pretty much the only reason for keeping the changes contained within the collections module would be code modularity; but the above argument about code reuse deconstructs that argument. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From foom at fuhm.net Wed Feb 22 17:17:41 2006 From: foom at fuhm.net (James Y Knight) Date: Wed, 22 Feb 2006 11:17:41 -0500 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FC4C8B.6080300@canterbury.ac.nz> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> Message-ID: <3886473F-A4F8-4B1A-9EEC-A60E9D221D45@fuhm.net> On Feb 22, 2006, at 6:35 AM, Greg Ewing wrote: > I'm thinking of convenience, too. Keep in mind that in Py3k, > 'unicode' will be called 'str' (or something equally neutral > like 'text') and you will rarely have to deal explicitly with > unicode codings, this being done mostly for you by the I/O > objects. So most of the time, using base64 will be just as > convenient as it is today: base64_encode(my_bytes) and write > the result out somewhere. > > The reason I say it's *corrrect* is that if you go straight > from bytes to bytes, you're *assuming* the eventual encoding > is going to be an ascii superset. The programmer is going to > have to know about this assumption and understand all its > consequences and decide whether it's right, and if not, do > something to change it. > > Whereas if the result is text, the right thing happens > automatically whatever the ultimate encoding turns out to > be. You can take the text from your base64 encoding, combine > it with other text from any other source to form a complete > mail message or xml document or whatever, and write it out > through a file object that's using any unicode encoding > at all, and the result will be correct. This makes little sense for mail. You combine *bytes*, in various and possibly different encodings to form a mail message. Some MIME sections might have a base64 Content-Transfer-Encoding, others might be 8bit encoded, others might be 7bit encoded, others might be quoted- printable encoded. Before the C-T-E encoding, you will have had to do the Content-Type encoding, coverting your text into bytes with the desired character encoding: utf-8, iso-8859-1, etc. Having the final mail message be made up of "characters", right before transmission to the socket would be crazy. James From python at rcn.com Wed Feb 22 17:20:44 2006 From: python at rcn.com (Raymond Hettinger) Date: Wed, 22 Feb 2006 11:20:44 -0500 Subject: [Python-Dev] defaultdict and on_missing() References: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> Message-ID: <001301c637cb$f05fd960$6a01a8c0@RaymondLaptop1> [Guido van Rossum"] > If we removed on_missing() from dict, we'd have to override > __getitem__ in defaultdict (regardless of whether we give >defaultdict an on_missing() hook or in-line it). You have another option. Keep your current modifications to dict.__getitem__ but do not include dict.on_missing(). Let it only be called in a subclass IF it is defined; otherwise, raise KeyError. That keeps me happy since the basic dict API won't show on_missing(), but it still allows a user to attach an on_missing method to a dict subclass when or if needed. I think all your test cases would still pass without modification. This is approach is not much different than for other magic methods which kick-in if defined or revert to a default behavior if not. My core concern is to keep the dict API clean as a whistle. Raymond From bob at redivi.com Wed Feb 22 18:04:38 2006 From: bob at redivi.com (Bob Ippolito) Date: Wed, 22 Feb 2006 09:04:38 -0800 Subject: [Python-Dev] operator.is*Type In-Reply-To: <43FC5681.8000903@voidspace.org.uk> References: <43FC4BEF.1060502@voidspace.org.uk> <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> <43FC5681.8000903@voidspace.org.uk> Message-ID: <450556FF-7948-4094-A79C-B09DCDA3CF79@redivi.com> On Feb 22, 2006, at 4:18 AM, Fuzzyman wrote: > Raymond Hettinger wrote: >>>>>> from operator import isSequenceType, isMappingType >>>>>> class anything(object): >>> ... def __getitem__(self, index): >>> ... pass >>> ... >>>>>> something = anything() >>>>>> isMappingType(something) >>> True >>>>>> isSequenceType(something) >>> True >>> >>> I suggest we either deprecate these functions as worthless, *or* we >>> define the protocols slightly more clearly for user defined classes. >> >> They are not worthless. They do a damned good job of differentiating >> anything that CAN be differentiated. >> > But as far as I can tell (and I may be wrong), they only work if the > object is a subclass of a built in type, otherwise they're broken. So > you'd have to do a type check as well, unless you document that an API > call *only* works with a builtin type or subclass. If you really cared, you could check hasattr(something, 'get') and hasattr(something, '__getitem__'), which is a pretty good indicator that it's a mapping and not a sequence (in a dict-like sense, anyway). -bob From ianb at colorstudy.com Wed Feb 22 18:10:14 2006 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 22 Feb 2006 11:10:14 -0600 Subject: [Python-Dev] operator.is*Type In-Reply-To: <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> References: <43FC4BEF.1060502@voidspace.org.uk> <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> Message-ID: <43FC9AF6.3040506@colorstudy.com> Raymond Hettinger wrote: >>>>>from operator import isSequenceType, isMappingType >>>>>class anything(object): >> >>... def __getitem__(self, index): >>... pass >>... >> >>>>>something = anything() >>>>>isMappingType(something) >> >>True >> >>>>>isSequenceType(something) >> >>True >> >>I suggest we either deprecate these functions as worthless, *or* we >>define the protocols slightly more clearly for user defined classes. > > > They are not worthless. They do a damned good job of differentiating anything > that CAN be differentiated. But they are just identical...? They seem terribly pointless to me. Deprecation is one option, of course. I think Michael's suggestion also makes sense. *If* we distinguish between sequences and mapping types with two functions, *then* those two functions should be distinct. It seems kind of obvious, doesn't it? I think hasattr(obj, 'keys') is the simplest distinction of the two kinds of collections. > Your example simply highlights the consequences of one of Python's most basic, > original design choices (using getitem for both sequences and mappings). That > choice is now so fundamental to the language that it cannot possibly change. > Get used to it. > > In your example, the results are correct. The "anything" class can be viewed as > either a sequence or a mapping. > > In this and other posts, you seem to be focusing your design around notions of > strong typing and mandatory interfaces. I would suggest that that approach is > futile unless you control all of the code being run. I think you are reading too much into it. If the functions exist, they should be useful. That's all I see in Michael's suggestion. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From guido at python.org Wed Feb 22 18:44:33 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 22 Feb 2006 12:44:33 -0500 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <001301c637cb$f05fd960$6a01a8c0@RaymondLaptop1> References: <001401c637a5$86053ea0$6a01a8c0@RaymondLaptop1> <001301c637cb$f05fd960$6a01a8c0@RaymondLaptop1> Message-ID: On 2/22/06, Raymond Hettinger wrote: > [Guido van Rossum"] > > If we removed on_missing() from dict, we'd have to override > > __getitem__ in defaultdict (regardless of whether we give > >defaultdict an on_missing() hook or in-line it). > > You have another option. Keep your current modifications to > dict.__getitem__ but do not include dict.on_missing(). Let it only > be called in a subclass IF it is defined; otherwise, raise KeyError. OK. I don't have time right now for another round of patches -- if you do, please go ahead. The dict docs in my latest patch must be updated somewhat (since they document on_missing()). > That keeps me happy since the basic dict API won't show on_missing(), > but it still allows a user to attach an on_missing method to a dict subclass > when > or if needed. I think all your test cases would still pass without > modification. Except the ones that explicitly test for dict.on_missing()'s presence and behavior. :-) > This is approach is not much different than for other magic methods which > kick-in if defined or revert to a default behavior if not. Right. Plenty of precedent there. > My core concern is to keep the dict API clean as a whistle. Understood. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jason.orendorff at gmail.com Wed Feb 22 19:17:51 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Wed, 22 Feb 2006 13:17:51 -0500 Subject: [Python-Dev] Path PEP: some comments (equality) In-Reply-To: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> References: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> Message-ID: On 2/20/06, Mark Mc Mahon wrote: > > It seems that the Path module as currently defined leaves equality > testing up to the underlying string comparison. My guess is that this > is fine for Unix (maybe not even) but it is a bit lacking for Windows. > > Should the path class implement an __eq__ method that might do some of > the following things: > - Get the absolute path of both self and the other path > - normcase both > - now see if they are equal > This has been suggested to me many times. Unfortunately, since Path is a subclass of string, this breaks stuff in weird ways. For example: 'x.py' == path('x.py') == path('X.PY') == 'X.PY', but 'x.py' != 'X.PY' And hashing needs to be consistent with __eq__: hash('x.py') == hash(path('X.PY')) == hash('X.PY') ??? Granted these problems would only pop up in code where people are mixing Path and string objects. But they would cause really obscure bugs in practice, very difficult for a non-expert to figure out and fix. It's safer for Paths to behave just like strings. -j -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/8577a431/attachment.html From pje at telecommunity.com Wed Feb 22 19:56:48 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 22 Feb 2006 13:56:48 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <43FC0E20.5080809@canterbury.ac.nz> <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FC0E20.5080809@canterbury.ac.nz> Message-ID: <5.1.1.6.0.20060222135150.01e04328@mail.telecommunity.com> At 06:14 AM 2/22/2006 -0500, Jeremy Hylton wrote: >On 2/22/06, Greg Ewing wrote: > > Mark Russell wrote: > > > > > PEP 227 mentions using := as a rebinding operator, but rejects the > > > idea as it would encourage the use of closures. > > > > Well, anything that facilitates rebinding in outer scopes > > is going to encourage the use of closures, so I can't > > see that as being a reason to reject a particular means > > of rebinding. You either think such rebinding is a good > > idea or not -- and that seems to be a matter of highly > > individual taste. > >At the time PEP 227 was written, nested scopes were contentious. (I >recall one developer who said he'd be embarassed to tell his >co-workers he worked on Python if it had this feature :-). Was this because of the implicit "inheritance" of variables from the enclosing scope? > Rebinding >was more contentious, so the feature was left out. I don't think any >particular syntax or spelling for rebinding was favored more or less. > > > On this particular idea, I tend to think it's too obscure > > as well. Python generally avoids attaching randomly-chosen > > semantics to punctuation, and I'd like to see it stay > > that way. > >I agree. Note that '.' for relative naming already exists (attribute access), and Python 2.5 is already introducing the use of a leading '.' (with no name before it) to mean "parent of the current namespace". So, using that approach to reference variables in outer scopes wouldn't be without precedents. IOW, I propose no new syntax for rebinding, but instead making variables' context explicit. This would also fix the issue where right now you have to inspect a function and its context to find out whether there's a closure and what's in it. The leading dots will be quite visible. From tjreedy at udel.edu Wed Feb 22 20:09:41 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 22 Feb 2006 14:09:41 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com><7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> Message-ID: "Almann T. Goo" wrote in message news:7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1 at mail.gmail.com... > IMO, Having properly nested scopes in Python in a sense made having > closures a natural idiom to the language and part of its "user > interface." By not allowing the name re-binding it almost seems like > that "user interface" has a rough edge that is almost too easy to get > cut on. I can see now how it would look that way to someone who has experience with fully functional nested scopes in other languages and who learns Python after no-write nested scoping was added. What is not mentioned in the ref manual and what I suppose may not be obvious even reading the PEP is that Python added nesting to solve two particular problems. First was the inability to write nested recursive functions without the hack of stuffing its name in the global namespace (or of patching the byte code). Second was the need to misuse the default arg mechanism in nested functions. What we have now pretty well fixes both. Terry Jan Reedy From tjreedy at udel.edu Wed Feb 22 20:32:30 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 22 Feb 2006 14:32:30 -0500 Subject: [Python-Dev] bytes.from_hex() References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp><43F69411.1020807@canterbury.ac.nz><20060217202813.5FA2.JCARLSON@uci.edu><43f6abab.1091371449@news.gmane.org><87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp><43F8C391.2070405@v.loewis.de><87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp><43FA3820.3070607@v.loewis.de><87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp><43FAE40B.8090406@canterbury.ac.nz><87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> Message-ID: "Greg Ewing" wrote in message news:43FC4C8B.6080300 at canterbury.ac.nz... > Efficiency is an implementation concern. It is also a user concern, especially if inefficiency overruns memory limits. > In Py3k, strings > which contain only ascii or latin-1 might be stored as > 1 byte per character, in which case this would not be an > issue. If 'might' becomes 'will', I and I suspect others will be happier with the change. And I would be happy if the choice of physical storage was pretty much handled behind the scenes, as with the direction int/long unification is going. > Which is why I think that only *unicode* codings should be > available through the .encode and .decode interface. Or > alternatively there should be something more explicit like > .unicode_encode and .unicode_decode that is thus restricted. I prefer the shorter names and using recode, for instance, for bytes to bytes. Terry Jan Reedy From steven.bethard at gmail.com Wed Feb 22 20:41:54 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed, 22 Feb 2006 12:41:54 -0700 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> Message-ID: On 2/21/06, Phillip J. Eby wrote: > Here's a crazy idea, that AFAIK has not been suggested before and could > work for both globals and closures: using a leading dot, ala the new > relative import feature. e.g.: > > def incrementer(val): > def inc(): > .val += 1 > return .val > return inc > > The '.' would mean "this name, but in the nearest outer scope that defines > it". Note that this could include the global scope, so the 'global' > keyword could go away in 2.5. And in Python 3.0, the '.' could become > *required* for use in closures, so that it's not necessary for the reader > to check a function's outer scope to see whether closure is taking > place. EIBTI. FWIW, I think this is nice. Since it uses the same dot-notation that normal attribute access uses, it's clearly accessing the attribute of *some* namespace. It's not perfectly intuitive that the accessed namespace is the enclosing one, but I do think it's at least more intuitive than the suggested := operator, and at least as intuitive as a ``global``-like declaration. And, as you mention, it's consistent with the relative import feature. I'm a little worried that this proposal will get lost amid the mass of other suggestions being thrown out right now. Any chance of turning this into a PEP? Steve -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From python at rcn.com Wed Feb 22 20:43:51 2006 From: python at rcn.com (Raymond Hettinger) Date: Wed, 22 Feb 2006 14:43:51 -0500 Subject: [Python-Dev] operator.is*Type References: <43FC4BEF.1060502@voidspace.org.uk> <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> <43FC9AF6.3040506@colorstudy.com> Message-ID: <000801c637e8$547d8390$6a01a8c0@RaymondLaptop1> [Ian Bicking] > They seem terribly pointless to me. FWIW, here is the script that had I used while updating and improving the two functions (can't remember whether it was for Py2.3 or Py2.4). It lists comparative results for many different types of inputs. Since perfection was not possible, the goal was to have no false negatives and mostly accurate positives. IMO, they do a pretty good job and are able to access information in not otherwise visable to pure Python code. With respect to user defined instances, I don't care that they can't draw a distinction where none exists in the first place -- at some point you have to either fallback on duck-typing or be in control of what kind of arguments you submit to your functions. Practicality beats purity -- especially when a pure solution doesn't exist (i.e. given a user defined class that defines just __getitem__, both mapping or sequence behavior is a possibility). ---- Analysis Script ---- from collections import deque from UserList import UserList from UserDict import UserDict from operator import * types = (set, int, float, complex, long, bool, str, unicode, list, UserList, tuple, deque, ) for t in types: print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t) class c: def __repr__(self): return 'Instance w/o getitem' class cn(object): def __repr__(self): return 'NewStyle Instance w/o getitem' class cg: def __repr__(self): return 'Instance w getitem' def __getitem__(self): return 10 class cng(object): def __repr__(self): return 'NewStyle Instance w getitem' def __getitem__(self): return 10 def f(): return 1 def g(): yield 1 for i in (None, NotImplemented, g(), c(), cn()): print isMappingType(i), isSequenceType(i), repr(i), type(i) for i in (cg(), cng(), dict(), UserDict()): print isMappingType(i), isSequenceType(i), repr(i), type(i) ---- Output ---- False False set([]) False False 0 False False 0.0 False False 0j False False 0L False False False False True '' False True u'' False True [] True True [] False True () False True deque([]) False False None False False NotImplemented False False False False Instance w/o getitem False False NewStyle Instance w/o getitem True True Instance w getitem True True NewStyle Instance w getitem True False {} True True {} From edcjones at comcast.net Wed Feb 22 21:27:56 2006 From: edcjones at comcast.net (Edward C. Jones) Date: Wed, 22 Feb 2006 15:27:56 -0500 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: References: Message-ID: <43FCC94C.6000405@comcast.net> Guido van Rossen wrote: > I think the pattern hasn't been commonly known; people have been > struggling with setdefault() all these years. I use setdefault _only_ to speed up the following code pattern: if akey not in somedict: somedict[akey] = list() somedict[akey].append(avalue) These lines of simple Python are much easier to read and write than somedict.setdefault(akey, list()).append(avalue) From rrr at ronadam.com Wed Feb 22 21:28:52 2006 From: rrr at ronadam.com (Ron Adam) Date: Wed, 22 Feb 2006 14:28:52 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp><43F69411.1020807@canterbury.ac.nz><20060217202813.5FA2.JCARLSON@uci.edu><43f6abab.1091371449@news.gmane.org><87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp><43F8C391.2070405@v.loewis.de><87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp><43FA3820.3070607@v.loewis.de><87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp><43FAE40B.8090406@canterbury.ac.nz><87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> Message-ID: <43FCC984.8020207@ronadam.com> Terry Reedy wrote: > "Greg Ewing" wrote in message > >> Which is why I think that only *unicode* codings should be >> available through the .encode and .decode interface. Or >> alternatively there should be something more explicit like >> .unicode_encode and .unicode_decode that is thus restricted. > > I prefer the shorter names and using recode, for instance, for bytes to > bytes. While I prefer constructors with an explicit encode argument, and use a recode() method for 'like to like' coding. Then the whole encode/decode confusion goes away. From fuzzyman at voidspace.org.uk Wed Feb 22 22:00:57 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 22 Feb 2006 21:00:57 +0000 Subject: [Python-Dev] operator.is*Type In-Reply-To: <000801c637e8$547d8390$6a01a8c0@RaymondLaptop1> References: <43FC4BEF.1060502@voidspace.org.uk> <002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1> <43FC9AF6.3040506@colorstudy.com> <000801c637e8$547d8390$6a01a8c0@RaymondLaptop1> Message-ID: <43FCD109.7020803@voidspace.org.uk> Raymond Hettinger wrote: > [Ian Bicking] >> They seem terribly pointless to me. > > FWIW, here is the script that had I used while updating and improving > the two functions (can't remember whether it was for Py2.3 or Py2.4). > It lists comparative results for many different types of inputs. > Since perfection was not possible, the goal was to have no false > negatives and mostly accurate positives. IMO, they do a pretty good > job and are able to access information in not otherwise visable to > pure Python code. With respect to user defined instances, I don't > care that they can't draw a distinction where none exists in the first > place -- at some point you have to either fallback on duck-typing or > be in control of what kind of arguments you submit to your functions. > Practicality beats purity -- especially when a pure solution doesn't > exist (i.e. given a user defined class that defines just __getitem__, > both mapping or sequence behavior is a possibility). > But given : True True Instance w getitem True True NewStyle Instance w getitem True True [] True True {} (Last one is UserDict) I can't conceive of circumstances where this is useful without duck typing *as well*. The tests seem roughly analogous to : def isMappingType(obj): return isinstance(obj, dict) or hasattr(obj, '__getitem__') def isSequenceType(obj): return isinstance(obj, (basestring, list, tuple, collections.deque)) or hasattr(obj, '__getitem__') If you want to allow sequence access you could either just use the isinstance or you *have* to trap an exception in the case of a mapping object being passed in. Redefining (effectively) as : def isMappingType(obj): return isinstance(obj, dict) or (hasattr(obj, '__getitem__') and hasattr(obj, 'keys')) def isSequenceType(obj): return isinstance(obj, (basestring, list, tuple, collections.deque)) or (hasattr(obj, '__getitem__') and not hasattr(obj, 'keys')) Makes the test useful where you want to know you can safely treat an object as a mapping (or sequence) *and* where you want to tell the difference. The only code that would break is use of mapping objects that don't define ``keys`` and sequences that do. I imagine these must be very rare and *would* be interested in seeing real code that does break. Especially if that code cannot be trivially rewritten to use the first example. All the best, Michael Foord > > ---- Analysis Script ---- > > from collections import deque > from UserList import UserList > from UserDict import UserDict > from operator import * > types = (set, > int, float, complex, long, bool, > str, unicode, > list, UserList, tuple, deque, > ) > > for t in types: > print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t) > > class c: > def __repr__(self): > return 'Instance w/o getitem' > > class cn(object): > def __repr__(self): > return 'NewStyle Instance w/o getitem' > > class cg: > def __repr__(self): > return 'Instance w getitem' > def __getitem__(self): > return 10 > > class cng(object): > def __repr__(self): > return 'NewStyle Instance w getitem' > def __getitem__(self): > return 10 > > def f(): > return 1 > > def g(): > yield 1 > > for i in (None, NotImplemented, g(), c(), cn()): > print isMappingType(i), isSequenceType(i), repr(i), type(i) > > for i in (cg(), cng(), dict(), UserDict()): > print isMappingType(i), isSequenceType(i), repr(i), type(i) > > > > ---- Output ---- > > False False set([]) > False False 0 > False False 0.0 > False False 0j > False False 0L > False False False > False True '' > False True u'' > False True [] > True True [] > False True () > False True deque([]) > False False None > False False NotImplemented > False False > False False Instance w/o getitem > False False NewStyle Instance w/o getitem > True True Instance w getitem > True True NewStyle Instance w getitem > True False {} > True True {} > > From mcherm at mcherm.com Wed Feb 22 22:13:28 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 22 Feb 2006 13:13:28 -0800 Subject: [Python-Dev] defaultdict and on_missing() Message-ID: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> A minor related point about on_missing(): Haven't we learned from regrets over the .next() method of iterators that all "magically" invoked methods should be named using the __xxx__ pattern? Shouldn't it be named __on_missing__() instead? -- Michael Chermside From python at rcn.com Wed Feb 22 22:18:58 2006 From: python at rcn.com (Raymond Hettinger) Date: Wed, 22 Feb 2006 16:18:58 -0500 Subject: [Python-Dev] operator.is*Type References: <43FC4BEF.1060502@voidspace.org.uk><002101c637a7$81d7afa0$6a01a8c0@RaymondLaptop1><43FC9AF6.3040506@colorstudy.com><000801c637e8$547d8390$6a01a8c0@RaymondLaptop1> <43FCD109.7020803@voidspace.org.uk> Message-ID: <004801c637f5$9d79acb0$6a01a8c0@RaymondLaptop1> > But given : > > True True Instance w getitem > True True NewStyle Instance w getitem > True True [] > True True {} > > (Last one is UserDict) > > I can't conceive of circumstances where this is useful without duck > typing *as well*. Yawn. Give it up. For user defined instances, these functions can only discriminate between the presence or absence of __getitem__. If you're trying to distinguish between sequences and mappings for instances, you're own your own with duck-typing. Since there is no mandatory mapping or sequence API, the operator module functions cannot add more checks without getting some false negatives (your original example is a case in point). Use the function as-is and add your own isinstance checks for your own personal definition of what makes a mapping a mapping and what makes a sequence a sequence. Or better yet, stop designing APIs that require you to differentiate things that aren't really different ;-) Raymond From pedronis at strakt.com Wed Feb 22 22:27:51 2006 From: pedronis at strakt.com (Samuele Pedroni) Date: Wed, 22 Feb 2006 22:27:51 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> Message-ID: <43FCD757.7030006@strakt.com> Almann T. Goo wrote: >>As far as I remember, Guido wasn't particularly opposed >>to the idea, but the discussion fizzled out after having >>failed to reach a consensus on an obviously right way >>to go about it. > > > My apologies for bringing this debated topic again to the > front-lines--that said, I think there has been good, constructive > things said again and sometimes it doesn't hurt to kick up an old > topic. After pouring through some of the list archive threads and > reading through this thread, it seems clear to me that the community > doesn't seem all that keen on fixing issue--which was my goal to > ferret out. > > For me this is one of those things where the Pythonic thing to do is > not so clear--and that mysterious, enigmatic definition of what it > means to be Pythonic can be quite individual so I definitely don't > want to waste my time arguing what that means. > > The most compelling argument for not doing anything about it is that > the use cases are probably not that many--that in itself makes me less > apt to push much harder--especially since my pragmatic side agrees > with a lot of what has been said to this regard. > > IMO, Having properly nested scopes in Python in a sense made having > closures a natural idiom to the language and part of its "user > interface." By not allowing the name re-binding it almost seems like > that "user interface" has a rough edge that is almost too easy to get > cut on. This in-elegance seems very un-Pythonic to me. > If you are looking for rough edges about nested scopes in Python this is probably worse: >>> x = [] >>> for i in range(10): ... x.append(lambda : i) ... >>> [y() for y in x] [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] although experienced people can live with it. The fact is that importing nested scope from the like of Scheme it was not considered that in Scheme for example, looping constructs introduce new scopes. So this work more as expected there. There were long threads about this at some point too. Idioms and features mostly never port straightforwardly from language to language. For example Python has nothing with the explicit context introduction and grouping of a Scheme 'let', so is arguable that nested scope code, especially with rebindings, would be less clear, readable than in Scheme (tastes in parenthesis kept aside). > Anyhow, good discussion. > > Cheers, > Almann > > -- > Almann T. Goo > almann.goo at gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com From brett at python.org Wed Feb 22 22:22:19 2006 From: brett at python.org (Brett Cannon) Date: Wed, 22 Feb 2006 13:22:19 -0800 Subject: [Python-Dev] PEP 358 (bytes type) comments Message-ID: First off, thanks to Neil for writing this all down. The whole thread of discussion on the bytes type was rather long and thus hard to follow. Nice to finally have it written down in a PEP. Anyway, a few comments on the PEP. One, should the hex() method instead be an attribute, implemented as a property? Seems like static data that is entirely based on the value of the bytes object and thus is not properly represented by a method. Next, why are the __*slice__ methods to be defined? Docs say they are deprecated. And for the open-ended questions, I don't think sort() is needed. Lastly, maybe I am just dense, but it took me a second to realize that it will most likely return the ASCII string for __str__() for use in something like socket.send(), but it isn't explicitly stated anywhere. There is a chance someone might think that __str__ will somehow return the sequence of integers as a string does exist. -Brett From pedronis at strakt.com Wed Feb 22 22:32:01 2006 From: pedronis at strakt.com (Samuele Pedroni) Date: Wed, 22 Feb 2006 22:32:01 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FBA9F3.7020003@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43FBA9F3.7020003@canterbury.ac.nz> Message-ID: <43FCD851.2070302@strakt.com> Greg Ewing wrote: > Jeremy Hylton wrote: > > >>The names of naming statements are quite hard to get right, I fear. > > > My vote goes for 'outer'. > > And if this gets accepted, remove 'global' in 3.0. > In 3.0 we could remove 'global' even without 'outer', and make module global scopes read-only, not rebindable after the top-level code has run (i.e. more like function body scopes). The only free-for-all namespaces would be class and instance ones. I can think of some gains from this. <.3 wink> From nas at arctrix.com Wed Feb 22 22:28:44 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 22 Feb 2006 14:28:44 -0700 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: References: <20060216025515.GA474@mems-exchange.org> Message-ID: <20060222212844.GA15221@mems-exchange.org> On Thu, Feb 16, 2006 at 12:47:22PM -0800, Guido van Rossum wrote: > BTW, for folks who want to experiment, it's quite simple to create a > working bytes implementation by inheriting from array.array. Here's a > quick draft (which only takes str instance arguments): Here's a more complete prototype. Also, I checked in the PEP as #358 after making changes suggested by Guido. Neil import sys from array import array import re import binascii class bytes(array): __slots__ = [] def __new__(cls, initialiser=None, encoding=None): b = array.__new__(cls, "B") if isinstance(initialiser, basestring): if isinstance(initialiser, unicode): if encoding is None: encoding = sys.getdefaultencoding() initialiser = initialiser.encode(encoding) initialiser = [ord(c) for c in initialiser] elif encoding is not None: raise TypeError("explicit encoding invalid for non-string " "initialiser") b.extend(initialiser) return b @classmethod def fromhex(self, data): data = re.sub(r'\s+', '', data) return bytes(binascii.unhexlify(data)) def __str__(self): return self.tostring() def __repr__(self): return "bytes(%r)" % self.tolist() def __add__(self, other): if isinstance(other, array): return bytes(super(bytes, self).__add__(other)) return NotImplemented def __mul__(self, n): return bytes(super(bytes, self).__mul__(n)) __rmul__ = __mul__ def __getslice__(self, i, j): return bytes(super(bytes, self).__getslice__(i, j)) def hex(self): return binascii.hexlify((self.tostring())) def decode(self, encoding): return self.tostring().decode(encoding) From anthony at interlink.com.au Wed Feb 22 22:50:12 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 23 Feb 2006 08:50:12 +1100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <20060212105141.GS10226@xs4all.nl> References: <20060212105141.GS10226@xs4all.nl> Message-ID: <200602230850.14524.anthony@interlink.com.au> On Sunday 12 February 2006 21:51, Thomas Wouters wrote: > Well, in the past, features -- even syntax changes -- have gone in > between the last beta and the final release (but reminding Guido > might bring him to tears of regret. ;) Features have also gone into > what would have been 'bugfix releases' if you looked at the > numbering alone (1.5 -> 1.5.1 -> 1.5.2, for instance.) "The past" > doesn't have a very impressive track record... *cough* Go on. Try slipping a feature into a bugfix release now, see how loudly you can make an Australian swear... See also PEP 006. Do I need to add a "bad language" caveat in it? From bob at redivi.com Wed Feb 22 23:03:29 2006 From: bob at redivi.com (Bob Ippolito) Date: Wed, 22 Feb 2006 14:03:29 -0800 Subject: [Python-Dev] PEP 358 (bytes type) comments In-Reply-To: References: Message-ID: On Feb 22, 2006, at 1:22 PM, Brett Cannon wrote: > First off, thanks to Neil for writing this all down. The whole thread > of discussion on the bytes type was rather long and thus hard to > follow. Nice to finally have it written down in a PEP. > > Anyway, a few comments on the PEP. One, should the hex() method > instead be an attribute, implemented as a property? Seems like static > data that is entirely based on the value of the bytes object and thus > is not properly represented by a method. > > Next, why are the __*slice__ methods to be defined? Docs say they are > deprecated. > > And for the open-ended questions, I don't think sort() is needed. sort would be totally useless for bytes. array.array doesn't have sort either. > Lastly, maybe I am just dense, but it took me a second to realize that > it will most likely return the ASCII string for __str__() for use in > something like socket.send(), but it isn't explicitly stated anywhere. > There is a chance someone might think that __str__ will somehow > return the sequence of integers as a string does exist. That would be a bad idea given that bytes are supposed make the str type go away. It's probably better to make __str__ return __repr__ like it does for most types. If bytes type supports the buffer API (one would hope so), functions like socket.send should do the right thing as-is. http://docs.python.org/api/bufferObjects.html -bob From guido at python.org Wed Feb 22 23:19:19 2006 From: guido at python.org (Guido van Rossum) Date: Wed, 22 Feb 2006 17:19:19 -0500 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: <200602230850.14524.anthony@interlink.com.au> References: <20060212105141.GS10226@xs4all.nl> <200602230850.14524.anthony@interlink.com.au> Message-ID: However the definition of "feature" vs. "bugfix" isn't always crystal clear. Some things that went into 2.4 recently felt like small features to me; but others may disagree: - fixing chunk.py to allow chunk size to be > 2GB - supporting Unicode filenames in fileinput.py Are these features or bugfixes? On 2/22/06, Anthony Baxter wrote: > On Sunday 12 February 2006 21:51, Thomas Wouters wrote: > > Well, in the past, features -- even syntax changes -- have gone in > > between the last beta and the final release (but reminding Guido > > might bring him to tears of regret. ;) Features have also gone into > > what would have been 'bugfix releases' if you looked at the > > numbering alone (1.5 -> 1.5.1 -> 1.5.2, for instance.) "The past" > > doesn't have a very impressive track record... > > *cough* Go on. Try slipping a feature into a bugfix release now, see > how loudly you can make an Australian swear... > > See also PEP 006. Do I need to add a "bad language" caveat in it? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Wed Feb 22 23:47:06 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Thu, 23 Feb 2006 09:47:06 +1100 Subject: [Python-Dev] [ python-Feature Requests-1436243 ] Extend pre-allocated integers to cover [0, 255] Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB97D@au3010avexu1.global.avaya.com> SourceForge.net wrote: > Status: Closed > Resolution: Accepted And here I was, thinking I might actually work on this and submit a patch on the weekend ... Tim Delaney From anthony at interlink.com.au Wed Feb 22 23:50:29 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 23 Feb 2006 09:50:29 +1100 Subject: [Python-Dev] release plan for 2.5 ? In-Reply-To: References: <200602230850.14524.anthony@interlink.com.au> Message-ID: <200602230950.31914.anthony@interlink.com.au> On Thursday 23 February 2006 09:19, Guido van Rossum wrote: > However the definition of "feature" vs. "bugfix" isn't always > crystal clear. > > Some things that went into 2.4 recently felt like small features to > me; but others may disagree: > > - fixing chunk.py to allow chunk size to be > 2GB > - supporting Unicode filenames in fileinput.py > > Are these features or bugfixes? Sure, the line isn't so clear sometimes. I consider both of these bugfixes, but others could disagree. True/False, on the other hand, I don't think anyone disagrees about This stuff is always open for discussion, of course. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From tdelaney at avaya.com Thu Feb 23 00:03:05 2006 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Thu, 23 Feb 2006 10:03:05 +1100 Subject: [Python-Dev] operator.is*Type Message-ID: <2773CAC687FD5F4689F526998C7E4E5F074350@au3010avexu1.global.avaya.com> Raymond Hettinger wrote: > Your example simply highlights the consequences of one of Python's > most basic, original design choices (using getitem for both sequences > and mappings). That choice is now so fundamental to the language > that it cannot possibly change. Hmm - just a thought ... Since we're adding the __index__ magic method, why not have a __getindexed__ method for sequences. Then semantics of indexing operations would be something like: if hasattr(obj, '__getindexed__'): return obj.__getindexed__(val.__index__()) else: return obj.__getitem__(val) Similarly __setindexed__ and __delindexed__. This would allow distinguishing between sequences and mappings in a fairly backwards-compatible way. It would also enforce that only indexes can be used for sequences. The backwards-incompatibility comes in when you have a type that implements __getindexed__, and a subclass that implements __getitem__ e.g. if `list` implemented __getindexed__ then any `list` subclass that overrode __getitem__ would fail. However, I think we could make it 100% backwards-compatible for the builtin sequence types if they just had __getindexed__ delegate to __getitem__. Effectively: class list (object): def __getindexed__(self, index): return self.__getitem__(index) Tim Delaney From anthony at interlink.com.au Thu Feb 23 00:36:12 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 23 Feb 2006 10:36:12 +1100 Subject: [Python-Dev] buildbot, and test failures Message-ID: <200602231036.24936.anthony@interlink.com.au> It took 2 hours, but I caught up on Python-dev email. Hoorah. So, couple of things - the trunk has test failures for me, right now. test test_email failed -- Traceback (most recent call last): File "/home/anthony/src/py/pytrunk/python/Lib/email/test/test_email.py", line 2111, in test_parsedate_acceptable_to_time_functions eq(time.localtime(t)[:6], timetup[:6]) AssertionError: (2003, 2, 5, 14, 47, 26) != (2003, 2, 5, 13, 47, 26) Right now, Australia's in daylight savings, I suspect that's the problem here. I also see intermittent failures from test_socketserver: test_socketserver test test_socketserver crashed -- socket.error: (111, 'Connection refused') is the only error message. When it fails, regrtest fails to exit - it just sits there after printing out the summary. This suggests that there's a threaded server not getting cleaned up correctly. test_socketserver could probably do with a rewrite. Who's the person who hands out buildbot username/password pairs? I have an Ubuntu x86 box here that can become one (I think the only linux, currently, is Gentoo...) Anthony -- Anthony Baxter It's never too late to have a happy childhood. From spam4bsimons at yahoo.ca Thu Feb 23 00:45:18 2006 From: spam4bsimons at yahoo.ca (Brendan Simons) Date: Wed, 22 Feb 2006 18:45:18 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: Message-ID: <57473635-4E4C-4E00-9376-AC851230BF69@yahoo.ca> On 21-Feb-06, at 11:21 AM, Almann T. Goo" wrote: >> Why not just use a class? >> >> >> def incgen(start=0, inc=1) : >> class incrementer(object): >> a = start - inc >> def __call__(self): >> self.a += inc >> return self.a >> return incrementer() >> >> a = incgen(7, 5) >> for n in range(10): >> print a(), > > Because I think that this is a workaround for a concept that the > language doesn't support elegantly with its lexically nested scopes. > > IMO, you are emulating name rebinding in a closure by creating an > object to encapsulate the name you want to rebind--you don't need this > workaround if you only need to access free variables in an enclosing > scope. I provided a "lighter" example that didn't need a callable > object but could use any mutable such as a list. > > This kind of workaround is needed as soon as you want to re-bind a > parent scope's name, except in the case when the parent scope is the > global scope (since there is the "global" keyword to handle this). > It's this dichotomy that concerns me, since it seems to be against the > elegance of Python--at least in my opinion. > > It seems artificially limiting that enclosing scope name rebinds are > not provided for by the language especially since the behavior with > the global scope is not so. In a nutshell I am proposing a solution > to make nested lexical scopes to be orthogonal with the global scope > and removing a "wart," as Jeremy put it, in the language. > > -Almann > > -- > Almann T. Goo > almann.goo at gmail.com If I may be so bold, couldn't this be addressed by introducing a "rebinding" operator? So the ' = ' operator would continue to create a new name in the current scope, and the (say) ' := ' operator would for an existing name to rebind. The two operators would highlight the special way Python handles variable / name assignment, which many newbies miss. (from someone who was surprised by this quirk of Python before: http://www.thescripts.com/forum/thread43418.html) -Brendan -- Brendan Simons -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/94282906/attachment.html From tim.peters at gmail.com Thu Feb 23 00:59:28 2006 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 22 Feb 2006 18:59:28 -0500 Subject: [Python-Dev] buildbot vs. Windows In-Reply-To: References: <43FA487B.4070909@v.loewis.de> <43FA52C8.70201@voidspace.org.uk> <20060220154659.5FF5.JCARLSON@uci.edu> <43FA60A9.4030209@v.loewis.de> <20060221011148.GA20714@panix.com> <1f7befae0602201924i5d2e6568j7a55dbfd7eefff@mail.gmail.com> <43FB8763.7070802@v.loewis.de> <1f7befae0602211704p7b9cbc19jde245e9e1d6bea02@mail.gmail.com> Message-ID: <1f7befae0602221559l2360ea8bw7eb80f4086b41b23@mail.gmail.com> [Neal Norwitz] > ... > I also think I know how to do the "double builds" (one release and one > debug). But it's too late for me to change it tonight without > screwing it up. I'm not mad :-). The debug build is more fruitful than the release build for finding problems, so doing two debug-build runs is an improvement (keeping in mind that some bugs only show up in release builds, though -- for example, subtly incorrect C code that works differently depending on whether compiler optimization is in effect). > The good/bad news after this change is: > > http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/145/step-test/0 > > A seg fault on Mac OS when running with -r. :-( Yay! That's certainly good/bad news. Since I always run with -r, I've had the fun of tracking most of these down. Sometimes it's very hard, sometimes not. regrtest's -f option is usually needed, to force running the tests in exactly the same order, then commenting test names out in binary-search fashion to get a minimal subset. Alas, half the time the cause for a -r segfault turns out to be an error in refcounting or in setting up gc'able containers, and has nothing in particular to do with the specific tests being run. Those are the "very hard" ones ;-) Setting the gc threshold to 1 (do a full collection on every allocation) can sometimes provoke such problems easily. From greg.ewing at canterbury.ac.nz Thu Feb 23 01:16:11 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 13:16:11 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <20060221105309.601F.JCARLSON@uci.edu> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> Message-ID: <43FCFECB.9010101@canterbury.ac.nz> Josiah Carlson wrote: > However, I believe global was and is necessary for the > same reasons for globals in any other language. Oddly, in Python, 'global' isn't actually necessary, since the module can always import itself and use attribute access. Clearly, though, Guido must have thought at the time that it was worth providing an alternative way. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 01:30:32 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 13:30:32 +1300 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> <43FBB1D1.5060808@canterbury.ac.nz> <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> Message-ID: <43FD0228.1070701@canterbury.ac.nz> Fredrik Lundh wrote: > fwiw, the first google hit for "autodict" appears to be part of someone's > link farm > > At this website we have assistance with autodict. In addition to > information for autodict we also have the best web sites concerning > dictionary, non profit and new york. Hmmm, looks like some sort of bot that takes the words in your search and stuffs them into its response. I wonder if they realise how silly the results end up sounding? I've seen these sorts of things before, but I haven't quite figured out yet how they manage to get into Google's database if they're auto-generated. Anyone have any clues what goes on? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From almann.goo at gmail.com Thu Feb 23 02:12:29 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Wed, 22 Feb 2006 20:12:29 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FCFECB.9010101@canterbury.ac.nz> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <43FCFECB.9010101@canterbury.ac.nz> Message-ID: <7e9b97090602221712k29016577te0d87ebb504949d5@mail.gmail.com> > Oddly, in Python, 'global' isn't actually necessary, > since the module can always import itself and use > attribute access. > > Clearly, though, Guido must have thought at the time > that it was worth providing an alternative way. I believe that use cases for rebinding globals (module attributes) from within a module are more numerous than rebinding in an enclosing lexical scope (although rebinding a name in the global scope from a local scope is really just a specific case of that). I would think this was probably a motivator for the 'global' key word to avoid clumsier workarounds. Since there were no nested lexical scopes back then, there was no need to have a construct for arbitrary enclosing scopes. -Almann -- Almann T. Goo almann.goo at gmail.com From greg.ewing at canterbury.ac.nz Thu Feb 23 03:28:52 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 15:28:52 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> Message-ID: <43FD1DE4.4040503@canterbury.ac.nz> Terry Reedy wrote: > "Greg Ewing" wrote in message > >>Efficiency is an implementation concern. > > It is also a user concern, especially if inefficiency overruns memory > limits. Sure, but what I mean is that it's better to find what's conceptually right and then look for an efficient way of implementing it, rather than letting the implementation drive the design. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From spam4bsimons at yahoo.ca Thu Feb 23 03:46:09 2006 From: spam4bsimons at yahoo.ca (Brendan Simons) Date: Wed, 22 Feb 2006 21:46:09 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: Message-ID: <9E97E3D1-12DB-44EE-8155-F6E8933E8C88@yahoo.ca> On 22-Feb-06, at 9:28 PM, python-dev-request at python.org wrote: > On 21-Feb-06, at 11:21 AM, Almann T. Goo" > wrote: > >>> Why not just use a class? >>> >>> >>> def incgen(start=0, inc=1) : >>> class incrementer(object): >>> a = start - inc >>> def __call__(self): >>> self.a += inc >>> return self.a >>> return incrementer() >>> >>> a = incgen(7, 5) >>> for n in range(10): >>> print a(), >> >> Because I think that this is a workaround for a concept that the >> language doesn't support elegantly with its lexically nested scopes. >> >> IMO, you are emulating name rebinding in a closure by creating an >> object to encapsulate the name you want to rebind--you don't need >> this >> workaround if you only need to access free variables in an enclosing >> scope. I provided a "lighter" example that didn't need a callable >> object but could use any mutable such as a list. >> >> This kind of workaround is needed as soon as you want to re-bind a >> parent scope's name, except in the case when the parent scope is the >> global scope (since there is the "global" keyword to handle this). >> It's this dichotomy that concerns me, since it seems to be against >> the >> elegance of Python--at least in my opinion. >> >> It seems artificially limiting that enclosing scope name rebinds are >> not provided for by the language especially since the behavior with >> the global scope is not so. In a nutshell I am proposing a solution >> to make nested lexical scopes to be orthogonal with the global scope >> and removing a "wart," as Jeremy put it, in the language. >> >> -Almann >> >> -- >> Almann T. Goo >> almann.goo at gmail.com > > If I may be so bold, couldn't this be addressed by introducing a > "rebinding" operator? So the ' = ' operator would continue to > create a new name in the current scope, and the (say) ' := ' > operator would for an existing name to rebind. The two operators > would highlight the special way Python handles variable / name > assignment, which many newbies miss. > > (from someone who was surprised by this quirk of Python before: > http://www.thescripts.com/forum/thread43418.html) > > -Brendan > -- > Brendan Simons Sorry, this got hung up in my email outbox. I see the thread has touched on this idea in the meantime. So, yeah. Go team. Brendan -- Brendan Simons -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/16807fe6/attachment.html From greg.ewing at canterbury.ac.nz Thu Feb 23 03:49:42 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 15:49:42 +1300 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> Message-ID: <43FD22C6.70108@canterbury.ac.nz> Steven Bethard wrote: > And, as you mention, it's consistent > with the relative import feature. Only rather vaguely -- it's really somewhat different. With imports, .foo is an abbreviation for myself.foo, where myself is the absolute name for the current module, and you could replace all instances of .foo with that. But in the suggested scheme, .foo wouldn't have any such interpretation -- there would be no other way of spelling it. Also, with imports, the dot refers to a single well- defined point in the module-name hierarchy, but here it would imply a search upwards throught the scope hierarchy. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 03:53:21 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 15:53:21 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FCC984.8020207@ronadam.com> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <43FCC984.8020207@ronadam.com> Message-ID: <43FD23A1.2070304@canterbury.ac.nz> Ron Adam wrote: > While I prefer constructors with an explicit encode argument, and use a > recode() method for 'like to like' coding. Then the whole encode/decode > confusion goes away. I'd be happy with that, too. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Thu Feb 23 04:01:52 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 22 Feb 2006 22:01:52 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <43FD22C6.70108@canterbury.ac.nz> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> At 03:49 PM 2/23/2006 +1300, Greg Ewing wrote: >Steven Bethard wrote: > > And, as you mention, it's consistent > > with the relative import feature. > >Only rather vaguely -- it's really somewhat different. > >With imports, .foo is an abbreviation for myself.foo, >where myself is the absolute name for the current module, >and you could replace all instances of .foo with that. Actually, "import .foo" is an abbreviation for "import myparent.foo", not "import myparent.myself.foo". From barry at python.org Thu Feb 23 04:29:08 2006 From: barry at python.org (Barry Warsaw) Date: Wed, 22 Feb 2006 22:29:08 -0500 Subject: [Python-Dev] getdefault(), the real replacement for setdefault() Message-ID: <1140665348.14539.114.camel@geddy.wooz.org> Guido's on_missing() proposal is pretty good for what it is, but it is not a replacement for set_default(). The use cases for a derivable, definition or instantiation time framework is different than the call-site based decision being made with setdefault(). The difference is that in the former case, the class designer or instantiator gets to decide what the default is, and in the latter (i.e. current) case, the user gets to decide. Going back to first principles, the two biggest problems with today's setdefault() is 1) the default object gets instantiated whether you need it or not, and 2) the idiom is not very readable. To directly address these two problems, I propose a new method called getdefault() with the following signature: def getdefault(self, key, factory) This yields the following idiom: d.getdefault('foo', list).append('bar') Clearly this completely addresses problem #1. The implementation is simple and obvious, and there's no default object instantiated unless the key is missing. I think #2 is addressed nicely too because "getdefault()" shifts the focus on what the method returns rather than the effect of the method on the target dict. Perhaps that's enough to make the chained operation on the returned value feel more natural. "getdefault()" also looks more like "get()" so maybe that helps it be less jarring. This approach also seems to address Raymond's objections because getdefault() isn't "special" the way on_missing() would be. Anyway, I don't think it's an either/or choice with Guido's subclass. Instead I think they are different use cases. I would add getdefault() to the standard dict API, remove (eventually) setdefault(), and add Guido's subclass in a separate module. But I /wouldn't/ clutter the built-in dict's API with on_missing(). -Barry P.S. _missing = object() def getdefault(self, key, factory): value = self.get(key, _missing) if value is _missing: value = self[key] = factory() return value -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20060222/6c570cfa/attachment.pgp From steven.bethard at gmail.com Thu Feb 23 04:58:57 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed, 22 Feb 2006 20:58:57 -0700 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> Message-ID: Steven Bethard wrote: > And, as you mention, it's consistent with the relative import feature. Greg Ewing wrote: > With imports, .foo is an abbreviation for myself.foo, > where myself is the absolute name for the current module, > and you could replace all instances of .foo with that. Phillip J. Eby wrote: > Actually, "import .foo" is an abbreviation for "import myparent.foo", not > "import myparent.myself.foo". If we wanted to be fully consistent with the relative import mechanism, we would require as many dots as nested scopes. So: def incrementer(val): def inc(): .val += 1 return .val return inc but also: def incrementer_getter(val): def incrementer(): def inc(): ..val += 1 return ..val return inc return incrementer (Yes, I know the example is silly. It's not meant as a use case, just to demonstrate the usage of dots.) I actually don't care which way it goes here, but if you want to make the semantics as close to the relative import semantics as possible, then this is the way to go. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From greg.ewing at canterbury.ac.nz Thu Feb 23 05:07:33 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:07:33 +1300 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> Message-ID: <43FD3505.5080904@canterbury.ac.nz> Steven Bethard wrote: > Phillip J. Eby wrote: > >>Actually, "import .foo" is an abbreviation for "import myparent.foo", not >>"import myparent.myself.foo". Oops, sorry, you're right. s/myself/myparent/g -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Thu Feb 23 05:13:16 2006 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 23 Feb 2006 05:13:16 +0100 Subject: [Python-Dev] buildbot, and test failures In-Reply-To: <200602231036.24936.anthony@interlink.com.au> References: <200602231036.24936.anthony@interlink.com.au> Message-ID: <43FD365C.10801@v.loewis.de> Anthony Baxter wrote: > Who's the person who hands out buildbot username/password pairs? That's me. > I > have an Ubuntu x86 box here that can become one (I think the only > linux, currently, is Gentoo...) How different are the Linuxes, though? How many of them do we need? Regards, Martin From greg.ewing at canterbury.ac.nz Thu Feb 23 05:23:21 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:23:21 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <3886473F-A4F8-4B1A-9EEC-A60E9D221D45@fuhm.net> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <3886473F-A4F8-4B1A-9EEC-A60E9D221D45@fuhm.net> Message-ID: <43FD38B9.4040903@canterbury.ac.nz> James Y Knight wrote: > Some MIME sections > might have a base64 Content-Transfer-Encoding, others might be 8bit > encoded, others might be 7bit encoded, others might be quoted- printable > encoded. I stand corrected -- in that situation you would have to encode the characters before combining them with other material. However, this doesn't change my view that the result of base64 encoding by itself is characters, not bytes. To go straight to bytes would require assuming an encoding, and that would make it *harder* to use in cases where you wanted a different encoding, because you'd first have to undo the default encoding and then re-encode it using the one you wanted. It may be reasonable to provide an easy way to go straight from raw bytes to ascii-encoded-base64 bytes, but that should be a different codec. The plain base64 codec should produce text. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 05:25:30 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:25:30 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FCD757.7030006@strakt.com> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> Message-ID: <43FD393A.80209@canterbury.ac.nz> Samuele Pedroni wrote: > If you are looking for rough edges about nested scopes in Python > this is probably worse: > > >>> x = [] > >>> for i in range(10): > ... x.append(lambda : i) > ... > >>> [y() for y in x] > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] As an aside, is there any chance that this could be changed in 3.0? I.e. have the for-loop create a new binding for the loop variable on each iteration. I know Guido seems to be attached to the idea of being able to use the value of the loop variable after the loop exits, but I find that to be a dubious practice readability-wise, and I can't remember ever using it. There are other ways of getting the same effect, e.g. assigning it to another variable before breaking out of the loop, or putting the loop in a function and using return. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 05:25:34 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:25:34 +1300 Subject: [Python-Dev] operator.is*Type In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F074350@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F074350@au3010avexu1.global.avaya.com> Message-ID: <43FD393E.1080703@canterbury.ac.nz> Delaney, Timothy (Tim) wrote: > Since we're adding the __index__ magic method, why not have a > __getindexed__ method for sequences. I don't think this is a good idea, since it would be re-introducing all the confusion that the existence of two C-level indexing slots has led to, this time for user-defined types. > The backwards-incompatibility comes in when you have a type that > implements __getindexed__, and a subclass that implements __getitem__ I don't think this is just a backwards-incompatibility issue. Having a single syntax that can correspond to more than one special method is inherently ambiguous. What do you do if both are defined? Sure you can come up with some rule to handle it, but it's better to avoid the situation in the first place. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 05:27:21 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:27:21 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <7e9b97090602221712k29016577te0d87ebb504949d5@mail.gmail.com> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <43FCFECB.9010101@canterbury.ac.nz> <7e9b97090602221712k29016577te0d87ebb504949d5@mail.gmail.com> Message-ID: <43FD39A9.6090108@canterbury.ac.nz> Almann T. Goo wrote: > (although rebinding a name in the global scope from a > local scope is really just a specific case of that). That's what rankles people about this, I think -- there doesn't seem to be a good reason for treating the global scope so specially, given that all scopes could be treated uniformly if only there were an 'outer' statement. All the arguments I've seen in favour of the status quo seem like rationalisations after the fact. > Since there were no nested lexical scopes back > then, there was no need to have a construct for arbitrary enclosing > scopes. However, if nested scopes *had* existed back then, I rather suspect we would have had an 'outer' statement from the beginning, or else 'global' would have been given the semantics we are now considering for 'outer'. Of all the suggestions so far, it seems to me that 'outer' is the least radical and most consistent with what we already have. How about we bung it in and see how it goes? We can always yank it out in 3.0 if it turns out to be a horrid mistake and we get swamped with a terabyte of grievously abusive nested scope code. :-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 05:27:28 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:27:28 +1300 Subject: [Python-Dev] Path PEP: some comments (equality) In-Reply-To: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> References: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> Message-ID: <43FD39B0.7000701@canterbury.ac.nz> Mark Mc Mahon wrote: > Should the path class implement an __eq__ method that might do some of > the following things: > - Get the absolute path of both self and the other path I don't think that any path operations should implicitly touch the file system like this. The paths may not represent real files or may be for a system other than the one the program is running on. > - normcase both Not sure about this one either. When dealing with remote file systems, it can be hard to know whether a path will be interpreted as case-sensitive or not. This can be a problem even with local filesystems, e.g. on MacOSX where you can have both HFS (case-insensitive) and Unix (case-sensitive) filesystems mounted. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Feb 23 05:27:33 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 23 Feb 2006 17:27:33 +1300 Subject: [Python-Dev] operator.is*Type In-Reply-To: <43FC4BEF.1060502@voidspace.org.uk> References: <43FC4BEF.1060502@voidspace.org.uk> Message-ID: <43FD39B5.1020008@canterbury.ac.nz> Fuzzyman wrote: > The operator module defines two functions : > > isMappingType > isSquenceType > > These protocols are loosely defined. Any object which has a > ``__getitem__`` method defined could support either protocol. These functions are actually testing for the presence of two different __getitem__ methods at the C level, one in the "mapping" substructure of the type object, and the other in the "sequence" substructure. This only works for types implemented in C which make use of this distinction. It's not much use for user-defined classes, where the presence of a __getitem__ method causes both of these slots to become populated. Having two different slots for __getitem__ seems to have been an ill-considered feature in the first place and would probably best be removed in 3.0. I wouldn't mind if these two functions went away. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From stephen at xemacs.org Thu Feb 23 07:05:43 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 23 Feb 2006 15:05:43 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FC4C8B.6080300@canterbury.ac.nz> (Greg Ewing's message of "Thu, 23 Feb 2006 00:35:39 +1300") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> Message-ID: <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing writes: Greg> Stephen J. Turnbull wrote: >> Base64 is a (family of) wire protocol(s). It's not clear to me >> that it makes sense to say that the alphabets used by "baseNN" >> encodings are composed of characters, Greg> Take a look at [this that the other] Those references use "character" in an ambiguous and ill-defined way. Trying to impose Python unicode object semantics on "vague characters" is a bad idea IMO. Greg> Which seems to make it perfectly clear that the result of Greg> the encoding is to be considered as characters, which are Greg> not necessarily going to be encoded using ascii. Please define "character," and explain how its semantics map to Python's unicode objects. Greg> So base64 on its own is *not* a wire protocol. Only after Greg> encoding the characters do you have a wire protocol. No, base64 isn't a wire protocol. Rather, it's a schema for a family of wire protocols, whose alphabets are heuristically chosen on the assumption that code units which happen to correspond to alpha-numeric code points in a commonly-used coded character set are more likely to pass through a communication channel without corruption. Note that I have _precisely_ defined what I mean. You still have the problem that you haven't defined character, and that is a real problem, see below. >> I don't see any case for "correctness" here, only for >> convenience, Greg> I'm thinking of convenience, too. Keep in mind that in Py3k, Greg> 'unicode' will be called 'str' (or something equally neutral Greg> like 'text') and you will rarely have to deal explicitly Greg> with unicode codings, this being done mostly for you by the Greg> I/O objects. So most of the time, using base64 will be just Greg> as convenient as it is today: base64_encode(my_bytes) and Greg> write the result out somewhere. Convenient, yes, but incorrect. Once you mix those bytes with the Python string type, they become subject to all the usual operations on characters, and there's no way for Python to tell you that you didn't want to do that. Ie, Greg> Whereas if the result is text, the right thing happens Greg> automatically whatever the ultimate encoding turns out to Greg> be. You can take the text from your base64 encoding, combine Greg> it with other text from any other source to form a complete Greg> mail message or xml document or whatever, and write it out Greg> through a file object that's using any unicode encoding at Greg> all, and the result will be correct. Only if you do no transformations that will harm the base64-encoding. This is why I say base64 is _not_ based on characters, at least not in the way they are used in Python strings. It doesn't allow any of the usual transformations on characters that might be applied globally to a mail composition buffer, for example. In other words, you don't escape from the programmer having to know what he's doing. EIBTI, and the setup I advocate forces the programmer to explicitly decide where to convert base64 objects to a textual representation. This reminds him that he'd better not touch that text. Greg> The reason I say it's *corrrect* is that if you go straight Greg> from bytes to bytes, you're *assuming* the eventual encoding Greg> is going to be an ascii superset. The programmer is going Greg> to have to know about this assumption and understand all its Greg> consequences and decide whether it's right, and if not, do Greg> something to change it. I'm not assuming any such thing, except in the context of analysis of implementation efficiency. And the programmer needs to know about the semantics of text that is actually a base64-encoded object, and that they are different from string semantics. This is something that programmers are used to dealing with in the case of Python 2.x str and C char[]; the whole point of the unicode type is to allow the programmer to abstract from that when dealing human-readable text. Why confuse the issue. >> And in the classroom, you're just going to confuse students by >> telling them that UTF-8 --[Unicode codec]--> Python string is >> decoding but UTF-8 --[base64 codec]--> Python string is >> encoding, when MAL is telling them that --> Python string is >> always decoding. Greg> Which is why I think that only *unicode* codings should be Greg> available through the .encode and .decode interface. Or Greg> alternatively there should be something more explicit like Greg> .unicode_encode and .unicode_decode that is thus restricted. Greg> Also, if most unicode coding is done in the I/O objects, Greg> there will be far less need for programmers to do explicit Greg> unicode coding in the first place, so likely it will become Greg> more of an advanced topic, rather than something you need to Greg> come to grips with on day one of using unicode, like it is Greg> now. So then you bring it right back in with base64. Now they need to know about bytes<->unicode codecs. Of course it all comes down to a matter of judgment. I do find your position attractive, but I just don't think it will work for naive users the way you think it will. It's also possible to make a precise statement of the rationale for my approach, which I have not been able to achieve for the "base64 uses characters" approach, and nobody else has demonstrated one, yet. On the other hand, I don't think either approach imposes substantially more burden on the advanced programmer, nor does either proposal involve a specific restriction on usage (aka "dumbing down the language"). -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From nick.bastin at gmail.com Thu Feb 23 07:14:12 2006 From: nick.bastin at gmail.com (Nicholas Bastin) Date: Thu, 23 Feb 2006 01:14:12 -0500 Subject: [Python-Dev] Unifying trace and profile In-Reply-To: <6949EC6CD39F97498A57E0FA55295B2101C312B6@ex9.hostedexchange.local> References: <6949EC6CD39F97498A57E0FA55295B2101C312B6@ex9.hostedexchange.local> Message-ID: <66d0a6e10602222214k2f3f874brac54dd495be5a441@mail.gmail.com> On 2/21/06, Robert Brewer wrote: > 1. Allow trace hooks to receive c_call, c_return, and c_exception events > (like profile does). I can easily make this modification. You can also register the same bound method for trace and profile, which sortof eliminates this problem. > 2. Allow profile hooks to receive line events (like trace does). You really don't want this in the general case. Line events make profiling *really* slow, and they're not that accurate (although many thanks to Armin last year for helping me make them much more accurate). I guess what you require is to be able to selectively turn on events, thus eliminating the notion of 'trace' or 'profile' entirely, but I don't have a good idea of how to implement that at least as efficiently as the current system at the moment - I'm sure it could be done, I just haven't put any thought into it. > 3. Expose new sys.gettrace() and getprofile() methods, so trace and > profile functions that want to play nice can call > sys.settrace/setprofile(None) only if they are the current hook. Not a bad idea, although are you really running into this problem a lot? > 4. Make "the same move" that sys.exitfunc -> atexit made (from a single > function to multiple functions via registration), so multiple > tracers/profilers can play nice together. It seems very unlikely that you'll want to have a trace hook and profile hook installed at the same time, given the extreme unreliability this will introduce into the profiler. > 5. Allow the core to filter on the "event" arg before hook(frame, event, > arg) is called. What do you mean by this, exactly? How would you use this feature? > 6. Unify tracing and profiling, which would remove a lot of redundant > code in ceval and sysmodule and free up some space in the PyThreadState > struct to boot. The more events you throw in profiling makes it slow, however. Line events, while a nice thing to have, theoretically, would probably make a profiler useless. If you want to create line-by-line timing data, we're going to have to look for a more efficient way (like sampling). > 7. As if the above isn't enough of a dream, it would be nice to have a > bytecode tracer, which didn't bother with the f_lineno logic in > maybe_call_line_trace, but just called the hook on every instruction. I'm working on one, but given how much time I've had to work on my profiler in the last year, I'm not even going to guess when I'll get a real shot at looking at that. My long-term goal is to eliminate profiling and tracing from the core interpreter entirely and implement the functionality in such a way that they don't cost you when not in use (i.e., implement profilers and debuggers which poke into the process from the outside, rather than be supported natively through events). This isn't impossible, but it's difficult because of the large variety of platforms. I have access to most of them, but again, my time is hugely constrained right now for python development work. -- Nick From stephen at xemacs.org Thu Feb 23 07:21:17 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 23 Feb 2006 15:21:17 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FCC984.8020207@ronadam.com> (Ron Adam's message of "Wed, 22 Feb 2006 14:28:52 -0600") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <43FCC984.8020207@ronadam.com> Message-ID: <8764n6mw82.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Ron" == Ron Adam writes: Ron> Terry Reedy wrote: >> I prefer the shorter names and using recode, for instance, for >> bytes to bytes. Ron> While I prefer constructors with an explicit encode argument, Ron> and use a recode() method for 'like to like' coding. 'Recode' is a great name for the conceptual process, but the methods are directional. Also, in internationalization work, "recode" strongly connotes "encodingA -> original -> encodingB", as in iconv. I do prefer constructors, as it's generally not a good idea to do encoding/decoding in-place for human-readable text, since the codecs are often lossy. Ron> Then the whole encode/decode confusion goes away. Unlikely. Errors like "A string".encode("base64").encode("base64") are all too easy to commit in practice. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From almann.goo at gmail.com Thu Feb 23 07:28:24 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Thu, 23 Feb 2006 01:28:24 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> Message-ID: <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> > If we wanted to be fully consistent with the relative import > mechanism, we would require as many dots as nested scopes. At first I was a bit taken a back with the syntax, but after reading PEP 328 (re: Relative Import) I think I can stomach the syntax a bit better ; ). That said, -1 because I believe it adds more problems than the one it is designed to fix. Part of me can appreciate using the prefixing "dot" as a way to spell "my parent's scope" since it does not add a new keyword and in this regard would appear to be equally as backwards compatible as the ":=" proposal (to which I am not a particularly big fan of either but could probably get used to it). Since the current semantics allow *evaluation* to an enclosing scope's name by an "un-punctuated" name, "var" is a synonym to ".var" (if "var" is bound in the immediately enclosing scope). However for *re-binding* to an enclosing scope's name, the "punctuated" name is the only one we can use, so the semantic becomes more cluttered. This can make a problem that I would say is akin to the "dangling else problem." def incrementer_getter(val): def incrementer(): val = 5 def inc(): ..val += 1 return val return inc return incrementer Building on an example that Steve wrote to demonstrate the syntax proposed, you can see that if a user inadvertently uses the enclosing scope for the return instead of what would presumably be the outer most bound parameter. Now remove the binding in the incrementer function and it works the way the user probably thought. Because of this, I think by adding the "dot" to allow resolving a name in an explicit way hurts the language by adding a new "gotcha" with existing name binding semantics. I would be okay with this if all name access for enclosing scopes (binding and evaluation) required the "dot" syntax (as I believe Steve suggests for Python 3K)--thus keeping the semantics cleaner--but that would be incredibly backwards incompatible for what I would guess is *a lot* of code. This is where the case for the re-bind operator (i.e. ":=") or an "outer" type keyword is stronger--the semantics in the language today are not adversely affected. -Almann -- Almann T. Goo almann.goo at gmail.com From rrr at ronadam.com Thu Feb 23 08:18:42 2006 From: rrr at ronadam.com (Ron Adam) Date: Thu, 23 Feb 2006 01:18:42 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <8764n6mw82.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <43FCC984.8020207@ronadam.com> <8764n6mw82.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FD61D2.1080200@ronadam.com> Stephen J. Turnbull wrote: >>>>>> "Ron" == Ron Adam writes: > > Ron> Terry Reedy wrote: > > >> I prefer the shorter names and using recode, for instance, for > >> bytes to bytes. > > Ron> While I prefer constructors with an explicit encode argument, > Ron> and use a recode() method for 'like to like' coding. > > 'Recode' is a great name for the conceptual process, but the methods > are directional. Also, in internationalization work, "recode" > strongly connotes "encodingA -> original -> encodingB", as in iconv. We could call it transform or translate if needed. Words are reused constantly in languages, so I don't think it's a sticking point. As long as its meaning is documented well and doesn't change later, I think it would be just fine. If the concept of not having encode and decode as methods work, (and has support other than me) the name can be decided later. > I do prefer constructors, as it's generally not a good idea to do > encoding/decoding in-place for human-readable text, since the codecs > are often lossy. > > Ron> Then the whole encode/decode confusion goes away. > > Unlikely. Errors like "A string".encode("base64").encode("base64") > are all too easy to commit in practice. Yes,... and wouldn't the above just result in a copy so it wouldn't be an out right error. But I understand that you mean similar cases where it would change the bytes with consecutive calls. In any case, I was referring to the confusion with the method names and how they are used. This is how I was thinking of it. * Given that the string type gains a __codec__ attribute to handle automatic decoding when needed. (is there a reason not to?) str(object[,codec][,error]) -> string coded with codec unicode(object[,error]) -> unicode bytes(object) -> bytes * a recode() method is used for transformations that *do_not* change the current codec. See any problems with it? (Other than from gross misuse of course and your dislike of 'recode' as the name.) There may still be a __decode__() method on strings to do the actual decoding, but it wouldn't be part of the public interface. Or it could call a function from the codec to do it. return self.codec.decode(self) The only catching point I see is if having an additional attribute on strings would increase the memory which many small strings would use. That may be why it wasn't done this way to start. (?) Cheers, Ron From fumanchu at amor.org Thu Feb 23 08:53:58 2006 From: fumanchu at amor.org (Robert Brewer) Date: Wed, 22 Feb 2006 23:53:58 -0800 Subject: [Python-Dev] Unifying trace and profile References: <6949EC6CD39F97498A57E0FA55295B2101C312B6@ex9.hostedexchange.local> <66d0a6e10602222214k2f3f874brac54dd495be5a441@mail.gmail.com> Message-ID: <6949EC6CD39F97498A57E0FA55295B21411613@ex9.hostedexchange.local> I, Robert, wrote: > 1. Allow trace hooks to receive c_call, c_return, > and c_exception events (like profile does). and Nicholas Bastin replied: > I can easily make this modification. You can also > register the same bound method for trace and profile, > which sort of eliminates this problem. Wonderful! It looked easy. :) I worked around this by registering one function for trace, and another for profile. The profile function rejects any non-C event and then calls the trace function. Robert: > 3. Expose new sys.gettrace() and getprofile() methods, > so trace and profile functions that want to play nice > can call sys.settrace/setprofile(None) only if they > are the current hook. Nicholas: > Not a bad idea, although are you really running into > this problem a lot? Well, not "a lot", as I don't expect I'll write very many debuggers in my lifetime ;) But it's important when you have multiple, different debugging systems running at once, either to take advantage of the strengths of each, or to debug a debugger. Bob: > 4. Make "the same move" that sys.exitfunc -> atexit made > (from a single function to multiple functions via > registration), so multiple tracers/profilers can play > nice together. Nick: > It seems very unlikely that you'll want to have a trace > hook and profile hook installed at the same time, given > the extreme unreliability this will introduce into the > profiler. True; this request is partly driven by the differing capabilities of each (only profile can handle C events at the moment). Being able to compose debuggers as described above is another reason. Anything else is just the usual (often ignorable) desire for elegance. Bob: > 5. Allow the core to filter on the "event" arg before > hook(frame, event, arg) is called. Nick: > What do you mean by this, exactly? How would you use > this feature? As you hinted, I mean that call_trace would only call the trace function if the current event were in a list of "events I want to monitor"; that list of events could be supplied, for example, with a new sys.settrace(func[, events]) signature, with "events" deafulting to all events for backward compatibility. A single int could be used internally, where each bit represents one of the event types. This would be necessary if trace and profile were unified (see next). If they're not, it's less compelling. Bob: > 6. Unify tracing and profiling, which would remove a > lot of redundant code in ceval and sysmodule and free > up some space in the PyThreadState struct to boot. Nick: > The more events you throw in profiling makes it slow, > however. Line events, while a nice thing to have, > theoretically, would probably make a profiler useless. Sure. If trace functions can receive C events, then there's no need to add that to profiling. I guess I just see profiling as a "stripped down" version of the general trace architecture, and wonder if it couldn't be that in reality as well as appearance; that is, profiling becomes tracing with the 'line' events ignored (before they reach back into your Python trace function and slow everything down). But I also note that the current hotshot uses PyEval_SetTrace "if (self->lineevents)", and PyEval_SetProfile otherwise. Bob: > 7. As if the above isn't enough of a dream, it would be nice > to have a bytecode tracer, which didn't bother with the > f_lineno logic in maybe_call_line_trace, but just called > the hook on every instruction. Nick: > I'm working on one, but given how much time I've had to > work on my profiler in the last year, I'm not even going > to guess when I'll get a real shot at looking at that. > > My long-term goal is to eliminate profiling and tracing > from the core interpreter entirely and implement the > functionality in such a way that they don't cost you > when not in use (i.e., implement profilers and debuggers > which poke into the process from the outside, rather > than be supported natively through events). This isn't > impossible, but it's difficult because of the large > variety of platforms. I have access to most of them, > but again, my time is hugely constrained right now for > python development work. Ah. Sorry to hear that. :/ But no worries on my end; if only #1 can be done someday, I'll be extremely happy. Find me at PyCon, I'll buy you a drink. :) Robert Brewer System Architect Amor Ministries fumanchu at amor.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060222/9f11a3d4/attachment.html From abo at minkirri.apana.org.au Thu Feb 23 10:21:13 2006 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Thu, 23 Feb 2006 09:21:13 +0000 Subject: [Python-Dev] calendar.timegm In-Reply-To: <17403.60665.188360.106157@montanaro.dyndns.org> References: <000d01c636ca$b53411a0$201010ac@prodo.ru> <17403.60665.188360.106157@montanaro.dyndns.org> Message-ID: <1140686473.2435.6.camel@warna.dub.corp.google.com> On Tue, 2006-02-21 at 22:47 -0600, skip at pobox.com wrote: > Sergey> Historical question ;) > > Sergey> Anyone can explain why function timegm is placed into module > Sergey> calendar, not to module time, where it would be near with > Sergey> similar function mktime? > > Historical accident. ;-) It seems time contains a simple wrapper around the equivalent C functions. There is no C equivalent to timegm() (how do they do it?). The timegm() function is implemented in python using the datetime module. The name sux BTW. It would be nice if there was a time.mkgmtime(), but it would need to be implemented in C. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From fuzzyman at voidspace.org.uk Thu Feb 23 10:59:29 2006 From: fuzzyman at voidspace.org.uk (Fuzzyman) Date: Thu, 23 Feb 2006 09:59:29 +0000 Subject: [Python-Dev] defaultdict proposal round three In-Reply-To: <43FD0228.1070701@canterbury.ac.nz> References: <43FA4E88.4090507@canterbury.ac.nz> <43FAE298.5040404@canterbury.ac.nz> <71BF3600-9D62-4FE7-8DB6-6A667AB47BD3@gmail.com> <43FBB1D1.5060808@canterbury.ac.nz> <001401c6374a$9a3a2fd0$6a01a8c0@RaymondLaptop1> <43FD0228.1070701@canterbury.ac.nz> Message-ID: <43FD8781.4070103@voidspace.org.uk> Greg Ewing wrote: > Fredrik Lundh wrote: > > >> fwiw, the first google hit for "autodict" appears to be part of someone's >> link farm >> >> At this website we have assistance with autodict. In addition to >> information for autodict we also have the best web sites concerning >> dictionary, non profit and new york. >> > > Hmmm, looks like some sort of bot that takes the words in > your search and stuffs them into its response. I wonder > if they realise how silly the results end up sounding? > > I've seen these sorts of things before, but I haven't > quite figured out yet how they manage to get into Google's > database if they're auto-generated. Anyone have any clues > what goes on? I guess the question is, how would google know *not* to index them ? As soon as they are linked to (or more likely they re-use an expired domain name that is already in the google database) they will be indexed. They may be obviously autogenerated to a human, but it's a lot harder for a computer to tell. It seems that google indexes sites of dubious value - but gives them a low pagerank. This means they do appear in results, but only if there is nothing more relevant available. All the best, Michael Foord -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060223/98ed61eb/attachment.htm From tzot at mediconsa.com Thu Feb 23 12:01:53 2006 From: tzot at mediconsa.com (Christos Georgiou) Date: Thu, 23 Feb 2006 13:01:53 +0200 Subject: [Python-Dev] buildbot, and test failures References: <200602231036.24936.anthony@interlink.com.au> <43FD365C.10801@v.loewis.de> Message-ID: ""Martin v. L?wis"" wrote in message news:43FD365C.10801 at v.loewis.de... > Anthony Baxter wrote: >> I >> have an Ubuntu x86 box here that can become one (I think the only >> linux, currently, is Gentoo...) > > How different are the Linuxes, though? How many of them do we need? Actually, we would need enough to cover the libc/gcc combinations that are most common. This isn't feasible, though, so in case we add more Linux machines, at least make sure that the libc/gcc combo is not one already used in the existing ones. From skip at pobox.com Thu Feb 23 13:35:51 2006 From: skip at pobox.com (skip at pobox.com) Date: Thu, 23 Feb 2006 06:35:51 -0600 Subject: [Python-Dev] buildbot, and test failures In-Reply-To: References: <200602231036.24936.anthony@interlink.com.au> <43FD365C.10801@v.loewis.de> Message-ID: <17405.44071.31002.162295@montanaro.dyndns.org> Christos> This isn't feasible, though, so in case we add more Linux Christos> machines, at least make sure that the libc/gcc combo is not Christos> one already used in the existing ones. Maybe include libc/gcc versions in the name or description? Skip From jason.orendorff at gmail.com Thu Feb 23 16:25:11 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 23 Feb 2006 10:25:11 -0500 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: <20060222212844.GA15221@mems-exchange.org> References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> Message-ID: On 2/22/06, Neil Schemenauer wrote: > > @classmethod > def fromhex(self, data): > data = re.sub(r'\s+', '', data) > return bytes(binascii.unhexlify(data)) If it's to be a classmethod, I guess that should be "return self( binascii.unhexlify(data))". -j -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060223/09dff5c6/attachment.html From g.brandl at gmx.net Thu Feb 23 16:49:10 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 23 Feb 2006 16:49:10 +0100 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > At 03:49 PM 2/23/2006 +1300, Greg Ewing wrote: >>Steven Bethard wrote: >> > And, as you mention, it's consistent >> > with the relative import feature. >> >>Only rather vaguely -- it's really somewhat different. >> >>With imports, .foo is an abbreviation for myself.foo, >>where myself is the absolute name for the current module, >>and you could replace all instances of .foo with that. > > Actually, "import .foo" is an abbreviation for "import myparent.foo", not > "import myparent.myself.foo". Actually, "import .foo" won't work anyway. nitpicking-ly yours, Georg From steven.bethard at gmail.com Thu Feb 23 17:19:08 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu, 23 Feb 2006 09:19:08 -0700 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> Message-ID: On 2/22/06, Almann T. Goo wrote: > Since the current semantics allow *evaluation* to an enclosing scope's > name by an "un-punctuated" name, "var" is a synonym to ".var" (if > "var" is bound in the immediately enclosing scope). However for > *re-binding* to an enclosing scope's name, the "punctuated" name is > the only one we can use, so the semantic becomes more cluttered. > > This can make a problem that I would say is akin to the "dangling else problem." > > def incrementer_getter(val): > def incrementer(): > val = 5 > def inc(): > ..val += 1 > return val > return inc > return incrementer > > Building on an example that Steve wrote to demonstrate the syntax > proposed, you can see that if a user inadvertently uses the enclosing > scope for the return instead of what would presumably be the outer > most bound parameter. Now remove the binding in the incrementer > function and it works the way the user probably thought. Sorry, what way did the user think? I'm not sure what you think was supposed to happen. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From chris at atlee.ca Thu Feb 23 22:01:08 2006 From: chris at atlee.ca (Chris AtLee) Date: Thu, 23 Feb 2006 16:01:08 -0500 Subject: [Python-Dev] Path PEP: some comments (equality) In-Reply-To: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> References: <71b6302c0602200806i4ae3dc60g2010d18e11e8be37@mail.gmail.com> Message-ID: <7790b6530602231301m3e6e9a19v3ff513414676929e@mail.gmail.com> On 2/20/06, Mark Mc Mahon wrote: > Hi, > > It seems that the Path module as currently defined leaves equality > testing up to the underlying string comparison. My guess is that this > is fine for Unix (maybe not even) but it is a bit lacking for Windows. > > Should the path class implement an __eq__ method that might do some of > the following things: > - Get the absolute path of both self and the other path > - normcase both > - now see if they are equal > > This would make working with paths much easier for keys of a > dictionary on windows. (I frequently use a case insensitive string > class for paths if I need them to be keys of a dict.) The PEP specifies path.samefile(), which is useful in the case of files that actually exist, but pretty much useless for comparing paths that don't exist on the local machine. I think leaving __eq__ as the default string comparison is best. But what about providing an alternate platform-specific equality test? def isequal(self, other, platform="native"): """Return True if self is equivalent to other using platform's path comparison rules. platform can be one of "native", "posix", "windows", "mac".""" This could do some combination of os.path.normpath() and os.path.normcase() depending on the platform. The docs for os.path.normpath() say that it may change the meaning of the path if it contains symlinks...it's not clear to me how though. Cheers, Chris From guido at python.org Thu Feb 23 22:18:57 2006 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Feb 2006 16:18:57 -0500 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> Message-ID: On 2/22/06, Michael Chermside wrote: > A minor related point about on_missing(): > > Haven't we learned from regrets over the .next() method of iterators > that all "magically" invoked methods should be named using the __xxx__ > pattern? Shouldn't it be named __on_missing__() instead? Good point. I'll call it __missing__. I've uploaded a new patch to python.org/sf/1433928. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Thu Feb 23 22:24:02 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 23 Feb 2006 22:24:02 +0100 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FD393A.80209@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> <43FD393A.80209@canterbury.ac.nz> Message-ID: <20060223212402.GM23859@xs4all.nl> On Thu, Feb 23, 2006 at 05:25:30PM +1300, Greg Ewing wrote: > Samuele Pedroni wrote: > > > If you are looking for rough edges about nested scopes in Python > > this is probably worse: > > > > >>> x = [] > > >>> for i in range(10): > > ... x.append(lambda : i) > > ... > > >>> [y() for y in x] > > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] > > As an aside, is there any chance that this could be > changed in 3.0? I.e. have the for-loop create a new > binding for the loop variable on each iteration. You can't do that without introducing a whole new scope for the body of the 'for' loop, and that means (in the current rules) you can't assign to any function-local names in the for loop. The nested scope in that 'lambda' refers to the 'slot' for the variable 'i' in the outer namespace (in this case, the global one.) You can't 'remove' the binding, either; 'del' will not allow you to. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Thu Feb 23 22:41:31 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 23 Feb 2006 22:41:31 +0100 Subject: [Python-Dev] getdefault(), the real replacement for setdefault() In-Reply-To: <1140665348.14539.114.camel@geddy.wooz.org> References: <1140665348.14539.114.camel@geddy.wooz.org> Message-ID: <20060223214131.GN23859@xs4all.nl> On Wed, Feb 22, 2006 at 10:29:08PM -0500, Barry Warsaw wrote: > d.getdefault('foo', list).append('bar') > Anyway, I don't think it's an either/or choice with Guido's subclass. > Instead I think they are different use cases. I would add getdefault() > to the standard dict API, remove (eventually) setdefault(), and add > Guido's subclass in a separate module. But I /wouldn't/ clutter the > built-in dict's API with on_missing(). +1. This is a much closer match to my own use of setdefault than Guido's dict subtype. I'm +0 on the subtype, but I prefer the call-time decision on whether to fall back to a default or not. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Thu Feb 23 22:45:19 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 23 Feb 2006 22:45:19 +0100 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> Message-ID: <20060223214519.GO23859@xs4all.nl> On Wed, Feb 22, 2006 at 01:13:28PM -0800, Michael Chermside wrote: > Haven't we learned from regrets over the .next() method of iterators > that all "magically" invoked methods should be named using the __xxx__ > pattern? Shouldn't it be named __on_missing__() instead? I agree that on_missing should be __missing__ (or __missing_key__) but I don't agree on the claim that all 'magically' invoked methods should be two-way-double-underscored. __methods__ are methods that should only be called 'magically', or by the object itself. 'next' has quite a few usecases where it's desireable to call it directly (and I often do.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From walter at livinglogic.de Thu Feb 23 22:55:40 2006 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 23 Feb 2006 22:55:40 +0100 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> Message-ID: <43FE2F5C.8090700@livinglogic.de> Guido van Rossum wrote: > On 2/22/06, Michael Chermside wrote: >> A minor related point about on_missing(): >> >> Haven't we learned from regrets over the .next() method of iterators >> that all "magically" invoked methods should be named using the __xxx__ >> pattern? Shouldn't it be named __on_missing__() instead? > > Good point. I'll call it __missing__. I've uploaded a new patch to > python.org/sf/1433928. I always thought that __magic__ method calls are done by Python on objects it doesn't know about. The special method name ensures that it is indeed the protocol Python is talking about, not some random method (with next() being the exception). In the defaultdict case this isn't a problem, because defaultdict is calling its own method. Bye, Walter D?rwald From greg.ewing at canterbury.ac.nz Fri Feb 24 01:25:50 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Feb 2006 13:25:50 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FE528E.1040100@canterbury.ac.nz> Stephen J. Turnbull wrote: > Please define "character," and explain how its semantics map to > Python's unicode objects. One of the 65 abstract entities referred to in the RFC and represented in that RFC by certain visual glyphs. There is a subset of the Unicode code points that are conventionally associated with very similar glyphs, so that there is an obvious one-to-one mapping between these entities and those Unicode code points. These entities therefore have a natural and obvious representation using Python unicode strings. > No, base64 isn't a wire protocol. Rather, it's a schema for a family > of wire protocols, whose alphabets are heuristically chosen on the > assumption that code units which happen to correspond to alpha-numeric > code points in a commonly-used coded character set are more likely to > pass through a communication channel without corruption. Yes, and it's up to the programmer to choose those code units (i.e. pick an encoding for the characters) that will, in fact, pass through the channel he is using without corruption. I don't see how any of this is inconsistent with what I've said. > Only if you do no transformations that will harm the base64-encoding. > ... It doesn't allow any of the > usual transformations on characters that might be applied globally to > a mail composition buffer, for example. I don't understand that. Obviously if you rot13 your mail message or turn it into pig latin or something, it's going to mess up any base64 it might contain. But that would be a silly thing to do to a message containing base64. Given any piece of text, there are things it makes sense to do with it and things it doesn't, depending entirely on the use to which the text will eventually be put. I don't see how base64 is any different in this regard. > So then you bring it right back in with base64. Now they need to know > about bytes<->unicode codecs. No, they need to know about the characteristics of the channel over which they're sending the data. Base64 is designed for situations in which you have a *text* channel that you know is capable of transmitting at least a certain subset of characters, where "character" means whatever is used as input to that channel. In Py3k, text will be represented by unicode strings. So a Py3k text channel should take unicode as its input, not bytes. I think we've got a bit sidetracked by talking about mime. I wasn't actually thinking about mime, but just a plain text message into which some base64 data was being inserted. That's the way we used to do things in the old days with uuencode etc, before mime was invented. Here, the "channel" is NOT the socket or whatever that the ultimate transmission takes place over -- it's the interface to your mail sending software that takes a piece of plain text and sends it off as a mail message somehow. In Py3k, if a channel doesn't take unicode as input, then it's not a text channel, and it's not appropriate to be using base64 with it directly. It might be appropriate to to use base64 followed by some encoding, but the programmer needs to be aware of that and choose the encoding wisely. It's not possible to shield him from having to know about encodings in that situation, even if the encoding is just ascii. Trying to do so will just lead to more confusion, in my opinion. Greg From facundobatista at gmail.com Fri Feb 24 03:12:21 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 23 Feb 2006 23:12:21 -0300 Subject: [Python-Dev] OT: T-Shirts Message-ID: Python Argentina finally have T-Shirts (you can see a photo here: http://www.taniquetil.com.ar/plog/post/1/161). Why this mail to python-dev? Because the group decided to give some, as a present, to some outstanding members of Python: Guido van Rossum Alex Martelli Tim Peters Fredrik Lundh David Ascher Mark Lutz Mark Hammond Also, some of us want to give one as a personal present: Raymond Hettinger (from Facundo Batista) Bob Ippolito (from Alejandro David Weil) Glyph Lefkowitz (from Alejandro J. Cura) The point is that I don't know some of you, so please grab my shoulder here in PyCon. And if you're not coming to the conference but somebody can carry it to you, just let me know. And if you want to buy one, I brought some, only USD 12, ;). Thank you very much and sorry for the OT. Regards, . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From mcherm at mcherm.com Fri Feb 24 03:39:56 2006 From: mcherm at mcherm.com (Michael Chermside) Date: Thu, 23 Feb 2006 21:39:56 -0500 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <43FE2F5C.8090700@livinglogic.de> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> Message-ID: <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> Walter D?rwald writes: > I always thought that __magic__ method calls are done by Python on > objects it doesn't know about. The special method name ensures that it > is indeed the protocol Python is talking about, not some random method > (with next() being the exception). In the defaultdict case this isn't a > problem, because defaultdict is calling its own method. I, instead, felt that the __xxx__ convention served a few purposes. First, it indicates that the method will be called in some means OTHER than by name (generally, the interpreter invokes it directly, although in this case it's a built-in method of dict that would invoke it). Secondly, it serves to flag the method as being special -- true newbies can safely ignore nearly all special methods aside from __init__(). And it serves to create a separate namespace... writers of Python code know that names beginning and ending with double-underscores are "reserved for the language". Of these, I always felt that special invocation was the most important feature. The next() method of iterators was an interesting object lesson. The original reasoning (I think) for using next() not __next__() was that *sometimes* the method was called directly by name (when stepping an iterator manually, which one frequently does for perfectly good reasons). Since it was sometimes invoked by name and sometimes by special mechanism, the choice was to use the unadorned name, but later experience showed that it would have been better the other way. -- Michael Chermside From aleaxit at gmail.com Fri Feb 24 06:15:26 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Thu, 23 Feb 2006 21:15:26 -0800 Subject: [Python-Dev] OT: T-Shirts In-Reply-To: References: Message-ID: On Feb 23, 2006, at 6:12 PM, Facundo Batista wrote: > Python Argentina finally have T-Shirts (you can see a photo here: > http://www.taniquetil.com.ar/plog/post/1/161). > > Why this mail to python-dev? Because the group decided to give some, > as a present, to some outstanding members of Python: > > Guido van Rossum > Alex Martelli T-shirts? I'm an absolute fan of T-shirts...!-) > The point is that I don't know some of you, so please grab my shoulder > here in PyCon. And if you're not coming to the conference but somebody > can carry it to you, just let me know. Anna can bring mine!!! > And if you want to buy one, I brought some, only USD 12, ;). Anna, please buy one for yourself before they run out -- they're cool, and this way we can go around as the AR (Anna Ravenscroft, of course!) Python Twins...!-) Alex From greg.ewing at canterbury.ac.nz Fri Feb 24 07:53:12 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Feb 2006 19:53:12 +1300 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> Message-ID: <43FEAD58.2020208@canterbury.ac.nz> Michael Chermside wrote: > The next() method of iterators was an interesting > object lesson. ... Since it was sometimes invoked by name > and sometimes by special mechanism, the choice was to use the > unadorned name, but later experience showed that it would have been > better the other way. Any thoughts about fixing this in 3.0? -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 24 07:54:07 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Feb 2006 19:54:07 +1300 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <20060223214519.GO23859@xs4all.nl> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <20060223214519.GO23859@xs4all.nl> Message-ID: <43FEAD8F.5030808@canterbury.ac.nz> Thomas Wouters wrote: > __methods__ are methods that should only be > called 'magically', or by the object itself. > 'next' has quite a few usecases where it's > desireable to call it directly That's why the proposal to replace .next() with .__next__() comes along with a function next(obj) which calls obj.__next__(). -- Greg From greg.ewing at canterbury.ac.nz Fri Feb 24 07:54:14 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 24 Feb 2006 19:54:14 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <20060223212402.GM23859@xs4all.nl> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> <43FD393A.80209@canterbury.ac.nz> <20060223212402.GM23859@xs4all.nl> Message-ID: <43FEAD96.3000300@canterbury.ac.nz> Thomas Wouters wrote: > On Thu, Feb 23, 2006 at 05:25:30PM +1300, Greg Ewing wrote: > >>As an aside, is there any chance that this could be >>changed in 3.0? I.e. have the for-loop create a new >>binding for the loop variable on each iteration. > > You can't do that without introducing a whole new scope for the body of the > 'for' loop, There's no need for that. The new scope need only include the loop variable -- everything else could still refer to the function's main scope. There's even a rather elegant way of implementing this in the current CPython. If a nested scope references the loop variable, then it will be in a cell. So you just create a new cell each time round the loop, instead of changing the existing one. This would even still let you use the value after the loop finished, if that were considered a good idea. But it might be better not to allow that, since it could make alternative implementations difficult. -- Greg From hoffman at ebi.ac.uk Fri Feb 24 10:33:37 2006 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Fri, 24 Feb 2006 09:33:37 +0000 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> Message-ID: [Neil Schemenauer] >> @classmethod >> def fromhex(self, data): >> data = re.sub(r'\s+', '', data) >> return bytes(binascii.unhexlify(data)) [Jason Orendorff] > If it's to be a classmethod, I guess that should be "return self( > binascii.unhexlify(data))". Am I the only one who finds the use of "self" on a classmethod to be incredibly confusing? Can we please follow PEP 8 and use "cls" instead? -- Michael Hoffman European Bioinformatics Institute From stephen at xemacs.org Fri Feb 24 11:15:34 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 24 Feb 2006 19:15:34 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FD61D2.1080200@ronadam.com> (Ron Adam's message of "Thu, 23 Feb 2006 01:18:42 -0600") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <43FCC984.8020207@ronadam.com> <8764n6mw82.fsf@tleepslib.sk.tsukuba.ac.jp> <43FD61D2.1080200@ronadam.com> Message-ID: <87fym9jc55.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Ron" == Ron Adam writes: Ron> We could call it transform or translate if needed. You're still losing the directionality, which is my primary objection to "recode". The absence of directionality is precisely why "recode" is used in that sense for i18n work. There really isn't a good reason that I can see to use anything other than the pair "encode" and "decode". In monolingual environments, once _all_ human-readable text (specifically including Python programs and console I/O) is automatically mapped to a Python (unicode) string, most programmers will never need to think about it as long as Python (the project) very very strongly encourages that all Python programs be written in UTF-8 if there's any chance the program will be reused in a locale other than the one where it was written. (Alternatively you can depend on PEP 263 coding cookies.) Then the user (or the Python interpreter) just changes console and file I/O codecs to the encoding in use in that locale, and everything just works. So the remaining uses of "encode" and "decode" are for advanced users and specialists: people using stuff like base64 or gzip, and those who need to use unicode codecs explicitly. I could be wrong about the possibility to get rid of explicit unicode codec use in monolingual environments, but I hope that we can at least try to achieve that. >> Unlikely. Errors like "A >> string".encode("base64").encode("base64") are all too easy to >> commit in practice. Ron> Yes,... and wouldn't the above just result in a copy so it Ron> wouldn't be an out right error. No, you either get the following: A string. -> QSBzdHJpbmcu -> UVNCemRISnBibWN1 or you might get an error if base64 is defined as bytes->unicode. Ron> * Given that the string type gains a __codec__ attribute Ron> to handle automatic decoding when needed. (is there a reason Ron> not to?) Ron> str(object[,codec][,error]) -> string coded with codec Ron> unicode(object[,error]) -> unicode Ron> bytes(object) -> bytes str == unicode in Py3k, so this is a non-starter. What do you want to say? Ron> * a recode() method is used for transformations that Ron> *do_not* change the current codec. I'm not sure what you mean by the "current codec". If it's attached to an "encoded object", it should be the codec needed to decode the object. And it should be allowed to be a "codec stack". So suppose you start with a unicode object "obj". Then >>> bytes = bytes (obj, 'utf-8') # implicit .encode() >>> print bytes.codec ['utf-8'] >>> wire = bytes.encode ('base64') # with apologies to Greg E. >>> print wire.codec ['base64', 'utf-8'] >>> obj2 = wire.decode ('gzip') CodecMatchException >>> obj2 = wire.decode (wire.codec) >>> print obj == obj2 True >>> print obj2.codec [] or maybe None for the last. I think this would be very nice as a basis for improving the email module (for one), but I don't really think it belongs in Python core. Ron> That may be why it wasn't done this way to start. (?) I suspect the real reason is that Marc-Andre had the generalized codec in mind from Day 0, and your proposal only works with duck-typing if codecs always have a well-defined signature with two different types for the argument and return of the "constructor". -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From stephen at xemacs.org Fri Feb 24 12:05:55 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 24 Feb 2006 20:05:55 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FE528E.1040100@canterbury.ac.nz> (Greg Ewing's message of "Fri, 24 Feb 2006 13:25:50 +1300") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> Message-ID: <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing writes: Greg> Stephen J. Turnbull wrote: >> No, base64 isn't a wire protocol. It's a family[...]. Greg> Yes, and it's up to the programmer to choose those code Greg> units (i.e. pick an encoding for the characters) that will, Greg> in fact, pass through the channel he is using without Greg> corruption. I don't see how any of this is inconsistent with Greg> what I've said. It's not. It just shows that there are other "correct" ways to think about the issue. >> Only if you do no transformations that will harm the >> base64-encoding. ... It doesn't allow any of the usual >> transformations on characters that might be applied globally to >> a mail composition buffer, for example. Greg> I don't understand that. Obviously if you rot13 your mail Greg> message or turn it into pig latin or something, it's going Greg> to mess up any base64 it might contain. But that would be a Greg> silly thing to do to a message containing base64. What "message containing base64"? "Any base64 in there?" "Nope, nobody here but us Unicode characters!" I certainly hope that in Py3k bytes objects will have neither ROT13 nor case-changing methods, but str objects certainly will. Why give up the safety of that distinction? Greg> Given any piece of text, there are things it makes sense to Greg> do with it and things it doesn't, depending entirely on the Greg> use to which the text will eventually be put. I don't see Greg> how base64 is any different in this regard. If you're going to be binary about it, it's not different. However the kind of "text" for which Unicode was designed is normally produced and consumed by people, who wll pt up w/ ll knds f nnsns. Base64 decoders will not put up with the same kinds of nonsense that people will. You're basically assuming that the person who implements the code that processes a Unicode string is the same person who implemented the code that converts a binary object into base64 and inserts it into a string. I think that's a dangerous (and certainly invalid) assumption. I know I've lost time and data to applications that make assumptions like that. In fact, that's why "MULE" is a four-letter word in Emacs channels. >> So then you bring it right back in with base64. Now they need >> to know about bytes<->unicode codecs. Greg> No, they need to know about the characteristics of the Greg> channel over which they're sending the data. I meant it in a trivial sense: "How do you use a bytes<->unicode codec properly without knowing that it's a bytes<->unicode codec?" In most environments, it should be possible to hide bytes<->unicode codecs almost all the time, and I think that's a very good thing. I don't think it's a good idea to gratuitously introduce wire protocols as unicode codecs, even if a class of bit patterns which represent the integer 65 are denoted "A" in various sources. Practicality beats purity (especially when you're talking about the purity of a pregnant virgin). Greg> It might be appropriate to to use base64 followed by some Greg> encoding, but the programmer needs to be aware of that and Greg> choose the encoding wisely. It's not possible to shield him Greg> from having to know about encodings in that situation, even Greg> if the encoding is just ascii. What do you think the email module does? Assuming conforming MIME messages and receivers capable of handling UTF-8, the user of the email module does not need to know anything about any encodings at all. With a little more smarts, the email module could even make a good choice of output encoding based on the _language_ of the text, removing the restriction to UTF-8 on the output side, too. With the aid of file(1), it can make excellent guesses about attachments. Sure, the email module programmer needs to know, but the email module programmer needs to know an awful lot about codecs anyway, since mail at that level is a binary channel, while users will be throwing a mixed bag of binary and textual objects at it. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From barry at python.org Fri Feb 24 15:22:19 2006 From: barry at python.org (Barry Warsaw) Date: Fri, 24 Feb 2006 09:22:19 -0500 Subject: [Python-Dev] getdefault(), the real replacement for setdefault() In-Reply-To: <20060223214131.GN23859@xs4all.nl> References: <1140665348.14539.114.camel@geddy.wooz.org> <20060223214131.GN23859@xs4all.nl> Message-ID: <40262D6D-F950-4096-BB01-B5B339C2DC3D@python.org> On Feb 23, 2006, at 4:41 PM, Thomas Wouters wrote: > On Wed, Feb 22, 2006 at 10:29:08PM -0500, Barry Warsaw wrote: >> d.getdefault('foo', list).append('bar') > >> Anyway, I don't think it's an either/or choice with Guido's subclass. >> Instead I think they are different use cases. I would add >> getdefault() >> to the standard dict API, remove (eventually) setdefault(), and add >> Guido's subclass in a separate module. But I /wouldn't/ clutter the >> built-in dict's API with on_missing(). > > +1. This is a much closer match to my own use of setdefault than > Guido's > dict subtype. I'm +0 on the subtype, but I prefer the call-time > decision on > whether to fall back to a default or not. Cool! As your reward: SF patch #1438113 https://sourceforge.net/tracker/index.php? func=detail&aid=1438113&group_id=5470&atid=305470 -Barry From foom at fuhm.net Fri Feb 24 16:40:57 2006 From: foom at fuhm.net (James Y Knight) Date: Fri, 24 Feb 2006 10:40:57 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FEAD96.3000300@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> <43FD393A.80209@canterbury.ac.nz> <20060223212402.GM23859@xs4all.nl> <43FEAD96.3000300@canterbury.ac.nz> Message-ID: <1259B2EC-DF60-4C7B-84E9-83C6DA665A61@fuhm.net> On Feb 24, 2006, at 1:54 AM, Greg Ewing wrote: > Thomas Wouters wrote: >> On Thu, Feb 23, 2006 at 05:25:30PM +1300, Greg Ewing wrote: >> >>> As an aside, is there any chance that this could be >>> changed in 3.0? I.e. have the for-loop create a new >>> binding for the loop variable on each iteration. >> >> You can't do that without introducing a whole new scope >> for the body of the 'for' loop, > > There's no need for that. The new scope need only > include the loop variable -- everything else could > still refer to the function's main scope. No, that would be insane. You get the exact same problem, now even more confusing: l=[] for x in range(10): y = x l.append(lambda: (x, y)) print l[0]() With your suggestion, that would print (0, 9). Unless python grows a distinction between creating a binding and assigning to one as most other languages have, this problem is here to stay. James From python at rcn.com Fri Feb 24 16:59:27 2006 From: python at rcn.com (Raymond Hettinger) Date: Fri, 24 Feb 2006 10:59:27 -0500 Subject: [Python-Dev] defaultdict and on_missing() References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com><43FE2F5C.8090700@livinglogic.de><327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> Message-ID: <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> > Michael Chermside wrote: >> The next() method of iterators was an interesting >> object lesson. ... Since it was sometimes invoked by name >> and sometimes by special mechanism, the choice was to use the >> unadorned name, but later experience showed that it would have been >> better the other way. [Grep] > Any thoughts about fixing this in 3.0? IMO, it isn't broken. It was an intentional divergence from naming conventions. The reasons for the divergence haven't changed. Code that uses next() is more understandable, friendly, and readable without the walls of underscores. Raymond From aleaxit at gmail.com Fri Feb 24 17:47:45 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 24 Feb 2006 08:47:45 -0800 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> Message-ID: On 2/24/06, Raymond Hettinger wrote: > > Michael Chermside wrote: > >> The next() method of iterators was an interesting > >> object lesson. ... Since it was sometimes invoked by name > >> and sometimes by special mechanism, the choice was to use the > >> unadorned name, but later experience showed that it would have been > >> better the other way. > > [Grep] > > Any thoughts about fixing this in 3.0? > > IMO, it isn't broken. It was an intentional divergence from naming conventions. > The reasons for the divergence haven't changed. Code that uses next() is more > understandable, friendly, and readable without the walls of underscores. Wouldn't, say, next(foo) [[with a hypothetical builtin 'next' internally calling foo.__next__(), just like builtin 'len' internally calls foo.__len__()]] be just as friendly etc? No biggie either way, but that would seem to be more aligned with Python's usual approach. Alex From nnorwitz at gmail.com Fri Feb 24 18:44:39 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 24 Feb 2006 11:44:39 -0600 Subject: [Python-Dev] Dropping support for Win9x in 2.6 Message-ID: Martin and I were talking about dropping support for older versions of Windows (of the non-NT flavor). We both thought that it was reasonable to stop supporting Win9x (including WinME) in Python 2.6. I updated PEP 11 to reflect this. The Python 2.5 installer will present a warning message on the systems which will not be supported in Python 2.6. n From g.brandl at gmx.net Fri Feb 24 19:00:06 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 24 Feb 2006 19:00:06 +0100 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: References: Message-ID: Neal Norwitz wrote: > Martin and I were talking about dropping support for older versions of > Windows (of the non-NT flavor). We both thought that it was > reasonable to stop supporting Win9x (including WinME) in Python 2.6. > I updated PEP 11 to reflect this. > > The Python 2.5 installer will present a warning message on the systems > which will not be supported in Python 2.6. Hey, someone even wanted to continue supporting DOS... Georg From fuzzyman at voidspace.org.uk Fri Feb 24 19:06:59 2006 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 24 Feb 2006 18:06:59 +0000 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: References: Message-ID: <43FF4B43.2080705@voidspace.org.uk> Georg Brandl wrote: > Neal Norwitz wrote: > >> Martin and I were talking about dropping support for older versions of >> Windows (of the non-NT flavor). We both thought that it was >> reasonable to stop supporting Win9x (including WinME) in Python 2.6. >> I updated PEP 11 to reflect this. >> >> The Python 2.5 installer will present a warning message on the systems >> which will not be supported in Python 2.6. >> > > Hey, someone even wanted to continue supporting DOS... > > A lot of people are still using Windows 98. But I guess if noone is volunteering to maintain the code... Michael Foord > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From aahz at pythoncraft.com Fri Feb 24 19:29:27 2006 From: aahz at pythoncraft.com (Aahz) Date: Fri, 24 Feb 2006 10:29:27 -0800 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: <43FF4B43.2080705@voidspace.org.uk> References: <43FF4B43.2080705@voidspace.org.uk> Message-ID: <20060224182927.GA21532@panix.com> On Fri, Feb 24, 2006, Michael Foord wrote: > Georg Brandl wrote: >> Neal Norwitz wrote: >> >>> Martin and I were talking about dropping support for older versions of >>> Windows (of the non-NT flavor). We both thought that it was >>> reasonable to stop supporting Win9x (including WinME) in Python 2.6. >>> I updated PEP 11 to reflect this. >>> >>> The Python 2.5 installer will present a warning message on the systems >>> which will not be supported in Python 2.6. >> >> Hey, someone even wanted to continue supporting DOS... > > A lot of people are still using Windows 98. But I guess if noone is > volunteering to maintain the code... DOS has some actual utility for low-grade devices and is overall a simpler platform to deliver code for. At the standard 18-month release cycle, it will be beginning of 2008 for the release of 2.6, which is ten years after Win98. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis From 2005a at usenet.alexanderweb.de Fri Feb 24 20:02:21 2006 From: 2005a at usenet.alexanderweb.de (Alexander Schremmer) Date: Fri, 24 Feb 2006 20:02:21 +0100 Subject: [Python-Dev] Dropping support for Win9x in 2.6 References: <43FF4B43.2080705@voidspace.org.uk> <20060224182927.GA21532@panix.com> Message-ID: <17lpeg5x3bxdv.dlg@usenet.alexanderweb.de> On Fri, 24 Feb 2006 10:29:27 -0800, Aahz wrote: > DOS has some actual utility for low-grade devices and is overall a > simpler platform to deliver code for. At the standard 18-month release > cycle, it will be beginning of 2008 for the release of 2.6, which is ten > years after Win98. The last Windows release of that branch was Windows ME, in September 2000, i.e. you have to wait till 2010 in order to be ten years after the last legacy OS release. Kind regards, Alexander From guido at python.org Fri Feb 24 21:04:06 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Feb 2006 14:04:06 -0600 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: <43FF4B43.2080705@voidspace.org.uk> References: <43FF4B43.2080705@voidspace.org.uk> Message-ID: On 2/24/06, Michael Foord wrote: > A lot of people are still using Windows 98. But I guess if noone is > volunteering to maintain the code... Agreed. If they're so keen on using an antiquated OS, perhaps they would be perfectly happy using a matching Python version... Somehow I doubt this is going to be a big deal for anyone affected. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From trentm at ActiveState.com Fri Feb 24 21:09:14 2006 From: trentm at ActiveState.com (Trent Mick) Date: Fri, 24 Feb 2006 12:09:14 -0800 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: References: Message-ID: <20060224200914.GA18542@activestate.com> [Neal Norwitz wrote] > Martin and I were talking about dropping support for older versions of > Windows (of the non-NT flavor). We both thought that it was > reasonable to stop supporting Win9x (including WinME) in Python 2.6. > I updated PEP 11 to reflect this. Are there specific code areas in mind that would be ripped out for this or is this mainly to avoid having to test on and ensure new code is compatible with? Trent -- Trent Mick TrentM at ActiveState.com From facundobatista at gmail.com Fri Feb 24 21:12:48 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Fri, 24 Feb 2006 17:12:48 -0300 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: References: Message-ID: 2006/2/24, Neal Norwitz : > Martin and I were talking about dropping support for older versions of > Windows (of the non-NT flavor). We both thought that it was > reasonable to stop supporting Win9x (including WinME) in Python 2.6. +1 . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From jeremy at alum.mit.edu Fri Feb 24 23:38:26 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 24 Feb 2006 17:38:26 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <1259B2EC-DF60-4C7B-84E9-83C6DA665A61@fuhm.net> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> <43FD393A.80209@canterbury.ac.nz> <20060223212402.GM23859@xs4all.nl> <43FEAD96.3000300@canterbury.ac.nz> <1259B2EC-DF60-4C7B-84E9-83C6DA665A61@fuhm.net> Message-ID: On 2/24/06, James Y Knight wrote: > On Feb 24, 2006, at 1:54 AM, Greg Ewing wrote: > > Thomas Wouters wrote: > >> On Thu, Feb 23, 2006 at 05:25:30PM +1300, Greg Ewing wrote: > >> > >>> As an aside, is there any chance that this could be > >>> changed in 3.0? I.e. have the for-loop create a new > >>> binding for the loop variable on each iteration. > >> > >> You can't do that without introducing a whole new scope > >> for the body of the 'for' loop, > > > > There's no need for that. The new scope need only > > include the loop variable -- everything else could > > still refer to the function's main scope. > > No, that would be insane. You get the exact same problem, now even > more confusing: > > l=[] > for x in range(10): > y = x > l.append(lambda: (x, y)) > > print l[0]() > > With your suggestion, that would print (0, 9). > > Unless python grows a distinction between creating a binding and > assigning to one as most other languages have, this problem is here > to stay. The more practical complaint is that list comprehensions use the same namespace as the block that contains them. It's much easier to miss an assignment to, say, i in a list comprehension than it is in a separate statement in the body of a for loop. Since list comps are expressions, the only variable at issue is the index variable. It would be simple to fix by renaming, but I suspect we're stuck with the current behavior for backwards compatibility reasons. Jeremy From rrr at ronadam.com Fri Feb 24 23:46:00 2006 From: rrr at ronadam.com (Ron Adam) Date: Fri, 24 Feb 2006 16:46:00 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87fym9jc55.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <43FCC984.8020207@ronadam.com> <8764n6mw82.fsf@tleepslib.sk.tsukuba.ac.jp> <43FD61D2.1080200@ronadam.com> <87fym9jc55.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FF8CA8.9020509@ronadam.com> * The following reply is a rather longer than I intended explanation of why codings (and how they differ) like 'rot' aren't the same thing as pure unicode codecs and probably should be treated differently. If you already understand that, then I suggest skipping this. But if you like detailed logical analysis, it might be of some interest even if it's reviewing the obvious to those who already know. (And hopefully I didn't make any really obvious errors myself.) Stephen J. Turnbull wrote: >>>>>> "Ron" == Ron Adam writes: > > Ron> We could call it transform or translate if needed. > > You're still losing the directionality, which is my primary objection > to "recode". The absence of directionality is precisely why "recode" > is used in that sense for i18n work. I think your not understanding what I suggested. It might help if we could agree on some points and then go from there. So, lets consider a "codec" and a "coding" as being two different things where a codec is a character sub set of unicode characters expressed in a native format. And a coding is *not* a subset of the unicode character set, but an _opperation_ performed on text. So you would have the following properties. codec -> text is always in *one_codec* at any time. coding -> operation performed on text. Lets add a special default coding called 'none' to represent a do nothing coding. (figuratively for explanation purposes) 'none' -> return the input as is, or the uncoded text Given the above relationships we have the following possible transformations. 1. codec to like codec: 'ascii' to 'ascii' 2. codec to unlike codec: 'ascii' to 'latin1' And we have coding relationships of: a. coding to like coding # Unchanged, do nothing b. coding to unlike coding Then we can express all the possible combinations as... [1.a, 1.b, 2.a, 2.b] 1.a -> coding in codec to like coding in like codec: 'none' in 'ascii' to 'none' in 'ascii' 1.b -> coding in codec to diff coding in like codec: 'none' in 'ascii' to 'base64' in 'ascii' 2.a -> coding in codec to same coding in diff codec: 'none' in 'ascii' to 'none' in 'latin1' 2.b -> coding in codec to diff coding in diff codec: 'none' in 'latin1' to 'base64' in 'ascii' This last one is a problem as some codecs combine coding with character set encoding and return text in a differnt encoding than they recieved. The line is also blurred between types and encodings. Is unicode and encoding? Will bytes also be a encoding? Using the above combinations: (1.a) is just creating a new copy of a object. s = str(s) (1.b) is recoding an object, it returns a copy of the object in the same encoding. s = s.encode('hex-codec') # ascii str -> ascii str coded in hex s = s.decode('hex-codec') # ascii str coded in hex -> ascii str * these are really two differnt operations. And encoding repeatedly results in nested codings. Codecs (as a pure subset of unicode) don't have that property. * the hex-codec also fit the 2.b pattern below if the source string is of a differnt type than ascii. (or the the default string?) (2.a) creates a copy encoded in a new codec. s = s.encode('latin1') * I beleive string constructors should have a encoding argument for use with unicode strings. s = str(u, 'latin1') # This would match the bytes constructor. (2.b) are combinations of the above. s = u.encode('base64') # unicode to ascii string as base64 coded characters u = unicode(s.decode('base64')) # ascii string coded in base64 to unicode characters or >>> u = unicode(s, 'base64') Traceback (most recent call last): File "", line 1, in ? TypeError: decoder did not return an unicode object (type=str) Ooops... ;) So is coding the same as a codec? I think they have different properties and should be treated differently except when the practicality over purity rule is needed. And in those cases maybe the names could clearly state the result. u.decode('base64ascii') # name indicates coding to codec > A string. -> QSBzdHJpbmcu -> UVNCemRISnBibWN1 Looks like the underlying sequence is: native string -> unicode -> unicode coded base64 -> coded ascii str And decode operation would be... coded ascii str -> unicode coded base64 -> unicode -> ascii str Except it may combine some of these steps to speed it up. Since it's a hybred codec including a coding operation. We have to treat it as a codec. > Ron> * Given that the string type gains a __codec__ attribute > Ron> to handle automatic decoding when needed. (is there a reason > Ron> not to?) > > Ron> str(object[,codec][,error]) -> string coded with codec > > Ron> unicode(object[,error]) -> unicode > > Ron> bytes(object) -> bytes > > str == unicode in Py3k, so this is a non-starter. What do you want to > say? > > Ron> * a recode() method is used for transformations that > Ron> *do_not* change the current codec. > > I'm not sure what you mean by the "current codec". If it's attached > to an "encoded object", it should be the codec needed to decode the > object. And it should be allowed to be a "codec stack". I wasn't thinking in terms of stacks, but in that case the current codec would be the top of the stack. I think stackable codecs is a very bad idea for the record. Back to recode vs encode/decode, the example used above might be useful. s = s.encode('hex-codec') # ascii str -> ascii str coded in hex s = s.decode('hex-codec') # ascii str coded in hex -> ascii str In my opinion these are actually too very different (although related) operations that would be better expressed with different names. Curently it's a hybred codec that converts it's input to an ascii string (or default encoding?), but when decoding you end up with an ascii encoding even if you started with something else. So the decode isn't a true inverse to encode in some cases. As a coding operation it would be. u = u.recode('to_hex') u = u.recode('from_hex') Where this would work with both unicode and strings without changing the codec. It also keeps the 'if i do it again' it will *recode* the coded text' relationship. So I think the name is appropriate. IMHO Pure codecs such as latin-1 can be envoked over and over and you can always get back what you put in in a single step. >>> s = 'abc' >>> for n in range(100): ... s = s.encode('latin-1') ... >>> print s, type(s) abc Supposedly a lot of these issues will go away in Python 3000. And we can probably live with the current state of things. But even after Python 3000 it seems to me we will still need access to codecs as we may run across encoded text input from various sources. Cheers, Ron From nnorwitz at gmail.com Sat Feb 25 00:11:26 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 24 Feb 2006 17:11:26 -0600 Subject: [Python-Dev] problem with genexp In-Reply-To: References: Message-ID: On 2/20/06, Jiwon Seo wrote: > Regarding this Grammar change; (last October) > from argument: [test '=' ] test [gen_for] > to argument: test [gen_for] | test '=' test ['(' gen_for ')'] > > - to raise error for "bar(a = i for i in range(10)) )" > > I think we should change it to > argument: test [gen_for] | test '=' test > > instead of > argument: test [gen_for] | test '=' test ['(' gen_for ')'] > > that is, without ['(' gen_for ')'] . We don't need that extra term, > because "test" itself includes generator expressions - with all those > parensises. Works for me, committed. n From nas at arctrix.com Sat Feb 25 00:52:48 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 24 Feb 2006 23:52:48 +0000 (UTC) Subject: [Python-Dev] Pre-PEP: The "bytes" object References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> Message-ID: Michael Hoffman wrote: > Am I the only one who finds the use of "self" on a classmethod to be > incredibly confusing? Can we please follow PEP 8 and use "cls" > instead? Sorry, using "self" was an oversight. It should be "cls", IMO. Neil From greg.ewing at canterbury.ac.nz Sat Feb 25 01:20:25 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Feb 2006 13:20:25 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <43FFA2C9.4010204@canterbury.ac.nz> Stephen J. Turnbull wrote: > the kind of "text" for which Unicode was designed is normally produced > and consumed by people, who wll pt up w/ ll knds f nnsns. Base64 > decoders will not put up with the same kinds of nonsense that people > will. The Python compiler won't put up with that sort of nonsense either. Would you consider that makes Python source code binary data rather than text, and that it's inappropriate to represent it using a unicode string? > You're basically assuming that the person who implements the code that > processes a Unicode string is the same person who implemented the code > that converts a binary object into base64 and inserts it into a > string. No, I'm assuming the user of base64 knows the characteristics of the channel he's using. You can only use base64 if you know the channel promises not to munge the particular characters that base64 uses. If you don't know that, you shouldn't be trying to send base64 through that channel. > In most environments, it should be possible to hide bytes<->unicode > codecs almost all the time, But it *is* hidden in the situation I'm talking about, because all the Unicode encoding/decoding takes place inside the implementation of the text channel, which I'm taking as a given. > I don't think it's a good idea to gratuitously introduce > wire protocols as unicode codecs, I am *not* saying that base64 is a unicode codec! If that's what you thought I was saying, it's no wonder we're confusing each other. It's just a transformation from bytes to text. I'm only calling it unicode because all text will be unicode in Py3k. In py2.x it could just as well be a str -- but a str interpreted as text, not binary. > What do you think the email module does? > Assuming conforming MIME messages But I'm not assuming mime in the first place. If I have a mail interface that will accept chunks of binary data and encode them as a mime message for me, then I don't need to use base64 in the first place. The only time I need to use something like base64 is when I have something that will only accept text. In Py3k, "accepts text" is going to mean "takes a character string as input", where "character string" is a distinct type from "binary data". So having base64 produce anything other than a character string would be awkward and inconvenient. I phrased that paragraph carefully to avoid using the word "unicode" anywhere. Does that make it clearer what I'm getting at? -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 25 01:28:34 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Feb 2006 13:28:34 +1300 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> Message-ID: <43FFA4B2.7060303@canterbury.ac.nz> Raymond Hettinger wrote: > Code that > uses next() is more understandable, friendly, and readable without the > walls of underscores. There wouldn't be any walls of underscores, because y = x.next() would become y = next(x) The only time you would need to write underscores is when defining a __next__ method. That would be no worse than defining an __init__ or any other special method, and has the advantage that it clearly marks the method as being special. -- Greg From greg.ewing at canterbury.ac.nz Sat Feb 25 01:32:23 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 Feb 2006 13:32:23 +1300 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <7e9b97090602210516o5d1a823apedcea66846a271b5@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> <43FD393A.80209@canterbury.ac.nz> <20060223212402.GM23859@xs4all.nl> <43FEAD96.3000300@canterbury.ac.nz> <1259B2EC-DF60-4C7B-84E9-83C6DA665A61@fuhm.net> Message-ID: <43FFA597.3090804@canterbury.ac.nz> Jeremy Hylton wrote: > The more practical complaint is that list comprehensions use the same > namespace as the block that contains them. > ... but I suspect we're stuck with the > current behavior for backwards compatibility reasons. There will be no backwards compatibility in 3.0, so perhaps this could be fixed then? Greg From guido at python.org Sat Feb 25 01:48:23 2006 From: guido at python.org (Guido van Rossum) Date: Fri, 24 Feb 2006 18:48:23 -0600 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FFA597.3090804@canterbury.ac.nz> References: <7e9b97090602202029v2af11e53vafef43e9ba25698b@mail.gmail.com> <43FBFED5.5080809@canterbury.ac.nz> <7e9b97090602212336ka0b5fd8r2c85b1c0e914aff1@mail.gmail.com> <43FCD757.7030006@strakt.com> <43FD393A.80209@canterbury.ac.nz> <20060223212402.GM23859@xs4all.nl> <43FEAD96.3000300@canterbury.ac.nz> <1259B2EC-DF60-4C7B-84E9-83C6DA665A61@fuhm.net> <43FFA597.3090804@canterbury.ac.nz> Message-ID: On 2/24/06, Greg Ewing wrote: > Jeremy Hylton wrote: > > The more practical complaint is that list comprehensions use the same > > namespace as the block that contains them. > > ... but I suspect we're stuck with the > > current behavior for backwards compatibility reasons. > > There will be no backwards compatibility in 3.0, > so perhaps this could be fixed then? Yes that's the plan. [f(x) for x in S] will be syntactic sugar for list(f(x) for x in S) which already avoids the scope problem. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rrr at ronadam.com Sat Feb 25 01:47:04 2006 From: rrr at ronadam.com (Ron Adam) Date: Fri, 24 Feb 2006 18:47:04 -0600 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> Message-ID: <43FFA908.6040208@ronadam.com> Neil Schemenauer wrote: > Michael Hoffman wrote: >> Am I the only one who finds the use of "self" on a classmethod to be >> incredibly confusing? Can we please follow PEP 8 and use "cls" >> instead? > > Sorry, using "self" was an oversight. It should be "cls", IMO. > > Neil IMO2 Why was it decided that the unicode encoding argument should be ignored if the first argument is a string? Wouldn't an exception be better rather than give the impression it does something when it doesn't? Ron From nas at arctrix.com Sat Feb 25 02:14:37 2006 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 25 Feb 2006 01:14:37 +0000 (UTC) Subject: [Python-Dev] Pre-PEP: The "bytes" object References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> <43FFA908.6040208@ronadam.com> Message-ID: Ron Adam wrote: > Why was it decided that the unicode encoding argument should be ignored > if the first argument is a string? Wouldn't an exception be better > rather than give the impression it does something when it doesn't? >From the PEP: There is no sane meaning that the encoding can have in that case. str objects *are* byte arrays and they know nothing about the encoding of character data they contain. We need to assume that the programmer has provided str object that already uses the desired encoding. Raising an exception would be a valid option. However, passing the string through unchanged makes the transition from str to bytes easier. Neil From tim.peters at gmail.com Sat Feb 25 06:12:19 2006 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 24 Feb 2006 23:12:19 -0600 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: <20060224200914.GA18542@activestate.com> References: <20060224200914.GA18542@activestate.com> Message-ID: <1f7befae0602242112p3e051e3pfa65f2868bea034@mail.gmail.com> [Neal Norwitz] >> Martin and I were talking about dropping support for older versions of >> Windows (of the non-NT flavor). We both thought that it was >> reasonable to stop supporting Win9x (including WinME) in Python 2.6. >> I updated PEP 11 to reflect this. It's OK by me, but I have the same question as Trent: [Trent Mick] > Are there specific code areas in mind that would be ripped out for this > or is this mainly to avoid having to test on and ensure new code is > compatible with? Seem unlikely it's the latter, since I'm not sure any Python developer tests on a pre-NT Windows anymore anyway. Maybe Raymond is still running WinME? About the former, I don't see much potential. The ugliest 9x-ism is w9xpopen.exe, but comments in the places it's used say it's needed on NT too if the user is running command.com. If so, it stays. There's a bit of excruciating Win9x-specific code in winsound.c that could go away, and I suppose we could assume that Unicode filenames are always supported on Windows. Maybe best is that if someone reports a Win9x-specific bug against 2.6+, we could close it as Won't-Fix at once instead of letting it sit around ignored for years :-) From rrr at ronadam.com Sat Feb 25 07:23:59 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 25 Feb 2006 00:23:59 -0600 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> <43FFA908.6040208@ronadam.com> Message-ID: <43FFF7FF.9050301@ronadam.com> Neil Schemenauer wrote: > Ron Adam wrote: >> Why was it decided that the unicode encoding argument should be ignored >> if the first argument is a string? Wouldn't an exception be better >> rather than give the impression it does something when it doesn't? > >>From the PEP: > > There is no sane meaning that the encoding can have in that > case. str objects *are* byte arrays and they know nothing about > the encoding of character data they contain. We need to assume > that the programmer has provided str object that already uses > the desired encoding. > > Raising an exception would be a valid option. However, passing the > string through unchanged makes the transition from str to bytes > easier. > > Neil I guess I'm concerned that if the string isn't already in the specified encoding it could pass though without complaining and not be encoded as expected. >>> b.bytes(u'abc', 'hex-codec') bytes([54, 49, 54, 50, 54, 51]) >>> b.bytes('abc', 'hex-codec') bytes([97, 98, 99]) # not hex If this was in a function I would need to do a check of some sort anyways or cast to unicode beforehand, or encode beforehand. Which negates the advantage of having the codec argument in bytes unfortunately. def hexabyte(s): s = unicode(s) return bytes(s, 'hex-codec') or def hexabyte(s): s = s.encode('hex-codec') return bytes(s) It seems to me if you are specifying a codec for bytes, then you will not be expecting to get an already encoded string, and if you do, it may not be in the codec you want since you are probably not specifying the default codec. Ron From martin at v.loewis.de Sat Feb 25 14:22:35 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sat, 25 Feb 2006 14:22:35 +0100 Subject: [Python-Dev] Dropping support for Win9x in 2.6 In-Reply-To: <20060224200914.GA18542@activestate.com> References: <20060224200914.GA18542@activestate.com> Message-ID: <1140873755.44005a1bd2f70@www.domainfactory-webmail.de> Zitat von Trent Mick : > Are there specific code areas in mind that would be ripped out for this > or is this mainly to avoid having to test on and ensure new code is > compatible with? Primarily the non-W versions of the file system API. I think the W9x popen support could also go away. I don't think any testing happens for W9x; I (atleast) can't test it myself (I installed a Windows 95 system to test the 2.4 installer, but had to give up the machine shortly after that). Regards, Martin From stephen at xemacs.org Sat Feb 25 19:05:38 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 26 Feb 2006 03:05:38 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FFA2C9.4010204@canterbury.ac.nz> (Greg Ewing's message of "Sat, 25 Feb 2006 13:20:25 +1300") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> Message-ID: <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing writes: Greg> Stephen J. Turnbull wrote: >> the kind of "text" for which Unicode was designed is normally >> produced and consumed by people, who wll pt up w/ ll knds f >> nnsns. Base64 decoders will not put up with the same kinds of >> nonsense that people will. Greg> The Python compiler won't put up with that sort of nonsense Greg> either. Would you consider that makes Python source code Greg> binary data rather than text, and that it's inappropriate to Greg> represent it using a unicode string? The reason that Python source code is text is that the primary producers/consumers of Python source code are human beings, not compilers. There are no such human producers/consumers of base64. Unless you prefer that I expressed that last sentence as "VGhlIHJlYXNvbiB0aG F0IFB5dGhvbiBzb3VyY2UgY29kZSBpcyB0ZXh0IGlzIGJlY2F1c2UgdGhlIHByaW1 hcnkKcHJvZHVjZXJzL2NvbnN1bWVycyBvZiBQeXRob24gc291cmNlIGNvZGUgYXJl IGh1bWFuIGJlaW5ncywgbm90CmNvbXBpbGVycy4="? >> You're basically assuming that the person who implements the >> code that processes a Unicode string is the same person who >> implemented the code that converts a binary object into base64 >> and inserts it into a string. Greg> No, I'm assuming the user of base64 knows the Greg> characteristics of the channel he's using. Yes, which implies that you assume he has control of the data all the way to the channel that actually requires base64. Use case: the Gnus MUA supports the RFC that allows non-ASCII names in MIME headers that take file names. The interface was written for message-at-a-time use, which makes sense for composition. Somebody else added "save and strip part" editing capability, but this only works one MIME part at a time. So if you have a message with four MIME parts and you save and strip all of them, the first one gets encoded four times. The reason for *this* bug, and scores like it over the years, is that somebody made it convenient to put wire protocols into a text document. Shouldn't Python do better than that? Shouldn't Python text be for humans, rather than be whatever had the tag "character" attached to it for convenience of definition of a protocol for communication of data humans can't process without mechanical assistance? >> I don't think it's a good idea to gratuitously introduce wire >> protocols as unicode codecs, Greg> I am *not* saying that base64 is a unicode codec! If that's Greg> what you thought I was saying, it's no wonder we're Greg> confusing each other. I know you don't think that it's a duck, but it waddles and quacks. Ie, the question is not what I think you're saying. It's "what is the Python compiler/interpreter going to think?" AFAICS, it's going to think that base64 is a unicode codec. Greg> The only time I need to use something like base64 is when I Greg> have something that will only accept text. In Py3k, "accepts Greg> text" is going to mean "takes a character string as input", Characters are inherently abstract, as a class they can't be instantiated as input or output---only derived (ie, encoded) characters can. I don't believe that "takes a character string as input" has any intrinsic meaning. Greg> Does that make it clearer what I'm getting at? No. I already understood what you're getting at. As I said, I'm sympathetic in principle. In practice, I think it's a loaded gun aimed at my foot. And yours. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From thomas at xs4all.net Sat Feb 25 19:26:01 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sat, 25 Feb 2006 19:26:01 +0100 Subject: [Python-Dev] PEP 328 Message-ID: <20060225182601.GQ23859@xs4all.nl> Since I implemented[*] PEP 328, Aahz suggested I take over editing the PEP, too, as there were some minor discussion points to add still. I haven't been around for the discussioons, though, and it's been a while for everone else, I think, so I'd like to rehash and ask for any other open points. The one open point that Aahz forwarded me, and is expressed somewhat in http://mail.python.org/pipermail/python-dev/2004-September/048695.html , is the case where you have a package that you want to transparently supply a particular version of a module for forward/backward compatibility, replacing a version elsewhere on sys.path (if any.) I see four distinct situations for this: 1) Replacing a stdlib module (or a set of them) with a newer version, if the stdlib module is too old, where you want the whole stdlib to use the newer version. 2) Same as 1), but private to your package; modules not in your package should get the stdlib version when they import the 'replaced' module. 3) Providing a module (or a set of them) that the stdlib might be missing (but which will be a new enough version if it's there) 1) and 3) are easy to solve: put the module in a separate directory, insert that into sys.path; at the front for 1), at the end for 3). Mailman, IIRC, does this, and I think it works fine. 2) is easy if it's a single module; include it in your package and import it relatively. If it's a package itself, it's again pretty easy; include the package and include it relatively. The package itself is hopefully already using relative imports to get sibling packages. If the package is using absolute imports to get sibling packages, well, crap. I don't think we can solve that issue whatever we do: that already breaks. The real problem with 2) is when you have tightly coupled modules that are not together in a package and not using relative imports, or perhaps when you want to *partially* override a package. I would argue that tightly coupled modules should always use relative imports, whether they are together in a package or not (even though they should probably be in a package anyway.) I'd also argue that having different modules import different versions of existing modules is a bad idea. It's workable if the modules are only used internally, but exposing anything is troublesome. for instance, an instance of a class defined in foo (1.0) imported by bar will not be an instance of the same class defined in foo (1.1) imported by feeble. Am I missing anything? ([*] incorrectly, to be sure, but I have a 'correct' version ready that I'll upload in a second; I was trying to confuse Guido into accepting my version, instead.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From stephen at xemacs.org Sat Feb 25 20:44:14 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 26 Feb 2006 04:44:14 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <43FF8CA8.9020509@ronadam.com> (Ron Adam's message of "Fri, 24 Feb 2006 16:46:00 -0600") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <43FCC984.8020207@ronadam.com> <8764n6mw82.fsf@tleepslib.sk.tsukuba.ac.jp> <43FD61D2.1080200@ronadam.com> <87fym9jc55.fsf@tleepslib.sk.tsukuba.ac.jp> <43FF8CA8.9020509@ronadam.com> Message-ID: <87hd6ngr5d.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Ron" == Ron Adam writes: Ron> So, lets consider a "codec" and a "coding" as being two Ron> different things where a codec is a character sub set of Ron> unicode characters expressed in a native format. And a Ron> coding is *not* a subset of the unicode character set, but an Ron> _opperation_ performed on text. Ron> codec -> text is always in *one_codec* at any time. No, a codec is an operation, not a state. And text qua text has no need of state; the whole point of defining text (as in the unicode type) is to abstract from such representational issues. Ron> Pure codecs such as latin-1 can be envoked over and over and Ron> you can always get back what you put in in a single step. Maybe you'd like to define them that way, but it doesn't work in general. Given that str and unicode currently don't carry state with them, it's not possible for "to ASCII" and "to EBCDIC" to be idempotent at the same time. And for the languages spoken by 75% of the world's population, "to latin-1" cannot be successfully invoked even once, let alone be idempotent. You really need to think about how your examples apply to codecs like KOI8-R for Russian and Shift JIS for Japanese. In practice, I just don't think you can distinguish "codecs" from "coding" using the kind of mathematical properties you have described here. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From skip at pobox.com Sat Feb 25 21:26:18 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 25 Feb 2006 14:26:18 -0600 Subject: [Python-Dev] cProfile prints to stdout? Message-ID: <17408.48490.846187.170007@montanaro.dyndns.org> I just noticed that cProfile (like profile) prints to stdout. Yuck. I guess that's to be expected because the pstats module does the actual printing and it's used by both modules. I'm willing to give up backward compatibility to achieve a little more sanity and flexibility here. I propose rewriting the necessary bits to att a stream= keyword argument where necessary and using stream.write(...) or print >> stream, ... instead of the current bare print. I'd prefer the default for the stream be sys.stderr as well. Thoughts? Skip From brett at python.org Sat Feb 25 21:35:22 2006 From: brett at python.org (Brett Cannon) Date: Sat, 25 Feb 2006 12:35:22 -0800 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: <17408.48490.846187.170007@montanaro.dyndns.org> References: <17408.48490.846187.170007@montanaro.dyndns.org> Message-ID: On 2/25/06, skip at pobox.com wrote: > I just noticed that cProfile (like profile) prints to stdout. Yuck. I > guess that's to be expected because the pstats module does the actual > printing and it's used by both modules. I'm willing to give up backward > compatibility to achieve a little more sanity and flexibility here. I > propose rewriting the necessary bits to att a stream= keyword argument where > necessary and using stream.write(...) or print >> stream, ... instead of the > current bare print. I'd prefer the default for the stream be sys.stderr as > well. > > Thoughts? +0 from me (would be +1 since it seems very reasonable, but I never use profile and this will break some code somewhere). -Brett From python at rcn.com Sat Feb 25 21:39:24 2006 From: python at rcn.com (Raymond Hettinger) Date: Sat, 25 Feb 2006 14:39:24 -0600 Subject: [Python-Dev] cProfile prints to stdout? References: <17408.48490.846187.170007@montanaro.dyndns.org> Message-ID: <001e01c63a4b$97bfde90$c913020a@RaymondLaptop1> [Skip] >I just noticed that cProfile (like profile) prints to stdout. Yuck. I > guess that's to be expected because the pstats module does the actual > printing and it's used by both modules. I'm willing to give up backward > compatibility to achieve a little more sanity and flexibility here. I > propose rewriting the necessary bits to att a stream= keyword argument where > necessary and using stream.write(...) or print >> stream, ... instead of the > current bare print. I'd prefer the default for the stream be sys.stderr as > well. > > Thoughts? FWIW, this idea has come-up a couple of times before so it should probably get fixed once and for all. Raymond From g.brandl at gmx.net Sat Feb 25 21:48:06 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 25 Feb 2006 21:48:06 +0100 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: <17408.48490.846187.170007@montanaro.dyndns.org> References: <17408.48490.846187.170007@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > I just noticed that cProfile (like profile) prints to stdout. Yuck. I > guess that's to be expected because the pstats module does the actual > printing and it's used by both modules. I'm willing to give up backward > compatibility to achieve a little more sanity and flexibility here. I > propose rewriting the necessary bits to att a stream= keyword argument where > necessary and using stream.write(...) or print >> stream, ... instead of the > current bare print. I'd prefer the default for the stream be sys.stderr as > well. Probably related: http://python.org/sf/1235266 Cheers, Georg From skip at pobox.com Sat Feb 25 21:57:42 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 25 Feb 2006 14:57:42 -0600 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: References: <17408.48490.846187.170007@montanaro.dyndns.org> Message-ID: <17408.50374.360818.741500@montanaro.dyndns.org> Georg> Probably related: Georg> http://python.org/sf/1235266 Don't think so. That was just a documentation nit (and is now fixed and closed at any rate). Skip From g.brandl at gmx.net Sat Feb 25 21:59:18 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 25 Feb 2006 21:59:18 +0100 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: <17408.50374.360818.741500@montanaro.dyndns.org> References: <17408.48490.846187.170007@montanaro.dyndns.org> <17408.50374.360818.741500@montanaro.dyndns.org> Message-ID: skip at pobox.com wrote: > Georg> Probably related: > > Georg> http://python.org/sf/1235266 > > Don't think so. That was just a documentation nit (and is now fixed and > closed at any rate). Well, it is another module that prints to stdout instead of stderr. Okay, not so closely related ;) Georg From guido at python.org Sat Feb 25 22:31:56 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2006 15:31:56 -0600 Subject: [Python-Dev] PEP 328 In-Reply-To: <20060225182601.GQ23859@xs4all.nl> References: <20060225182601.GQ23859@xs4all.nl> Message-ID: On 2/25/06, Thomas Wouters wrote: > > Since I implemented[*] PEP 328, Aahz suggested I take over editing the PEP, > too, as there were some minor discussion points to add still. I haven't been > around for the discussioons, though, and it's been a while for everone else, > I think, so I'd like to rehash and ask for any other open points. > > The one open point that Aahz forwarded me, and is expressed somewhat in > http://mail.python.org/pipermail/python-dev/2004-September/048695.html , is > the case where you have a package that you want to transparently supply a > particular version of a module for forward/backward compatibility, replacing > a version elsewhere on sys.path (if any.) I see four distinct situations for > this: > > 1) Replacing a stdlib module (or a set of them) with a newer version, if the > stdlib module is too old, where you want the whole stdlib to use the > newer version. > > 2) Same as 1), but private to your package; modules not in your package > should get the stdlib version when they import the 'replaced' module. > > 3) Providing a module (or a set of them) that the stdlib might be missing > (but which will be a new enough version if it's there) > > 1) and 3) are easy to solve: put the module in a separate directory, insert > that into sys.path; at the front for 1), at the end for 3). Mailman, IIRC, > does this, and I think it works fine. > > 2) is easy if it's a single module; include it in your package and import it > relatively. If it's a package itself, it's again pretty easy; include the > package and include it relatively. The package itself is hopefully already > using relative imports to get sibling packages. If the package is using > absolute imports to get sibling packages, well, crap. I don't think we can > solve that issue whatever we do: that already breaks. > > The real problem with 2) is when you have tightly coupled modules that are > not together in a package and not using relative imports, or perhaps when > you want to *partially* override a package. I would argue that tightly > coupled modules should always use relative imports, whether they are > together in a package or not (even though they should probably be in a > package anyway.) I'd also argue that having different modules import > different versions of existing modules is a bad idea. It's workable if the > modules are only used internally, but exposing anything is troublesome. for > instance, an instance of a class defined in foo (1.0) imported by bar will > not be an instance of the same class defined in foo (1.1) imported by > feeble. > > Am I missing anything? > > ([*] incorrectly, to be sure, but I have a 'correct' version ready that I'll > upload in a second; I was trying to confuse Guido into accepting my version, > instead.) One thing you're missing here is that the original assertion about the impossibility of editing the source code of the third-party package that's being incorporated into your distribution, is simply wrong. Systematically modifying all modules in a package to change their imports to assume a slightly different hierarchy can easily be done mechanically. I'd also add that eggs promise to provide a different solution for most concerns. I believe we should go ahead and implement PEP 338 faithfully without revisiting the decisions. If we were wrong (which I doubt) we'll have the opportunity to take a different direction in 2.6. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Feb 25 22:36:33 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2006 15:36:33 -0600 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: <17408.48490.846187.170007@montanaro.dyndns.org> References: <17408.48490.846187.170007@montanaro.dyndns.org> Message-ID: On 2/25/06, skip at pobox.com wrote: > I just noticed that cProfile (like profile) prints to stdout. Yuck. Can you clarify? Why is it wrong to send the output of the profiler to stdout? It seems to make sense to me that you should be able to redirect the profiler's output to a file with a simple ">file". -- --Guido van Rossum (home page: http://www.python.org/~guido/) From facundobatista at gmail.com Sat Feb 25 22:59:49 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Sat, 25 Feb 2006 18:59:49 -0300 Subject: [Python-Dev] Translating docs Message-ID: After a small talk with Raymond, yesterday in the breakfast, I proposed in PyAr the idea of start to translate the Library Reference. You'll agree with me that this is a BIG effort. But not only big, it's dynamic! So, we decided that we need a system that provide us the management of the translations. And it'd be a good idea the system to be available for translations in other languages. One of the guys proposed to use Launchpad (https://launchpad.net/). The question is, it's ok to use a third party system for this initiative? Or you (we) prefer to host it in-house? Someone alredy thought of this? Thank you! . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From tjreedy at udel.edu Sat Feb 25 23:00:05 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 25 Feb 2006 17:00:05 -0500 Subject: [Python-Dev] PEP 328 References: <20060225182601.GQ23859@xs4all.nl> Message-ID: "Thomas Wouters" wrote in message news:20060225182601.GQ23859 at xs4all.nl... > The one open point that Aahz forwarded me, and is expressed somewhat in > http://mail.python.org/pipermail/python-dev/2004-September/048695.html , > is > the case where you have a package that you want to transparently supply a > particular version of a module for forward/backward compatibility, > replacing > a version elsewhere on sys.path (if any.) I see four distinct situations > for > this: Did you mean three? > 1) Replacing a stdlib module (or a set of them) with a newer version, if > the > stdlib module is too old, where you want the whole stdlib to use the > newer version. > > 2) Same as 1), but private to your package; modules not in your package > should get the stdlib version when they import the 'replaced' module. > > 3) Providing a module (or a set of them) that the stdlib might be missing > (but which will be a new enough version if it's there) Or did you forget the fourth? In any case, the easy solution to breaking code is to not do it until 3.0. There might never be a 2.7 to worry about. Terry Jan Reedy From skip at pobox.com Sat Feb 25 23:14:04 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 25 Feb 2006 16:14:04 -0600 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: References: <17408.48490.846187.170007@montanaro.dyndns.org> Message-ID: <17408.54956.508734.992814@montanaro.dyndns.org> >> I just noticed that cProfile (like profile) prints to stdout. Yuck. Guido> Can you clarify? Why is it wrong to send the output of the Guido> profiler to stdout? If the program emits a bunch of output of its own I want to keep it separate from what is arguably the debug output of the profiler (even though the profiler prints all its output at the end): foo.py > /dev/null 2> foo.prof Guido> It seems to make sense to me that you should be able to redirect Guido> the profiler's output to a file with a simple ">file". It is currently impossible to separate profile output from the program's output. Skip From guido at python.org Sat Feb 25 23:42:03 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2006 16:42:03 -0600 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: <17408.54956.508734.992814@montanaro.dyndns.org> References: <17408.48490.846187.170007@montanaro.dyndns.org> <17408.54956.508734.992814@montanaro.dyndns.org> Message-ID: On 2/25/06, skip at pobox.com wrote: > > >> I just noticed that cProfile (like profile) prints to stdout. Yuck. > > Guido> Can you clarify? Why is it wrong to send the output of the > Guido> profiler to stdout? > > If the program emits a bunch of output of its own I want to keep it separate > from what is arguably the debug output of the profiler (even though the > profiler prints all its output at the end): > > foo.py > /dev/null 2> foo.prof > > Guido> It seems to make sense to me that you should be able to redirect > Guido> the profiler's output to a file with a simple ">file". > > It is currently impossible to separate profile output from the program's > output. It is if you use the "advanced" use of the profiler -- the profiling run just saves the profiling data to a file, and the pstats module invoked separately prints the output. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Feb 26 00:16:21 2006 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2006 17:16:21 -0600 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> Message-ID: FWIW this has now been checked in. Enjoy! --Guido On 2/23/06, Guido van Rossum wrote: > On 2/22/06, Michael Chermside wrote: > > A minor related point about on_missing(): > > > > Haven't we learned from regrets over the .next() method of iterators > > that all "magically" invoked methods should be named using the __xxx__ > > pattern? Shouldn't it be named __on_missing__() instead? > > Good point. I'll call it __missing__. I've uploaded a new patch to > python.org/sf/1433928. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Sun Feb 26 00:18:54 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 26 Feb 2006 12:18:54 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <4400E5DE.3070102@canterbury.ac.nz> Stephen J. Turnbull wrote: > The reason that Python source code is text is that the primary > producers/consumers of Python source code are human beings, not > compilers I disagree with "primary" -- I think human and computer use of source code have equal importance. Because of the fact that Python source code must be acceptable to the Python compiler, a great many transformations that would be harmless to English text (upper casing, paragraph wrapping, etc.) would cause disaster if applied to a Python program. I don't see how base64 is any different. > Yes, which implies that you assume he has control of the data all the > way to the channel that actually requires base64. Yes. If he doesn't, he can't safely use base64 at all. That's true regardless of how the base64-encoded data is represented. It's true of any data of any kind. > Use case: the Gnus MUA supports the RFC that allows non-ASCII names in > MIME headers that take file names... I'm not familiar with all the details you're alluding to here, but if there's a bug here, I'd say it's due to somebody not thinking something through properly. It shouldn't matter if something gets encoded four times as long as it gets decoded four times at the other end. If it's not possible to do that, someone made an assumption about the channel that wasn't true. > It's "what is the Python compiler/interpreter going > to think?" AFAICS, it's going to think that base64 is > a unicode codec. Only if it's designed that way, and I specifically think it shouldn't -- i.e. it should be an error to attempt the likes of a_unicode_string.encode("base64") or unicode(something, "base64"). The interface for doing base64 encoding should be something else. > I don't believe that "takes a character string as > input" has any intrinsic meaning. I'm using that phrase in the context of Python, where it means "a function that takes a Python character string as input". In the particular case of base64, it has the added restriction that it must preserve the particular 65 characters used. > In practice, I think it's a loaded gun > aimed at my foot. And yours. Whereas it seems quite the opposite to me, i.e. *failing* to clearly distinguish between text and binary data here is what will lead to confusion and foot-shooting. I think we need some concrete use cases to talk about if we're to get any further with this. Do you have any such use cases in mind? Greg From skip at pobox.com Sun Feb 26 01:13:04 2006 From: skip at pobox.com (skip at pobox.com) Date: Sat, 25 Feb 2006 18:13:04 -0600 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: References: <17408.48490.846187.170007@montanaro.dyndns.org> <17408.54956.508734.992814@montanaro.dyndns.org> Message-ID: <17408.62096.624270.517776@montanaro.dyndns.org> >> It is currently impossible to separate profile output from the >> program's output. Guido> It is if you use the "advanced" use of the profiler -- the Guido> profiling run just saves the profiling data to a file, and the Guido> pstats module invoked separately prints the output. Sure, but then it's not "simple". Your original example was "... > file". I'd like it to be (nearly) as easy to do it right yet keep it simple. Skip From martin at v.loewis.de Sun Feb 26 01:43:34 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sun, 26 Feb 2006 01:43:34 +0100 Subject: [Python-Dev] Translating docs In-Reply-To: References: Message-ID: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Zitat von Facundo Batista : > The question is, it's ok to use a third party system for this > initiative? Or you (we) prefer to host it in-house? Someone alredy > thought of this? I thought about it at one time, and I think the doc strings can be translated very well using gettext-based procedures; I once submitted a POT file to the translation project: http://www.iro.umontreal.ca/translation/registry.cgi?domain=python Translating the library reference as such is more difficult, because it can't be translated in small chunks very well. Some group of French translators once translated everything for 1.5.2, and that translation never got updated. Regards, Martin From aleaxit at gmail.com Sun Feb 26 02:34:41 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sat, 25 Feb 2006 17:34:41 -0800 Subject: [Python-Dev] Translating docs In-Reply-To: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> References: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Message-ID: On Feb 25, 2006, at 4:43 PM, martin at v.loewis.de wrote: > Zitat von Facundo Batista : > >> The question is, it's ok to use a third party system for this >> initiative? Or you (we) prefer to host it in-house? Someone alredy >> thought of this? > > I thought about it at one time, and I think the doc strings can be > translated very well using gettext-based procedures; I once submitted > a POT file to the translation project: > > http://www.iro.umontreal.ca/translation/registry.cgi?domain=python > > Translating the library reference as such is more difficult, because > it can't be translated in small chunks very well. > > Some group of French translators once translated everything for 1.5.2, > and that translation never got updated. A similar situation applies to Italy -- a lot of stuff is translated at http://www.python.it/doc/Python-Docs/html/ (the C-API and Extending and Embedding aren't translated, though), but it's 2.3.4- vintage docs. There's no real mechanism or process to ensure updates. Alex From rrr at ronadam.com Sun Feb 26 02:50:24 2006 From: rrr at ronadam.com (Ron Adam) Date: Sat, 25 Feb 2006 19:50:24 -0600 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <4400E5DE.3070102@canterbury.ac.nz> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> <4400E5DE.3070102@canterbury.ac.nz> Message-ID: <44010960.3060906@ronadam.com> Greg Ewing wrote: > Stephen J. Turnbull wrote: >> It's "what is the Python compiler/interpreter going > > to think?" AFAICS, it's going to think that base64 is > > a unicode codec. > > Only if it's designed that way, and I specifically > think it shouldn't -- i.e. it should be an error > to attempt the likes of a_unicode_string.encode("base64") > or unicode(something, "base64"). The interface for > doing base64 encoding should be something else. I agree > I think we need some concrete use cases to talk > about if we're to get any further with this. Do > you have any such use cases in mind? > > Greg Or at least some where more concrete than trying to debate abstract meanings. ;-) Running a short test over all the codecs and converting u"Python" to string and back to unicode resulted in the following output. These are the only ones of the 92 that couldn't do the round trip successfully. It seems to me these will need to be moved and/or made to work with unicode at some point. Test 1: UNICODE -> STRING -> UNICODE 1: 'bz2_codec' Traceback (most recent call last): File "codectest.py", line 29, in test1 u2 = unicode(s, c) # to unicode TypeError: decoder did not return an unicode object (type=str) 2: 'hex_codec' Traceback (most recent call last): File "codectest.py", line 29, in test1 u2 = unicode(s, c) # to unicode TypeError: decoder did not return an unicode object (type=str) 3: 'uu_codec' Traceback (most recent call last): File "codectest.py", line 29, in test1 u2 = unicode(s, c) # to unicode TypeError: decoder did not return an unicode object (type=str) 4: 'quopri_codec' Traceback (most recent call last): File "codectest.py", line 29, in test1 u2 = unicode(s, c) # to unicode TypeError: decoder did not return an unicode object (type=str) 5: 'base64_codec' Traceback (most recent call last): File "codectest.py", line 29, in test1 u2 = unicode(s, c) # to unicode TypeError: decoder did not return an unicode object (type=str) 6: 'zlib_codec' Traceback (most recent call last): File "codectest.py", line 29, in test1 u2 = unicode(s, c) # to unicode TypeError: decoder did not return an unicode object (type=str) 7: 'tactis' Traceback (most recent call last): File "codectest.py", line 28, in test1 s = u1.encode(c) # to string LookupError: unknown encoding: tactis From facundobatista at gmail.com Sun Feb 26 03:13:51 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Sat, 25 Feb 2006 23:13:51 -0300 Subject: [Python-Dev] Translating docs In-Reply-To: References: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Message-ID: 2006/2/25, Alex Martelli : > A similar situation applies to Italy -- a lot of stuff is translated > at http://www.python.it/doc/Python-Docs/html/ (the C-API and > Extending and Embedding aren't translated, though), but it's 2.3.4- > vintage docs. There's no real mechanism or process to ensure updates. We don't want that to happen, no. BTW, Alex, so bad you're not here. We miss you, :) . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From facundobatista at gmail.com Sun Feb 26 03:14:20 2006 From: facundobatista at gmail.com (Facundo Batista) Date: Sat, 25 Feb 2006 23:14:20 -0300 Subject: [Python-Dev] Fwd: Translating docs In-Reply-To: References: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Message-ID: 2006/2/25, martin at v.loewis.de : > Translating the library reference as such is more difficult, because > it can't be translated in small chunks very well. The SVN directory "python/dist/src/Doc/lib/" has 276 .tex's, with an average of 250 lines each. Maybe manage each file independently could work. > Some group of French translators once translated everything for 1.5.2, > and that translation never got updated. We're afraid of this. And that's why we think that it'd be necessary to have some automated system that tell us if the original file got updated, if there're new files to translate, to show the state of the translation (in process, finished, not even started, etc...). I think that a system like this is not so difficult, but if the wheel is already invented... . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From almann.goo at gmail.com Sun Feb 26 07:46:21 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 01:46:21 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> Message-ID: <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> On 2/23/06, Steven Bethard wrote: > On 2/22/06, Almann T. Goo wrote: > > def incrementer_getter(val): > > def incrementer(): > > val = 5 > > def inc(): > > ..val += 1 > > return val > > return inc > > return incrementer > > Sorry, what way did the user think? I'm not sure what you think was > supposed to happen. My apologies ... I shouldn't use vague terms like what the "user thinks." My problem, as is demonstrated in the above example, is that the implicit nature of evaluating a name in Python conflicts with the explicit nature of the proposed "dot" notation. It makes it easier for a user to write obscure code (until Python 3K when we force users to use "dot" notation for all enclosing scope access ;-) ). This sort of thing can be done today with code using attribute access on its module object to evaluate and rebind global names. With the "global" keyword however, users don't have to resort to this sort of trick. Because of Python's name binding semantics and the semantic for the "global" keyword, I think the case for an "outer"-type keyword is stronger and we could deprecate "global" going forward in Python 3K. One of the biggest points of contention to this is of course the backwards incompatibility with a new keyword ... Python has already recently added "yield" and we're about to get "with" and "as" in 2.5. As far as the "user-interface" of the language getting bloated, I personally think trading "global" for an "outer" mitigates that some. -Almann -- Almann T. Goo almann.goo at gmail.com From greg.ewing at canterbury.ac.nz Sun Feb 26 07:48:03 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 26 Feb 2006 19:48:03 +1300 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> Message-ID: <44014F23.80409@canterbury.ac.nz> Almann T. Goo wrote: > One of the biggest points of contention to this is of course the > backwards incompatibility with a new keyword ... Alternatively, 'global' could be redefined to mean what we're thinking of for 'outer'. Then there would be no change in keywordage. There would be potential for breaking code, but I suspect the actual amount of breakage would be small, since there would have to be 3 scopes involved, with something in the middle one shadowing a global that was referenced in the inner one with a global statement. Given the rarity of global statement usage to begin with, I'd say that narrows things down to something well within the range of acceptable breakage in 3.0. Greg From almann.goo at gmail.com Sun Feb 26 08:07:32 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 02:07:32 -0500 Subject: [Python-Dev] PEP for Better Control of Nested Lexical Scopes In-Reply-To: <43FD39A9.6090108@canterbury.ac.nz> References: <20060220204850.6004.JCARLSON@uci.edu> <43FAE45F.3020000@canterbury.ac.nz> <20060221105309.601F.JCARLSON@uci.edu> <43FCFECB.9010101@canterbury.ac.nz> <7e9b97090602221712k29016577te0d87ebb504949d5@mail.gmail.com> <43FD39A9.6090108@canterbury.ac.nz> Message-ID: <7e9b97090602252307y71887238j9b93d0da35bf7f4b@mail.gmail.com> On 2/22/06, Greg Ewing wrote: > That's what rankles people about this, I think -- there > doesn't seem to be a good reason for treating the global > scope so specially, given that all scopes could be > treated uniformly if only there were an 'outer' statement. > All the arguments I've seen in favour of the status quo > seem like rationalisations after the fact. I agree, hence my initial pre-PEP feeler on the topic ;). > > Since there were no nested lexical scopes back > > then, there was no need to have a construct for arbitrary enclosing > > scopes. > > However, if nested scopes *had* existed back then, I > rather suspect we would have had an 'outer' statement > from the beginning, or else 'global' would have been > given the semantics we are now considering for 'outer'. Would it not be so horrible to make "global" be the "outer"-type keyword--basically meaning "lexically global" versus "the global scope"? It would make the semantics for Python's nested lexical scopes to be more in line with other languages with this feature and fix my orthogonality gripes. As far as backwards compatibility, I doubt there would be too much impact in this regard, as places that would break would be where "global" was used in a closure where the name was shadowed in an enclosing scope. A "from __future__ import lexical_global" (which we'd have for adding the "outer"-like keyword anyway) could help diminish the growing pains. -Almann -- Almann T. Goo almann.goo at gmail.com From almann.goo at gmail.com Sun Feb 26 08:15:20 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 02:15:20 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <44014F23.80409@canterbury.ac.nz> References: <20060220204850.6004.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> <44014F23.80409@canterbury.ac.nz> Message-ID: <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> On 2/26/06, Greg Ewing wrote: > Alternatively, 'global' could be redefined to mean > what we're thinking of for 'outer'. Then there would > be no change in keywordage. > > There would be potential for breaking code, but I > suspect the actual amount of breakage would be > small, since there would have to be 3 scopes > involved, with something in the middle one > shadowing a global that was referenced in the > inner one with a global statement. > > Given the rarity of global statement usage to begin > with, I'd say that narrows things down to something > well within the range of acceptable breakage in 3.0. You read my mind--I made a reply similar to this on another branch of this thread just minutes ago :). I am curious to see what the community thinks about this. -Almann -- Almann T. Goo almann.goo at gmail.com From g.brandl at gmx.net Sun Feb 26 08:50:57 2006 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 26 Feb 2006 08:50:57 +0100 Subject: [Python-Dev] Fwd: Translating docs In-Reply-To: References: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Message-ID: Facundo Batista wrote: > 2006/2/25, martin at v.loewis.de : > >> Translating the library reference as such is more difficult, because >> it can't be translated in small chunks very well. > > The SVN directory "python/dist/src/Doc/lib/" has 276 .tex's, with an > average of 250 lines each. > > Maybe manage each file independently could work. > > >> Some group of French translators once translated everything for 1.5.2, >> and that translation never got updated. > > We're afraid of this. And that's why we think that it'd be necessary > to have some automated system that tell us if the original file got > updated, if there're new files to translate, to show the state of the > translation (in process, finished, not even started, etc...). > > I think that a system like this is not so difficult, but if the wheel > is already invented... Wouldn't a post-commit hook in SVN be enough? Also, the docs could be managed in a Wiki (or, if the translators know how to use it, in SVN too) so that translators can correct and revise what others have translated... Martin: There aren't any German docs, are there? Georg From 2005a at usenet.alexanderweb.de Sun Feb 26 11:08:23 2006 From: 2005a at usenet.alexanderweb.de (Alexander Schremmer) Date: Sun, 26 Feb 2006 11:08:23 +0100 Subject: [Python-Dev] Fwd: Translating docs References: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Message-ID: On Sun, 26 Feb 2006 08:50:57 +0100, Georg Brandl wrote: > Martin: There aren't any German docs, are there? There is e.g. http://starship.python.net/~gherman/publications/tut-de/ Kind regards, Alexander From martin at v.loewis.de Sun Feb 26 15:30:13 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sun, 26 Feb 2006 15:30:13 +0100 Subject: [Python-Dev] Fwd: Translating docs In-Reply-To: References: <1140914613.4400f9b60125e@www.domainfactory-webmail.de> Message-ID: <1140964213.4401bb75dfbb7@www.domainfactory-webmail.de> Zitat von Georg Brandl : > Martin: There aren't any German docs, are there? I started translating the doc strings once, but never got to complete it. I still believe that the doc string translation is the only approach that could work in a reasonable way - you would have to use pydoc to view the translations, though. There are, of course, various German books. Regards, Martin From massimiliano.leoni at katamail.com Sun Feb 26 15:27:34 2006 From: massimiliano.leoni at katamail.com (Massimiliano Leoni) Date: Sun, 26 Feb 2006 15:27:34 +0100 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) Message-ID: <7.0.1.0.0.20060226104058.019c63b8@katamail.com> Why would you change the Python scoping rules, instead of using the function attributes, available from release 2.1 (PEP 232) ? For example, you may write: def incgen(start, inc): def incrementer(): incrementer.a += incrementer.b return incrementer.a incrementer.a = start - inc incrementer.b = inc return incrementer f = incgen(100, 2) g = incgen(200, 3) for i in range(5): print f(), g() The result is: 100 200 102 203 104 206 106 209 108 212 From stephen at xemacs.org Sun Feb 26 17:05:51 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 27 Feb 2006 01:05:51 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <4400E5DE.3070102@canterbury.ac.nz> (Greg Ewing's message of "Sun, 26 Feb 2006 12:18:54 +1300") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> <4400E5DE.3070102@canterbury.ac.nz> Message-ID: <87d5hagl5s.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing writes: Greg> I think we need some concrete use cases to talk about if Greg> we're to get any further with this. Do you have any such use Greg> cases in mind? I gave you one, MIME processing in email, and a concrete bug that is possible with the design you propose, but not in mine. You said, "the programmers need to try harder." If that's an acceptable answer, I have to concede it beats all any use case I can imagine. I think it's your turn. Give me a use case where it matters practically that the output of the base64 codec be Python unicode characters rather than 8-bit ASCII characters. I don't think you can. Everything you have written so far is based on defending your maintained assumption that because Python implements text processing via the unicode type, everything that is described as a "character" must be coerced to that type. If you give up that assumption, you get 1. an automatic reduction in base64.upcase() bugs because it's a type error, ie, binary objects are not text objects, no matter what their representation 2. encouragement to programmer teams to carry along binary objects as opaque blobs until they're just about to put them on the wire, then let the wire protocol guy implement the conversion at that point 3. efficiency for a very common case where ASCII octets are the wire representation 4. efficient and clear implementation and documentation using the codec framework and API I don't really see a downside, except for the occasional double conversion ASCII -> unicode -> UTF-16, as is allowed (but not mandated) in XML's use of base64. What downside do you see? -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From gvanrossum at gmail.com Sun Feb 26 18:12:25 2006 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 26 Feb 2006 11:12:25 -0600 Subject: [Python-Dev] cProfile prints to stdout? In-Reply-To: <17408.62096.624270.517776@montanaro.dyndns.org> References: <17408.48490.846187.170007@montanaro.dyndns.org> <17408.54956.508734.992814@montanaro.dyndns.org> <17408.62096.624270.517776@montanaro.dyndns.org> Message-ID: On 2/25/06, skip at pobox.com wrote: > > >> It is currently impossible to separate profile output from the > >> program's output. > > Guido> It is if you use the "advanced" use of the profiler -- the > Guido> profiling run just saves the profiling data to a file, and the > Guido> pstats module invoked separately prints the output. > > Sure, but then it's not "simple". Your original example was "... > file". > I'd like it to be (nearly) as easy to do it right yet keep it simple. OK. I believe the default should be stdout though, and the conveniece method print_stats() in profile.py should be the only place that references stderr. The smallest code mod would be to redirect stdout temporarily inside print_stats(); but I won't complain if you're more ambitious and modify pstats.py. def print_stats(self, sort=-1, stream=None): import pstats if stream is None: stream = sys.stderr save = sys.stdout try: if stream is not None: sys.stdout = stream pstats.Stats(self).strip_dirs().sort_stats(sort). \ print_stats() finally: sys.stdout = save -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Sun Feb 26 18:14:18 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Sun, 26 Feb 2006 18:14:18 +0100 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <7.0.1.0.0.20060226104058.019c63b8@katamail.com> References: <7.0.1.0.0.20060226104058.019c63b8@katamail.com> Message-ID: <20060226171418.GR23859@xs4all.nl> On Sun, Feb 26, 2006 at 03:27:34PM +0100, Massimiliano Leoni wrote: > Why would you change the Python scoping rules, instead of using the > function attributes, available from release 2.1 (PEP 232) ? Because closures allow for data that isn't trivially reachable by the caller (or anyone but the function itself.) You can argue that that's unpythonic or what not, but fact is that the current closures allow that. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From steven.bethard at gmail.com Sun Feb 26 18:17:31 2006 From: steven.bethard at gmail.com (Steven Bethard) Date: Sun, 26 Feb 2006 10:17:31 -0700 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> References: <20060220204850.6004.JCARLSON@uci.edu> <20060221105309.601F.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> Message-ID: On 2/25/06, Almann T. Goo wrote: > On 2/23/06, Steven Bethard wrote: > > On 2/22/06, Almann T. Goo wrote: > > > def incrementer_getter(val): > > > def incrementer(): > > > val = 5 > > > def inc(): > > > ..val += 1 > > > return val > > > return inc > > > return incrementer > > > > Sorry, what way did the user think? I'm not sure what you think was > > supposed to happen. > > My apologies ... I shouldn't use vague terms like what the "user > thinks." My problem, as is demonstrated in the above example, is that > the implicit nature of evaluating a name in Python conflicts with the > explicit nature of the proposed "dot" notation. It makes it easier > for a user to write obscure code (until Python 3K when we force users > to use "dot" notation for all enclosing scope access ;-) ). Then do you also dislike the original proposal: that only a single dot be allowed, and that the '.' would mean "this name, but in the nearest outer scope that defines it"? Then: def incrementer_getter(val): def incrementer(): val = 5 def inc(): .val += 1 return val return inc return incrementer would do what I think you want it to[1]. Note that I only suggested extending the dot-notation to allow multiple dots because of Greg Ewing's complaint that it wasn't enough like the relative import notation. Personally I find PJE's original proposal more intuitive, and based on your example, I suspect so do you. [1] That is, increment the ``val`` in incrementer(), return the same ``val``, and never modify the ``val`` in incrementer_getter(). STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy From tjreedy at udel.edu Sun Feb 26 18:45:24 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 26 Feb 2006 12:45:24 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> Message-ID: "Almann T. Goo" wrote in message news:7e9b97090602252315mf6d4686ud86dd5163ea76b37 at mail.gmail.com... > On 2/26/06, Greg Ewing wrote: >> Alternatively, 'global' could be redefined to mean >> what we're thinking of for 'outer'. Then there would >> be no change in keywordage. >> Given the rarity of global statement usage to begin >> with, I'd say that narrows things down to something >> well within the range of acceptable breakage in 3.0. > > You read my mind--I made a reply similar to this on another branch of > this thread just minutes ago :). > > I am curious to see what the community thinks about this. I *think* I like this better than more complicated proposals. I don't think I would ever have a problem with the intermediate scope masking the module scope. After all, if I really meant to access the current global scope from a nested function, I simply would not use that name in the intermediate scope. tjr From almann.goo at gmail.com Sun Feb 26 20:06:57 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 14:06:57 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> Message-ID: <7e9b97090602261106w646c15c3q16b763e62737d6a9@mail.gmail.com> On 2/26/06, Steven Bethard wrote: > Then do you also dislike the original proposal: that only a single dot > be allowed, and that the '.' would mean "this name, but in the nearest > outer scope that defines it"? Then: > > def incrementer_getter(val): > def incrementer(): > val = 5 > def inc(): > .val += 1 > return val > return inc > return incrementer > > would do what I think you want it to[1]. Note that I only suggested > extending the dot-notation to allow multiple dots because of Greg > Ewing's complaint that it wasn't enough like the relative import > notation. Personally I find PJE's original proposal more intuitive, > and based on your example, I suspect so do you. > > [1] That is, increment the ``val`` in incrementer(), return the same > ``val``, and never modify the ``val`` in incrementer_getter(). I'm not sure if I find this more intuitive, but I think it is more convenient than the "explicit dots" for each scope. However my biggest issue is still there. I am not a big fan of letting users have synonyms for names. Notice how ".var" means the same as "var" in some contexts in the example above--that troubles me. PEP 227 addresses this concern with regard to the class scope: Names in class scope are not accessible. Names are resolved in the innermost enclosing function scope. If a class definition occurs in a chain of nested scopes, the resolution process skips class definitions. This rule prevents odd interactions between class attributes and local variable access. As the PEP further states: An alternative would have been to allow name binding in class scope to behave exactly like name binding in function scope. This rule would allow class attributes to be referenced either via attribute reference or simple name. This option was ruled out because it would have been inconsistent with all other forms of class and instance attribute access, which always use attribute references. Code that used simple names would have been obscure. I especially don't want to add an issue that is similar to one that PEP 227 went out of its way to avoid. -Almann -- Almann T. Goo almann.goo at gmail.com From rrr at ronadam.com Sun Feb 26 20:47:18 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 26 Feb 2006 13:47:18 -0600 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> Message-ID: <440205C6.3030301@ronadam.com> Terry Reedy wrote: > "Almann T. Goo" wrote in message > news:7e9b97090602252315mf6d4686ud86dd5163ea76b37 at mail.gmail.com... >> On 2/26/06, Greg Ewing wrote: >>> Alternatively, 'global' could be redefined to mean >>> what we're thinking of for 'outer'. Then there would >>> be no change in keywordage. >>> Given the rarity of global statement usage to begin >>> with, I'd say that narrows things down to something >>> well within the range of acceptable breakage in 3.0. >> You read my mind--I made a reply similar to this on another branch of >> this thread just minutes ago :). >> >> I am curious to see what the community thinks about this. > > I *think* I like this better than more complicated proposals. I don't > think I would ever have a problem with the intermediate scope masking the > module scope. After all, if I really meant to access the current global > scope from a nested function, I simply would not use that name in the > intermediate scope. > > tjr Would this apply to reading intermediate scopes without the global keyword? How would you know you aren't in inadvertently masking a name in a function you call? In most cases it will probably break something in an obvious way, but I suppose in some cases it won't be so obvious. Ron From martin at v.loewis.de Sun Feb 26 20:59:09 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sun, 26 Feb 2006 20:59:09 +0100 Subject: [Python-Dev] Exposing the abstract syntax Message-ID: <1140983949.4402088d88ae0@www.domainfactory-webmail.de> At PyCon, there was general reluctance for incorporating the ast-objects branch, primarily because people where concerned what the reference counting would do to maintainability, and what (potentially troublesome) options direct exposure of AST objects would do. OTOH, the approach of creating a shadow tree did not find opposition, so I implemented that. Currently, you can use compile() to create an AST out of source code, by passing PyCF_ONLY_AST (0x400) to compile. The mapping of AST to Python objects is as follows: - there is a Python type for every sum, product, and constructor. - The constructor types inherit from their sum types (e.g. ClassDef inherits from stmt) - Each constructor and product type has an _fields member, giving the names of the fields of the product. - Each node in the AST has members with the names given in _fields - If the field is optional, it might be None - if the field is zero-or-more, it is represented as a list. It might be reasonable to expose this through a separate module, in particular to provide access to the type objects. Regards, Martin From aleaxit at gmail.com Sun Feb 26 21:07:42 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sun, 26 Feb 2006 12:07:42 -0800 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: <440205C6.3030301@ronadam.com> References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> Message-ID: On Feb 26, 2006, at 11:47 AM, Ron Adam wrote: ... > How would you know you aren't in inadvertently masking a name in a > function you call? What does calling have to do with it? Nobody's proposing a move to (shudder) dynamic scopes, we're talking of saner concepts such as lexical scopes anyway. Can you give an example of what you mean? For the record: I detest the existing 'global' (could I change but ONE thing in Python, that would be the one -- move from hated 'global' to a decent namespace use, e.g. glob.x=23 rather than global x;x=23), and I'd detest a similar 'outer' just as intensely (again, what I'd like instead is a decent namespace) -- so I might well be sympathetic to your POV, if I could but understand it;-). Alex From almann.goo at gmail.com Sun Feb 26 21:18:12 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 15:18:12 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: <440205C6.3030301@ronadam.com> References: <20060220204850.6004.JCARLSON@uci.edu> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> <44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> Message-ID: <7e9b97090602261218s1bdeb24fkad05c90e2ba61f80@mail.gmail.com> > Would this apply to reading intermediate scopes without the global keyword? Using a name from an enclosing scope without re-binding to it would not require the "global" keyword. This actually is the case today with "global" and accessing a name from the global scope versus re-binding to it--this would make "global" more general than explicitly overriding to the global scope. > How would you know you aren't in inadvertently masking a name in a > function you call? I think is really an issue with the name binding semantics in Python. There are benefits to not having variable declarations, but with assignment meaning bind locally, you can already shadow a name in a nested scope inadvertently today. > In most cases it will probably break something in an obvious way, but I > suppose in some cases it won't be so obvious. Having the "global" keyword semantics changed to be "lexically global" would break in the cases that "global" is used on a name within a nested scope that has an enclosing scope with the same name. I would suppose that actual instances in real code of this would be rare. Consider: >>> x = 1 >>> def f() : ... x = 2 ... def inner() : ... global x ... print x ... inner() ... >>> f() 1 Under the proposed rules: >>> f() 2 PEP 227 also had backwards incompatibilities that were similar and I suggest handling them the same way by issuing a warning in these cases when the new semantics are not being used (i.e. no "from __future__"). -Almann -- Almann T. Goo almann.goo at gmail.com From almann.goo at gmail.com Sun Feb 26 21:28:56 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 15:28:56 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> <44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> Message-ID: <7e9b97090602261228x721a4d17g65a027c39cf2f49c@mail.gmail.com> On 2/26/06, Alex Martelli wrote: > For the record: I detest the existing 'global' (could I change but > ONE thing in Python, that would be the one -- move from hated > 'global' to a decent namespace use, e.g. glob.x=23 rather than global > x;x=23), and I'd detest a similar 'outer' just as intensely (again, > what I'd like instead is a decent namespace) -- so I might well be > sympathetic to your POV, if I could but understand it;-). I would prefer a more explicit means to accomplish this too (I sort of like the prefix dot in this regard), however the fundamental problem with allowing this lies in how accessing and binding names works in Python today (sorry if I sound like a broken record in this regard). Unless we change how names can be accessed/re-bound (very bad for backwards compatibility), any proposal that forces explicit name spaces would have to allow for both accessing "simple names" (like just "var") and names via attribute access (name spaces) like "glob.var"--I think this adds the problem of introducing obscurity to the language. -Almann -- Almann T. Goo almann.goo at gmail.com From rrr at ronadam.com Mon Feb 27 01:20:07 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 26 Feb 2006 18:20:07 -0600 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> Message-ID: <440245B7.5090903@ronadam.com> Alex Martelli wrote: > On Feb 26, 2006, at 11:47 AM, Ron Adam wrote: > ... >> How would you know you aren't in inadvertently masking a name in a >> function you call? > > What does calling have to do with it? Nobody's proposing a move to > (shudder) dynamic scopes, we're talking of saner concepts such as > lexical scopes anyway. Can you give an example of what you mean? (sigh of relief) Ok, so the following example will still be true. def foo(n): #foo is a global return n def bar(n): return foo(n) #call to foo is set at compile time def baz(n): foo = lambda x: 7 #will not replace foo called in bar. return bar(n) print baz(42) I guess I don't quite get what they are proposing yet. It seems to me adding intermediate scopes are making functions act more like class's. After you add naming conventions to functions they begin to look like this. """ Multiple n itemiter """ class baz(object): def getn(baz, n): start = baz.start baz.start += n return baz.lst[start:start+n] def __init__(baz, lst): baz.lst = lst baz.start = 0 b = baz(range(100)) for n in range(1,10): print b.getn(n) > For the record: I detest the existing 'global' (could I change but ONE > thing in Python, that would be the one -- move from hated 'global' to a > decent namespace use, e.g. glob.x=23 rather than global x;x=23), and I'd > detest a similar 'outer' just as intensely (again, what I'd like instead > is a decent namespace) -- so I might well be sympathetic to your POV, if > I could but understand it;-). Maybe something explicit like: >>> import __main__ as glob >>> glob.x = 10 >>> globals() {'__builtins__': , '__name__': '__main__', 'glo b': , '__doc__': None, 'x': 10} >>> That could eliminate the global keyword. I'm -1 on adding the intermediate (outer) scopes to functions. I'd even like to see closures gone completely, but there's probably a reason they are there. What I like about functions is they are fast, clean up behind themselves, and act *exactly* the same on consecutive calls. Cheers, Ron From thomas at xs4all.net Mon Feb 27 01:31:56 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 27 Feb 2006 01:31:56 +0100 Subject: [Python-Dev] PEP 308 Message-ID: <20060227003156.GT23859@xs4all.nl> Since I was on a streak of implementing not-quite-the-right-thing, I checked in my PEP 308 implementation *with* backward compatibility -- just to spite Guido's latest change to the PEP. It jumps through a minor hoop (two new grammar rules) in order to be backwardly compatible, but that hoop can go away in Python 3.0, and that shouldn't be too long from now. I apologize for the test failures of compile, transform and parser: they seem to all depend on the parsermodule being updated. If no one feels responsible for it, I'll do it later in the week (I'll be sprinting until Thursday anyway.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg.ewing at canterbury.ac.nz Mon Feb 27 01:47:25 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Feb 2006 13:47:25 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87d5hagl5s.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> <4400E5DE.3070102@canterbury.ac.nz> <87d5hagl5s.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <44024C1D.3020009@canterbury.ac.nz> Stephen J. Turnbull wrote: > I gave you one, MIME processing in email If implementing a mime packer is really the only use case for base64, then it might as well be removed from the standard library, since 99.99999% of all programmers will never touch it. Those that do will need to have boned up on the subject of encoding until it's coming out their ears, so they'll know what they're doing in any case. And they'll be quite competent to write their own base64 encoder that works however they want it to. I don't have any real-life use cases for base64 that a non-mime-implementer might come across, so all I can do is imagine what shape such a use case might have. When I do that, I come up with what I've already described. The programmer wants to send arbitrary data over a channel that only accepts text. He doesn't know, and doesn't want to have to know, how the channel encodes that text -- it might be ASCII or EBCDIC or morse code, it shouldn't matter. If his Python base64 encoder produces a Python character string, and his Python channel interface accepts a Python character string, he doesn't have to know. > I think it's your turn. Give me a use case where it matters > practically that the output of the base64 codec be Python unicode > characters rather than 8-bit ASCII characters. I'd be perfectly happy with ascii characters, but in Py3k, the most natural place to keep ascii characters will be in character strings, not byte arrays. > Everything you have written so far is based on > defending your maintained assumption that because Python implements > text processing via the unicode type, everything that is described as > a "character" must be coerced to that type. I'm not just blindly assuming that because the RFC happens to use the word "character". I'm also looking at how it uses that word in an effort to understand what it means. It *doesn't* specify what bit patterns are to be used to represent the characters. It *does* mention two "character sets", namely ASCII and EBCDIC, with the implication that the characters it is talking about could be taken as being members of either of those sets. Since the Unicode character set is a superset of the ASCII character set, it doesn't seem unreasonable that they could also be thought of as Unicode characters. > I don't really see a downside, except for the occasional double > conversion ASCII -> unicode -> UTF-16, as is allowed (but not > mandated) in XML's use of base64. What downside do you see? It appears that all your upsides I see as downsides, and vice versa. We appear to be mutually upside-down. :-) XML is another example. Inside a Python program, the most natural way to represent an XML is as a character string. Your way, embedding base64 in it would require converting the bytes produced by the base64 encoder into a character string in some way, taking into account the assumed ascii encoding of said bytes. My way, you just use the result directly, with no coding involved at all. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From aleaxit at gmail.com Mon Feb 27 01:55:21 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sun, 26 Feb 2006 16:55:21 -0800 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: <440245B7.5090903@ronadam.com> References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> <440245B7.5090903@ronadam.com> Message-ID: <1956B0C6-4FE7-41EC-BCBA-37C25CBE470D@gmail.com> On Feb 26, 2006, at 4:20 PM, Ron Adam wrote: ... > (sigh of relief) Ok, so the following example will still be true. Yep, no danger of dynamic scoping, be certain of that. > Maybe something explicit like: > >>>> import __main__ as glob Sure, or the more general ''glob=__import__(__name__)''. > I'm -1 on adding the intermediate (outer) scopes to functions. I'd > even > like to see closures gone completely, but there's probably a reason > they > are there. What I like about functions is they are fast, clean up > behind themselves, and act *exactly* the same on consecutive calls. Except that the latter assertion is just untrue in Python -- we already have a bazilion ways to perform side effects, and, since there is no procedure/function distinction, side effects in functions are an extremely common thing. If you're truly keen on having the "exactly the same" property, you may want to look into functional languages, such as Haskell -- there, all data is immutable, so the property does hold (any *indispensable* side effects, e.g. I/O, are packed into 'monads' -- but that's another story). Closures in Python are often extremely handy, as long as you use them much as you would in Haskell -- treating data as immutable (and in particular outer names as unrebindable). You'd think that functional programming fans wouldn't gripe so much about Python closures being meant for use like Haskell ones, hm?-) But, of course, they do want to have their closure and rebind names too... Alex From almann.goo at gmail.com Mon Feb 27 02:27:42 2006 From: almann.goo at gmail.com (Almann T. Goo) Date: Sun, 26 Feb 2006 20:27:42 -0500 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: <440245B7.5090903@ronadam.com> References: <20060220204850.6004.JCARLSON@uci.edu> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> <44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> <440245B7.5090903@ronadam.com> Message-ID: <7e9b97090602261727i44e8a2ecy219655a9131b3cac@mail.gmail.com> On 2/26/06, Ron Adam wrote: > I'm -1 on adding the intermediate (outer) scopes to functions. I'd even > like to see closures gone completely, but there's probably a reason they > are there. We already have enclosing scopes since Python 2.1--this is PEP 227 (http://www.python.org/peps/pep-0227.html). The proposal is for a mechanism to allow for re-binding of enclosing scopes which seems like a logical step to me. The rest of the scoping semantics would remain as they are today in Python. -Almann -- Almann T. Goo almann.goo at gmail.com From rrr at ronadam.com Mon Feb 27 02:43:49 2006 From: rrr at ronadam.com (Ron Adam) Date: Sun, 26 Feb 2006 19:43:49 -0600 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: <1956B0C6-4FE7-41EC-BCBA-37C25CBE470D@gmail.com> References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> <440245B7.5090903@ronadam.com> <1956B0C6-4FE7-41EC-BCBA-37C25CBE470D@gmail.com> Message-ID: <44025955.7090709@ronadam.com> Alex Martelli wrote: >> I'm -1 on adding the intermediate (outer) scopes to functions. I'd even >> like to see closures gone completely, but there's probably a reason they >> are there. What I like about functions is they are fast, clean up >> behind themselves, and act *exactly* the same on consecutive calls. > > Except that the latter assertion is just untrue in Python -- we already > have a bazilion ways to perform side effects, and, since there is no > procedure/function distinction, side effects in functions are an > extremely common thing. If you're truly keen on having the "exactly the > same" property, you may want to look into functional languages, such as > Haskell -- there, all data is immutable, so the property does hold (any > *indispensable* side effects, e.g. I/O, are packed into 'monads' -- but > that's another story). True, I should have said mostly act the same when using them in a common and direct way. I know we can change all sorts of behaviors fairly easily if we choose to. > Closures in Python are often extremely handy, as long as you use them > much as you would in Haskell -- treating data as immutable (and in > particular outer names as unrebindable). You'd think that functional > programming fans wouldn't gripe so much about Python closures being > meant for use like Haskell ones, hm?-) But, of course, they do want to > have their closure and rebind names too... So far everywhere I've seen closures used, a class would work. But maybe not as conveniently or as fast? On the other side of the coin there are those who want to get rid of the "self" variable in class's also. Which would cause classes to look more like nested functions. Haskel sounds interesting, maybe I'll try a bit of it sometime. But I like Python. ;-) Ron From aleaxit at gmail.com Mon Feb 27 03:54:19 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Sun, 26 Feb 2006 18:54:19 -0800 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: <44025955.7090709@ronadam.com> References: <20060220204850.6004.JCARLSON@uci.edu><5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com><43FD22C6.70108@canterbury.ac.nz><5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com><7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com><7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com><44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> <440245B7.5090903@ronadam.com> <1956B0C6-4FE7-41EC-BCBA-37C25CBE470D@gmail.com> <44025955.7090709@ronadam.com> Message-ID: On Feb 26, 2006, at 5:43 PM, Ron Adam wrote: ... > So far everywhere I've seen closures used, a class would work. But > maybe not as conveniently or as fast? Yep. In this, closures are like generators: much more convenient than purpose-built classes, but not as general. > Haskel sounds interesting, maybe I'll try a bit of it sometime. But I > like Python. ;-) So do I, so do many others: the first EuroHaskell was held the day right after a EuroPython, in the same venue (a Swedish University, Chalmers) -- that was convenient because so many delegates were interested in both languages, see. We stole list comprehensions and genexps from Haskell (the idea and most of the semantics, not the syntax, which was Pythonized relentlessly) -- and the two languages share the concept of indentation being significant for grouping, with some minor differences in details since they developed these concepts independently. Hey, what more do you need?-) Alex From tim.peters at gmail.com Mon Feb 27 06:36:20 2006 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 26 Feb 2006 23:36:20 -0600 Subject: [Python-Dev] Current trunk test failures Message-ID: <1f7befae0602262136g5aac3dfbp836264464019c002@mail.gmail.com> The buildbot shows that the debug-build test_grammar is dying with a C assert failure on all boxes. In case it helps, in a Windows release build test_transformer is also failing: test_transformer test test_transformer failed -- Traceback (most recent call last): File "C:\Code\python\lib\test\test_transformer.py", line 16, in testMultipleLHS a = transformer.parse(s) File "C:\Code\python\lib\compiler\transformer.py", line 52, in parse return Transformer().parsesuite(buf) File "C:\Code\python\lib\compiler\transformer.py", line 129, in parsesuite return self.transform(parser.suite(text)) File "C:\Code\python\lib\compiler\transformer.py", line 125, in transform return self.compile_node(tree) File "C:\Code\python\lib\compiler\transformer.py", line 158, in compile_node return self.file_input(node[1:]) File "C:\Code\python\lib\compiler\transformer.py", line 189, in file_input self.com_append_stmt(stmts, node) File "C:\Code\python\lib\compiler\transformer.py", line 1036, in com_append_stmt result = self.lookup_node(node)(node[1:]) File "C:\Code\python\lib\compiler\transformer.py", line 305, in stmt return self.com_stmt(nodelist[0]) File "C:\Code\python\lib\compiler\transformer.py", line 1029, in com_stmt result = self.lookup_node(node)(node[1:]) File "C:\Code\python\lib\compiler\transformer.py", line 315, in simple_stmt self.com_append_stmt(stmts, nodelist[i]) File "C:\Code\python\lib\compiler\transformer.py", line 1036, in com_append_stmt result = self.lookup_node(node)(node[1:]) File "C:\Code\python\lib\compiler\transformer.py", line 305, in stmt return self.com_stmt(nodelist[0]) File "C:\Code\python\lib\compiler\transformer.py", line 1029, in com_stmt result = self.lookup_node(node)(node[1:]) File "C:\Code\python\lib\compiler\transformer.py", line 353, in expr_stmt exprNode = self.lookup_node(en)(en[1:]) File "C:\Code\python\lib\compiler\transformer.py", line 763, in lookup_node return self._dispatch[node[0]] KeyError: 324 Also test_parser: C:\Code\python\PCbuild>python -E -tt ../lib/test/regrtest.py -v test_parser test_parser test_assert (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_basic_import_statement (test.test_parser.RoundtripLegalSyntaxTestCase) ... ok test_class_defs (test.test_parser.RoundtripLegalSyntaxTestCase) ... ok test_expressions (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_function_defs (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_import_from_statement (test.test_parser.RoundtripLegalSyntaxTestCase) ... ok test_pep263 (test.test_parser.RoundtripLegalSyntaxTestCase) ... ok test_print (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_simple_assignments (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_simple_augmented_assignments (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_simple_expression (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_yield_statement (test.test_parser.RoundtripLegalSyntaxTestCase) ... FAIL test_a_comma_comma_c (test.test_parser.IllegalSyntaxTestCase) ... ok test_illegal_operator (test.test_parser.IllegalSyntaxTestCase) ... ok test_illegal_yield_1 (test.test_parser.IllegalSyntaxTestCase) ... ok test_illegal_yield_2 (test.test_parser.IllegalSyntaxTestCase) ... ok test_junk (test.test_parser.IllegalSyntaxTestCase) ... ok test_malformed_global (test.test_parser.IllegalSyntaxTestCase) ... ok test_print_chevron_comma (test.test_parser.IllegalSyntaxTestCase) ... ok test_compile_error (test.test_parser.CompileTestCase) ... ok test_compile_expr (test.test_parser.CompileTestCase) ... ok test_compile_suite (test.test_parser.CompileTestCase) ... ok ====================================================================== FAIL: test_assert (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 180, in test_assert self.check_suite("assert alo < ahi and blo < bhi\n") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'assert alo < ahi and blo < bhi\n': Expected node type 303, got 302. ====================================================================== FAIL: test_expressions (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 50, in test_expressions self.check_expr("foo(1)") File "C:\Code\python\lib\test\test_parser.py", line 25, in check_expr self.roundtrip(parser.expr, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'foo(1)': Expected node type 303, got 302. ====================================================================== FAIL: test_function_defs (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 119, in test_function_defs self.check_suite("def f(foo=bar): pass") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'def f(foo=bar): pass': Expected node type 303, got 302. ====================================================================== FAIL: test_print (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 86, in test_print self.check_suite("print 1") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'print 1': Expected node type 303, got 302. ====================================================================== FAIL: test_simple_assignments (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 97, in test_simple_assignments self.check_suite("a = b") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'a = b': Expected node type 303, got 302. ====================================================================== FAIL: test_simple_augmented_assignments (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 101, in test_simple_augmented_assignments self.check_suite("a += b") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'a += b': Expected node type 303, got 302. ====================================================================== FAIL: test_simple_expression (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 94, in test_simple_expression self.check_suite("a") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'a': Expected node type 303, got 302. ====================================================================== FAIL: test_yield_statement (test.test_parser.RoundtripLegalSyntaxTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_parser.py", line 31, in test_yield_statement self.check_suite("def f(): yield 1") File "C:\Code\python\lib\test\test_parser.py", line 28, in check_suite self.roundtrip(parser.suite, s) File "C:\Code\python\lib\test\test_parser.py", line 19, in roundtrip self.fail("could not roundtrip %r: %s" % (s, why)) AssertionError: could not roundtrip 'def f(): yield 1': Expected node type 303, got 302. ---------------------------------------------------------------------- Ran 22 tests in 0.015s FAILED (failures=8) test test_parser failed -- errors occurred; run in verbose mode for details 1 test failed: test_parser and also test_compiler. From stephen at xemacs.org Mon Feb 27 06:59:44 2006 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 27 Feb 2006 14:59:44 +0900 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <44024C1D.3020009@canterbury.ac.nz> (Greg Ewing's message of "Mon, 27 Feb 2006 13:47:25 +1300") References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> <4400E5DE.3070102@canterbury.ac.nz> <87d5hagl5s.fsf@tleepslib.sk.tsukuba.ac.jp> <44024C1D.3020009@canterbury.ac.nz> Message-ID: <87irr1fijz.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing writes: Greg> Stephen J. Turnbull wrote: >> I gave you one, MIME processing in email Greg> If implementing a mime packer is really the only use case Greg> for base64, then it might as well be removed from the Greg> standard library, since 99.99999% of all programmers will Greg> never touch it. I don't have any real-life use cases for Greg> base64 that a non-mime-implementer might come across, so all Greg> I can do is imagine what shape such a use case might have. I guess we don't have much to talk about, then. >> Give me a use case where it matters practically that the output >> of the base64 codec be Python unicode characters rather than >> 8-bit ASCII characters. Greg> I'd be perfectly happy with ascii characters, but in Py3k, Greg> the most natural place to keep ascii characters will be in Greg> character strings, not byte arrays. Natural != practical. Anyway, I disagree, and I've lived with the problems that come with an environment that mixes objects with various underlying semantics into a single "text stream" for a decade and a half. That doesn't make me authoritative, but as we agree to disagree, I hope you'll keep in mind that someone with real-world experience that is somewhat relevant[1] to the issue doesn't find that natural at all. Greg> Since the Unicode character set is a superset of the ASCII Greg> character set, it doesn't seem unreasonable that they could Greg> also be thought of as Unicode characters. I agree. However, as soon as I go past that intuition to thinking about what that implies for _operations_ on the base64 string, it begins to seem unreasonable, unnatural, and downright dangerous. The base64 string is a representation of an object that doesn't have text semantics. Nor do base64 strings have text semantics: they can't even be concatenated as text (the pad character '=' is typically a syntax error in a profile of base64, except as terminal padding). So if you wish to concatenate the underlying objects, the base64 strings must be decoded, concatenated, and re-encoded in the general case. IMO it's not worth preserving the very superficial coincidence of "character representation" in the face of such semantics. I think that fact that favoring the coincidence of representation leads you to also deprecate the very natural use of the codec API to implement and understand base64 is indicative of a deep problem with the idea of implementing base64 as bytes->unicode. Footnotes: [1] That "somewhat" is intended literally; my specialty is working with codecs for humans in Emacs, but I've also worked with more abstract codecs such as base64 in contexts like email, in both LISP and Python. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From greg.ewing at canterbury.ac.nz Mon Feb 27 10:40:08 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Feb 2006 22:40:08 +1300 Subject: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes) In-Reply-To: References: <20060220204850.6004.JCARLSON@uci.edu> <5.1.1.6.0.20060221150640.037adcd0@mail.telecommunity.com> <43FD22C6.70108@canterbury.ac.nz> <5.1.1.6.0.20060222215929.01df05f0@mail.telecommunity.com> <7e9b97090602222228y755bd4d8n9e38805bcc222112@mail.gmail.com> <7e9b97090602252246q229c6815xdeff8cf431807f41@mail.gmail.com> <44014F23.80409@canterbury.ac.nz> <7e9b97090602252315mf6d4686ud86dd5163ea76b37@mail.gmail.com> <440205C6.3030301@ronadam.com> <440245B7.5090903@ronadam.com> <1956B0C6-4FE7-41EC-BCBA-37C25CBE470D@gmail.com> <44025955.7090709@ronadam.com> Message-ID: <4402C8F8.1040604@canterbury.ac.nz> Alex Martelli wrote: > We stole list comprehensions and genexps from Haskell The idea predates Haskell, I think. I first saw it in Miranda, and it may have come from something even earlier -- SETL, maybe? Greg From greg.ewing at canterbury.ac.nz Mon Feb 27 12:41:25 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 28 Feb 2006 00:41:25 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <87irr1fijz.fsf@tleepslib.sk.tsukuba.ac.jp> References: <87wtfuv66i.fsf@tleepslib.sk.tsukuba.ac.jp> <43F69411.1020807@canterbury.ac.nz> <20060217202813.5FA2.JCARLSON@uci.edu> <43f6abab.1091371449@news.gmane.org> <87zmknql7b.fsf@tleepslib.sk.tsukuba.ac.jp> <43F8C391.2070405@v.loewis.de> <87slqeqw0g.fsf@tleepslib.sk.tsukuba.ac.jp> <43FA3820.3070607@v.loewis.de> <87d5hhpf53.fsf@tleepslib.sk.tsukuba.ac.jp> <43FAE40B.8090406@canterbury.ac.nz> <87mzgjn2qn.fsf@tleepslib.sk.tsukuba.ac.jp> <43FC4C8B.6080300@canterbury.ac.nz> <87accimwy0.fsf@tleepslib.sk.tsukuba.ac.jp> <43FE528E.1040100@canterbury.ac.nz> <87bqwxj9t8.fsf@tleepslib.sk.tsukuba.ac.jp> <43FFA2C9.4010204@canterbury.ac.nz> <87lkvzgvpp.fsf@tleepslib.sk.tsukuba.ac.jp> <4400E5DE.3070102@canterbury.ac.nz> <87d5hagl5s.fsf@tleepslib.sk.tsukuba.ac.jp> <44024C1D.3020009@canterbury.ac.nz> <87irr1fijz.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <4402E565.2070408@canterbury.ac.nz> Stephen J. Turnbull wrote: > Greg> I'd be perfectly happy with ascii characters, but in Py3k, > Greg> the most natural place to keep ascii characters will be in > Greg> character strings, not byte arrays. > > Natural != practical. That seems to be another thing we disagree about -- to me it seems both natural *and* practical. The whole business of stuffing binary data down a text channel is a practicality-beats-purity kind of thing. You wouldn't do it if you had a real binary channel available, but if you don't, it's better than nothing. > The base64 string is a representation of an object > that doesn't have text semantics. But the base64 string itself *does* have text semantics. That's the whole point of base64 -- to represent a non-text object *using* text. To me this is no different than using a string of decimal digit characters to represent an integer, or a string of hexadecimal digit characters to represent a bit pattern. Would you say that those are not text, either? What about XML? What would you consider the proper data type for an XML document to be inside a Python program -- bytes or text? I'm genuinely interested in your answer to that, because I'm trying to understand where you draw the line between text and non-text. You seem to want to reserve the term "text" for data that doesn't ever have to be understood even a little bit by a computer program, but that seems far too restrictive to me, and a long way from established usage. > Nor do base64 strings have text semantics: they can't even > be concatenated as text ... So if you > wish to concatenate the underlying objects, the base64 strings must be > decoded, concatenated, and re-encoded in the general case. You can't add two integers by concatenating their base-10 character representation, either, but I wouldn't take that as an argument against putting decimal numbers into text files. Also, even if we follow your suggestion and store our base64-encoded data in byte arrays, we *still* wouldn't be able to concatenate the original data just by concatenating those byte arrays. So this argument makes no sense either way. > IMO it's not worth preserving the very superficial > coincidence of "character representation" I disagree entirely that it's superficial. On the contrary, it seems to me to be very essence of what base64 is all about. If there's any "coincidence of representation" it's in the idea of storing the result as ASCII bit patterns in a byte array, on the assumption that that's probably how they're going to end up being represented down the line. That assumption could be very wrong. What happens if it turns out they really need to be encoded as UTF-16, or as EBCDIC? All hell breaks loose, as far as I can see, unless the programmer has kept very firmly in mind that there is an implicit ASCII encoding involved. It's exactly to avoid the need for those kinds of mental gymnastics that Py3k will have a unified, encoding-agnostic data type for all character strings. > I think that fact that favoring the coincidence of representation > leads you to also deprecate the very natural use of the codec API to > implement and understand base64 is indicative of a deep problem with > the idea of implementing base64 as bytes->unicode. Not sure I'm following you. I don't object to implementing base64 as a codec, only to exposing it via the same interface as the "real" unicode codecs like utf8, etc. I thought we were in agreement about that. If you're thinking that the mere fact its input type is bytes and its output type is characters is going to lead to its mistakenly appearing via that interface, that would be a bug or design flaw in the mechanism that controls which codecs appear via that interface. It needs to be controlled by something more than just the input and output types. Greg From ncoghlan at gmail.com Mon Feb 27 13:55:05 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2006 22:55:05 +1000 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <43FFA4B2.7060303@canterbury.ac.nz> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> <43FFA4B2.7060303@canterbury.ac.nz> Message-ID: <4402F6A9.1030906@gmail.com> Greg Ewing wrote: > Raymond Hettinger wrote: >> Code that >> uses next() is more understandable, friendly, and readable without the >> walls of underscores. > > There wouldn't be any walls of underscores, because > > y = x.next() > > would become > > y = next(x) > > The only time you would need to write underscores is > when defining a __next__ method. That would be no worse > than defining an __init__ or any other special method, > and has the advantage that it clearly marks the method > as being special. I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected some day, such that the signature for the special method was "__next__(self, input)" and for the builtin "next(iterator, input=None)" That would go hand in hand with the idea of allowing the continue statement to accept an argument though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From fredrik at pythonware.com Mon Feb 27 16:34:00 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 27 Feb 2006 16:34:00 +0100 Subject: [Python-Dev] PEP 332 revival in coordination with pep 349? [Was:Re: release plan for 2.5 ?] References: Message-ID: Just van Rossum wrote: > > If bytes support the buffer interface, we get another interesting > > issue -- regular expressions over bytes. Brr. > > We already have that: > > >>> import re, array > >>> re.search('\2', array.array('B', [1, 2, 3, 4])).group() > array('B', [2]) > >>> > > Not sure whether to blame array or re, though... SRE. iirc, the design rationale was to support RE over mmap'ed regions. From guido at python.org Mon Feb 27 17:02:12 2006 From: guido at python.org (Guido van Rossum) Date: Mon, 27 Feb 2006 10:02:12 -0600 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <4402F6A9.1030906@gmail.com> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> <43FFA4B2.7060303@canterbury.ac.nz> <4402F6A9.1030906@gmail.com> Message-ID: On 2/27/06, Nick Coghlan wrote: > I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected > some day, such that the signature for the special method was "__next__(self, > input)" and for the builtin "next(iterator, input=None)" > > That would go hand in hand with the idea of allowing the continue statement to > accept an argument though. Yup. The continue thing we might add in 2.6. The __next__ API in 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Mon Feb 27 17:33:31 2006 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 27 Feb 2006 17:33:31 +0100 Subject: [Python-Dev] Current trunk test failures In-Reply-To: <1f7befae0602262136g5aac3dfbp836264464019c002@mail.gmail.com> References: <1f7befae0602262136g5aac3dfbp836264464019c002@mail.gmail.com> Message-ID: <20060227163331.GU23859@xs4all.nl> On Sun, Feb 26, 2006 at 11:36:20PM -0600, Tim Peters wrote: > The buildbot shows that the debug-build test_grammar is dying with a C > assert failure on all boxes. > > In case it helps, in a Windows release build test_transformer is also failing: All build/test failures introduced by the PEP 308 patch should be fixed (thanks, Martin!) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From janssen at parc.com Mon Feb 27 18:38:41 2006 From: janssen at parc.com (Bill Janssen) Date: Mon, 27 Feb 2006 09:38:41 PST Subject: [Python-Dev] bytes.from_hex() In-Reply-To: Your message of "Sun, 26 Feb 2006 16:47:25 PST." <44024C1D.3020009@canterbury.ac.nz> Message-ID: <06Feb27.093841pst."58633"@synergy1.parc.xerox.com> > If implementing a mime packer is really the only use case > for base64, then it might as well be removed from the > standard library, since 99.99999% of all programmers will > never touch it. Those that do will need to have boned up I use it quite a bit for image processing (converting to and from the "data:" URL form), and various checksum applications (converting SHA into a string). Bill From mal at egenix.com Mon Feb 27 18:40:34 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 27 Feb 2006 18:40:34 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! Message-ID: <44033992.8040805@egenix.com> Microsoft has recently released their express version of the Visual C++. Given that this version is free for everyone, wouldn't it make sense to ship Python 2.5 compiled with this version ?! http://msdn.microsoft.com/vstudio/express/default.aspx I suppose this would make compiling extensions easier for people who don't have a standard VC++ .NET installed. Note: This is just a thought - I haven't looked into the consequences of building with VC8 yet, e.g. from the list of pre-requisites, it's possible that .NET 2.0 would become a requirement. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 27 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From aleaxit at gmail.com Mon Feb 27 18:51:40 2006 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 27 Feb 2006 09:51:40 -0800 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: <44033992.8040805@egenix.com> References: <44033992.8040805@egenix.com> Message-ID: On 2/27/06, M.-A. Lemburg wrote: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! > > http://msdn.microsoft.com/vstudio/express/default.aspx > > I suppose this would make compiling extensions easier for people > who don't have a standard VC++ .NET installed. It would sure be nice for people like me with "occasional dabbler in Windows" status, so, selfishly, I'd be all in favor. However...: What I hear from the rumor mill (not perhaps a reliable source) is a bit discouraging about the stability of VS2005 (e.g. internal rebellion at MS in which groups which need to ship a lot of code pushed back against any attempt to make them use VS2005, and managed to win the internal fight and stick with VS2003), but I don't know if any such worry applies to something as simple as the mere compilation of C code... > Note: This is just a thought - I haven't looked into the consequences > of building with VC8 yet, e.g. from the list of pre-requisites, > it's possible that .NET 2.0 would become a requirement. You mean, to RUN vc8-compiled Python?! That would be perhaps the first C compiler ever unable to produce "native", stand-alone code, wouldn't it? Alex From phil at riverbankcomputing.co.uk Mon Feb 27 19:00:50 2006 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Mon, 27 Feb 2006 18:00:50 +0000 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: References: <44033992.8040805@egenix.com> Message-ID: <200602271800.50820.phil@riverbankcomputing.co.uk> On Monday 27 February 2006 5:51 pm, Alex Martelli wrote: > On 2/27/06, M.-A. Lemburg wrote: > > Microsoft has recently released their express version of the Visual C++. > > Given that this version is free for everyone, wouldn't it make sense > > to ship Python 2.5 compiled with this version ?! > > > > http://msdn.microsoft.com/vstudio/express/default.aspx > > > > I suppose this would make compiling extensions easier for people > > who don't have a standard VC++ .NET installed. > > It would sure be nice for people like me with "occasional dabbler in > Windows" status, so, selfishly, I'd be all in favor. However...: > > What I hear from the rumor mill (not perhaps a reliable source) is a > bit discouraging about the stability of VS2005 (e.g. internal > rebellion at MS in which groups which need to ship a lot of code > pushed back against any attempt to make them use VS2005, and managed > to win the internal fight and stick with VS2003), but I don't know if > any such worry applies to something as simple as the mere compilation > of C code... ...but some extension modules are 500,000 lines of C++. Phil From benji at benjiyork.com Mon Feb 27 19:42:32 2006 From: benji at benjiyork.com (Benji York) Date: Mon, 27 Feb 2006 13:42:32 -0500 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: <44033992.8040805@egenix.com> References: <44033992.8040805@egenix.com> Message-ID: <44034818.7020101@benjiyork.com> M.-A. Lemburg wrote: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! > > http://msdn.microsoft.com/vstudio/express/default.aspx The express editions are only "free" until November 7th: http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx#pricing -- Benji York From martin at v.loewis.de Mon Feb 27 20:58:46 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 27 Feb 2006 20:58:46 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: <44033992.8040805@egenix.com> References: <44033992.8040805@egenix.com> Message-ID: <1141070326.440359f60adf0@www.domainfactory-webmail.de> Zitat von "M.-A. Lemburg" : > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! Not in my opinion. People have also commented that they want to continue with this version (i.e. 7.1.). I actually hope that Python can skip VS 2005, and go right away to the next version. Regards, Martin From mal at egenix.com Mon Feb 27 21:03:06 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 27 Feb 2006 21:03:06 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: References: <44033992.8040805@egenix.com> Message-ID: <44035AFA.1050701@egenix.com> Alex Martelli wrote: > On 2/27/06, M.-A. Lemburg wrote: >> Microsoft has recently released their express version of the Visual C++. >> Given that this version is free for everyone, wouldn't it make sense >> to ship Python 2.5 compiled with this version ?! >> >> http://msdn.microsoft.com/vstudio/express/default.aspx >> >> I suppose this would make compiling extensions easier for people >> who don't have a standard VC++ .NET installed. > > It would sure be nice for people like me with "occasional dabbler in > Windows" status, so, selfishly, I'd be all in favor. However...: > > What I hear from the rumor mill (not perhaps a reliable source) is a > bit discouraging about the stability of VS2005 (e.g. internal > rebellion at MS in which groups which need to ship a lot of code > pushed back against any attempt to make them use VS2005, and managed > to win the internal fight and stick with VS2003), but I don't know if > any such worry applies to something as simple as the mere compilation > of C code... Should I read this as: VC8 is unstable ? Perhaps that's the reason they decided to give it away for free for the first year. >> Note: This is just a thought - I haven't looked into the consequences >> of building with VC8 yet, e.g. from the list of pre-requisites, >> it's possible that .NET 2.0 would become a requirement. > > You mean, to RUN vc8-compiled Python?! That would be perhaps the > first C compiler ever unable to produce "native", stand-alone code, > wouldn't it? Well, the code that VC7 generates relies on MSVCR71.DLL which appears to be part of .NET 1.1. It's hard to tell since I don't have a system around without .NET on it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 27 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Mon Feb 27 21:12:35 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 27 Feb 2006 21:12:35 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: <44035AFA.1050701@egenix.com> References: <44033992.8040805@egenix.com> <44035AFA.1050701@egenix.com> Message-ID: <1141071155.44035d333e674@www.domainfactory-webmail.de> Zitat von "M.-A. Lemburg" : > > What I hear from the rumor mill (not perhaps a reliable source) is a > > bit discouraging about the stability of VS2005 (e.g. internal > > rebellion at MS in which groups which need to ship a lot of code > > pushed back against any attempt to make them use VS2005, and managed > > to win the internal fight and stick with VS2003), but I don't know if > > any such worry applies to something as simple as the mere compilation > > of C code... > > Should I read this as: VC8 is unstable ? Not sure how Alex interprets this; I think that one of the good reasons not to use VS2005 is that they managed to "break" the C library: change it from standard C in an incompatible way that they think is better for the end user. One of these changes broke Python; we now have a work-around for this breakage. In addition to changing the library behaviour, they also produce tons of warnings about perfectly correct code. > Well, the code that VC7 generates relies on MSVCR71.DLL > which appears to be part of .NET 1.1. It's hard to tell > since I don't have a system around without .NET on it. I don't believe .NET 1.1 ships msvcr71.dll. Actually, Microsoft discourages installing msvcr into system32, so that would be against their own guidelines. Regards, Martin From jason.orendorff at gmail.com Mon Feb 27 21:14:00 2006 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 27 Feb 2006 15:14:00 -0500 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: <43FFF7FF.9050301@ronadam.com> References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> <43FFA908.6040208@ronadam.com> <43FFF7FF.9050301@ronadam.com> Message-ID: Neil Schemenauer wrote: > Ron Adam wrote: >> Why was it decided that the unicode encoding argument should be ignored >> if the first argument is a string? Wouldn't an exception be better >> rather than give the impression it does something when it doesn't? > >From the PEP: > > There is no sane meaning that the encoding can have in that > case. str objects *are* byte arrays and they know nothing about > the encoding of character data they contain. We need to assume > that the programmer has provided str object that already uses > the desired encoding. > > Raising an exception would be a valid option. However, passing the > string through unchanged makes the transition from str to bytes > easier. Does it? I am quite certain the bytes PEP is dead wrong on this. It should be changed. Suppose I have code like this: def faz(s): return s.encode('utf-16be') If I want to transition from str to bytes, how should I change this code? def faz(s): return bytes(s, 'utf-16be') # OOPS - subtle bug This silently does the wrong thing when s is a str. If I hadn't read the PEP, I would confidently assume that bytes(str, encoding) == bytes(unicode, encoding), modulo the default encoding. I'd be wrong. But there's a really good reason to think this. Wherever a unicode argument is expected in Python 2.x, you can pass a str and it'll be silently decoded. This is an extremely strong convention. It's even embedded in PyArg_ParseTuple(). I can't think of any exceptions to the rule, offhand. Is this special case special enough to break the rules? Arguable. I suspect not. But even if so, allowing the breakage to pass silently is surely a mistake. It should just refuse the temptation to guess, and throw an exception--right? Now you may be thinking: the str/unicode duality of text, and the bytes/text duality of the "str" type, are *bad* things, and we're trying to get rid of them. True. My view is, we'll be rid of them in 3.0 regardless. In the meantime, there is no point trying to pretend that 2.0 "str" is bytes and not text. It just ain't so; you'll only succeed in confusing people and causing bugs. (And in 3.0 you're going to turn around and tell them "str" *is* text!) Good APIs make simple, sensible, comprehensible promises. I like these promises: - bytes(arg) works like array.array('b', arg) - bytes(arg1, arg2) works like bytes(arg1.encode(arg2)) I dislike these promises: - bytes(s, [ignored]), where s is a str, works like array.array('b', s) - bytes(u, [encoding]), where u is a unicode, works like bytes(u.encode(encoding)) It seems more Pythonic to differentiate based on the number of arguments, rather than the type. -j P.S. As someone who gets a bit agitated when the word "Pythonic" or the Zen of Python is taken in vain, I'd like to know if anyone feels I've done so here, so I can properly apologize. Thanks. From trentm at ActiveState.com Mon Feb 27 22:18:07 2006 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 27 Feb 2006 13:18:07 -0800 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: References: <44033992.8040805@egenix.com> Message-ID: <20060227211807.GA31807@activestate.com> [Alex Martelli wrote] > What I hear from the rumor mill (not perhaps a reliable source) is a > bit discouraging about the stability of VS2005 (e.g. internal > rebellion at MS in which groups which need to ship a lot of code > pushed back against any attempt to make them use VS2005, and managed > to win the internal fight and stick with VS2003), but I don't know if > any such worry applies to something as simple as the mere compilation > of C code... As a (perhaps significant) datapoint: the Mozilla guys are moving to building with VS2005. That's lots of C++ and widely run -- though probably not the C runtime so much. Trent -- Trent Mick TrentM at ActiveState.com From fredrik at pythonware.com Mon Feb 27 22:53:26 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 27 Feb 2006 22:53:26 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! References: <44033992.8040805@egenix.com> Message-ID: M.-A. Lemburg wrote: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! > > http://msdn.microsoft.com/vstudio/express/default.aspx > > I suppose this would make compiling extensions easier for people > who don't have a standard VC++ .NET installed. it also causes more work for those of us who provide ready-made Windows binaries for more than just the latest and greatest Python release. if I could chose, I'd use the same compiler for at least one more release... From martin at v.loewis.de Mon Feb 27 22:57:41 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 27 Feb 2006 22:57:41 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: References: <44033992.8040805@egenix.com> Message-ID: <1141077461.440375d59c265@www.domainfactory-webmail.de> Zitat von Fredrik Lundh : > it also causes more work for those of us who provide ready-made Windows > binaries for more than just the latest and greatest Python release. > > if I could chose, I'd use the same compiler for at least one more release... I find this argument convincing. Regards, Martin From fredrik at pythonware.com Mon Feb 27 23:21:19 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 27 Feb 2006 23:21:19 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! References: <44033992.8040805@egenix.com> Message-ID: > if I could chose, I'd use the same compiler for at least one more release... to clarify, the guideline should be "does the new compiler version add some- thing important ?", rather than just "is there a new version ?" From fredrik at pythonware.com Mon Feb 27 23:46:32 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 27 Feb 2006 23:46:32 +0100 Subject: [Python-Dev] Translating docs References: Message-ID: Facundo Batista wrote: > After a small talk with Raymond, yesterday in the breakfast, I > proposed in PyAr the idea of start to translate the Library Reference. > > You'll agree with me that this is a BIG effort. But not only big, it's dynamic! > > So, we decided that we need a system that provide us the management of > the translations. And it'd be a good idea the system to be available > for translations in other languages. > > One of the guys proposed to use Launchpad (https://launchpad.net/). > > The question is, it's ok to use a third party system for this > initiative? Or you (we) prefer to host it in-house? Someone alredy > thought of this? localized editions (with editing support) is definitely within the scope for a more dynamic library reference platform [1]. with a more granular structure, you can easily track changes on a method/function level, and dynamically generate pages that suits the reader ("official english for version X.Y", "experimental norwegian", "mixed latest english/german", etc). (but until we get there (if ever), I see no reason not to use an existing infrastructure, of course). 1) http://effbot.org/zone/pyref.htm From martin at v.loewis.de Mon Feb 27 23:57:13 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 27 Feb 2006 23:57:13 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: References: <44033992.8040805@egenix.com> Message-ID: <1141081033.440383c99a4cc@www.domainfactory-webmail.de> Zitat von Fredrik Lundh : > to clarify, the guideline should be "does the new compiler version add some- > thing important ?", rather than just "is there a new version ?" In this specific case, the new thing added is the availability of Visual Studio Express. Whether this is important, and outweighs the disadvantages, I don't know. In addition, I'm uncertain whether this is a new feature. I thought you could get the VS 2003 compiler (VC 7.1) with the .NET 1.1 SDK. But maybe I'm misremembering. Regards, Martin From nnorwitz at gmail.com Tue Feb 28 00:35:58 2006 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 27 Feb 2006 17:35:58 -0600 Subject: [Python-Dev] Making ascii the default encoding Message-ID: PEP 263 states that in Phase 2 the default encoding will be set to ASCII. Although the PEP is marked final, this isn't actually implemented. The warning about using non-ASCII characters started in 2.3. Does anyone think we shouldn't enforce the default being ASCII? This means if an # -*- coding: ... -*- is not set and non-ASCII characters are used, an error will be generated. n From bencvt at gmail.com Tue Feb 28 00:50:28 2006 From: bencvt at gmail.com (Ben Cartwright) Date: Mon, 27 Feb 2006 18:50:28 -0500 Subject: [Python-Dev] str.count is slow In-Reply-To: <1141083127.970403.147100@v46g2000cwv.googlegroups.com> References: <1141073696.107136.318600@j33g2000cwa.googlegroups.com> <1141083127.970403.147100@v46g2000cwv.googlegroups.com> Message-ID: >From comp.lang.python: chrisperkins99 at gmail.com wrote: > It seems to me that str.count is awfully slow. Is there some reason > for this? > Evidence: > > ######## str.count time test ######## > import string > import time > import array > > s = string.printable * int(1e5) # 10**7 character string > a = array.array('c', s) > u = unicode(s) > RIGHT_ANSWER = s.count('a') > > def main(): > print 'str: ', time_call(s.count, 'a') > print 'array: ', time_call(a.count, 'a') > print 'unicode:', time_call(u.count, 'a') > > def time_call(f, *a): > start = time.clock() > assert RIGHT_ANSWER == f(*a) > return time.clock()-start > > if __name__ == '__main__': > main() > > ###### end ######## > > On my machine, the output is: > > str: 0.29365715475 > array: 0.448095498171 > unicode: 0.0243757237303 > > If a unicode object can count characters so fast, why should an str > object be ten times slower? Just curious, really - it's still fast > enough for me (so far). > > This is with Python 2.4.1 on WinXP. > > > Chris Perkins Your evidence points to some unoptimized code in the underlying C implementation of Python. As such, this should probably go to the python-dev list (http://mail.python.org/mailman/listinfo/python-dev). The problem is that the C library function memcmp is slow, and str.count calls it frequently. See lines 2165+ in stringobject.c (inside function string_count): r = 0; while (i < m) { if (!memcmp(s+i, sub, n)) { r++; i += n; } else { i++; } } This could be optimized as: r = 0; while (i < m) { if (s[i] == *sub && !memcmp(s+i, sub, n)) { r++; i += n; } else { i++; } } This tactic typically avoids most (sometimes all) of the calls to memcmp. Other string search functions, including unicode.count, unicode.index, and str.index, use this tactic, which is why you see unicode.count performing better than str.count. The above might be optimized further for cases such as yours, where a single character appears many times in the string: r = 0; if (n == 1) { /* optimize for a single character */ while (i < m) { if (s[i] == *sub) r++; i++; } } else { while (i < m) { if (s[i] == *sub && !memcmp(s+i, sub, n)) { r++; i += n; } else { i++; } } } Note that there might be some subtle reason why neither of these optimizations are done that I'm unaware of... in which case a comment in the C source would help. :-) --Ben From tjreedy at udel.edu Tue Feb 28 01:07:40 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2006 19:07:40 -0500 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! References: <44033992.8040805@egenix.com> <44034818.7020101@benjiyork.com> Message-ID: "Benji York" wrote in message news:44034818.7020101 at benjiyork.com... >> http://msdn.microsoft.com/vstudio/express/default.aspx > > The express editions are only "free" until November 7th: > http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx#pricing One can keep using any version downloaded before that date, but I would not be surprised to see a bugfix sometime after. There is also this: " 2.What can I do with the Express Editions? ... Evaluate the .NET Framework for Windows and Web development. " and this " 13.Can I develop applications using the Visual Studio Express Editions to target the .NET Framework 1.1? No, each release of Visual Studio is tied to a specific version of the .NET Framework. The Express Editions can only be used to create applications that run on the .NET Framework 2.0. " 'Free' is not always free. This appears to be a .NET 2 promotion. Perhaps the Firefox people are using the professional version, without such a limitation? Terry Jan Reedy From fredrik at pythonware.com Tue Feb 28 01:06:50 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 28 Feb 2006 01:06:50 +0100 Subject: [Python-Dev] str.count is slow References: <1141073696.107136.318600@j33g2000cwa.googlegroups.com><1141083127.970403.147100@v46g2000cwv.googlegroups.com> Message-ID: (manually cross-posting from comp.lang.python) Ben Cartwright wrote: > Your evidence points to some unoptimized code in the underlying C > implementation of Python. As such, this should probably go to the > python-dev list (http://mail.python.org/mailman/listinfo/python-dev). > This tactic typically avoids most (sometimes all) of the calls to > memcmp. Other string search functions, including unicode.count, > unicode.index, and str.index, use this tactic, which is why you see > unicode.count performing better than str.count. it's about time that someone sat down and merged the string and unicode implementations into a single "stringlib" code base (see the SRE sources for an efficient way to do this in plain C). [1] moving to (basic) C++ might also be a good idea (in 3.0, perhaps). is any- one still stuck with pure C89 these days ? 1) anyone want me to start working on this ? From tjreedy at udel.edu Tue Feb 28 01:09:34 2006 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2006 19:09:34 -0500 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! References: <44033992.8040805@egenix.com> Message-ID: "M.-A. Lemburg" wrote in message news:44033992.8040805 at egenix.com... > Note: This is just a thought - I haven't looked into the consequences > of building with VC8 yet, e.g. from the list of pre-requisites, > it's possible that .NET 2.0 would become a requirement. >From the FAQ (see other reply), it appears that this *is* a requirement for the Express editions. From martin at v.loewis.de Tue Feb 28 01:20:57 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 28 Feb 2006 01:20:57 +0100 Subject: [Python-Dev] str.count is slow In-Reply-To: References: <1141073696.107136.318600@j33g2000cwa.googlegroups.com><1141083127.970403.147100@v46g2000cwv.googlegroups.com> Message-ID: <1141086057.44039769919a3@www.domainfactory-webmail.de> Zitat von Fredrik Lundh : > it's about time that someone sat down and merged the string and unicode > implementations into a single "stringlib" code base (see the SRE sources for > an efficient way to do this in plain C). [1] [...] > 1) anyone want me to start working on this ? This would be a waste of time: In Python 3, the string type will be gone (or, rather, the unicode type, depending on the point of view). Regards, Martin From fredrik at pythonware.com Tue Feb 28 01:24:50 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 28 Feb 2006 01:24:50 +0100 Subject: [Python-Dev] str.count is slow References: <1141073696.107136.318600@j33g2000cwa.googlegroups.com><1141083127.970403.147100@v46g2000cwv.googlegroups.com> <1141086057.44039769919a3@www.domainfactory-webmail.de> Message-ID: martin at v.loewis.de wrote: > > it's about time that someone sat down and merged the string and unicode > > implementations into a single "stringlib" code base (see the SRE sources for > > an efficient way to do this in plain C). [1] > [...] > > 1) anyone want me to start working on this ? > > This would be a waste of time: In Python 3, the string type will be > gone (or, rather, the unicode type, depending on the point of view). no matter what ends up in Python 3, you'll still need to perform operations on both 8-bit buffers and Unicode buffers. (not to mention that a byte type that doesn't support find/split/count etc is pretty useless). From martin at v.loewis.de Tue Feb 28 01:28:21 2006 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 28 Feb 2006 01:28:21 +0100 Subject: [Python-Dev] Switch to MS VC++ 2005 ?! In-Reply-To: References: <44033992.8040805@egenix.com> <44034818.7020101@benjiyork.com> Message-ID: <1141086501.4403992585357@www.domainfactory-webmail.de> Zitat von Terry Reedy : > " > 2.What can I do with the Express Editions? > ... > Evaluate the .NET Framework for Windows and Web development. > " > and this > > " Yes, but also this: """4. Can I use Express Editions for commercial use? Yes, there are no licensing restrictions for applications built using the Express Editions. """ > 13.Can I develop applications using the Visual Studio Express Editions to > target the .NET Framework 1.1? > No ... > 'Free' is not always free. This appears to be a .NET 2 promotion. Well, this is completely irrelevant for Python. Python does not use any .NET whatsoever (except for IronPython, of course). What framework version the C# links with is irrelevant for the to-native C compiler. > Perhaps the Firefox people are using the professional version, without such > a limitation? I guess the Express version can also build firefox, just fine. Regards, Martin From jeremy at alum.mit.edu Tue Feb 28 03:54:04 2006 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 27 Feb 2006 21:54:04 -0500 Subject: [Python-Dev] quick status report Message-ID: I made a few more minor revisions to the AST on the plane this afternoon. I'll check them in tomorrow when I get a chance to do a full test run. * Remove asdl_seq_APPEND. All uses replaced with set * Fix set_context() comments and check return value every where. * Reimplement real arena for pyarena.c Jeremy From tim.peters at gmail.com Tue Feb 28 05:48:44 2006 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 27 Feb 2006 22:48:44 -0600 Subject: [Python-Dev] Long-time shy failure in test_socket_ssl In-Reply-To: <1f7befae0601241553v10f7dd41i364e92b98f9de4a2@mail.gmail.com> References: <1f7befae0601241422sd783c37u5d676893de2f6055@mail.gmail.com> <1f7befae0601241553v10f7dd41i364e92b98f9de4a2@mail.gmail.com> Message-ID: <1f7befae0602272048x50d260b3i29819389709e0ab6@mail.gmail.com> [1/24/06, Tim Peters] >> ... >> test_rude_shutdown() is dicey, relying on a sleep() instead of proper >> synchronization to make it probable that the `listener` thread goes >> away before the main thread tries to connect, but while that race may >> account for bogus TestFailed deaths, it doesn't seem possible that it >> could account for the kind of failure above. [Tim Peters] > Well, since it's silly to try to guess about one weird failure when a > clear cause for another kind of weird failure is known, I checked in > changes to do "proper" thread synchronization and termination in that > test. Hasn't failed here since, but that's not surprising (it was > always a "once in a light blue moon" kind of thing). Neal plugged another hole later, but-- alas --I have seen the same shy failure since then on WinXP. One of the most recent buildbot test runs saw it too, on a non-Windows box: http://www.python.org/dev/buildbot/trunk/g5%20osx.3%20trunk/builds/204/step-test/0 test_socket_ssl test test_socket_ssl crashed -- exceptions.TypeError: 'NoneType' object is not callable in the second test run there. Still no theory! Maybe we can spend the next 3 days sprinting on it :-) From greg.ewing at canterbury.ac.nz Tue Feb 28 06:40:51 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 28 Feb 2006 18:40:51 +1300 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <4402F6A9.1030906@gmail.com> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> <43FFA4B2.7060303@canterbury.ac.nz> <4402F6A9.1030906@gmail.com> Message-ID: <4403E263.1060505@canterbury.ac.nz> Nick Coghlan wrote: > I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected > some day, such that the signature for the special method was "__next__(self, > input)" and for the builtin "next(iterator, input=None)" Aren't we getting an argument to next() anyway? Or was that idea dropped? Greg From greg.ewing at canterbury.ac.nz Tue Feb 28 06:43:20 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 28 Feb 2006 18:43:20 +1300 Subject: [Python-Dev] bytes.from_hex() In-Reply-To: <06Feb27.093841pst.58633@synergy1.parc.xerox.com> References: <06Feb27.093841pst.58633@synergy1.parc.xerox.com> Message-ID: <4403E2F8.2070106@canterbury.ac.nz> Bill Janssen wrote: > I use it quite a bit for image processing (converting to and from the > "data:" URL form), and various checksum applications (converting SHA > into a string). Aha! We have a customer! For those cases, would you find it more convenient for the result to be text or bytes in Py3k? Greg From greg.ewing at canterbury.ac.nz Tue Feb 28 07:03:06 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 28 Feb 2006 19:03:06 +1300 Subject: [Python-Dev] Pre-PEP: The "bytes" object In-Reply-To: References: <20060216025515.GA474@mems-exchange.org> <20060222212844.GA15221@mems-exchange.org> <43FFA908.6040208@ronadam.com> <43FFF7FF.9050301@ronadam.com> Message-ID: <4403E79A.4000306@canterbury.ac.nz> Jason Orendorff wrote: > I like these promises: > - bytes(arg) works like array.array('b', arg) > - bytes(arg1, arg2) works like bytes(arg1.encode(arg2)) +1. That's exactly how I think it should work, too. > I dislike these promises: > - bytes(s, [ignored]), where s is a str, works like array.array('b', s) > - bytes(u, [encoding]), where u is a unicode, > works like bytes(u.encode(encoding)) Agreed. Greg From greg.ewing at canterbury.ac.nz Tue Feb 28 07:38:38 2006 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 28 Feb 2006 19:38:38 +1300 Subject: [Python-Dev] str.count is slow In-Reply-To: References: <1141073696.107136.318600@j33g2000cwa.googlegroups.com> <1141083127.970403.147100@v46g2000cwv.googlegroups.com> Message-ID: <4403EFEE.80002@canterbury.ac.nz> Fredrik Lundh wrote: > moving to (basic) C++ might also be a good idea (in 3.0, perhaps). is any- > one still stuck with pure C89 these days ? Some of us actually *prefer* working with plain C when we have a choice, and don't consider ourselves "stuck" with it. My personal goal in life right now is to stay as far away from C++ as I can get. If CPython becomes C++-based (C++Python?) I will find it quite distressing, because my most favourite language will then be built on top of my least favourite language. Greg From fredrik at pythonware.com Tue Feb 28 08:45:49 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 28 Feb 2006 08:45:49 +0100 Subject: [Python-Dev] str.count is slow References: <1141073696.107136.318600@j33g2000cwa.googlegroups.com><1141083127.970403.147100@v46g2000cwv.googlegroups.com> <4403EFEE.80002@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Fredrik Lundh wrote: > > > moving to (basic) C++ might also be a good idea (in 3.0, perhaps). is any- > > one still stuck with pure C89 these days ? > > Some of us actually *prefer* working with plain C > when we have a choice, and don't consider ourselves > "stuck" with it. perhaps, but polymorphic code is a lot easier to write in C++ than in C. > My personal goal in life right now is to stay as > far away from C++ as I can get. so what C compiler are you using ? From ncoghlan at gmail.com Tue Feb 28 10:03:29 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2006 19:03:29 +1000 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <4403E263.1060505@canterbury.ac.nz> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> <43FFA4B2.7060303@canterbury.ac.nz> <4402F6A9.1030906@gmail.com> <4403E263.1060505@canterbury.ac.nz> Message-ID: <440411E1.6040401@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> I wouldn't mind seeing one of the early ideas from PEP 340 being >> resurrected some day, such that the signature for the special method >> was "__next__(self, input)" and for the builtin "next(iterator, >> input=None)" > > Aren't we getting an argument to next() anyway? > Or was that idea dropped? PEP 342 opted to extend the generator API instead (using "send") and leave the iterator protocol alone for the time being. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From mal at egenix.com Tue Feb 28 11:23:55 2006 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 28 Feb 2006 11:23:55 +0100 Subject: [Python-Dev] Making ascii the default encoding In-Reply-To: References: Message-ID: <440424BB.4040904@egenix.com> Neal Norwitz wrote: > PEP 263 states that in Phase 2 the default encoding will be set to > ASCII. Although the PEP is marked final, this isn't actually > implemented. The warning about using non-ASCII characters started in > 2.3. Does anyone think we shouldn't enforce the default being ASCII? > > This means if an # -*- coding: ... -*- is not set and non-ASCII > characters are used, an error will be generated. +1. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From anthony at interlink.com.au Tue Feb 28 16:39:12 2006 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed, 1 Mar 2006 02:39:12 +1100 Subject: [Python-Dev] 2.4.3 for end of March? Message-ID: <200603010239.13342.anthony@interlink.com.au> So I'm planning a 2.4.3c1 around the 22nd-23rd of March, with a 2.4.3 final a week later. This will be the first release since the svn cutover, which should make things exciting. This is to get things cleared out before we start the cycle of pain - ahem - the 2.5 release cycle. A 2.4.4 would then follow when 2.5 final is done, hopefully October or so... Anyone have any screaming issues with this? Martin's ok to do the Windows release, and the doc build should be fine, too. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From guido at python.org Tue Feb 28 18:02:55 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Feb 2006 11:02:55 -0600 Subject: [Python-Dev] defaultdict and on_missing() In-Reply-To: <440411E1.6040401@gmail.com> References: <20060222131328.nzferzk2xxk448cg@login.werra.lunarpages.com> <43FE2F5C.8090700@livinglogic.de> <327376ff0602231839h4c27c999la52039be37c28230@mail.gmail.com> <43FEAD58.2020208@canterbury.ac.nz> <015901c6395d$3b9fd4b0$2e2a960a@RaymondLaptop1> <43FFA4B2.7060303@canterbury.ac.nz> <4402F6A9.1030906@gmail.com> <4403E263.1060505@canterbury.ac.nz> <440411E1.6040401@gmail.com> Message-ID: On 2/28/06, Nick Coghlan wrote: > Greg Ewing wrote: > > Nick Coghlan wrote: > > > >> I wouldn't mind seeing one of the early ideas from PEP 340 being > >> resurrected some day, such that the signature for the special method > >> was "__next__(self, input)" and for the builtin "next(iterator, > >> input=None)" > > > > Aren't we getting an argument to next() anyway? > > Or was that idea dropped? > > PEP 342 opted to extend the generator API instead (using "send") and leave the > iterator protocol alone for the time being. One of the main reasons for this was the backwards compatibility problems at the C level. The C implementation doesn't take an argument. Adding an argument would cause all sorts of code breakage and possible segfaults (if there's 3rd party code calling tp_next for example). In 3.0 we could fix this. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Feb 28 18:34:35 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Feb 2006 11:34:35 -0600 Subject: [Python-Dev] with-statement heads-up Message-ID: I just realized that there's a bug in the with-statement as currently checked in. __exit__ is supposed to re-raise the exception if there was one; if it returns normally, the finally clause is NOT to re-raise it. The fix is relatively simple (I believe) but requires updating lots of unit tests. It'll be a while. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mbland at acm.org Tue Feb 28 18:52:04 2006 From: mbland at acm.org (Mike Bland) Date: Tue, 28 Feb 2006 09:52:04 -0800 Subject: [Python-Dev] with-statement heads-up In-Reply-To: References: Message-ID: <57ff0ed00602280952l556dfa98qb78cb04a72783a81@mail.gmail.com> On 2/28/06, Guido van Rossum wrote: > I just realized that there's a bug in the with-statement as currently > checked in. __exit__ is supposed to re-raise the exception if there > was one; if it returns normally, the finally clause is NOT to re-raise > it. The fix is relatively simple (I believe) but requires updating > lots of unit tests. It'll be a while. Hmm. My understanding was that __exit__ was *not* to reraise it, but was simply given the opportunity to record the exception-in-progress. Mike From guido at python.org Tue Feb 28 19:07:36 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Feb 2006 12:07:36 -0600 Subject: [Python-Dev] with-statement heads-up In-Reply-To: <57ff0ed00602280952l556dfa98qb78cb04a72783a81@mail.gmail.com> References: <57ff0ed00602280952l556dfa98qb78cb04a72783a81@mail.gmail.com> Message-ID: On 2/28/06, Mike Bland wrote: > On 2/28/06, Guido van Rossum wrote: > > I just realized that there's a bug in the with-statement as currently > > checked in. __exit__ is supposed to re-raise the exception if there > > was one; if it returns normally, the finally clause is NOT to re-raise > > it. The fix is relatively simple (I believe) but requires updating > > lots of unit tests. It'll be a while. > > Hmm. My understanding was that __exit__ was *not* to reraise it, but > was simply given the opportunity to record the exception-in-progress. Yes, that's what the PEP said. :-( Unfortunately the way the PEP is specified, the intended equivalence between writing a try/except in a @contextmanager-decorated generator and writing things out explicitly is lost. The plan was that this: @contextmanager def foo(): try: yield except Exception: pass with foo(): 1/0 would be equivalent to this: try: 1/0 except Exception: pass IOW with GENERATOR(): BLOCK becomes a macro call, and GENERATOR() becomes a macro definition; its body is the macro expansion with "yield" replaced by BLOCK. But in order to get those semantics, it must be possible for __exit__() to signal that the exception passed into it should *not* be re-raised. The current expansion uses roughly this: finally: ctx.__exit__(*exc) and here the finally clause will re-raise the exception (if there was one). I ran into this when writing unit tests for @contextmanager. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mbland at acm.org Tue Feb 28 19:25:41 2006 From: mbland at acm.org (Mike Bland) Date: Tue, 28 Feb 2006 10:25:41 -0800 Subject: [Python-Dev] with-statement heads-up In-Reply-To: References: <57ff0ed00602280952l556dfa98qb78cb04a72783a81@mail.gmail.com> Message-ID: <57ff0ed00602281025l79e96f69wf5b3268d7d3aff1d@mail.gmail.com> On 2/28/06, Guido van Rossum wrote: > On 2/28/06, Mike Bland wrote: > > On 2/28/06, Guido van Rossum wrote: > > > I just realized that there's a bug in the with-statement as currently > > > checked in. __exit__ is supposed to re-raise the exception if there > > > was one; if it returns normally, the finally clause is NOT to re-raise > > > it. The fix is relatively simple (I believe) but requires updating > > > lots of unit tests. It'll be a while. > > > > Hmm. My understanding was that __exit__ was *not* to reraise it, but > > was simply given the opportunity to record the exception-in-progress. > > Yes, that's what the PEP said. :-( > > Unfortunately the way the PEP is specified, the intended equivalence > between writing a try/except in a @contextmanager-decorated generator > and writing things out explicitly is lost. The plan was that this: > > @contextmanager > def foo(): > try: > yield > except Exception: > pass > > with foo(): > 1/0 > > would be equivalent to this: > > try: > 1/0 > except Exception: > pass > > IOW > > with GENERATOR(): > BLOCK > > becomes a macro call, and GENERATOR() becomes a macro definition; its > body is the macro expansion with "yield" replaced by BLOCK. But in > order to get those semantics, it must be possible for __exit__() to > signal that the exception passed into it should *not* be re-raised. > > The current expansion uses roughly this: > > finally: > ctx.__exit__(*exc) > > and here the finally clause will re-raise the exception (if there was one). > > I ran into this when writing unit tests for @contextmanager. This may just be my inexperience talking, and I don't have the code in front of me right this moment, but in my mind these semantics would simplify the original version of my patch, as we wouldn't have to juggle the stack at all. (Other than rotating the three exception objects, that is). We could then just pass the exception objects into __exit__ without having to leave a copy on the stack, and could forego the END_FINALLY. (I *think*.) Does that make sense? Mike From guido at python.org Tue Feb 28 20:07:14 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Feb 2006 13:07:14 -0600 Subject: [Python-Dev] with-statement heads-up In-Reply-To: <57ff0ed00602281025l79e96f69wf5b3268d7d3aff1d@mail.gmail.com> References: <57ff0ed00602280952l556dfa98qb78cb04a72783a81@mail.gmail.com> <57ff0ed00602281025l79e96f69wf5b3268d7d3aff1d@mail.gmail.com> Message-ID: On 2/28/06, Mike Bland wrote: > On 2/28/06, Guido van Rossum wrote: > > On 2/28/06, Mike Bland wrote: > > > On 2/28/06, Guido van Rossum wrote: > > > > I just realized that there's a bug in the with-statement as currently > > > > checked in. __exit__ is supposed to re-raise the exception if there > > > > was one; if it returns normally, the finally clause is NOT to re-raise > > > > it. The fix is relatively simple (I believe) but requires updating > > > > lots of unit tests. It'll be a while. > > > > > > Hmm. My understanding was that __exit__ was *not* to reraise it, but > > > was simply given the opportunity to record the exception-in-progress. > > > > Yes, that's what the PEP said. :-( > > > > Unfortunately the way the PEP is specified, the intended equivalence > > between writing a try/except in a @contextmanager-decorated generator > > and writing things out explicitly is lost. The plan was that this: > > > > @contextmanager > > def foo(): > > try: > > yield > > except Exception: > > pass > > > > with foo(): > > 1/0 > > > > would be equivalent to this: > > > > try: > > 1/0 > > except Exception: > > pass > > > > IOW > > > > with GENERATOR(): > > BLOCK > > > > becomes a macro call, and GENERATOR() becomes a macro definition; its > > body is the macro expansion with "yield" replaced by BLOCK. But in > > order to get those semantics, it must be possible for __exit__() to > > signal that the exception passed into it should *not* be re-raised. > > > > The current expansion uses roughly this: > > > > finally: > > ctx.__exit__(*exc) > > > > and here the finally clause will re-raise the exception (if there was one). > > > > I ran into this when writing unit tests for @contextmanager. > > This may just be my inexperience talking, and I don't have the code in > front of me right this moment, but in my mind these semantics would > simplify the original version of my patch, as we wouldn't have to > juggle the stack at all. (Other than rotating the three exception > objects, that is). We could then just pass the exception objects into > __exit__ without having to leave a copy on the stack, and could forego > the END_FINALLY. (I *think*.) Does that make sense? Yes, it does. Except there's yet another wrinkle: non-local gotos (break, continue, return). The special WITH_CLEANUP opcode that I added instead of your ROT4 magic now considers the following cases: - if the "exception indicator" is None or an int, leave it, and push three Nones - otherwise, replace the exception indicator and the two elements below it with a single None (thus reducing the stack level by 2), *and* push the exception indicator and those two elements back onto the stack, in reverse order. To clarify, let's look at the four cases. I'm drawing the stack top on the right: (return or continue; the int is WHY_RETURN or WHY_CONTINUE) BEFORE: retval; int; __exit__ AFTER: retval; int; __exit__; None; None; None (break; the int is WHY_BREAK) BEFORE: int; __exit__ AFTER: int; __exit__; None; None; None (no exception) BEFORE: None; __exit__ AFTER: None; __exit__; None; None; None (exception) BEFORE: traceback; value; type; __exit__ AFTER: None; __exit__; type; value; traceback The code generated in the finally clause looks as follows: WITH_CLEANUP (this does the above transform) CALL_FUNCTION 3 (calls __exit__ with three arguments) POP_TOP (throws away the result) END_FINALLY (interprets the int or None now on top appropriately) Hope this helps (if not you, future generations :-). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Tue Feb 28 22:57:52 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 01 Mar 2006 07:57:52 +1000 Subject: [Python-Dev] with-statement heads-up In-Reply-To: References: Message-ID: <4404C760.3060809@gmail.com> Guido van Rossum wrote: > I just realized that there's a bug in the with-statement as currently > checked in. __exit__ is supposed to re-raise the exception if there > was one; if it returns normally, the finally clause is NOT to re-raise > it. The fix is relatively simple (I believe) but requires updating > lots of unit tests. It'll be a while. So does that mean with statements *will* be able to suppress exceptions now? (If I'm reading the PEP changes right it does, but I haven't finished my coffee yet. . .) I'm not complaining if that's so, as I think allowing it makes the operation of the statement both more useful and more intuitive, but you were originally concerned about the potential for hidden flow control if the context manager could suppress exceptions, as well as the need to remember to write "raise" in the except clauses of context managers. If you changed your mind along the way, that should probably be explained in the PEP somewhere :) Cheers, Nick -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Tue Feb 28 23:01:49 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Feb 2006 16:01:49 -0600 Subject: [Python-Dev] with-statement heads-up In-Reply-To: <4404C760.3060809@gmail.com> References: <4404C760.3060809@gmail.com> Message-ID: On 2/28/06, Nick Coghlan wrote: > Guido van Rossum wrote: > > I just realized that there's a bug in the with-statement as currently > > checked in. __exit__ is supposed to re-raise the exception if there > > was one; if it returns normally, the finally clause is NOT to re-raise > > it. The fix is relatively simple (I believe) but requires updating > > lots of unit tests. It'll be a while. > > So does that mean with statements *will* be able to suppress exceptions now? > (If I'm reading the PEP changes right it does, but I haven't finished my > coffee yet. . .) Yes. And unless there are peasants at the gate with pitchforks etc. it will stay that way. :-) > I'm not complaining if that's so, as I think allowing it makes the operation > of the statement both more useful and more intuitive, but you were originally > concerned about the potential for hidden flow control if the context manager > could suppress exceptions, as well as the need to remember to write "raise" in > the except clauses of context managers. Yes, I've changed my mind about that. > If you changed your mind along the way, that should probably be explained in > the PEP somewhere :) I don't know that PEPs benefit from too much "on the one hand, on the other hand, on the third hand" or "and then I changed my mind, and then I changed it back, and then I changed it again". -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Tue Feb 28 23:12:50 2006 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 01 Mar 2006 08:12:50 +1000 Subject: [Python-Dev] with-statement heads-up In-Reply-To: References: <4404C760.3060809@gmail.com> Message-ID: <4404CAE2.2020305@gmail.com> Guido van Rossum wrote: >> If you changed your mind along the way, that should probably be explained in >> the PEP somewhere :) > > I don't know that PEPs benefit from too much "on the one hand, on the > other hand, on the third hand" or "and then I changed my mind, and > then I changed it back, and then I changed it again". Heh :) Plus the SVN history and the mailing list archive already provide that record. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From pje at telecommunity.com Tue Feb 28 23:29:16 2006 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 28 Feb 2006 16:29:16 -0600 Subject: [Python-Dev] with-statement heads-up In-Reply-To: References: <4404C760.3060809@gmail.com> Message-ID: <7.0.1.0.0.20060228161944.01ffc918@telecommunity.com> At 04:01 PM 2/28/2006, Guido van Rossum wrote: >On 2/28/06, Nick Coghlan wrote: > > Guido van Rossum wrote: > > > I just realized that there's a bug in the with-statement as currently > > > checked in. __exit__ is supposed to re-raise the exception if there > > > was one; if it returns normally, the finally clause is NOT to re-raise > > > it. The fix is relatively simple (I believe) but requires updating > > > lots of unit tests. It'll be a while. > > > > So does that mean with statements *will* be able to suppress > exceptions now? > > (If I'm reading the PEP changes right it does, but I haven't finished my > > coffee yet. . .) > >Yes. And unless there are peasants at the gate with pitchforks etc. it >will stay that way. :-) Notice that these semantics break some of the PEP examples. For example, the 'locked' and 'nested' classes now suppress exceptions, and it took a non-trivial study of their code to determine this. This seems to suggest that making suppression the default behavior is a bad idea. I was originally on the side of allowing suppression, but I wanted it to be done by explicitly returning some non-None value, so that suppression would not be the default, implicit behavior. I think I'd prefer not to be able to suppress the errors, than to have errors pass silently unless explicitly re-raised! I don't see a problem with having e.g. __exit__ have to return a flag to suppress the exception; it wouldn't need to change how @contextmanager functions are written. (Implicit suppression is only a problem for people writing __exit__ methods, in other words; all your reasoning about @contextmanager generators is valid, IMO.) From guido at python.org Tue Feb 28 23:36:26 2006 From: guido at python.org (Guido van Rossum) Date: Tue, 28 Feb 2006 16:36:26 -0600 Subject: [Python-Dev] with-statement heads-up In-Reply-To: <7.0.1.0.0.20060228161944.01ffc918@telecommunity.com> References: <4404C760.3060809@gmail.com> <7.0.1.0.0.20060228161944.01ffc918@telecommunity.com> Message-ID: On 2/28/06, Phillip J. Eby wrote: > Notice that these semantics break some of the PEP examples. For > example, the 'locked' and 'nested' classes now suppress exceptions, > and it took a non-trivial study of their code to determine > this. This seems to suggest that making suppression the default > behavior is a bad idea. I presume you're referring to example 4 (locked as a class), not example 1 (locked as a generator). I'll fix this, and rewrite nested() as a generator (just like what I checked in :-). > I was originally on the side of allowing suppression, but I wanted it > to be done by explicitly returning some non-None value, so that > suppression would not be the default, implicit behavior. I think I'd > prefer not to be able to suppress the errors, than to have errors > pass silently unless explicitly re-raised! I don't see a problem > with having e.g. __exit__ have to return a flag to suppress the > exception; it wouldn't need to change how @contextmanager functions > are written. (Implicit suppression is only a problem for people > writing __exit__ methods, in other words; all your reasoning about > @contextmanager generators is valid, IMO.) Thanks for the validation of the idea -- I ran into this when writing unittests for @contextmanager... I think that providing sufficient *correct* examples will avoid most of the problems. People will clone existing examples (I know *I* did when adding context managers to various modules :-). Changing the API to let __exit__() return something true to suppress the exception seems somewhat clumsy. Re-raising the exception is analogous to the throw() method in PEP 342. -- --Guido van Rossum (home page: http://www.python.org/~guido/)