From nnorwitz at gmail.com Tue Apr 1 06:25:24 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 31 Mar 2008 21:25:24 -0700 Subject: [Python-3000] refleaks Message-ID: The current refleaks for 3k are: test_compile leaked [10, 10, 10] references, sum=30 test_io leaked [21, 21, 21] references, sum=63 test_itertools leaked [4, 4, 4] references, sum=12 test_queue leaked [995, 996, 996] references, sum=2987 When running the refleak hunter, 4 tests failed: test_codecs test_collections test_profile test_tcl test_tcl can't run properly IIRC, but I think the other 3 should be able to run with -R. n From nnorwitz at gmail.com Tue Apr 1 09:41:40 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 1 Apr 2008 00:41:40 -0700 Subject: [Python-3000] refleaks In-Reply-To: References: Message-ID: I fixed the itertools refleak. test_compile leaks due to code like this: class J: def foo(): class Bar: pass I thought Amaury fixed that problem already? n On Mon, Mar 31, 2008 at 9:25 PM, Neal Norwitz wrote: > The current refleaks for 3k are: > > test_compile leaked [10, 10, 10] references, sum=30 > test_io leaked [21, 21, 21] references, sum=63 > test_itertools leaked [4, 4, 4] references, sum=12 > test_queue leaked [995, 996, 996] references, sum=2987 > > When running the refleak hunter, 4 tests failed: > test_codecs test_collections test_profile test_tcl > > test_tcl can't run properly IIRC, but I think the other 3 should be > able to run with -R. > > n > From amauryfa at gmail.com Tue Apr 1 09:58:47 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 1 Apr 2008 09:58:47 +0200 Subject: [Python-3000] refleaks In-Reply-To: References: Message-ID: On Tue, Apr 1, 2008 at 9:41 AM, Neal Norwitz wrote: > I fixed the itertools refleak. > > test_compile leaks due to code like this: > > class J: > def foo(): > class Bar: pass > > I thought Amaury fixed that problem already? http://mail.python.org/pipermail/python-3000-checkins/2008-March/003205.html says: > Blocked revisions 62015 via svnmerge > This was apparently fixed in r54428 already Which is not exact; the lines in compile.c (around line 1552) /* use the class name for name mangling */ Py_INCREF(s->v.ClassDef.name); c->u->u_private = s->v.ClassDef.name; should be changed to /* use the class name for name mangling */ Py_INCREF(s->v.ClassDef.name); Py_XDECREF(c->u->u_private); c->u->u_private = s->v.ClassDef.name; I agree that the merge was complicated: compile.c changed a lot during r54428. Maybe an argument in favor of the MYOC pattern? ("Merge Your Own Code": http://www.cmcrossroads.com/bradapp/acme/branching/branch-policy.html#MYOC ) -- Amaury Forgeot d'Arc From nnorwitz at gmail.com Tue Apr 1 10:10:28 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 1 Apr 2008 01:10:28 -0700 Subject: [Python-3000] refleaks In-Reply-To: References: Message-ID: On Tue, Apr 1, 2008 at 12:58 AM, Amaury Forgeot d'Arc wrote: > On Tue, Apr 1, 2008 at 9:41 AM, Neal Norwitz wrote: > > I fixed the itertools refleak. > > > > test_compile leaks due to code like this: > > > > class J: > > def foo(): > > class Bar: pass > > > > I thought Amaury fixed that problem already? > > http://mail.python.org/pipermail/python-3000-checkins/2008-March/003205.html > says: > > Blocked revisions 62015 via svnmerge > > This was apparently fixed in r54428 already [solution] This fixed the problem in test_compile. Committed revision 62089. test_io and test_queue are still leaking. n From van.lindberg at gmail.com Tue Apr 1 13:54:45 2008 From: van.lindberg at gmail.com (VanL) Date: Tue, 01 Apr 2008 06:54:45 -0500 Subject: [Python-3000] 3to2 Message-ID: I know there has been some discussion of a 3to2 tool for easing porting. The PyPy team has created at least the start of such a tool: "Under the hood, the 2to3 conversion tool operates as a graph transformer: it takes the graph of your program (in the form of Python 2.x source file) and returns a transformed graph of the same program (in the form of Python 3.0 source file). Since the entire translation toolchain of PyPy is based on graph transformations, we could reuse it to modify the behaviour of the 2to3 tool. We wrote a general graph-inverter algorithm which, as the name suggests, takes a graph transformation and build the inverse transformation; then, we applied the graph inverter to 2to3, getting something that we called 3to2: it is important to underline that 3to2 was built by automatically analysing 2to3 and reversing its operation with only the help of a few manual hints. For this reason and because we are not keeping generated files under version control, we do not need to maintain this new tool in the Subversion repository. Once we built 3to2, it was relatively easy to pipe its result to our interpreter, getting something that can run Python 3.0 programs." From http://morepypy.blogspot.com/2008/04/trying-to-get-pypy-to-run-on-python-30.html Thanks, Van From showell30 at yahoo.com Tue Apr 1 16:38:07 2008 From: showell30 at yahoo.com (Steve Howell) Date: Tue, 1 Apr 2008 07:38:07 -0700 (PDT) Subject: [Python-3000] problems with the 3to2 converter Message-ID: <553490.64821.qm@web33507.mail.mud.yahoo.com> I've written about 100,000 lines of Py3K code since it was released, mostly on evenings and weekends, so I was very excited to see Van release the new 3to2 tool today. I immediately ran it against my codebase, and it mostly works, but I got some strange diagnostics: line 673234: lambda cannot be renamed in ANY temporal dimension line 782121: grammar reduced to LL(0), turn on -0 flag for further simplification line 913975: parens not removed from print(), please use 3to4 converter instead Thoughts? ____________________________________________________________________________________ You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. http://tc.deals.yahoo.com/tc/blockbuster/text5.com From collinw at gmail.com Tue Apr 1 17:30:46 2008 From: collinw at gmail.com (Collin Winter) Date: Tue, 1 Apr 2008 08:30:46 -0700 Subject: [Python-3000] problems with the 3to2 converter In-Reply-To: <553490.64821.qm@web33507.mail.mud.yahoo.com> References: <553490.64821.qm@web33507.mail.mud.yahoo.com> Message-ID: <43aa6ff70804010830n59c055f6ha9e23eba23afbf25@mail.gmail.com> On Tue, Apr 1, 2008 at 7:38 AM, Steve Howell wrote: > I've written about 100,000 lines of Py3K code since it > was released, mostly on evenings and weekends, so I > was very excited to see Van release the new 3to2 tool > today. > > I immediately ran it against my codebase, and it > mostly works, but I got some strange diagnostics: > > line 673234: lambda cannot be renamed in ANY > temporal dimension > > line 782121: grammar reduced to LL(0), turn on -0 > flag for further simplification > > line 913975: parens not removed from print(), > please use 3to4 converter instead > > Thoughts? Are you talking about the 3to2 written by the PyPy people (http://morepypy.blogspot.com/2008/04/trying-to-get-pypy-to-run-on-python-30.html)? You should ask them, since that's their project. (2to3 is the one maintained by python-dev.) Collin Winter From amauryfa at gmail.com Tue Apr 1 18:04:58 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 1 Apr 2008 18:04:58 +0200 Subject: [Python-3000] TypeError: expected bytes, str found In-Reply-To: References: Message-ID: Vizcayno wrote: > I am doing some testing using Python r30a3:61161 under command prompt > of WinXp SP2. > Is this possible to find an explanation about next error? I tried to > find the error message in the web but no info exists and, can not > isolate or reproduce it. > Many, many thanks for your attention. > > Traceback (most recent call last): > File "testconn.py", line 112, in > main(sys.argv) > File "testconn.py", line 100, in main > sap.sapinfo() > File "C:\os\sapconn\saprfc_py30\saprfc.py", line 142, in sapinfo > print("Aqui estoy") > File "C:\python30\lib\io.py", line 1248, in write > self.buffer.write(b) > File "C:\python30\lib\io.py", line 852, in write > if len(self._write_buf) > self.buffer_size: > TypeError: expected bytes, str found Did you by any chance redirect stdout or stderr to something? cStringIO, for example. -- Amaury Forgeot d'Arc From van.lindberg at gmail.com Tue Apr 1 20:42:25 2008 From: van.lindberg at gmail.com (VanL) Date: Tue, 01 Apr 2008 13:42:25 -0500 Subject: [Python-3000] problems with the 3to2 converter In-Reply-To: <553490.64821.qm@web33507.mail.mud.yahoo.com> References: <553490.64821.qm@web33507.mail.mud.yahoo.com> Message-ID: Steve Howell wrote: > I've written about 100,000 lines of Py3K code since it > was released, mostly on evenings and weekends, so I > was very excited to see Van release the new 3to2 tool > today. A point of clarification: I did not release anything. I was simply pointing out someone else's work that could be relevant to this list. It is the PyPy team that has created this tool, and deserves all the credit. Thanks, Van From solipsis at pitrou.net Tue Apr 1 20:55:43 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 1 Apr 2008 18:55:43 +0000 (UTC) Subject: [Python-3000] problems with the 3to2 converter References: <553490.64821.qm@web33507.mail.mud.yahoo.com> Message-ID: Steve Howell yahoo.com> writes: > > line 673234: lambda cannot be renamed in ANY > temporal dimension > [...] > > line 913975: parens not removed from print(), > please use 3to4 converter instead Mmmh... "3to2" was released on April 1st right ? :) From musiccomposition at gmail.com Tue Apr 1 22:46:30 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 1 Apr 2008 15:46:30 -0500 Subject: [Python-3000] IO __all__ Message-ID: <1afaf6160804011346i4e3b4a49hf3e08e9f3a8461f2@mail.gmail.com> Is there a reason io.open is in the __all__? It seems to me it would redundant and confusing to import a builtin. -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080401/8ca30c3f/attachment.htm From dbpokorny at gmail.com Tue Apr 1 23:24:34 2008 From: dbpokorny at gmail.com (David Pokorny) Date: Tue, 1 Apr 2008 14:24:34 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends Message-ID: Hi Py3k, I just started using Python 3000 (for my own projects, nothing production-y, and mostly for the class decorators and function annotations) but ever since I noticed PEP 3106, the "spooky" behavior of dict views has been bothering me...I wouldn't say it has been keeping me up late at night, but I can't seem to shake this nagging sensation that it is only a matter of time before I return dict.items(), .keys(), or .values() from a function while not all of my wits are about me and end up with some impossible-to-trace algorithmic error because I wrote mydict.items() instead of list(mydict.items()). This is what I'm talking about: >>> mydict = {1:2} >>> myitems = mydict.items() >>> [x for x in myitems] [(1, 2)] >>> mydict[3] = 'foo' >>> [x for x in myitems] [(1, 2), (3, 'foo')] What really bugs me about this state of affairs is that I consider the python 2 dict.items() to be safe and free of surprises, but I no longer feel the same way about it in 3; this is really about the fact that when you want to get the items, keys, or values of a dict, the simplest thing is no longer the safest thing. (I don't want to belabor this point too much since it isn't really my place to judge, but to me, dict views feel like they are a "special case that breaks the rules.") In python 2 we had dict.iteritems(), and I never had any negative feelings about it because I always considered it "more complicated" than dict.items(), and I almost never used it. Every time I see dict.iteritems() I think, "OK, this is a lightweight iterator, it better get used up before the dict changes." If dict.items() had stayed the same function but we lost dict.iteritems() and gained dict.itemsview(), I wouldn't have written this message. At any rate, my apologies for the long message, but I've been sitting on this for a while, and I just felt the need to get this out. Thanks for reading, David Pokorny From martin at v.loewis.de Wed Apr 2 00:15:42 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 02 Apr 2008 00:15:42 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: Message-ID: <47F2B40E.2080304@v.loewis.de> > What really bugs me about this state of affairs is that I consider the > python 2 dict.items() to be safe and free of surprises, but I no > longer feel the same way about it in 3; this is really about the fact > that when you want to get the items, keys, or values of a dict, the > simplest thing is no longer the safest thing. (I don't want to belabor > this point too much since it isn't really my place to judge, but to > me, dict views feel like they are a "special case that breaks the > rules.") I feel to the contrary. 2.x .keys() was not safe, but 3.x keys() is. When I iterate over the keys of a dictionary, I want all of them, and I want only the keys. With 2.x, it could always happen that the dictionary changes "behind me", and then I'd either iterate over not all of the keys, or see some keys that aren't actually in the dictionary anymore. With 3.x dictionary views, it's much safer now. Regards, Martin From p.f.moore at gmail.com Wed Apr 2 00:25:50 2008 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 1 Apr 2008 23:25:50 +0100 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F2B40E.2080304@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> Message-ID: <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> On 01/04/2008, "Martin v. L?wis" wrote: > > What really bugs me about this state of affairs is that I consider the > > python 2 dict.items() to be safe and free of surprises, but I no > > longer feel the same way about it in 3; this is really about the fact > > that when you want to get the items, keys, or values of a dict, the > > simplest thing is no longer the safest thing. (I don't want to belabor > > this point too much since it isn't really my place to judge, but to > > me, dict views feel like they are a "special case that breaks the > > rules.") > > > I feel to the contrary. 2.x .keys() was not safe, but 3.x keys() is. > When I iterate over the keys of a dictionary, I want all of them, > and I want only the keys. With 2.x, it could always happen that the > dictionary changes "behind me", and then I'd either iterate over > not all of the keys, or see some keys that aren't actually in the > dictionary anymore. With 3.x dictionary views, it's much safer now. The oddity with the 3.x keys() is that it's plausible to retain a reference to d.keys(), but if you do so it can change if you alter d. In 2.x, d.keys() is static and d.iterkeys() is (for all practical purposes) not something you retain. The 3.x d.keys() "action at a distance" is unfamiliar - I can't think of an example of this type of view semantics in 2.x (at least in the core - numpy has had this for some time, I believe). I suspect that view semantics will become less surprising over time, but I think it's a fair point that it's something new to get used to. Paul. From martin at v.loewis.de Wed Apr 2 00:47:56 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 02 Apr 2008 00:47:56 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> Message-ID: <47F2BB9C.1090903@v.loewis.de> > I suspect that view semantics will become less surprising over time, > but I think it's a fair point that it's something new to get used to. I don't doubt that it is surprising. I object to calling it unsafe. Regards, Martin From musiccomposition at gmail.com Wed Apr 2 00:53:01 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 1 Apr 2008 17:53:01 -0500 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> Message-ID: <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> On Tue, Apr 1, 2008 at 5:25 PM, Paul Moore wrote: > On 01/04/2008, "Martin v. L?wis" wrote: > > > What really bugs me about this state of affairs is that I consider the > > > python 2 dict.items() to be safe and free of surprises, but I no > > > longer feel the same way about it in 3; this is really about the fact > > > that when you want to get the items, keys, or values of a dict, the > > > simplest thing is no longer the safest thing. (I don't want to > belabor > > > this point too much since it isn't really my place to judge, but to > > > me, dict views feel like they are a "special case that breaks the > > > rules.") > > > > > > I feel to the contrary. 2.x .keys() was not safe, but 3.x keys() is. > > When I iterate over the keys of a dictionary, I want all of them, > > and I want only the keys. With 2.x, it could always happen that the > > dictionary changes "behind me", and then I'd either iterate over > > not all of the keys, or see some keys that aren't actually in the > > dictionary anymore. With 3.x dictionary views, it's much safer now. > > The oddity with the 3.x keys() is that it's plausible to retain a > reference to d.keys(), but if you do so it can change if you alter d. > In 2.x, d.keys() is static and d.iterkeys() is (for all practical > purposes) not something you retain. The 3.x d.keys() "action at a > distance" is unfamiliar - I can't think of an example of this type of > view semantics in 2.x (at least in the core - numpy has had this for > some time, I believe). > > I suspect that view semantics will become less surprising over time, > but I think it's a fair point that it's something new to get used to. I personally find it less surprising. It seems logical to me that whatever you get from dict.items should reflect the current state of the dictionary. When you want it static, it's better to be explicit and say list(dict.keysor dict.values). > > > Paul. > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080401/3282c730/attachment-0001.htm From guido at python.org Wed Apr 2 01:40:57 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Apr 2008 16:40:57 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> Message-ID: I'm not sure that it's really a safe vs. non-safe issue. The OP's concern is that the change affects behavior of keys() and friends that people have internalized for the past 18 years. I certainly don't see this is a reason to change it back (we knew this would be the case). I do think it needs to be ephasized in every "What's New" doc. It's already in the list of Common Stumbling Blocks, but the paragraph there could be expanded a bit. --Guido On Tue, Apr 1, 2008 at 3:53 PM, Benjamin Peterson wrote: > > > > > On Tue, Apr 1, 2008 at 5:25 PM, Paul Moore wrote: > > > > On 01/04/2008, "Martin v. L?wis" wrote: > > > > What really bugs me about this state of affairs is that I consider the > > > > python 2 dict.items() to be safe and free of surprises, but I no > > > > longer feel the same way about it in 3; this is really about the fact > > > > that when you want to get the items, keys, or values of a dict, the > > > > simplest thing is no longer the safest thing. (I don't want to > belabor > > > > this point too much since it isn't really my place to judge, but to > > > > me, dict views feel like they are a "special case that breaks the > > > > rules.") > > > > > > > > > I feel to the contrary. 2.x .keys() was not safe, but 3.x keys() is. > > > When I iterate over the keys of a dictionary, I want all of them, > > > and I want only the keys. With 2.x, it could always happen that the > > > dictionary changes "behind me", and then I'd either iterate over > > > not all of the keys, or see some keys that aren't actually in the > > > dictionary anymore. With 3.x dictionary views, it's much safer now. > > > > The oddity with the 3.x keys() is that it's plausible to retain a > > reference to d.keys(), but if you do so it can change if you alter d. > > In 2.x, d.keys() is static and d.iterkeys() is (for all practical > > purposes) not something you retain. The 3.x d.keys() "action at a > > distance" is unfamiliar - I can't think of an example of this type of > > view semantics in 2.x (at least in the core - numpy has had this for > > some time, I believe). > > > > I suspect that view semantics will become less surprising over time, > > but I think it's a fair point that it's something new to get used to. > I personally find it less surprising. It seems logical to me that whatever > you get from dict.items should reflect the current state of the dictionary. > When you want it static, it's better to be explicit and say list(dict.keys > or dict.values). > > > > > > > Paul. > > > > > > > > > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > -- > Cheers, > Benjamin Peterson > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/guido%40python.org > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Apr 2 01:43:16 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Apr 2008 16:43:16 -0700 Subject: [Python-3000] IO __all__ In-Reply-To: <1afaf6160804011346i4e3b4a49hf3e08e9f3a8461f2@mail.gmail.com> References: <1afaf6160804011346i4e3b4a49hf3e08e9f3a8461f2@mail.gmail.com> Message-ID: Well, it *is* part of the public interface of io.py, and it *is* the implementation of the built-in open() function. So I don't think this should be changed. The module's name is so short that I hope people won't import * from it. On Tue, Apr 1, 2008 at 1:46 PM, Benjamin Peterson wrote: > Is there a reason io.open is in the __all__? It seems to me it would > redundant and confusing to import a builtin. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Apr 2 02:14:26 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 1 Apr 2008 17:14:26 -0700 Subject: [Python-3000] PEP 3102 question In-Reply-To: References: Message-ID: On Mon, Mar 31, 2008 at 12:12 PM, Alexander Belopolsky wrote: > Do I understand correctly that with PEP 3102 implemented, keyword > arguments can follow vararg in function definitions, but doing the > same when calling the function is still a syntax error? > > With the latest py3k, > > >>> def f(a, *args, v=None): > ... pass > ... > >>> f(a, *args, v=None) > File "", line 1 > f(a, *args, v=None) > ^ > SyntaxError: invalid syntax > > Is this intentional? Yes, in the sense that the PEP doesn't propose to fix this. Thomas Wouters's changes for variable tuple packing might fix this, if we can agree to add that feature. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From showell30 at yahoo.com Wed Apr 2 02:57:36 2008 From: showell30 at yahoo.com (Steve Howell) Date: Tue, 1 Apr 2008 17:57:36 -0700 (PDT) Subject: [Python-3000] problems with the 3to2 converter In-Reply-To: Message-ID: <662344.33000.qm@web33505.mail.mud.yahoo.com> --- Antoine Pitrou wrote: > Steve Howell yahoo.com> writes: > > > > line 673234: lambda cannot be renamed in ANY > > temporal dimension > > > [...] > > > > line 913975: parens not removed from print(), > > please use 3to4 converter instead > > Mmmh... "3to2" was released on April 1st right ? :) > Yep. :) > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/showell30%40yahoo.com > ____________________________________________________________________________________ You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost. http://tc.deals.yahoo.com/tc/blockbuster/text5.com From dbpokorny at gmail.com Wed Apr 2 03:31:51 2008 From: dbpokorny at gmail.com (David Pokorny) Date: Tue, 1 Apr 2008 18:31:51 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> Message-ID: On Tue, Apr 1, 2008 at 4:40 PM, Guido van Rossum wrote: > I'm not sure that it's really a safe vs. non-safe issue. The OP's > concern is that the change affects behavior of keys() and friends that > people have internalized for the past 18 years. I certainly don't see > this is a reason to change it back (we knew this would be the case). I > do think it needs to be ephasized in every "What's New" doc. It's > already in the list of Common Stumbling Blocks, but the paragraph > there could be expanded a bit. I agree that the 3.0 behavior is safe but surprising. So unless I am misinterpreting this, it sounds like the burden of proof now falls on the option to keep the status quo. The thing is that it seems to me that if that an outside observer were to look at this situation, then they might ask why the names are being changed when the current behavior is functional and no one is clamoring for the change. If you disagree, then I still don't understand the motivation on the PEP, and the current motivation, "being able to do set operations on keys and items without having to copy them" does not appear to pertain to the issue of which names should correspond to which behavior. Once viewitems() or itemview() gets backported to 2.6, then unless I am missing something, the PEP is concerned solely with changing the names. FWIW, there is a distinction between "loud" changes such as, say, "print" changing from a function to a statement or "except" blocks rejecting the "except TypeError, exc" syntax in favor of "except TypeError as exc" and "silent" changes such as the one under discussion; the change under discussion alters the meaning of an expression that is valid in both 2 and 3 whereas in the other cases the code will not compile in 3. David From musiccomposition at gmail.com Wed Apr 2 04:36:37 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 1 Apr 2008 21:36:37 -0500 Subject: [Python-3000] IO __all__ In-Reply-To: References: <1afaf6160804011346i4e3b4a49hf3e08e9f3a8461f2@mail.gmail.com> Message-ID: <1afaf6160804011936l3101b794y68a74ec48c465c6b@mail.gmail.com> On Tue, Apr 1, 2008 at 6:43 PM, Guido van Rossum wrote: > Well, it *is* part of the public interface of io.py, and it *is* the > implementation of the built-in open() function. So I don't think this > should be changed. The module's name is so short that I hope people > won't import * from it. Ok. It just seems to me to be an accident waiting to happen which we could easily avoid. > > > On Tue, Apr 1, 2008 at 1:46 PM, Benjamin Peterson > wrote: > > Is there a reason io.open is in the __all__? It seems to me it would > > redundant and confusing to import a builtin. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080401/e0c98bb9/attachment.htm From martin at v.loewis.de Wed Apr 2 04:37:05 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 02 Apr 2008 04:37:05 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> Message-ID: <47F2F151.3050809@v.loewis.de> > So unless I am misinterpreting this, it sounds like the burden of > proof now falls on the option to keep the status quo. The thing is > that it seems to me that if that an outside observer were to look at > this situation, then they might ask why the names are being changed > when the current behavior is functional and no one is clamoring for > the change. I think it's fairly obvious why the 2.x .keys() has to change. It's just too wasteful to actually build the list of all keys of a dictionary (or even of all values, as you have to create all the tuples as well), if all you want to do is to iterate over it, and the most common operation of .keys() is to iterate over it in a for look (right?). Applications that take a snapshot of the .keys() are rare (right?). Even more uncommon are applications that take a snapshot of .keys(), and then continue changing the dictionary. And yet more uncommon are cases where you save a snapshot of .keys(), change the dictionary, and then break (i.e. fail to function correctly) if the snapshot gets "silently" updated. > If you disagree, then I still don't understand the motivation on the > PEP, and the current motivation, "being able to do set operations on > keys and items without having to copy them" does not appear to pertain > to the issue of which names should correspond to which behavior. The most direct name should be used in the most common scenario, which is the for loop. I.e. people who don't think about this issue at all should likely do the right thing. For 2.x, this is not the case. Regards, Martin From dbpokorny at gmail.com Wed Apr 2 08:00:41 2008 From: dbpokorny at gmail.com (David Pokorny) Date: Tue, 1 Apr 2008 23:00:41 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F2F151.3050809@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: On Tue, Apr 1, 2008 at 7:37 PM, "Martin v. L?wis" wrote: > I think it's fairly obvious why the 2.x .keys() has to change. It's > just too wasteful to actually build the list of all keys of a dictionary > (or even of all values, as you have to create all the tuples as well), > if all you want to do is to iterate over it, and the most common > operation of .keys() is to iterate over it in a for look (right?). I agree that the most common operation/scenario is the one you describe, but I don't understand why the behavior of the most common name should be the most efficient implementation of the most common scenario. One could propose an alternate policy: the behavior of the most common name should correspond to the most common (human) interpretation of the name. According to this policy, I think there are valid arguments to be made for .keys() to return either a list or set (set if you had never used python 2 before, list if you had), but I don't think a dict_keys object that is tied to the underlying dict is a common interpretation of the meaning of .keys() (outside this list). This is a good policy because it minimizes the mental housekeeping required to understand a given piece of code; this is a real benefit for the programmer. (And especially for the programmer just coming to Python). With all due respect, the policy you describe---a more efficient implementation in the common case---optimizes the code of people who don't think about this issue at all. In other words it facilitates premature optimization. David From martin at v.loewis.de Wed Apr 2 08:39:50 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 02 Apr 2008 08:39:50 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <20080402005650.3c96033b@bhuda.mired.org> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <20080402005650.3c96033b@bhuda.mired.org> Message-ID: <47F32A36.6090209@v.loewis.de> > I'd say not clear, for two reasons. One is that I pretty much never > use keys() in a for loop, I just use the dictionary. Ok. Consider items() then. Again, I claim that the common use of items() is to iterate over it. ,keys() should clearly behave the same as .items(). >> Applications that take a snapshot of the .keys() are rare (right?). > > And the second is that I don't think it's rare to want to process the > keys in sorted order. It's not exactly common, but > > keys = mydict.keys() > keys.sort() > for key in keys: That is indeed a frequent case in 2.x. Fortunately, it is what David calls "loud" breakage: py> keys = mydict.keys() py> keys.sort() Traceback (most recent call last): File "", line 1, in AttributeError: 'dict_keys' object has no attribute 'sort' > In fact, the 2.5 standard library turns up 3 occurrences of > "keys.sort". Given that that's just the ones that used the obvious > name for the list to be sorted > > Nowdays, I tend to write > > keys = sorted(mydict.keys()) # Yeah, I know, .keys() is redundant... > for key in keys: > > or maybe > > for key in sorted(mydict): > > both of which are probably slower than the original version unless > sorted switches to an insertion sort if passed a generator. Notice that this isn't affected by the "spookiness" of dict.keys() at all - it just works fine. Why do you think this version is slower? It behaves exactly the same as the original code: a list is created with all the keys, and then that list is sorted, with the list-sort algorithm. > I'd say the most direct name is to use the dictionary as an iterator > directly. So if you don't think about it the way I don't think about > it, you get the right thing in 2.x and 3.0. Not in 2.0 - you couldn't iterate over a dictionary there. Also, I claim that it is *not* obvious that iterating over a dictionary iterates over the keys. It's a useful convention, but not obvious. I recall having had to look it up for about a year or so until I memorized it. Explicit is better than implicit, at least if you aim for obviousness. In any case, it's also common to use .items(), which you have to use explicitly, and there the most common use is to iterate over it. (the subject of this thread was about .items(), anyway) Regards, Martin From jbarham at gmail.com Wed Apr 2 08:42:24 2008 From: jbarham at gmail.com (John Barham) Date: Tue, 1 Apr 2008 23:42:24 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: <4f34febc0804012342t64117b3dj8d72287ab538f507@mail.gmail.com> David Pokorny wrote: > With all due respect, the policy you > describe---a more efficient implementation in the common > case---optimizes the code of people who don't think about this issue > at all. In other words it facilitates premature optimization. So automatically making the most common use-case more efficient for the na?ve user is a problem? John From brett at python.org Wed Apr 2 09:30:48 2008 From: brett at python.org (Brett Cannon) Date: Wed, 2 Apr 2008 09:30:48 +0200 Subject: [Python-3000] IO __all__ In-Reply-To: <1afaf6160804011936l3101b794y68a74ec48c465c6b@mail.gmail.com> References: <1afaf6160804011346i4e3b4a49hf3e08e9f3a8461f2@mail.gmail.com> <1afaf6160804011936l3101b794y68a74ec48c465c6b@mail.gmail.com> Message-ID: On Wed, Apr 2, 2008 at 4:36 AM, Benjamin Peterson wrote: > > > > On Tue, Apr 1, 2008 at 6:43 PM, Guido van Rossum wrote: > > Well, it *is* part of the public interface of io.py, and it *is* the > > implementation of the built-in open() function. So I don't think this > > should be changed. The module's name is so short that I hope people > > won't import * from it. > Ok. It just seems to me to be an accident waiting to happen which we could > easily avoid. > But people should not blindly do an ``import *``. I agree with Guido it is better for __all__ to reflect the API of the library than to worry about blocking a built-in which happens to be the exact same object. -Brett > > > > > > > > > > > > On Tue, Apr 1, 2008 at 1:46 PM, Benjamin Peterson > > wrote: > > > Is there a reason io.open is in the __all__? It seems to me it would > > > redundant and confusing to import a builtin. > > > > -- > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > -- > Cheers, > Benjamin Peterson > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/brett%40python.org > > From tnelson at onresolve.com Wed Apr 2 09:58:35 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Wed, 2 Apr 2008 00:58:35 -0700 Subject: [Python-3000] the release gods are angry at python In-Reply-To: References: <47EA72D4.8000709@cheimes.de> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E17E381BF@EXMBX04.exchhosting.com> > > In the py3k branch I've assigned the audio resource to the winsound > > tests. Only regrtest.py -uall or -uaudio runs the winsound test. > Reason: > > the test sound was freaking out my poor cat. :/ > > I feel with your cat ;-). > This would not help on the buildbot since it runs 'rt.bat -d -q -uall - > rw'. I feel for the poor NOC engineers at my colo that freak out when some random server in a farm of thousands starts making bizarre sounds. I detest test_winsound. There are so many corner cases you need to account for that makes the test pointless as you end up wrapping everything in except: pass blocks. Does the system have a legacy beep driver? Is it enabled? Is it disabled? Is there a sound card? Is it enabled or disabled? Pah! +1 to removing audio out of -uall, if only for the sake of cats, erroneously red buildbots, and poor ServerCentral NOC engineers. Trent. From amauryfa at gmail.com Wed Apr 2 11:07:03 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 2 Apr 2008 11:07:03 +0200 Subject: [Python-3000] the release gods are angry at python In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E17E381BF@EXMBX04.exchhosting.com> References: <47EA72D4.8000709@cheimes.de> <87D3F9C72FBF214DB39FA4E3FE618CDC6E17E381BF@EXMBX04.exchhosting.com> Message-ID: On Wed, Apr 2, 2008 at 9:58 AM, Trent Nelson wrote: > > > In the py3k branch I've assigned the audio resource to the winsound > > > tests. Only regrtest.py -uall or -uaudio runs the winsound test. > > Reason: > > > the test sound was freaking out my poor cat. :/ > > > > I feel with your cat ;-). > > This would not help on the buildbot since it runs 'rt.bat -d -q -uall - > > rw'. > > I feel for the poor NOC engineers at my colo that freak out when some random server in a farm of thousands starts making bizarre sounds. > > I detest test_winsound. There are so many corner cases you need to account for that makes the test pointless as you end up wrapping everything in except: pass blocks. Does the system have a legacy beep driver? Is it enabled? Is it disabled? Is there a sound card? Is it enabled or disabled? Pah! > > +1 to removing audio out of -uall, if only for the sake of cats, erroneously red buildbots, and poor ServerCentral NOC engineers. And I would not mind removing this module altogether, and provide a ctypes implementation. -- Amaury Forgeot d'Arc From ncoghlan at gmail.com Wed Apr 2 15:30:25 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 02 Apr 2008 23:30:25 +1000 Subject: [Python-3000] Method to populate tp_* slots via getattr()? Message-ID: <47F38A71.1020803@gmail.com> One of the issues with porting to Py3k is the problem that __getattr__ and __getattribute__ can't reliably provide special methods like __add__ the way __getattr__ could with classic classes. (As first noted by Terry Reedy years ago, and recently seeing some new activity on the bug tracker [1]) The culprit here is the fact that __getattribute__ and its associated machinery is typically never invoked for the methods with dedicated tp_* slots in the C-level type structure. What do people think of the idea of providing an extra method on type objects that goes through all of the C-level special method slots, and for each one that isn't currently set, does a getattr() on the associated special name and stores the result (if any) on the current type object? When converting a proxy class that relies on __getattr__ from classic to new-style, all that would then be needed is to invoke the new method on the class object after defining the class (a class decorator or metaclass could be provided somewhere to make this a bit tidier). This seems a lot cleaner than expecting everyone that implements a proxy object to maintain there own list of all of the relevant special methods, and locates the implementation support in an area of the code that already has plenty of infrastructure dedicated to keeping Python visible attributes in sync with the C visible tp_* slots. Thoughts? Altenative ideas? Howls of protest? Cheers, Nick. [1] http://bugs.python.org/issue643841 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From murman at gmail.com Wed Apr 2 15:51:40 2008 From: murman at gmail.com (Michael Urman) Date: Wed, 2 Apr 2008 08:51:40 -0500 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F32A36.6090209@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <20080402005650.3c96033b@bhuda.mired.org> <47F32A36.6090209@v.loewis.de> Message-ID: On Wed, Apr 2, 2008 at 1:39 AM, "Martin v. L?wis" wrote: > > I'd say not clear, for two reasons. One is that I pretty much never > > use keys() in a for loop, I just use the dictionary. > > Ok. Consider items() then. Again, I claim that the common use of > items() is to iterate over it. > > ,keys() should clearly behave the same as .items(). The biggest concern I have is over whether the following works: for i, k in enumerate(d.keys()): if i % 2: del d[k] If this code works as is in py3k, I have no concerns over whether keys(), etc., return snapshots or live views. If this code instead requires the snapshot that list(d) or list(d.keys()) provides, then I'm lightly worried that this will be a repeated source of error for folks who have recently migrated from 2.x to 3.x and haven't really internalized that keys() no longer returns a copy. It's only a light worry as there are plenty people who make that mistake in 2.x by leaving off the keys() entirely. And I hardly think this light worry is worth changing the behavior that was decided on months ago. -- Michael Urman From facundobatista at gmail.com Wed Apr 2 16:08:03 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 2 Apr 2008 11:08:03 -0300 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: 2008/4/2, David Pokorny : > describe, but I don't understand why the behavior of the most common > name should be the most efficient implementation of the most common > scenario. One could propose an alternate policy: the behavior of the Half of the magic power of Python, IMHO, resides in that "the behavior of the most common name should be the most efficient implementation of the most common scenario". -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From solipsis at pitrou.net Wed Apr 2 16:43:33 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 2 Apr 2008 14:43:33 +0000 (UTC) Subject: [Python-3000] Spooky behavior of dict.items() and friends References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <20080402005650.3c96033b@bhuda.mired.org> <47F32A36.6090209@v.loewis.de> Message-ID: Michael Urman gmail.com> writes: > The biggest concern I have is over whether the following works: > > for i, k in enumerate(d.keys()): > if i % 2: del d[k] > Well: Python 3.0a3+ (py3k, Mar 30 2008, 21:14:40) [GCC 4.2.3 (4.2.3-5mnb1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> d = {'a': 1, 'b': 2, 'c': 3, 'd': 4} >>> for i, k in enumerate(d.keys()): ... if i % 2: del d[k] ... Traceback (most recent call last): File "", line 1, in RuntimeError: dictionary changed size during iteration The "problem" here is that while d.keys() returns the view, enumerate() in turn calls iter() on the view and that iterator fails on you when dictionary changed size (as iterkeys() already did in 2.x). Regards Antoine. From lists at cheimes.de Wed Apr 2 16:53:00 2008 From: lists at cheimes.de (Christian Heimes) Date: Wed, 02 Apr 2008 16:53:00 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <20080402005650.3c96033b@bhuda.mired.org> <47F32A36.6090209@v.loewis.de> Message-ID: Michael Urman schrieb: > for i, k in enumerate(d.keys()): > if i % 2: del d[k] > > If this code works as is in py3k, I have no concerns over whether > keys(), etc., return snapshots or live views. If this code instead > requires the snapshot that list(d) or list(d.keys()) provides, then > I'm lightly worried that this will be a repeated source of error for > folks who have recently migrated from 2.x to 3.x and haven't really > internalized that keys() no longer returns a copy. The 2to3 fixer does the right thing. enumerate(d.keys()) does not have the same effect in Python 3.0 as it has in Python 2.x. --- test.py (original) +++ test.py (refactored) @@ -1,3 +1,3 @@ -for i, k in enumerate(d.keys()): +for i, k in enumerate(list(d.keys())): if i % 2: del d[k] Christian From aleaxit at gmail.com Wed Apr 2 16:58:45 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 2 Apr 2008 07:58:45 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F32A36.6090209@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <20080402005650.3c96033b@bhuda.mired.org> <47F32A36.6090209@v.loewis.de> Message-ID: On Tue, Apr 1, 2008 at 11:39 PM, "Martin v. L?wis" wrote: ... > > keys = mydict.keys() > > keys.sort() > > for key in keys: > > That is indeed a frequent case in 2.x. Fortunately, it is what David > calls "loud" breakage: > > py> keys = mydict.keys() > py> keys.sort() > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'dict_keys' object has no attribute 'sort' > > > In fact, the 2.5 standard library turns up 3 occurrences of > > "keys.sort". Given that that's just the ones that used the obvious > > name for the list to be sorted > > > > Nowdays, I tend to write > > > > keys = sorted(mydict.keys()) # Yeah, I know, .keys() is redundant... > > for key in keys: > > > > or maybe > > > > for key in sorted(mydict): > > > > both of which are probably slower than the original version unless > > sorted switches to an insertion sort if passed a generator. > > Notice that this isn't affected by the "spookiness" of dict.keys() > at all - it just works fine. > > Why do you think this version is slower? It behaves exactly the > same as the original code: a list is created with all the keys, > and then that list is sorted, with the list-sort algorithm. Indeed, at least with Python 2.5, any difference in performance is more or less in the noise: $ python -mtimeit -s'd=dict.fromkeys(range(99))' 'k=d.keys(); k.sort()' 'for x in k: pass' 10000 loops, best of 3: 24 usec per loop $ python -mtimeit -s'd=dict.fromkeys(range(99))' 'k=d.keys(); k.sort()' 'for x in k: pass' 10000 loops, best of 3: 21.9 usec per loop $ python -mtimeit -s'd=dict.fromkeys(range(99))' 'for x in sorted(d): pass'10000 loops, best of 3: 22.8 usec per loop $ python -mtimeit -s'd=dict.fromkeys(range(99))' 'for x in sorted(d): pass' 10000 loops, best of 3: 22.6 usec per loop So the "old" cumbersome idiom (though it still comes natural to those who started using Python before it had a `sorted' builtin) should IMHO be discouraged -- the new one is compactly readable and higher-level, yet roughly equivalent performance-wise. IOW, this use case counts as a PLUS for d.keys() NOT returning a list!-) Alex From rhamph at gmail.com Wed Apr 2 18:07:34 2008 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 2 Apr 2008 10:07:34 -0600 Subject: [Python-3000] Method to populate tp_* slots via getattr()? In-Reply-To: <47F38A71.1020803@gmail.com> References: <47F38A71.1020803@gmail.com> Message-ID: On Wed, Apr 2, 2008 at 7:30 AM, Nick Coghlan wrote: > One of the issues with porting to Py3k is the problem that __getattr__ > and __getattribute__ can't reliably provide special methods like __add__ > the way __getattr__ could with classic classes. (As first noted by Terry > Reedy years ago, and recently seeing some new activity on the bug > tracker [1]) > > The culprit here is the fact that __getattribute__ and its associated > machinery is typically never invoked for the methods with dedicated tp_* > slots in the C-level type structure. > > What do people think of the idea of providing an extra method on type > objects that goes through all of the C-level special method slots, and > for each one that isn't currently set, does a getattr() on the > associated special name and stores the result (if any) on the current > type object? > > When converting a proxy class that relies on __getattr__ from classic to > new-style, all that would then be needed is to invoke the new method on > the class object after defining the class (a class decorator or > metaclass could be provided somewhere to make this a bit tidier). > > This seems a lot cleaner than expecting everyone that implements a proxy > object to maintain there own list of all of the relevant special > methods, and locates the implementation support in an area of the code > that already has plenty of infrastructure dedicated to keeping Python > visible attributes in sync with the C visible tp_* slots. > > Thoughts? Altenative ideas? Howls of protest? > > [1] http://bugs.python.org/issue643841 I've been wondering if we should provide a ProxyMixin that returned all the special methods to their old lookup behaviour. I think that'd be cleaner than providing a method to do it. Not sure how easy it'd be to implement though. -- Adam Olsen, aka Rhamphoryncus From guido at python.org Wed Apr 2 20:26:17 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 11:26:17 -0700 Subject: [Python-3000] Method to populate tp_* slots via getattr()? In-Reply-To: <47F38A71.1020803@gmail.com> References: <47F38A71.1020803@gmail.com> Message-ID: On Wed, Apr 2, 2008 at 6:30 AM, Nick Coghlan wrote: > One of the issues with porting to Py3k is the problem that __getattr__ > and __getattribute__ can't reliably provide special methods like __add__ > the way __getattr__ could with classic classes. (As first noted by Terry > Reedy years ago, and recently seeing some new activity on the bug > tracker [1]) > > The culprit here is the fact that __getattribute__ and its associated > machinery is typically never invoked for the methods with dedicated tp_* > slots in the C-level type structure. Well, yes, this is all an intentional part of the new-style class design. > What do people think of the idea of providing an extra method on type > objects that goes through all of the C-level special method slots, and > for each one that isn't currently set, does a getattr() on the > associated special name and stores the result (if any) on the current > type object? Does a getattr on what? Since you seem to be thinking specifically of proxies here, I'm thinking you're doing a getattr on an *instance* -- but it seems wrong to base the *type* slots on that. > When converting a proxy class that relies on __getattr__ from classic Can you show specific code for such a proxy class? I'm having a hard time imagining how it would work (not having used proxies in a really long time...). > to new-style, all that would then be needed is to invoke the new method on > the class object after defining the class (a class decorator or > metaclass could be provided somewhere to make this a bit tidier). Hm. So you are thinking of a proxy for a class?!?! Note that if you set a class attribute corresponding to a special method (e.g. C.__add__ = ...) the corresponding C slot is automatically updated, so you should be able to write a class decorator or mixin or helper function to do this in pure Python, unless I completely misunderstand what you're after. > This seems a lot cleaner than expecting everyone that implements a proxy > object to maintain there own list of all of the relevant special > methods, and locates the implementation support in an area of the code > that already has plenty of infrastructure dedicated to keeping Python > visible attributes in sync with the C visible tp_* slots. How many proxy implementations does the world need? Maybe we should add one to the stdlib? > Thoughts? Altenative ideas? Howls of protest? No, so far just a bit of confusion. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Wed Apr 2 20:32:50 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 02 Apr 2008 20:32:50 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <20080402005650.3c96033b@bhuda.mired.org> <47F32A36.6090209@v.loewis.de> Message-ID: <47F3D152.7080307@v.loewis.de> > The biggest concern I have is over whether the following works: > > for i, k in enumerate(d.keys()): > if i % 2: del d[k] > > If this code works as is in py3k, I have no concerns over whether > keys(), etc., return snapshots or live views. Define "works". This code does not "work" in any version of Python ever released, in any meaningful sense of "works" I could imagine (which all include "if I run the same code twice, it produces the same results") In 3k, it gives Traceback (most recent call last): File "", line 1, in RuntimeError: dictionary changed size during iteration > If this code instead > requires the snapshot that list(d) or list(d.keys()) provides, then > I'm lightly worried that this will be a repeated source of error for > folks who have recently migrated from 2.x to 3.x and haven't really > internalized that keys() no longer returns a copy. See above: it's unlikely that it this error will go unnoticed. Regards, Martin From guido at python.org Wed Apr 2 20:36:35 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 11:36:35 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: On Tue, Apr 1, 2008 at 11:00 PM, David Pokorny wrote: > On Tue, Apr 1, 2008 at 7:37 PM, "Martin v. L?wis" wrote: > > I think it's fairly obvious why the 2.x .keys() has to change. It's > > just too wasteful to actually build the list of all keys of a dictionary > > (or even of all values, as you have to create all the tuples as well), > > if all you want to do is to iterate over it, and the most common > > operation of .keys() is to iterate over it in a for look (right?). > > I agree that the most common operation/scenario is the one you > describe, but I don't understand why the behavior of the most common > name should be the most efficient implementation of the most common > scenario. One could propose an alternate policy: the behavior of the > most common name should correspond to the most common (human) > interpretation of the name. According to this policy, I think there > are valid arguments to be made for .keys() to return either a list or > set (set if you had never used python 2 before, list if you had), but > I don't think a dict_keys object that is tied to the underlying dict > is a common interpretation of the meaning of .keys() (outside this > list). This is a good policy because it minimizes the mental > housekeeping required to understand a given piece of code; this is a > real benefit for the programmer. (And especially for the programmer > just coming to Python). With all due respect, the policy you > describe---a more efficient implementation in the common > case---optimizes the code of people who don't think about this issue > at all. In other words it facilitates premature optimization. The problem is that if you make the slow and fool-proof implementation the common name, you'll have to invent another name for the fast (but sometimes less convenient) method. This is what we ended up doing in Python 2.2 with iterkeys() and friends. Unfortunately, despite your assertion, most people think their code should run as fast as possible, and hence we see a great proliferation of iterkeys() calls. So the fast-but-requiring-care implementation becomes more popular than the slow-but-simple version, and now we have a duplication of APIs. I'd much rather have a single API that can be made to serve everyone equally. I predict that list(x.keys()) will remain a rarity (except in code converted by 2to3). However sorted(x.keys()) will become a well-known idiom, and it's a much better one than the old idiom keys = x.keys() keys.sort() which doesn't led itself easily to use in an expression. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From krstic at solarsail.hcs.harvard.edu Wed Apr 2 20:57:49 2008 From: krstic at solarsail.hcs.harvard.edu (=?UTF-8?Q?Ivan_Krsti=C4=87?=) Date: Wed, 2 Apr 2008 11:57:49 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: On Apr 2, 2008, at 11:36 AM, Guido van Rossum wrote: > I predict that list(x.keys()) will remain a rarity (except in > code converted by 2to3). However sorted(x.keys()) will become a > well-known idiom, and it's a much better one than the old idiom > keys = x.keys() > keys.sort() > which doesn't led itself easily to use in an expression. Is there a particular rationale describing the use of function calls vs. object properties in core Python? When I see a function call required for something that could be conveniently expressed as a property, it generally tells me "I'm computing something. It might be expensive, and if you call me again, I'll have to recompute." This made sense with .keys() in 2.x, but is not true in 3.0. Is there a good reason besides compatibility to keep the parentheses there? sorted(x.keys) has a nice ring to it. Cheers, -- Ivan Krsti? | http://radian.org From guido at python.org Wed Apr 2 21:26:08 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 12:26:08 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: We went over this a few years ago when we first reviewed this design from every POV. We decided that .keys() has been ingrained in the collective mind of Python users for such a long time that it would be a mistake to change it. On Wed, Apr 2, 2008 at 11:57 AM, Ivan Krsti? wrote: > On Apr 2, 2008, at 11:36 AM, Guido van Rossum wrote: > > > I predict that list(x.keys()) will remain a rarity (except in > > code converted by 2to3). However sorted(x.keys()) will become a > > well-known idiom, and it's a much better one than the old idiom > > keys = x.keys() > > keys.sort() > > which doesn't led itself easily to use in an expression. > > > > > Is there a particular rationale describing the use of function calls vs. > object properties in core Python? > > When I see a function call required for something that could be > conveniently expressed as a property, it generally tells me "I'm computing > something. It might be expensive, and if you call me again, I'll have to > recompute." > > This made sense with .keys() in 2.x, but is not true in 3.0. Is there a > good reason besides compatibility to keep the parentheses there? > > sorted(x.keys) > > has a nice ring to it. Cheers, > > -- > Ivan Krsti? | http://radian.org > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Wed Apr 2 21:37:19 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 02 Apr 2008 21:37:19 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: <47F3E06F.1030302@v.loewis.de> > Is there a particular rationale describing the use of function calls > vs. object properties in core Python? > > When I see a function call required for something that could be > conveniently expressed as a property, it generally tells me "I'm > computing something. It might be expensive, and if you call me again, > I'll have to recompute." > > This made sense with .keys() in 2.x, but is not true in 3.0. Is there > a good reason besides compatibility to keep the parentheses there? > > sorted(x.keys) > > has a nice ring to it. Cheers, In the current implementation (3.0a3+), x.keys() is not x.keys() i.e. it does indeed recompute something new each time. Now, the objects it creates have the same state, so it technically wouldn't have to create a fresh object each time. One issue is that by doing so, you prevent cyclic references (the dict actually doesn't need to know what views of it exist), so this works better for garbage collection. Implementation issues aside, I presume that the choice of interface primarily comes from tradition. Regards, Martin From alexander.belopolsky at gmail.com Wed Apr 2 21:39:46 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 2 Apr 2008 15:39:46 -0400 Subject: [Python-3000] PEP 3102 question In-Reply-To: References: Message-ID: On Tue, Apr 1, 2008 at 8:14 PM, Guido van Rossum wrote: .. > Thomas Wouters's changes for variable tuple packing might fix this, if > we can agree to add that feature. > Do you mean http://bugs.python.org/issue2292 ? From description it does not seem to address function calls and after applying that patch I still see the same syntax error. Is there some other relevant patch or discussion? From guido at python.org Wed Apr 2 21:47:13 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 12:47:13 -0700 Subject: [Python-3000] PEP 3102 question In-Reply-To: References: Message-ID: On Wed, Apr 2, 2008 at 12:39 PM, Alexander Belopolsky wrote: > On Tue, Apr 1, 2008 at 8:14 PM, Guido van Rossum wrote: > .. > > Thomas Wouters's changes for variable tuple packing might fix this, if > > we can agree to add that feature. > > > > Do you mean http://bugs.python.org/issue2292 ? From description it > does not seem to address function calls and after applying that patch > I still see the same syntax error. > > Is there some other relevant patch or discussion? Thomas isn't finished yet. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexander.belopolsky at gmail.com Wed Apr 2 22:30:20 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 2 Apr 2008 16:30:20 -0400 Subject: [Python-3000] PEP 3102 question In-Reply-To: References: Message-ID: On Wed, Apr 2, 2008 at 3:47 PM, Guido van Rossum wrote: > > > Thomas Wouters's changes for variable tuple packing might fix this, if > > > we can agree to add that feature. .. > > Thomas isn't finished yet. The reason I am asking is that I've been looking into ways of fixing the way instance methods are reporting the number of arguments and it looks like some things may need to be rearranged in ceval in order to provide a fix and I don't want to propose a patch that will conflict with someone else's work. From dbpokorny at gmail.com Wed Apr 2 23:22:33 2008 From: dbpokorny at gmail.com (David Pokorny) Date: Wed, 2 Apr 2008 14:22:33 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: On Wed, Apr 2, 2008 at 11:36 AM, Guido van Rossum wrote: > The problem is that if you make the slow and fool-proof implementation > the common name, you'll have to invent another name for the fast (but > sometimes less convenient) method. This is what we ended up doing in > Python 2.2 with iterkeys() and friends. Unfortunately, despite your > assertion, most people think their code should run as fast as > possible, and hence we see a great proliferation of iterkeys() calls. > So the fast-but-requiring-care implementation becomes more popular > than the slow-but-simple version, and now we have a duplication of > APIs. > > I'd much rather have a single API that can be made to serve everyone > equally. I predict that list(x.keys()) will remain a rarity (except in > code converted by 2to3). However sorted(x.keys()) will become a > well-known idiom, and it's a much better one than the old idiom > > keys = x.keys() > keys.sort() > > which doesn't led itself easily to use in an expression. I agree that most people think their code should run as fast as possible, but in this particular case, common practice and common sense diverge. If 80% of one's code makes only a negligible contribution to performance, then clearly there is no need to optimize it, but an average programmer will do it anyway. I imagine the best programmers would probably do it too under social pressure. This is not entirely fair, but one could say this change encourages average programmers to keep their bad habits. I understand the appeal of having a single API, but in this particular case, there are two arguably distinct use cases: getting the keys of a dict and iterating over the keys of the dict. One could change syntax so that "getting the keys" would be spelled "x.keys()" and "iterating over the keys" would look like for k in keys of x: ... or ','.join(k for k in keys of x) (This would make 'keys' both a keyword and an identifier, so my understanding is that this would entail a change in architecture of the tokenizer and maybe parser as well; this strikes me as a temporary but not intrinsic objection.) To elaborate on the point of the impact on the programmer new to Python, I find >>> x = {1:2} >>> x.keys() {1} much more appealing than >>> x.keys() I taught (scheme) programming to high school students once, and I know from experience that they try absolutely everything (most of it wrong of course) because there are a million other confusing things to learn. I think there is a real value in making a simple operation on a core type as straightforward as possible. David From amauryfa at gmail.com Wed Apr 2 23:47:58 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 2 Apr 2008 23:47:58 +0200 Subject: [Python-3000] Are bytes object really immutable? Message-ID: Stop me if I'm wrong, but I thought that bytes objects are immutable (they are based on the PyStringType, after all) But I was surprised by this code in test_socket.py:: buf = b" "*1024 nbytes = self.cli_conn.recv_into(buf) And this in getargs.c:: case 'w': { /* memory buffer, read-write access */ ... ((temp = (*pb->bf_getbuffer)(arg, &view, PyBUF_SIMPLE)) != 0) || (I'd expect PyBUF_READONLY) And this in stringobject.c:: static int string_buffer_getbuffer(PyStringObject *self, Py_buffer *view, int flags) { return PyBuffer_FillInfo(view, (void *)self->ob_sval, Py_SIZE(self), 0, flags); } (The zero is the "readonly" parameter) Is all of this wrong? -- Amaury Forgeot d'Arc From jason.orendorff at gmail.com Wed Apr 2 23:54:39 2008 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Wed, 2 Apr 2008 16:54:39 -0500 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F2F151.3050809@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: On Tue, Apr 1, 2008 at 9:37 PM, "Martin v. L?wis" wrote: > I think it's fairly obvious why the 2.x .keys() has to change. It's > just too wasteful to actually build the list of all keys of a dictionary > (or even of all values, as you have to create all the tuples as well), > if all you want to do is to iterate over it, and the most common > operation of .keys() is to iterate over it in a for look (right?). I don't think so. Is this a use case for d.keys()? Why not just write "for k in d"? To me, framing the question as "iterate vs. copy" seems bogus. It's more like "view vs. copy". The thing is, copying provides the semantics I want (of *course* I don't want extra helpings of aliasing and spooky interaction between collections, are you nuts?), and the slowness has never bothered me--that I know of. Views would be faster, but with silently different semantics. I think I want copying. -j From paul at prescod.net Wed Apr 2 23:57:14 2008 From: paul at prescod.net (Paul Prescod) Date: Wed, 2 Apr 2008 14:57:14 -0700 Subject: [Python-3000] Types and classes Message-ID: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Apologies if this has been discussed before. But does anyone else find it odd that the types of some things are classes and the classes of some things are types? >>> type(socket.socket()) >>> type("abc") >>> socket.socket().__class__ >>> "abc".__class__ In a recent talk I could only explain this as a historical quirk. As I understand, it is now possible to make types that behave basically exactly like classes and classes that behave exactly like types. Is there any important difference between them anymore? Paul Prescod From martin at v.loewis.de Wed Apr 2 23:59:06 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 02 Apr 2008 23:59:06 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: <47F401AA.5070801@v.loewis.de> Jason Orendorff wrote: > On Tue, Apr 1, 2008 at 9:37 PM, "Martin v. L?wis" wrote: >> I think it's fairly obvious why the 2.x .keys() has to change. It's >> just too wasteful to actually build the list of all keys of a dictionary >> (or even of all values, as you have to create all the tuples as well), >> if all you want to do is to iterate over it, and the most common >> operation of .keys() is to iterate over it in a for look (right?). > > I don't think so. Is this a use case for d.keys()? Why not just > write "for k in d"? See the subject. What do you say about d.items()? > To me, framing the question as "iterate vs. copy" seems bogus. It's > more like "view vs. copy". The thing is, copying provides the > semantics I want (of *course* I don't want extra helpings of aliasing > and spooky interaction between collections, are you nuts?), and the > slowness has never bothered me--that I know of. Views would be > faster, but with silently different semantics. I think I want > copying. I think there is zero chance to revert that decision now. Regards, Martin From barry at python.org Thu Apr 3 00:00:09 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 2 Apr 2008 18:00:09 -0400 Subject: [Python-3000] Building next alphas Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a reminder that I am going to start building the next alpha releases for Python 2.6 and 3.0 now. Please, no checkins unless you get approval from me, and until you hear that the freeze is lifted. I am now on freenode #python-dev, IM, and Jabber if you need to contact me. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBR/QB6XEjvBPtnXfVAQLn8QQArynMsNeeb6gqjUCSYuupM2XXAbwP5XOX LbTeGN+vM13uNK32fI47rDaPEfudfGnrd3Ttc1pg6/S/MOo5T41zs/TX2jdMEQ4g 6zCtk6xJiexGbExKioiTVdYgiqA8C6u+XY8aU2ogklD1h7kfEOWKw5urXkValFhG Iymq6mrEyJQ= =d/L3 -----END PGP SIGNATURE----- From amauryfa at gmail.com Thu Apr 3 00:03:58 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 3 Apr 2008 00:03:58 +0200 Subject: [Python-3000] Types and classes In-Reply-To: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Message-ID: Hello, On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod wrote: > Apologies if this has been discussed before. > > But does anyone else find it odd that the types of some things are > classes and the classes of some things are types? > > >>> type(socket.socket()) > > >>> type("abc") > > >>> socket.socket().__class__ > > >>> "abc".__class__ > > > In a recent talk I could only explain this as a historical quirk. As I > understand, it is now possible to make types that behave basically > exactly like classes and classes that behave exactly like types. Is > there any important difference between them anymore? I can find one difference: - types are written in C - classes are written in Python and there is a difference in behaviour: most types don't have a writable __dict__, and you cannot add members. classes are more flexible. -- Amaury Forgeot d'Arc From guido at python.org Thu Apr 3 00:10:45 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 15:10:45 -0700 Subject: [Python-3000] Are bytes object really immutable? In-Reply-To: References: Message-ID: On Wed, Apr 2, 2008 at 2:47 PM, Amaury Forgeot d'Arc wrote: > Stop me if I'm wrong, but I thought that bytes objects are immutable > (they are based on the PyStringType, after all) Right. In 3.0a1 they were mutable, that's probably where these examples come from. > But I was surprised by this code in test_socket.py:: > > buf = b" "*1024 > nbytes = self.cli_conn.recv_into(buf) That shouldn't work. > And this in getargs.c:: > > case 'w': { /* memory buffer, read-write access */ > ... > ((temp = (*pb->bf_getbuffer)(arg, &view, > PyBUF_SIMPLE)) != 0) || > > (I'd expect PyBUF_READONLY) > > And this in stringobject.c:: > > static int > string_buffer_getbuffer(PyStringObject *self, Py_buffer *view, int flags) > { > return PyBuffer_FillInfo(view, (void *)self->ob_sval, Py_SIZE(self), > 0, flags); > } > > (The zero is the "readonly" parameter) > > Is all of this wrong? If it ever writes into bytes/PyString objects, yes, it is wrong! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From oliphant.travis at ieee.org Thu Apr 3 00:11:34 2008 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed, 02 Apr 2008 17:11:34 -0500 Subject: [Python-3000] Are bytes object really immutable? In-Reply-To: References: Message-ID: Amaury Forgeot d'Arc wrote: > Stop me if I'm wrong, but I thought that bytes objects are immutable > (they are based on the PyStringType, after all) > > But I was surprised by this code in test_socket.py:: > > buf = b" "*1024 > nbytes = self.cli_conn.recv_into(buf) > I'm not sure about this one... > And this in getargs.c:: > > case 'w': { /* memory buffer, read-write access */ > ... > ((temp = (*pb->bf_getbuffer)(arg, &view, > PyBUF_SIMPLE)) != 0) || > > (I'd expect PyBUF_READONLY) This one is O.K. because 'w' is requesting read-write access. An error will occur if the object does not allow it. > > And this in stringobject.c:: > > static int > string_buffer_getbuffer(PyStringObject *self, Py_buffer *view, int flags) > { > return PyBuffer_FillInfo(view, (void *)self->ob_sval, Py_SIZE(self), > 0, flags); > } > You are right that the 0 here should be a 1 for the immutable bytes object. Good job. -Travis O. From mike.klaas at gmail.com Thu Apr 3 00:08:53 2008 From: mike.klaas at gmail.com (Mike Klaas) Date: Wed, 2 Apr 2008 15:08:53 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F401AA.5070801@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> Message-ID: On 2-Apr-08, at 2:59 PM, Martin v. L?wis wrote: > Jason Orendorff wrote: >> On Tue, Apr 1, 2008 at 9:37 PM, "Martin v. L?wis" >> wrote: >>> I think it's fairly obvious why the 2.x .keys() has to change. It's >>> just too wasteful to actually build the list of all keys of a >>> dictionary >>> (or even of all values, as you have to create all the tuples as >>> well), >>> if all you want to do is to iterate over it, and the most common >>> operation of .keys() is to iterate over it in a for look (right?). >> >> I don't think so. Is this a use case for d.keys()? Why not just >> write "for k in d"? > > See the subject. What do you say about d.items()? Exactly. Iterating over the items of a dictionary is one of the most common dict operations, especially in list/genexp contexts. $ pygrep '\.items' | wc -l 128 $ pygrep '\.iteritems' | wc -l 211 This may make it seem like iteration is only twice as common, but the majority of .items() use is in for loops where iteration is more appropriate. There are only 46 non-'for' uses of .items() in this codebase: $ pygrep '\.items' | grep -v for | wc -l 46 ...and the majority of these cases would work fine with views (input to sorted(), etc). Needing a physical list snapshot of them items that will be used independently of the dictionary is a much rarer use case. It is good that this will be distinguished from the iteration use-cases with the extra syntax list(d.items()). -Mike From guido at python.org Thu Apr 3 00:20:31 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 15:20:31 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Message-ID: On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc wrote: > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod wrote: > > But does anyone else find it odd that the types of some things are > > classes and the classes of some things are types? > > > > >>> type(socket.socket()) > > > > >>> type("abc") > > > > >>> socket.socket().__class__ > > > > >>> "abc".__class__ > > > > > > In a recent talk I could only explain this as a historical quirk. As I > > understand, it is now possible to make types that behave basically > > exactly like classes and classes that behave exactly like types. Is > > there any important difference between them anymore? > > I can find one difference: > - types are written in C > - classes are written in Python > > and there is a difference in behaviour: > most types don't have a writable __dict__, and you cannot add members. > classes are more flexible. That's more correctly described as the difference between built-in types/classes and user-defined types/classes. I think it's still just a historical quirk; maybe we should bite the bullet and fix this in py3k. (Still, 'type' and 'class' will both be part of the language, one as a built-in function and metaclass, the other as a keyword.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jason.orendorff at gmail.com Thu Apr 3 00:33:26 2008 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Wed, 2 Apr 2008 17:33:26 -0500 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> Message-ID: On Wed, Apr 2, 2008 at 5:08 PM, Mike Klaas wrote: > ...and the majority of these cases would work fine with views (input > to sorted(), etc). Suppose "the majority" here means 36 of the 46 cases. Then what you're saying is, if I write .items() without thinking, there's about a 3% chance it won't work (10 out of 339 cases). Forgive me: the fact that you've gotten it down to 3%, e.g. by making items() return a view instead of an iterator, doesn't make me terrifically happy. I'm OK with the status quo. Maybe iteritems() is a wart, but I think views will be a much worse wart! If the only hard requirement is that dict lose *something* in Python 3.0, I suggest droping values() and itervalues(), as I never use them. ;-) -j From musiccomposition at gmail.com Thu Apr 3 00:34:48 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 2 Apr 2008 17:34:48 -0500 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Message-ID: <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> On Wed, Apr 2, 2008 at 5:20 PM, Guido van Rossum wrote: > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc > wrote: > > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod wrote: > > > But does anyone else find it odd that the types of some things are > > > classes and the classes of some things are types? > > > > > > >>> type(socket.socket()) > > > > > > >>> type("abc") > > > > > > >>> socket.socket().__class__ > > > > > > >>> "abc".__class__ > > > > > > > > > In a recent talk I could only explain this as a historical quirk. As > I > > > understand, it is now possible to make types that behave basically > > > exactly like classes and classes that behave exactly like types. Is > > > there any important difference between them anymore? > > > > I can find one difference: > > - types are written in C > > - classes are written in Python > > > > and there is a difference in behaviour: > > most types don't have a writable __dict__, and you cannot add members. > > classes are more flexible. > > That's more correctly described as the difference between built-in > types/classes and user-defined types/classes. > > I think it's still just a historical quirk; maybe we should bite the > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > part of the language, one as a built-in function and metaclass, the > other as a keyword.) Especially because of that I think we should change. list, dict, and set aren't metaclasses, so it would make since to fix it. > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080402/516b7b13/attachment.htm From musiccomposition at gmail.com Thu Apr 3 00:37:13 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 2 Apr 2008 17:37:13 -0500 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> Message-ID: <1afaf6160804021537s192ed32x99b0dc9e28f9fe2c@mail.gmail.com> On Wed, Apr 2, 2008 at 5:33 PM, Jason Orendorff wrote: > On Wed, Apr 2, 2008 at 5:08 PM, Mike Klaas wrote: > > ...and the majority of these cases would work fine with views (input > > to sorted(), etc). > > Suppose "the majority" here means 36 of the 46 cases. Then what > you're saying is, if I write .items() without thinking, there's about > a 3% chance it won't work (10 out of 339 cases). Forgive me: the > fact that you've gotten it down to 3%, e.g. by making items() return a > view instead of an iterator, doesn't make me terrifically happy. It's so easy to do what you want in those cases, though. Just by the view in list. > > > I'm OK with the status quo. Maybe iteritems() is a wart, but I think > views will be a much worse wart! > > If the only hard requirement is that dict lose *something* in Python > 3.0, I suggest droping values() and itervalues(), as I never use them. > ;-) > > -j > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080402/f7649406/attachment.htm From martin at v.loewis.de Thu Apr 3 00:48:40 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 03 Apr 2008 00:48:40 +0200 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> Message-ID: <47F40D48.3030904@v.loewis.de> > On Wed, Apr 2, 2008 at 5:08 PM, Mike Klaas wrote: >> ...and the majority of these cases would work fine with views (input >> to sorted(), etc). > > Suppose "the majority" here means 36 of the 46 cases. What makes you suppose so. In the standard library of Python 2.5, I could not find a single case where using views would cause silent breakage: - the majority of uses is in for loops or list comprehensions. - of the remaining uses, the majority is with .sort(), which would cause an exception, to be rewritten as sorted(foo.items()) - of the then-remaining cases, the majority is immediately followed by an iteration, with no intermediate changes to the dictionary. - in some cases, the view is returned to the caller (i.e. outside of the standard library); whether this would break anything would depend on the application. In your code, how many (in absolute numbers) applications of .items() would break when .items() becomes a view? Regards, Martin From mike.klaas at gmail.com Thu Apr 3 00:56:33 2008 From: mike.klaas at gmail.com (Mike Klaas) Date: Wed, 2 Apr 2008 15:56:33 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> Message-ID: <21E9BFB7-654C-4C01-BA19-C399F904CC37@gmail.com> On 2-Apr-08, at 3:33 PM, Jason Orendorff wrote: > On Wed, Apr 2, 2008 at 5:08 PM, Mike Klaas > wrote: >> ...and the majority of these cases would work fine with views (input >> to sorted(), etc). > > Suppose "the majority" here means 36 of the 46 cases. Then what > you're saying is, if I write .items() without thinking, there's about > a 3% chance it won't work (10 out of 339 cases). Forgive me: the > fact that you've gotten it down to 3%, e.g. by making items() return a > view instead of an iterator, doesn't make me terrifically happy. I apologize: I wasn't trying to make the point that programmers used to the old behaviour can continue willy-nilly using it without worrying about the consequences. Yes, programmers will have to learn the new behaviour; these are among the subtleties of the new language. I suspect that it will be mentioned prominently in every "python 3k for 2.X programmers" tutorial, and 2to3 can handle this translation safely. I suspect that most cases will not fail quietly, either: l = d.keys() l.sort() # exception l[0] # exception etc. (there may be other examples, too, like sliceability) -Mike From musiccomposition at gmail.com Thu Apr 3 00:58:00 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 2 Apr 2008 17:58:00 -0500 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> Message-ID: <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> On Wed, Apr 2, 2008 at 5:51 PM, Guido van Rossum wrote: > I have no idea what you are saying here (and I did s/since/sense/ :-). Another lesson to me, that I should proofread my Python impulses: :P Especially because of that I think we should do that. list, dict, and set aren't metaclasses, so it would make sense to make that name change. > > > On Wed, Apr 2, 2008 at 3:34 PM, Benjamin Peterson > wrote: > > > > > > > > > > On Wed, Apr 2, 2008 at 5:20 PM, Guido van Rossum > wrote: > > > > > > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc < > amauryfa at gmail.com> > > wrote: > > > > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod > wrote: > > > > > > > > But does anyone else find it odd that the types of some things > are > > > > > classes and the classes of some things are types? > > > > > > > > > > >>> type(socket.socket()) > > > > > > > > > > >>> type("abc") > > > > > > > > > > >>> socket.socket().__class__ > > > > > > > > > > >>> "abc".__class__ > > > > > > > > > > > > > > > In a recent talk I could only explain this as a historical > quirk. As > > I > > > > > understand, it is now possible to make types that behave > basically > > > > > exactly like classes and classes that behave exactly like types. > Is > > > > > there any important difference between them anymore? > > > > > > > > I can find one difference: > > > > - types are written in C > > > > - classes are written in Python > > > > > > > > and there is a difference in behaviour: > > > > most types don't have a writable __dict__, and you cannot add > members. > > > > classes are more flexible. > > > > > > That's more correctly described as the difference between built-in > > > types/classes and user-defined types/classes. > > > > > > I think it's still just a historical quirk; maybe we should bite the > > > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > > > part of the language, one as a built-in function and metaclass, the > > > other as a keyword.) > > Especially because of that I think we should change. list, dict, and set > > aren't metaclasses, so it would make since to fix it. > > > > > > > > > -- > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > > > > > _______________________________________________ > > > Python-3000 mailing list > > > Python-3000 at python.org > > > http://mail.python.org/mailman/listinfo/python-3000 > > > Unsubscribe: > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > -- > > Cheers, > > Benjamin Peterson > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080402/4655665c/attachment-0001.htm From guido at python.org Thu Apr 3 00:51:44 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 15:51:44 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> Message-ID: I have no idea what you are saying here (and I did s/since/sense/ :-). On Wed, Apr 2, 2008 at 3:34 PM, Benjamin Peterson wrote: > > > > > On Wed, Apr 2, 2008 at 5:20 PM, Guido van Rossum wrote: > > > > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc > wrote: > > > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod wrote: > > > > > > But does anyone else find it odd that the types of some things are > > > > classes and the classes of some things are types? > > > > > > > > >>> type(socket.socket()) > > > > > > > > >>> type("abc") > > > > > > > > >>> socket.socket().__class__ > > > > > > > > >>> "abc".__class__ > > > > > > > > > > > > In a recent talk I could only explain this as a historical quirk. As > I > > > > understand, it is now possible to make types that behave basically > > > > exactly like classes and classes that behave exactly like types. Is > > > > there any important difference between them anymore? > > > > > > I can find one difference: > > > - types are written in C > > > - classes are written in Python > > > > > > and there is a difference in behaviour: > > > most types don't have a writable __dict__, and you cannot add members. > > > classes are more flexible. > > > > That's more correctly described as the difference between built-in > > types/classes and user-defined types/classes. > > > > I think it's still just a historical quirk; maybe we should bite the > > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > > part of the language, one as a built-in function and metaclass, the > > other as a keyword.) > Especially because of that I think we should change. list, dict, and set > aren't metaclasses, so it would make since to fix it. > > > > > > -- > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > -- > Cheers, > Benjamin Peterson -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Apr 3 01:01:04 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 16:01:04 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> Message-ID: I don't recall proposing a name change. And I still don't see what metaclasses have to do with it; I just mentioned them because 'type' is both usable as a built-in function to access an object's class, and as a metaclass (in fact it is the root metaclass). On Wed, Apr 2, 2008 at 3:58 PM, Benjamin Peterson wrote: > > > > On Wed, Apr 2, 2008 at 5:51 PM, Guido van Rossum wrote: > > I have no idea what you are saying here (and I did s/since/sense/ :-). > Another lesson to me, that I should proofread my Python impulses: :P > Especially because of that I think we should do that. list, dict, and set > aren't metaclasses, so it would make sense to make that name change. > > > > > > > > > > > > > > On Wed, Apr 2, 2008 at 3:34 PM, Benjamin Peterson > > wrote: > > > > > > > > > > > > > > > On Wed, Apr 2, 2008 at 5:20 PM, Guido van Rossum > wrote: > > > > > > > > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc > > > > wrote: > > > > > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod > wrote: > > > > > > > > > > But does anyone else find it odd that the types of some things > are > > > > > > classes and the classes of some things are types? > > > > > > > > > > > > >>> type(socket.socket()) > > > > > > > > > > > > >>> type("abc") > > > > > > > > > > > > >>> socket.socket().__class__ > > > > > > > > > > > > >>> "abc".__class__ > > > > > > > > > > > > > > > > > > In a recent talk I could only explain this as a historical > quirk. As > > > I > > > > > > understand, it is now possible to make types that behave > basically > > > > > > exactly like classes and classes that behave exactly like types. > Is > > > > > > there any important difference between them anymore? > > > > > > > > > > I can find one difference: > > > > > - types are written in C > > > > > - classes are written in Python > > > > > > > > > > and there is a difference in behaviour: > > > > > most types don't have a writable __dict__, and you cannot add > members. > > > > > classes are more flexible. > > > > > > > > That's more correctly described as the difference between built-in > > > > types/classes and user-defined types/classes. > > > > > > > > I think it's still just a historical quirk; maybe we should bite the > > > > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > > > > part of the language, one as a built-in function and metaclass, the > > > > other as a keyword.) > > > Especially because of that I think we should change. list, dict, and set > > > aren't metaclasses, so it would make since to fix it. > > > > > > > > > > > > -- > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Python-3000 mailing list > > > > Python-3000 at python.org > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > Unsubscribe: > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > -- > > > Cheers, > > > Benjamin Peterson > > > > > > > > -- > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > -- > Cheers, > Benjamin Peterson -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Thu Apr 3 01:04:39 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 2 Apr 2008 18:04:39 -0500 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> Message-ID: <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> On Wed, Apr 2, 2008 at 6:01 PM, Guido van Rossum wrote: > I don't recall proposing a name change. And I still don't see what > metaclasses have to do with it; I just mentioned them because 'type' > is both usable as a built-in function to access an object's class, and > as a metaclass (in fact it is the root metaclass). Ah. You were referring to allowing types to have __dict__ attributes, right? I misread and thought you wanted to rename. > > > On Wed, Apr 2, 2008 at 3:58 PM, Benjamin Peterson > wrote: > > > > > > > > On Wed, Apr 2, 2008 at 5:51 PM, Guido van Rossum > wrote: > > > I have no idea what you are saying here (and I did s/since/sense/ :-). > > Another lesson to me, that I should proofread my Python impulses: :P > > Especially because of that I think we should do that. list, dict, and > set > > aren't metaclasses, so it would make sense to make that name change. > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 2, 2008 at 3:34 PM, Benjamin Peterson > > > wrote: > > > > > > > > > > > > > > > > > > > > On Wed, Apr 2, 2008 at 5:20 PM, Guido van Rossum > > wrote: > > > > > > > > > > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc > > > > > > wrote: > > > > > > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod > > > wrote: > > > > > > > > > > > > But does anyone else find it odd that the types of some > things > > are > > > > > > > classes and the classes of some things are types? > > > > > > > > > > > > > > >>> type(socket.socket()) > > > > > > > > > > > > > > >>> type("abc") > > > > > > > > > > > > > > >>> socket.socket().__class__ > > > > > > > > > > > > > > >>> "abc".__class__ > > > > > > > > > > > > > > > > > > > > > In a recent talk I could only explain this as a historical > > quirk. As > > > > I > > > > > > > understand, it is now possible to make types that behave > > basically > > > > > > > exactly like classes and classes that behave exactly like > types. > > Is > > > > > > > there any important difference between them anymore? > > > > > > > > > > > > I can find one difference: > > > > > > - types are written in C > > > > > > - classes are written in Python > > > > > > > > > > > > and there is a difference in behaviour: > > > > > > most types don't have a writable __dict__, and you cannot add > > members. > > > > > > classes are more flexible. > > > > > > > > > > That's more correctly described as the difference between built-in > > > > > types/classes and user-defined types/classes. > > > > > > > > > > I think it's still just a historical quirk; maybe we should bite > the > > > > > bullet and fix this in py3k. (Still, 'type' and 'class' will both > be > > > > > part of the language, one as a built-in function and metaclass, > the > > > > > other as a keyword.) > > > > Especially because of that I think we should change. list, dict, and > set > > > > aren't metaclasses, so it would make since to fix it. > > > > > > > > > > > > > > > -- > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Python-3000 mailing list > > > > > Python-3000 at python.org > > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > Unsubscribe: > > > > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Benjamin Peterson > > > > > > > > > > > > -- > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > -- > > Cheers, > > Benjamin Peterson > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080402/f207f994/attachment.htm From mike.klaas at gmail.com Thu Apr 3 01:06:30 2008 From: mike.klaas at gmail.com (Mike Klaas) Date: Wed, 2 Apr 2008 16:06:30 -0700 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F40D48.3030904@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> <47F40D48.3030904@v.loewis.de> Message-ID: <3B4ACAFE-854B-4CEE-AF0F-83A001D007BE@gmail.com> On 2-Apr-08, at 3:48 PM, Martin v. L?wis wrote: >> On Wed, Apr 2, 2008 at 5:08 PM, Mike Klaas >> wrote: >>> ...and the majority of these cases would work fine with views (input >>> to sorted(), etc). >> >> Suppose "the majority" here means 36 of the 46 cases. > > What makes you suppose so. In the standard library of Python 2.5, I > could not find a single case where using views would cause silent > breakage: > - the majority of uses is in for loops or list comprehensions. > - of the remaining uses, the majority is with .sort(), which > would cause an exception, to be rewritten as sorted(foo.items()) > - of the then-remaining cases, the majority is immediately followed > by an iteration, with no intermediate changes to the dictionary. > - in some cases, the view is returned to the caller (i.e. outside > of the standard library); whether this would break anything would > depend on the application. > > In your code, how many (in absolute numbers) applications of .items() > would break when .items() becomes a view? I assume you are asking Jason even though you attributed the quote to me. However, a cursory examination of the 46 non-'for' I quoted above results in a situation much like you describe: either silently working fine with view, or louding breaking where a list was expected. There are some that get passed out of functions which may fail silently but I don't have time to examine them in detail right now. -Mike From guido at python.org Thu Apr 3 01:09:36 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 16:09:36 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> Message-ID: No. types *have* a __dict__, but it's readonly. Some type *instances* don't have a __dict__, but that's up to the individual type. All I really mean to fix is to standardize the terminology, especially in repr(). On Wed, Apr 2, 2008 at 4:04 PM, Benjamin Peterson wrote: > > > > On Wed, Apr 2, 2008 at 6:01 PM, Guido van Rossum wrote: > > I don't recall proposing a name change. And I still don't see what > > metaclasses have to do with it; I just mentioned them because 'type' > > is both usable as a built-in function to access an object's class, and > > as a metaclass (in fact it is the root metaclass). > Ah. You were referring to allowing types to have __dict__ attributes, right? > I misread and thought you wanted to rename. > > > > > > > > On Wed, Apr 2, 2008 at 3:58 PM, Benjamin Peterson > > > > > > > > wrote: > > > > > > > > > > > > On Wed, Apr 2, 2008 at 5:51 PM, Guido van Rossum > wrote: > > > > I have no idea what you are saying here (and I did s/since/sense/ :-). > > > Another lesson to me, that I should proofread my Python impulses: :P > > > Especially because of that I think we should do that. list, dict, and > set > > > aren't metaclasses, so it would make sense to make that name change. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 2, 2008 at 3:34 PM, Benjamin Peterson > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 2, 2008 at 5:20 PM, Guido van Rossum > > > wrote: > > > > > > > > > > > > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc > > > > > > > > wrote: > > > > > > > On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod > > > > wrote: > > > > > > > > > > > > > > But does anyone else find it odd that the types of some > things > > > are > > > > > > > > classes and the classes of some things are types? > > > > > > > > > > > > > > > > >>> type(socket.socket()) > > > > > > > > > > > > > > > > >>> type("abc") > > > > > > > > > > > > > > > > >>> socket.socket().__class__ > > > > > > > > > > > > > > > > >>> "abc".__class__ > > > > > > > > > > > > > > > > > > > > > > > > In a recent talk I could only explain this as a historical > > > quirk. As > > > > > I > > > > > > > > understand, it is now possible to make types that behave > > > basically > > > > > > > > exactly like classes and classes that behave exactly like > types. > > > Is > > > > > > > > there any important difference between them anymore? > > > > > > > > > > > > > > I can find one difference: > > > > > > > - types are written in C > > > > > > > - classes are written in Python > > > > > > > > > > > > > > and there is a difference in behaviour: > > > > > > > most types don't have a writable __dict__, and you cannot add > > > members. > > > > > > > classes are more flexible. > > > > > > > > > > > > That's more correctly described as the difference between built-in > > > > > > types/classes and user-defined types/classes. > > > > > > > > > > > > I think it's still just a historical quirk; maybe we should bite > the > > > > > > bullet and fix this in py3k. (Still, 'type' and 'class' will both > be > > > > > > part of the language, one as a built-in function and metaclass, > the > > > > > > other as a keyword.) > > > > > Especially because of that I think we should change. list, dict, and > set > > > > > aren't metaclasses, so it would make since to fix it. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Python-3000 mailing list > > > > > > Python-3000 at python.org > > > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > > Unsubscribe: > > > > > > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Cheers, > > > > > Benjamin Peterson > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > > > > > > > -- > > > Cheers, > > > Benjamin Peterson > > > > > > > > -- > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > -- > Cheers, > Benjamin Peterson -- --Guido van Rossum (home page: http://www.python.org/~guido/) From g.brandl at gmx.net Thu Apr 3 01:13:33 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 03 Apr 2008 01:13:33 +0200 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Message-ID: Guido van Rossum schrieb: > On Wed, Apr 2, 2008 at 3:03 PM, Amaury Forgeot d'Arc wrote: >> On Wed, Apr 2, 2008 at 11:57 PM, Paul Prescod wrote: >> > But does anyone else find it odd that the types of some things are >> > classes and the classes of some things are types? >> > >> > >>> type(socket.socket()) >> > >> > >>> type("abc") >> > >> > >>> socket.socket().__class__ >> > >> > >>> "abc".__class__ >> > >> > >> > In a recent talk I could only explain this as a historical quirk. As I >> > understand, it is now possible to make types that behave basically >> > exactly like classes and classes that behave exactly like types. Is >> > there any important difference between them anymore? >> >> I can find one difference: >> - types are written in C >> - classes are written in Python >> >> and there is a difference in behaviour: >> most types don't have a writable __dict__, and you cannot add members. >> classes are more flexible. > > That's more correctly described as the difference between built-in > types/classes and user-defined types/classes. > > I think it's still just a historical quirk; maybe we should bite the > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > part of the language, one as a built-in function and metaclass, the > other as a keyword.) +1 on the repr() change (to "class", preferably). That's exactly the kind of simplification that Py3k is intended to make. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From thomas at python.org Thu Apr 3 01:17:20 2008 From: thomas at python.org (Thomas Wouters) Date: Thu, 3 Apr 2008 01:17:20 +0200 Subject: [Python-3000] PEP 3102 question In-Reply-To: References: Message-ID: <9e804ac0804021617w1f983bfch5f8c663b9ca3ae11@mail.gmail.com> On Wed, Apr 2, 2008 at 10:30 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Wed, Apr 2, 2008 at 3:47 PM, Guido van Rossum wrote: > > > > > Thomas Wouters's changes for variable tuple packing might fix > this, if > > > > we can agree to add that feature. > .. > > > > Thomas isn't finished yet. > > The reason I am asking is that I've been looking into ways of fixing > the way instance methods are reporting the number of arguments > and it looks like some things may > need to be rearranged in ceval in order to provide a fix and I don't > want to propose a patch that will conflict with someone else's work. My work actually won't be changing any of the function-calling opcodes. The current trick bound methods use (replacing the function value on the stack with the 'self' argument and increasing the number of arguments by one) is unaffected. I may change them very subtly: currently the _VAR* opcodes expect positional arguments, then keyword arguments, then the *args argument on the stack. It looks like we're going to have to change that order, but I don't see how that would affect the number-of-arguments reporting. And if it does, too bad for me; I'm pretty sure your change will go in before mine, and I've got lots of experience merging things :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080403/da491307/attachment.htm From thomas at python.org Thu Apr 3 01:19:12 2008 From: thomas at python.org (Thomas Wouters) Date: Thu, 3 Apr 2008 01:19:12 +0200 Subject: [Python-3000] PEP 3102 question In-Reply-To: References: Message-ID: <9e804ac0804021619v195b2414id3ae53d10d5a3cad@mail.gmail.com> On Wed, Apr 2, 2008 at 2:14 AM, Guido van Rossum wrote: > Thomas Wouters's changes for variable tuple packing might fix this, if > we can agree to add that feature. > In all fairness, liberating the argument-unpacking doesn't *require* the variable sequence unpacking patch, although they do share a lot of code. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080403/1bf703d1/attachment.htm From guido at python.org Thu Apr 3 01:33:04 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 16:33:04 -0700 Subject: [Python-3000] PEP 3102 question In-Reply-To: <9e804ac0804021619v195b2414id3ae53d10d5a3cad@mail.gmail.com> References: <9e804ac0804021619v195b2414id3ae53d10d5a3cad@mail.gmail.com> Message-ID: On Wed, Apr 2, 2008 at 4:19 PM, Thomas Wouters wrote: > On Wed, Apr 2, 2008 at 2:14 AM, Guido van Rossum wrote: > > Thomas Wouters's changes for variable tuple packing might fix this, if > > we can agree to add that feature. > In all fairness, liberating the argument-unpacking doesn't *require* the > variable sequence unpacking patch, although they do share a lot of code. Well, I'd see no reason to change the call syntax unless we allow "x, y, *z" elsewhere, and conversely once we allow the latter, not fixing calls would be a mistake. So even if they don't share much code, they go hand in hand for me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fumanchu at aminus.org Thu Apr 3 01:39:12 2008 From: fumanchu at aminus.org (Robert Brewer) Date: Wed, 2 Apr 2008 16:39:12 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Message-ID: Guido van Rossum wrote: > On Wed, Apr 2, 2008 at 11:57 PM, > Paul Prescod wrote: > > But does anyone else find it odd that the types of some things > > are classes and the classes of some things are types? > > > > >>> type(socket.socket()) > > > > >>> type("abc") > > > > >>> socket.socket().__class__ > > > > >>> "abc".__class__ > > > > > > In a recent talk I could only explain this as a historical quirk. > > As I understand, it is now possible to make types that behave > > basically exactly like classes and classes that behave exactly > > like types. Is there any important difference between them anymore? > > I think it's still just a historical quirk; maybe we should bite the > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > part of the language, one as a built-in function and metaclass, the > other as a keyword.) That's...grating, but livable. Maybe we should change "class" to "classdef" and "type" to "class" so code like "isinstance(x, type)" doesn't look so...wrong. On the other hand, why is there no "function" builtin/metaclass to go with the "def" keyword? The asymmetry implies a semantic conflict somewhere (it doesn't *prove* that, just implies). Robert Brewer fumanchu at aminus.org From paul at prescod.net Thu Apr 3 01:45:23 2008 From: paul at prescod.net (Paul Prescod) Date: Wed, 2 Apr 2008 16:45:23 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> Message-ID: <1cb725390804021645t1604a6dck27038741c76b32ca@mail.gmail.com> Also, could the types module be renamed "builtin_classes" or "core_classes" or something like that? It was always a weird name because it wasn't if it contained all of the types in a Python distribution. Just a set of core-to-the-implementation ones. Just out of curiousity: why is the type(x) function valuable when x.__class__ is a viable alternative and has been one for a long time. Paul Prescod From guido at python.org Thu Apr 3 01:51:21 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 16:51:21 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> Message-ID: No, we're not renaming fundamentals like that. 3.0a4 goes out tomorrow and we want stability. On Wed, Apr 2, 2008 at 4:39 PM, Robert Brewer wrote: > Guido van Rossum wrote: > > On Wed, Apr 2, 2008 at 11:57 PM, > > Paul Prescod wrote: > > > But does anyone else find it odd that the types of some things > > > are classes and the classes of some things are types? > > > > > > >>> type(socket.socket()) > > > > > > >>> type("abc") > > > > > > >>> socket.socket().__class__ > > > > > > >>> "abc".__class__ > > > > > > > > > In a recent talk I could only explain this as a historical quirk. > > > As I understand, it is now possible to make types that behave > > > basically exactly like classes and classes that behave exactly > > > like types. Is there any important difference between them anymore? > > > > > I think it's still just a historical quirk; maybe we should bite the > > bullet and fix this in py3k. (Still, 'type' and 'class' will both be > > part of the language, one as a built-in function and metaclass, the > > other as a keyword.) > > That's...grating, but livable. Maybe we should change "class" to > "classdef" and "type" to "class" so code like "isinstance(x, type)" > doesn't look so...wrong. > > On the other hand, why is there no "function" builtin/metaclass to go > with the "def" keyword? The asymmetry implies a semantic conflict > somewhere (it doesn't *prove* that, just implies). > > > Robert Brewer > fumanchu at aminus.org > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Apr 3 01:52:26 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 2 Apr 2008 16:52:26 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <1cb725390804021645t1604a6dck27038741c76b32ca@mail.gmail.com> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <1cb725390804021645t1604a6dck27038741c76b32ca@mail.gmail.com> Message-ID: On Wed, Apr 2, 2008 at 4:45 PM, Paul Prescod wrote: > Also, could the types module be renamed "builtin_classes" or > "core_classes" or something like that? It was always a weird name > because it wasn't if it contained all of the types in a Python > distribution. Just a set of core-to-the-implementation ones. That's up to the stdlib reorg committee; my position has been for a long time that there shouldn't be a types module at all. > Just out of curiousity: why is the type(x) function valuable when > x.__class__ is a viable alternative and has been one for a long time. They aren't the same though. __class__ is overridable via __getattr__. type() is not. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From amauryfa at gmail.com Thu Apr 3 02:31:38 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 3 Apr 2008 02:31:38 +0200 Subject: [Python-3000] Are bytes object really immutable? In-Reply-To: References: Message-ID: On Thu, Apr 3, 2008 at 12:10 AM, Guido van Rossum wrote: > On Wed, Apr 2, 2008 at 2:47 PM, Amaury Forgeot d'Arc wrote: > > Stop me if I'm wrong, but I thought that bytes objects are immutable > > (they are based on the PyStringType, after all) > > Right. In 3.0a1 they were mutable, that's probably where these > examples come from. > > > > But I was surprised by this code in test_socket.py:: > > > > buf = b" "*1024 > > nbytes = self.cli_conn.recv_into(buf) > > That shouldn't work. Filed issue2538 (with a tentative patch) about this problem. There aren't many tests around buffers. I'll try to write some more. Oh, and some documentation. -- Amaury Forgeot d'Arc From tjreedy at udel.edu Thu Apr 3 05:15:15 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 2 Apr 2008 23:15:15 -0400 Subject: [Python-3000] Types and classes References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com><1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com><1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com><1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> Message-ID: "Guido van Rossum" wrote in message news:ca471dc20804021609j535c54c8h76c5d75ddf992e5a at mail.gmail.com... | All I really mean to fix is to standardize the terminology, I have recently been thinking about how to present/explain the basics of Python3 to someone with no experience of Python1/2 or any need to know about them. Having one word instead of two to collectively refer to objects that have instances would make this easier. After thinking about the posts in this thread, I believe 'classes' slightly wins over 'types'. | especially in repr(). I think repr(int) == repr(type(0)) == "" would be fine. (Yes, it was jarring at first, but a half hour later, it almost seems normal ;-) The absence of a module name in front of the class name signals that it is a builtin class (or writen in C?), for whatever difference that makes. I do not think having the root metaclass named 'type' is anymore problematic than having the base class named 'object'. That keywords cannot be identifiers must be explained and learned anyway. That 'type' doubles as the class-revealer is a matter of economy. Having repr(type) == "" might even be clearer than the current "" since 'type' would only appear as a name (of a particularly important class) rather than as both a name and a metacategory. I could go with "", as the other way of being consistent. But I think 'class' works better both because it is a keyword, and not the name of any object, and because it is the word most users use to create new classes, even if type(x,y,z) is used internally. Terry Jan Reedy From barry at python.org Thu Apr 3 06:21:11 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 3 Apr 2008 00:21:11 -0400 Subject: [Python-3000] Building next alphas In-Reply-To: References: Message-ID: <5C3C4C91-88B9-49A7-924C-21ACB5225A46@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Apr 2, 2008, at 6:00 PM, Barry Warsaw wrote: > > This is a reminder that I am going to start building the next alpha > releases for Python 2.6 and 3.0 now. Please, no checkins unless you > get approval from me, and until you hear that the freeze is lifted. > > I am now on freenode #python-dev, IM, and Jabber if you need to > contact me. I've been battling the flu and got distracted for a few hours tonight, so I'm not quite done with the releases. However, I've tagged and tar'd 'em so it should just be a matter of uploading the files and updating the site. I should finish tomorrow. I'm thawing the trees, so you can go ahead and start committing things again, but /please/ be especially conservative over the next 24-48 hours. Make sure your changes don't break anything, just in case my virus-addled brain screwed something up and I need to cut another release. - -Barry P.S. Huge thanks to Benjamin Peterson, both for a quick last minute fix to the 3.0 NEWS file via IRC, and his wonderful release.py script. I've hacked it up a bit, but it was exactly what I was looking for, and it made things go much smoother this time. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBR/RbOHEjvBPtnXfVAQKCOwQArrYXC9X3lOvqRTwWQ9SYPH+6n1VN4MrT WNm+jhsbiwZq8EuNslCBW3/52HP/wM7jlYizKZCL+cbcFaevNhWjjbPtwSTkJjVy /uKG/NcDYQsPH3n4mET3/XlF5JrfS51avLSD7YebucTph9+otzI8LkK0Unvdbtq+ /86m3lEZAlY= =q005 -----END PGP SIGNATURE----- From martin at v.loewis.de Thu Apr 3 09:58:25 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 03 Apr 2008 09:58:25 +0200 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> Message-ID: <47F48E21.3070304@v.loewis.de> > All I really mean to fix is to standardize the terminology, especially > in repr(). So you don't want to be called a wimp anymore ?-) ------------------------------------------------------------------------ r23331 | gvanrossum | 2001-09-25 05:56:29 +0200 (Di, 25 Sep 2001) | 5 lines Change repr() of a new-style class to say rather than . Exception: if it's a built-in type or an extension type, continue to call it . Call me a wimp, but I don't want to break more user code than necessary. ------------------------------------------------------------------------ Regards, Martin From ncoghlan at gmail.com Thu Apr 3 14:33:41 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 03 Apr 2008 22:33:41 +1000 Subject: [Python-3000] Method to populate tp_* slots via getattr()? In-Reply-To: References: <47F38A71.1020803@gmail.com> Message-ID: <47F4CEA5.8070200@gmail.com> Guido van Rossum wrote: > On Wed, Apr 2, 2008 at 6:30 AM, Nick Coghlan wrote: >> One of the issues with porting to Py3k is the problem that __getattr__ >> and __getattribute__ can't reliably provide special methods like __add__ >> the way __getattr__ could with classic classes. (As first noted by Terry >> Reedy years ago, and recently seeing some new activity on the bug >> tracker [1]) >> >> The culprit here is the fact that __getattribute__ and its associated >> machinery is typically never invoked for the methods with dedicated tp_* >> slots in the C-level type structure. > > Well, yes, this is all an intentional part of the new-style class design. Not complaining, just trying to provide some background for those that may not be quite as familiar with the inner workings of typeobject.c :) >> What do people think of the idea of providing an extra method on type >> objects that goes through all of the C-level special method slots, and >> for each one that isn't currently set, does a getattr() on the >> associated special name and stores the result (if any) on the current >> type object? > > Does a getattr on what? Since you seem to be thinking specifically of > proxies here, I'm thinking you're doing a getattr on an *instance* -- > but it seems wrong to base the *type* slots on that. D'oh, you're right - the specific proxying example I am thinking of (see below) does indeed grab bound methods directly from the underlying instance. However, I think the idea is salvageable (whether or not it is *worth* salvaging is of course a completely different question!). >> When converting a proxy class that relies on __getattr__ from classic > > Can you show specific code for such a proxy class? I'm having a hard > time imagining how it would work (not having used proxies in a really > long time...). From tempfile._TemporaryFileWrapper, which aims to delegate as many operations as it can automatically to the underlying file object: def __getattr__(self, name): # Attribute lookups are delegated to the underlying file # and cached for non-numeric results # (i.e. methods are cached, closed and friends are not) file = self.__dict__['file'] a = getattr(file, name) if not issubclass(type(a), type(0)): setattr(self, name, a) return a For 2.x, the only methods that need to be overridden explicitly are those where this bound method caching does the wrong thing (__exit__ and __enter__ needed to be on that list, which is what first brought this class to my attention). For 3.0, it was also necessary to add: def __iter__(self): return iter(self.file) It wasn't too bad in this case since file doesn't implement many tp_* slots, but the 3.0 version of classes that delegate a lot of operations to a specific member variable will be a lot more verbose in any cases where the underlying type being delegated to implements some of the number or container protocols. >> to new-style, all that would then be needed is to invoke the new method on >> the class object after defining the class (a class decorator or >> metaclass could be provided somewhere to make this a bit tidier). > > Hm. So you are thinking of a proxy for a class?!?! Sort of - I'm thinking mainly of classes like _TemporaryFileWrapper that delegate most operations to a specific member variable, and expect that member variable to always be of a specific type. > Note that if you set a class attribute corresponding to a special > method (e.g. C.__add__ = ...) the corresponding C slot is > automatically updated, so you should be able to write a class > decorator or mixin or helper function to do this in pure Python, > unless I completely misunderstand what you're after. Yeah, doing it in typeobject was mostly an easy way of getting at the complete list of special methods with tp_* slots without having to maintain two copies of that list. >> This seems a lot cleaner than expecting everyone that implements a proxy >> object to maintain there own list of all of the relevant special >> methods, and locates the implementation support in an area of the code >> that already has plenty of infrastructure dedicated to keeping Python >> visible attributes in sync with the C visible tp_* slots. > > How many proxy implementations does the world need? Maybe we should > add one to the stdlib? I don't know enough about the different ways people proxy or otherwise delegate special methods to know if it is feasible to provide a one-size-fits-most implementation in the standard library. That said, maybe it would be enough if a type instance could be queried for the list of special method names it implements that the interpreter can access without going through __getattribute__? Then the slots of a class delegating to a specific type could be initialised appropriately by doing something like: for name in delegate_type.special_methods(): if not hasattr(cls, name): def delegation(*args, **kwds): self, *args = args # +1 on arbitrary tuple unpacking ;) getattr(self.delegate, name)(*args, **kwds) setattr(cls, name, delegation) The approach I suggested in my original email would instead look more like this: class Foo: ... Foo.delegate_special_methods('delegate', delegate_type) where delegate_special_methods is basically just a C level implementation of the loop described above (except that the 'delegation' callable could be a lot more efficient than the given Python function). Another option would be to provide an explicit list in the documentation of the slot names for the tp_* methods which the interpreter may access without going through __getattr__ and __getattribute__. The discussion in the bug report that got me thinking about this topic commented on the fact that quite a few magic methods were added during the 2.x development cycle - I think the key point I missed at the time is the fact that most of those *didn't* have corresponding tp_* slots, so __getattr__ and __getattribute__ (particularly the latter) can intercept them just fine. That said, the documentation approach would probably be too limiting on alternate interpreters though - why should other implementations be restricted from providing optimised access to special methods just because we haven't done so certain cases in CPython? If we don't make any changes at all, the delegation loop shown above can actually already be written as follows: for name in dir(delegate_type): if (name.startswith('__') and name.endswith('__') and not hasattr(cls, name)): def delegation(*args, **kwds): self, *args = args # +1 on arbitrary tuple unpacking ;) getattr(self.delegate, name)(*args, **kwds) setattr(cls, name, delegation) >> Thoughts? Altenative ideas? Howls of protest? > > No, so far just a bit of confusion. :-) Hopefully the above makes my concerms a bit clearer. I'm actually hoping to hear from some more people that would benefit from having better support for this kind of delegation - my interest in the matter is fairly academic (based solely on the tempfile bugs arising from the initial conversion to Py3k), so my personal inclination is actually to put a stronger note in the documentation about the fact that the lookup of special methods may bypass __getattribute__ entirely and leave it at that. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Thu Apr 3 20:14:46 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 3 Apr 2008 11:14:46 -0700 Subject: [Python-3000] Method to populate tp_* slots via getattr()? In-Reply-To: <47F4CEA5.8070200@gmail.com> References: <47F38A71.1020803@gmail.com> <47F4CEA5.8070200@gmail.com> Message-ID: I'll wait for others to jump on this bandwagon... IMO the tempfile object would be better off not to bother with caching at all... On Thu, Apr 3, 2008 at 5:33 AM, Nick Coghlan wrote: > Guido van Rossum wrote: > > > On Wed, Apr 2, 2008 at 6:30 AM, Nick Coghlan wrote: > > > > > One of the issues with porting to Py3k is the problem that __getattr__ > > > and __getattribute__ can't reliably provide special methods like > __add__ > > > the way __getattr__ could with classic classes. (As first noted by > Terry > > > Reedy years ago, and recently seeing some new activity on the bug > > > tracker [1]) > > > > > > The culprit here is the fact that __getattribute__ and its associated > > > machinery is typically never invoked for the methods with dedicated > tp_* > > > slots in the C-level type structure. > > > > > > > Well, yes, this is all an intentional part of the new-style class design. > > > > Not complaining, just trying to provide some background for those that may > not be quite as familiar with the inner workings of typeobject.c :) > > > > > > > > What do people think of the idea of providing an extra method on type > > > objects that goes through all of the C-level special method slots, and > > > for each one that isn't currently set, does a getattr() on the > > > associated special name and stores the result (if any) on the current > > > type object? > > > > > > > Does a getattr on what? Since you seem to be thinking specifically of > > proxies here, I'm thinking you're doing a getattr on an *instance* -- > > but it seems wrong to base the *type* slots on that. > > > > D'oh, you're right - the specific proxying example I am thinking of (see > below) does indeed grab bound methods directly from the underlying instance. > However, I think the idea is salvageable (whether or not it is *worth* > salvaging is of course a completely different question!). > > > > > > > > When converting a proxy class that relies on __getattr__ from classic > > > > > > > Can you show specific code for such a proxy class? I'm having a hard > > time imagining how it would work (not having used proxies in a really > > long time...). > > > > From tempfile._TemporaryFileWrapper, which aims to delegate as many > operations as it can automatically to the underlying file object: > > def __getattr__(self, name): > # Attribute lookups are delegated to the underlying file > # and cached for non-numeric results > # (i.e. methods are cached, closed and friends are not) > file = self.__dict__['file'] > a = getattr(file, name) > if not issubclass(type(a), type(0)): > setattr(self, name, a) > return a > > For 2.x, the only methods that need to be overridden explicitly are those > where this bound method caching does the wrong thing (__exit__ and __enter__ > needed to be on that list, which is what first brought this class to my > attention). For 3.0, it was also necessary to add: > > def __iter__(self): > return iter(self.file) > > It wasn't too bad in this case since file doesn't implement many tp_* > slots, but the 3.0 version of classes that delegate a lot of operations to a > specific member variable will be a lot more verbose in any cases where the > underlying type being delegated to implements some of the number or > container protocols. > > > > > > > > to new-style, all that would then be needed is to invoke the new method > on > > > the class object after defining the class (a class decorator or > > > metaclass could be provided somewhere to make this a bit tidier). > > > > > > > Hm. So you are thinking of a proxy for a class?!?! > > > > Sort of - I'm thinking mainly of classes like _TemporaryFileWrapper that > delegate most operations to a specific member variable, and expect that > member variable to always be of a specific type. > > > > > Note that if you set a class attribute corresponding to a special > > method (e.g. C.__add__ = ...) the corresponding C slot is > > automatically updated, so you should be able to write a class > > decorator or mixin or helper function to do this in pure Python, > > unless I completely misunderstand what you're after. > > > > Yeah, doing it in typeobject was mostly an easy way of getting at the > complete list of special methods with tp_* slots without having to maintain > two copies of that list. > > > > > > > > This seems a lot cleaner than expecting everyone that implements a > proxy > > > object to maintain there own list of all of the relevant special > > > methods, and locates the implementation support in an area of the code > > > that already has plenty of infrastructure dedicated to keeping Python > > > visible attributes in sync with the C visible tp_* slots. > > > > > > > How many proxy implementations does the world need? Maybe we should > > add one to the stdlib? > > > > I don't know enough about the different ways people proxy or otherwise > delegate special methods to know if it is feasible to provide a > one-size-fits-most implementation in the standard library. > > That said, maybe it would be enough if a type instance could be queried for > the list of special method names it implements that the interpreter can > access without going through __getattribute__? > > Then the slots of a class delegating to a specific type could be > initialised appropriately by doing something like: > > for name in delegate_type.special_methods(): > if not hasattr(cls, name): > def delegation(*args, **kwds): > self, *args = args # +1 on arbitrary tuple unpacking ;) > getattr(self.delegate, name)(*args, **kwds) > setattr(cls, name, delegation) > > The approach I suggested in my original email would instead look more like > this: > > class Foo: ... > > Foo.delegate_special_methods('delegate', delegate_type) > > where delegate_special_methods is basically just a C level implementation > of the loop described above (except that the 'delegation' callable could be > a lot more efficient than the given Python function). > > Another option would be to provide an explicit list in the documentation of > the slot names for the tp_* methods which the interpreter may access without > going through __getattr__ and __getattribute__. > > The discussion in the bug report that got me thinking about this topic > commented on the fact that quite a few magic methods were added during the > 2.x development cycle - I think the key point I missed at the time is the > fact that most of those *didn't* have corresponding tp_* slots, so > __getattr__ and __getattribute__ (particularly the latter) can intercept > them just fine. > > That said, the documentation approach would probably be too limiting on > alternate interpreters though - why should other implementations be > restricted from providing optimised access to special methods just because > we haven't done so certain cases in CPython? > > If we don't make any changes at all, the delegation loop shown above can > actually already be written as follows: > > for name in dir(delegate_type): > if (name.startswith('__') > and name.endswith('__') > and not hasattr(cls, name)): > def delegation(*args, **kwds): > self, *args = args # +1 on arbitrary tuple unpacking ;) > getattr(self.delegate, name)(*args, **kwds) > setattr(cls, name, delegation) > > > > > > > > Thoughts? Altenative ideas? Howls of protest? > > > > > > > No, so far just a bit of confusion. :-) > > > > Hopefully the above makes my concerms a bit clearer. > > I'm actually hoping to hear from some more people that would benefit from > having better support for this kind of delegation - my interest in the > matter is fairly academic (based solely on the tempfile bugs arising from > the initial conversion to Py3k), so my personal inclination is actually to > put a stronger note in the documentation about the fact that the lookup of > special methods may bypass __getattribute__ entirely and leave it at that. > > Cheers, > Nick. > > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Apr 3 20:18:48 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 3 Apr 2008 11:18:48 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <47F48E21.3070304@v.loewis.de> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> Message-ID: On Thu, Apr 3, 2008 at 12:58 AM, "Martin v. L?wis" wrote: > > All I really mean to fix is to standardize the terminology, especially > > in repr(). > > So you don't want to be called a wimp anymore ?-) Indeed. > ------------------------------------------------------------------------ > r23331 | gvanrossum | 2001-09-25 05:56:29 +0200 (Di, 25 Sep 2001) | 5 lines > > Change repr() of a new-style class to say rather > than . Exception: if it's a built-in type or an > extension type, continue to call it . Call me a > wimp, but I don't want to break more user code than necessary. > > ------------------------------------------------------------------------ Well, if we're going to break user code, 3.0 is the time to do it. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Fri Apr 4 03:47:01 2008 From: barry at python.org (Barry Warsaw) Date: Thu, 3 Apr 2008 21:47:01 -0400 Subject: [Python-3000] RELEASED Python 2.6a2 and 3.0a4 Message-ID: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team and the Python community, I'm happy to announce the second alpha release of Python 2.6, and the fourth alpha release of Python 3.0. Please note that these are alpha releases, and as such are not suitable for production environments. We continue to strive for a high degree of quality, but there are still some known problems and the feature sets have not been finalized. These alphas are being released to solicit feedback and hopefully discover bugs, as well as allowing you to determine how changes in 2.6 and 3.0 might impact you. If you find things broken or incorrect, please submit a bug report at http://bugs.python.org For more information and downloadable distributions, see the Python 2.6 web site: http://www.python.org/download/releases/2.6/ and the Python 3.0 web site: http://www.python.org/download/releases/3.0/ We are planning one more alpha release of each version, followed by two beta releases, with the final releases planned for August 2008. See PEP 361 for release details: http://www.python.org/dev/peps/pep-0361/ Enjoy, - -Barry Barry Warsaw barry at python.org Python 2.6/3.0 Release Manager (on behalf of the entire python-dev team) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBR/WImHEjvBPtnXfVAQJmoQP+MzqNDI+Xt8zua/FE7Ca4TVXoIIy2uoOm I1i3+vmevZ9vtAb9hcGwfEgPY4LSwb9Js4KnJJWMPaMuFJK4NgGoiMdj+t42zDbQ bEzfBUOCoVkejLRxIQnWeJf1Hu8JocYyCHIRffv57/QdKpHuiSs8aE8GIT3STo3o I88H5NY1GgI= =WT2z -----END PGP SIGNATURE----- From tony.meyer at gmail.com Tue Apr 1 21:51:42 2008 From: tony.meyer at gmail.com (Tony Meyer) Date: Wed, 2 Apr 2008 08:51:42 +1300 Subject: [Python-3000] u'text' as an alias for 'text'? In-Reply-To: <20080323204906.0d540bc1@bhuda.mired.org> References: <319e029f0803200048s768262d1g3805873e4e646e0c@mail.gmail.com> <319e029f0803200928sd9b5d03ud4966c70d6acd080@mail.gmail.com> <9e804ac0803200950p3d0a190cj1d9463581106b00b@mail.gmail.com> <319e029f0803200955l747b7e8ey11ba828adcc4ea2c@mail.gmail.com> <47E2A69B.9080509@v.loewis.de> <319e029f0803201251h2ec05b1fk50f9629627a6b07d@mail.gmail.com> <20080320180239.79ee5b64@bhuda.mired.org> <319e029f0803231320t3f01d99fp4cf9d6890774603@mail.gmail.com> <47E6D111.1030607@v.loewis.de> <319e029f0803231512s16e9f56bo9a19d4eb8d03a98e@mail.gmail.com> <20080323204906.0d540bc1@bhuda.mired.org> Message-ID: On 24/03/2008, at 1:49 PM, Mike Meyer wrote: > How many programs that used set.Set in 2.3 broke in 2.4 > when the set module vanished? I presume you're referring to the "sets" module, and it has not gone anywhere in 2.x: Python 2.5.1 (r251:54863, Jan 17 2008, 19:35:17) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from sets import Set >>> Cheers, Tony From mwm at mired.org Wed Apr 2 06:56:50 2008 From: mwm at mired.org (Mike Meyer) Date: Wed, 2 Apr 2008 00:56:50 -0400 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F2F151.3050809@v.loewis.de> References: <47F2B40E.2080304@v.loewis.de> <79990c6b0804011525s5784da0ch1604cbf7393f6160@mail.gmail.com> <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> Message-ID: <20080402005650.3c96033b@bhuda.mired.org> On Wed, 02 Apr 2008 04:37:05 +0200 "Martin v. L?wis" wrote: > > So unless I am misinterpreting this, it sounds like the burden of > > proof now falls on the option to keep the status quo. The thing is > > that it seems to me that if that an outside observer were to look at > > this situation, then they might ask why the names are being changed > > when the current behavior is functional and no one is clamoring for > > the change. > > I think it's fairly obvious why the 2.x .keys() has to change. It's > just too wasteful to actually build the list of all keys of a dictionary > (or even of all values, as you have to create all the tuples as well), > if all you want to do is to iterate over it, and the most common > operation of .keys() is to iterate over it in a for look (right?). I'd say not clear, for two reasons. One is that I pretty much never use keys() in a for loop, I just use the dictionary. > Applications that take a snapshot of the .keys() are rare (right?). And the second is that I don't think it's rare to want to process the keys in sorted order. It's not exactly common, but keys = mydict.keys() keys.sort() for key in keys: In fact, the 2.5 standard library turns up 3 occurrences of "keys.sort". Given that that's just the ones that used the obvious name for the list to be sorted Nowdays, I tend to write keys = sorted(mydict.keys()) # Yeah, I know, .keys() is redundant... for key in keys: or maybe for key in sorted(mydict): both of which are probably slower than the original version unless sorted switches to an insertion sort if passed a generator. > The most direct name should be used in the most common scenario, > which is the for loop. I.e. people who don't think about this > issue at all should likely do the right thing. For 2.x, this is > not the case. I'd say the most direct name is to use the dictionary as an iterator directly. So if you don't think about it the way I don't think about it, you get the right thing in 2.x and 3.0. Given that you the dictionary itself is an iterator, what are the use cases for wanting the result of the keys() method returning an iterator where you can't use the dictionary itself? I thought assignment might be one, but a second reference to the dictionary will behave the same way in all cases you'd use the dict_keys (hmm - dict_keys.__xor__???) http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From szport at gmail.com Fri Apr 4 09:49:30 2008 From: szport at gmail.com (Zaur Shibzoukhov) Date: Fri, 4 Apr 2008 11:49:30 +0400 Subject: [Python-3000] A new member for contextlib? Message-ID: I suggest a context manager for property defining/redefining. There is a prototype and illustrative example: -------------------------------------------------------------------------------------------------------------- import sys null = object() _property = property class property(_property): # def __init__(self, *args, **kw): _property.__init__(self, *args, **kw) self.__enter__ = self.__enter self.__exit__ = self.__exit # @classmethod def __enter__(self): return null # @classmethod def __exit__(self, type, value, tb): if tb is None: frame = sys._getframe(1) self.__exitHandler(self, frame.f_locals) # def __enter(self): return null # def __exit(self, type, value, tb): if tb is None: frame = sys._getframe(1) self.__exitHandler(self, frame.f_locals) # @staticmethod def __exitHandler(self, _locals): propName = "_" PropName = str(id(self)) for key, value in _locals.items(): if value is null: propName = key PropName = key.capitalize() break if type(self) == type(property): getFunc = _locals.pop('get', None) setFunc = _locals.pop('set', None) delFunc = _locals.pop('delete', None) doc = _locals.pop('doc', None) else: getFunc = _locals.pop('get', None) or self.fget setFunc = _locals.pop('set', None) or self.fset delFunc = _locals.pop('delete', None) or self.fdel doc = _locals.pop('doc', None) or self.__doc__ if getFunc: funcName = "_get"+PropName getFunc.__name__ = funcName _locals[funcName] = getFunc if setFunc: funcName = "_set"+PropName setFunc.__name__ = funcName _locals[funcName] = setFunc if delFunc: funcName = "_del"+PropName delFunc.__name__ = funcName _locals[funcName] = delFunc prop = property(getFunc, setFunc, delFunc) prop.__doc__ = doc _locals[propName] = prop ----------------------------------------------------------------------------------------------------------- def testPropertyMaker(): class AAA: _v = 1 with property as v: doc = "Example of making property with *with* statement" def get(self): return self._v def set(self, v): self._v = v class BBB(AAA): with AAA.v as v: doc = "Example of modified property" def get(self): return [self._v] a=AAA() a.v = 10 print(a.v) print(AAA.v.__doc__) b=BBB() b.v = 100 print(b.v) print(BBB.v.__doc__) This code is also in attachment. Is it suitable for contextlib.py? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080404/61e5bb44/attachment.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: withProperty.py Type: text/x-python Size: 2725 bytes Desc: not available Url : http://mail.python.org/pipermail/python-3000/attachments/20080404/61e5bb44/attachment.py From musiccomposition at gmail.com Fri Apr 4 15:25:45 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 4 Apr 2008 08:25:45 -0500 Subject: [Python-3000] A new member for contextlib? In-Reply-To: References: Message-ID: <1afaf6160804040625o12c71877w11671c3440a10b6d@mail.gmail.com> On Fri, Apr 4, 2008 at 2:49 AM, Zaur Shibzoukhov wrote: > I suggest a context manager for property defining/redefining. There is a > prototype and illustrative example: > > > > -------------------------------------------------------------------------------------------------------------- > import sys > > null = object() > _property = property > class property(_property): > # > def __init__(self, *args, **kw): > _property.__init__(self, *args, **kw) > self.__enter__ = self.__enter > self.__exit__ = self.__exit > # > @classmethod > def __enter__(self): > return null > # > @classmethod > def __exit__(self, type, value, tb): > if tb is None: > frame = sys._getframe(1) > self.__exitHandler(self, frame.f_locals) > # > def __enter(self): > return null > # > def __exit(self, type, value, tb): > if tb is None: > frame = sys._getframe(1) > self.__exitHandler(self, frame.f_locals) > # > @staticmethod > def __exitHandler(self, _locals): > propName = "_" > PropName = str(id(self)) > for key, value in _locals.items(): > if value is null: > propName = key > PropName = key.capitalize() > break > > if type(self) == type(property): > getFunc = _locals.pop('get', None) > setFunc = _locals.pop('set', None) > delFunc = _locals.pop('delete', None) > doc = _locals.pop('doc', None) > else: > getFunc = _locals.pop('get', None) or self.fget > setFunc = _locals.pop('set', None) or self.fset > delFunc = _locals.pop('delete', None) or self.fdel > doc = _locals.pop('doc', None) or self.__doc__ > > if getFunc: > funcName = "_get"+PropName > getFunc.__name__ = funcName > _locals[funcName] = getFunc > if setFunc: > funcName = "_set"+PropName > setFunc.__name__ = funcName > _locals[funcName] = setFunc > if delFunc: > funcName = "_del"+PropName > delFunc.__name__ = funcName > _locals[funcName] = delFunc > > prop = property(getFunc, setFunc, delFunc) > prop.__doc__ = doc > _locals[propName] = prop > > > ----------------------------------------------------------------------------------------------------------- > > def testPropertyMaker(): > class AAA: > _v = 1 > with property as v: > doc = "Example of making property with *with* statement" > def get(self): > return self._v > def set(self, v): > self._v = v > > class BBB(AAA): > with AAA.v as v: > doc = "Example of modified property" > def get(self): > return [self._v] > > a=AAA() > a.v = 10 > print(a.v) > print(AAA.v.__doc__) > > b=BBB() > b.v = 100 > print(b.v) > print(BBB.v.__doc__) > > This code is also in attachment. > > Is it suitable for contextlib.py? I don't really see how this is better/easier than: class AAA: def get_x(): pass def set_x(): pass x = property(get_x, set_x, None, "The x property") > > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080404/a143cc3e/attachment.htm From lists at cheimes.de Fri Apr 4 15:39:25 2008 From: lists at cheimes.de (Christian Heimes) Date: Fri, 04 Apr 2008 15:39:25 +0200 Subject: [Python-3000] A new member for contextlib? In-Reply-To: References: Message-ID: Zaur Shibzoukhov schrieb: > I suggest a context manager for property defining/redefining. There is a > prototype and illustrative example: Python 2.6 and 3.0 already have a new way to modify properties: class C(object): @property def x(self): return self._x @x.setter def x(self, value): self._x = value @x.deleter def x(self): del self._x Christian From szport at gmail.com Fri Apr 4 20:12:17 2008 From: szport at gmail.com (Zaur Shibzoukhov) Date: Fri, 4 Apr 2008 22:12:17 +0400 Subject: [Python-3000] A new member for contextlib? Message-ID: Benjamin Peterson: > I don't really see how this is better/easier than: > class AAA: > def get_x(): pass > def set_x(): pass > x = property(get_x, set_x, None, "The x property") Perhaps it's better because :) @classmethod def func(self): pass is better than def func(self): pass func = classmethod(func) Christian Heimes: > Python 2.6 and 3.0 already have a new way to modify properties: > > class C(object): > @property > def x(self): return self._x > @x.setter > def x(self, value): self._x = value > @x.deleter > def x(self): del self._x Certainly! It don't intent to replace this way of defining/modifining properties. First, it is an example of "with" statement application. Second, suggested approach allow to write your example in the following way: class C(object): with property as x: def get(self): return self._x def set(self, value): self._x = value def del(self): del self._x IMHO it's quite readable too because of additional identation. From p.f.moore at gmail.com Fri Apr 4 20:40:40 2008 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 4 Apr 2008 19:40:40 +0100 Subject: [Python-3000] A new member for contextlib? In-Reply-To: References: Message-ID: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> On 04/04/2008, Zaur Shibzoukhov wrote: > Certainly! It don't intent to replace this way of defining/modifining > properties. First, it is an example of "with" statement application. > Second, suggested approach allow to write your example in the > following way: > > class C(object): > with property as x: > def get(self): return self._x > def set(self, value): self._x = value > def del(self): del self._x > > IMHO it's quite readable too because of additional identation. It does look reasonably nice. But I'd suggest submitting it as a recipe in the Python cookbook - it doesn't seem to me that it needs to go in the core. I'm not entirely sure about it, as I don't think "with property as x" reads right. And you shouldn't call it "property" - that name is already builtin (and it doesn't read right, see above). I can't think of a good name that *does* read right in the context of "with ... as x", either. Paul. From barry at python.org Fri Apr 4 20:57:55 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 4 Apr 2008 14:57:55 -0400 Subject: [Python-3000] A new member for contextlib? In-Reply-To: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> References: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Apr 4, 2008, at 2:40 PM, Paul Moore wrote: > On 04/04/2008, Zaur Shibzoukhov wrote: >> Certainly! It don't intent to replace this way of defining/modifining >> properties. First, it is an example of "with" statement application. >> Second, suggested approach allow to write your example in the >> following way: >> >> class C(object): >> with property as x: >> def get(self): return self._x >> def set(self, value): self._x = value >> def del(self): del self._x >> >> IMHO it's quite readable too because of additional identation. > > It does look reasonably nice. But I'd suggest submitting it as a > recipe in the Python cookbook - it doesn't seem to me that it needs to > go in the core. I'm not entirely sure about it, as I don't think "with > property as x" reads right. It looks nice to me. I think it's exactly right because it's entirely clear what the intent is. Very clever too, so I would be +1 on extending property to handle this. > And you shouldn't call it "property" - that name is already builtin > (and it doesn't read right, see above). I can't think of a good name > that *does* read right in the context of "with ... as x", either. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBR/Z6M3EjvBPtnXfVAQIyMgP+LPEX2EU1Pyqbnj0cASgK/iWy6j3ljdVc KNMVhwI60jiYOK1JBv8knanTjwpwburI88hbOL1nvJrmgtd8QmKkapaRihdjcFt1 Tu5k0jqH9gScilMec0Fl5aVt/YKjTgyJmxBL8EnKNV+UZ3emzbYda2fTvz6Egz5u /tU5aJTGyUU= =gVxJ -----END PGP SIGNATURE----- From gregor.lingl at aon.at Fri Apr 4 22:07:07 2008 From: gregor.lingl at aon.at (Gregor Lingl) Date: Fri, 04 Apr 2008 22:07:07 +0200 Subject: [Python-3000] [Python-Dev] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> Message-ID: <47F68A6B.9060901@aon.at> Hi, something doesn't work as usual, at least for me: When I try to download the Python 2.6a2 release for Windows by clicking * Windows x86 MSI Installer (2.6a2) (sig) instead of the usual download dialog I get an Error 404: File not found for the url http://www.python.org/ftp/python/2.6/python-2.6a2.msi . The same is true when trying to download Python 3.0 a4. Embarassed, Gregor Barry Warsaw schrieb: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On behalf of the Python development team and the Python community, I'm > happy to announce the second alpha release of Python 2.6, and the > fourth alpha release of Python 3.0. > > Please note that these are alpha releases, and as such are not > suitable for production environments. We continue to strive for a > high degree of quality, but there are still some known problems and > the feature sets have not been finalized. These alphas are being > released to solicit feedback and hopefully discover bugs, as well as > allowing you to determine how changes in 2.6 and 3.0 might impact > you. If you find things broken or incorrect, please submit a bug > report at > > http://bugs.python.org > > For more information and downloadable distributions, see the Python > 2.6 web > site: > > http://www.python.org/download/releases/2.6/ > > and the Python 3.0 web site: > > http://www.python.org/download/releases/3.0/ > > We are planning one more alpha release of each version, followed by > two beta releases, with the final releases planned for August 2008. > See PEP 361 for release details: > > http://www.python.org/dev/peps/pep-0361/ > > Enjoy, > - -Barry > > Barry Warsaw > barry at python.org > Python 2.6/3.0 Release Manager > (on behalf of the entire python-dev team) > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.8 (Darwin) > > iQCVAwUBR/WImHEjvBPtnXfVAQJmoQP+MzqNDI+Xt8zua/FE7Ca4TVXoIIy2uoOm > I1i3+vmevZ9vtAb9hcGwfEgPY4LSwb9Js4KnJJWMPaMuFJK4NgGoiMdj+t42zDbQ > bEzfBUOCoVkejLRxIQnWeJf1Hu8JocYyCHIRffv57/QdKpHuiSs8aE8GIT3STo3o > I88H5NY1GgI= > =WT2z > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/gregor.lingl%40aon.at > > > From martin at v.loewis.de Fri Apr 4 22:21:45 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 04 Apr 2008 22:21:45 +0200 Subject: [Python-3000] A new member for contextlib? In-Reply-To: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> References: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> Message-ID: <47F68DD9.30900@v.loewis.de> > And you shouldn't call it "property" - that name is already builtin > (and it doesn't read right, see above). I can't think of a good name > that *does* read right in the context of "with ... as x", either. Of course, we non-native speakers are completely ignorant of "reads right" (even to the extend of spelling that "reads write"), and are thus unable to propose anything that sounds "natural". To us, it's all foreign, and identifiers just mean something with respect to the programming language, but not in real life. So with property as x: is not any better or worse than for attribute named bar: except that we can see how the latter can't work in the Python syntax as-is. Regards, Martin From martin at v.loewis.de Fri Apr 4 22:23:37 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 04 Apr 2008 22:23:37 +0200 Subject: [Python-3000] [Python-Dev] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <47F68A6B.9060901@aon.at> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> <47F68A6B.9060901@aon.at> Message-ID: <47F68E49.30306@v.loewis.de> > Error 404: File not found That has a simple explanation: the file is not there because it just doesn't exist yet, which in turn is because I have problems creating it (which is in turn due to switching to Visual Studio 2008). Regards, Martin From barry at python.org Fri Apr 4 22:39:09 2008 From: barry at python.org (Barry Warsaw) Date: Fri, 4 Apr 2008 16:39:09 -0400 Subject: [Python-3000] [Python-Dev] Python source code on Bazaar vcs In-Reply-To: <18409.16093.799920.286191@montanaro-dyndns-org.local> References: <20C0AC37-D748-450E-B690-FBCA2ACFFC4E@python.org> <18408.27695.339064.649345@montanaro-dyndns-org.local> <08E8188C-AEA2-4E78-B74C-AF213757DEA0@python.org> <18409.16093.799920.286191@montanaro-dyndns-org.local> Message-ID: <99AE707E-C992-42B9-A238-F2751A691DFA@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mar 25, 2008, at 2:05 PM, skip at pobox.com wrote: > >>> Did I misread the directions or do I really need the --create-prefix >>> arg? > > Barry> You do, the first time you push a user branch because > users/skip > Barry> doesn't exist yet. It's mentioned in the docs, but it's > pretty > Barry> easy to overlook ;). > > Well, I noticed the mention in .../dev/bazaar, where it reads, "the > first > time you do this, you might need to add --create-prefix". Perhaps > that > should read "... you will need to ...". > > It pushed fine with --create-prefix. Thanks Skip. Fixed. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBR/aR7XEjvBPtnXfVAQIoXgQAsN6Hfs1JjdFMOI1/Ef2kLeeBfTPxf+Ys K9y81yEHUNonaQEIF9ptnyOIEyic5uX+Ig4cYO20i1LgvGEIIiCg191EJtYFc9jr s1dTgmE3PQfiR7J2m2SWS06bYMsanBdAAW/ZnMpgmUMZixYEX43z7Q+kjFibwTn+ UbGz2uLeW+o= =Fx0S -----END PGP SIGNATURE----- From jjb5 at cornell.edu Fri Apr 4 21:28:44 2008 From: jjb5 at cornell.edu (Joel Bender) Date: Fri, 04 Apr 2008 15:28:44 -0400 Subject: [Python-3000] A new member for contextlib? In-Reply-To: References: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> Message-ID: <47F6816C.7070508@cornell.edu> How about reversing the order? class C(object): with x as property: ... Joel From jason.orendorff at gmail.com Fri Apr 4 23:02:01 2008 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Fri, 4 Apr 2008 16:02:01 -0500 Subject: [Python-3000] Spooky behavior of dict.items() and friends In-Reply-To: <47F40D48.3030904@v.loewis.de> References: <1afaf6160804011553q13b4bd36yc1a7b867d61fd13@mail.gmail.com> <47F2F151.3050809@v.loewis.de> <47F401AA.5070801@v.loewis.de> <47F40D48.3030904@v.loewis.de> Message-ID: On Wed, Apr 2, 2008 at 5:48 PM, "Martin v. L?wis" wrote: > In your code, how many (in absolute numbers) applications of .items() > would break when .items() becomes a view? Sorry for the slow response. Good question. In 25k lines of code (not mine but mostly written by people "like me"), I found only 4, all "loud". This is pretty convincing. However, I do want to search again for possible bugs introduced by calls to .keys(), which is used much more often in this code. -j From musiccomposition at gmail.com Sat Apr 5 00:49:36 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 4 Apr 2008 17:49:36 -0500 Subject: [Python-3000] A new member for contextlib? In-Reply-To: <47F6816C.7070508@cornell.edu> References: <79990c6b0804041140l783b155fo6e82829b5601b87e@mail.gmail.com> <47F6816C.7070508@cornell.edu> Message-ID: <1afaf6160804041549n43064745ma3fe78e7673eb1d0@mail.gmail.com> On Fri, Apr 4, 2008 at 2:28 PM, Joel Bender wrote: > How about reversing the order? > > class C(object): > with x as property: > ... That is nicer but is not easy to implement. > > > Joel > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080404/720b18df/attachment.htm From nnorwitz at gmail.com Sat Apr 5 08:18:49 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Fri, 4 Apr 2008 23:18:49 -0700 Subject: [Python-3000] raw strings and \u Message-ID: I just checked in r62163 with this change: - rc = os.system(r"ml64 -c -Foms\uptable.obj ms\uptable.asm") + rc = os.system("ml64 -c -Foms\\uptable.obj ms\\uptable.asm") What should happen with raw unicode strings that contain a \u? The old code above was generating: SyntaxError: (unicode error) truncated \uXXXX Is that correct? Or should the \u be translated literally? n From martin at v.loewis.de Sat Apr 5 09:34:05 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 05 Apr 2008 09:34:05 +0200 Subject: [Python-3000] raw strings and \u In-Reply-To: References: Message-ID: <47F72B6D.9000209@v.loewis.de> > I just checked in r62163 with this change: > - rc = os.system(r"ml64 -c -Foms\uptable.obj ms\uptable.asm") > + rc = os.system("ml64 -c -Foms\\uptable.obj ms\\uptable.asm") > > What should happen with raw unicode strings that contain a \u? The > old code above was generating: > SyntaxError: (unicode error) truncated \uXXXX > > Is that correct? Or should the \u be translated literally? The intention is that the file ms\uptable.asm is compiled to ms\uptable.obj. So the change is correct. (not sure what alternatives you might have considered) Regards, Martin From amauryfa at gmail.com Sat Apr 5 14:25:11 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Sat, 5 Apr 2008 14:25:11 +0200 Subject: [Python-3000] raw strings and \u In-Reply-To: <47F72B6D.9000209@v.loewis.de> References: <47F72B6D.9000209@v.loewis.de> Message-ID: Martin v. L?wis wrote: > > I just checked in r62163 with this change: > > - rc = os.system(r"ml64 -c -Foms\uptable.obj ms\uptable.asm") > > + rc = os.system("ml64 -c -Foms\\uptable.obj ms\\uptable.asm") > > > > What should happen with raw unicode strings that contain a \u? The > > old code above was generating: > > SyntaxError: (unicode error) truncated \uXXXX > > > > Is that correct? Or should the \u be translated literally? > > The intention is that the file ms\uptable.asm is compiled to > ms\uptable.obj. So the change is correct. (not sure what > alternatives you might have considered) I use raw strings when there are backslashes in the text, and I still want it to be readable:: r"C:\Documents and Settings\User" But this is now invalid! This kills the usefulness of it IMO. -- Amaury Forgeot d'Arc From musiccomposition at gmail.com Sat Apr 5 16:07:53 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 5 Apr 2008 09:07:53 -0500 Subject: [Python-3000] raw strings and \u In-Reply-To: References: <47F72B6D.9000209@v.loewis.de> Message-ID: <1afaf6160804050707i6f601098o7e7faa9523241871@mail.gmail.com> On Sat, Apr 5, 2008 at 7:25 AM, Amaury Forgeot d'Arc wrote: > Martin v. L?wis wrote: > > > I just checked in r62163 with this change: > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > ms\uptable.asm") > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > ms\\uptable.asm") > > > > > > What should happen with raw unicode strings that contain a \u? The > > > old code above was generating: > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > Is that correct? Or should the \u be translated literally? > > > > The intention is that the file ms\uptable.asm is compiled to > > ms\uptable.obj. So the change is correct. (not sure what > > alternatives you might have considered) > > I use raw strings when there are backslashes in the text, and I still > want it to be readable:: > > r"C:\Documents and Settings\User" > > But this is now invalid! > This kills the usefulness of it IMO. I agree. I think we have three choices: 1. Don't allow Unicode escapes in raw mode. 2. Introduce a new mode which has unicode escapes and raw mode. 3. Deal with it. I don't like any of them... > > > -- > Amaury Forgeot d'Arc > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080405/a1d0d9cb/attachment.htm From guido at python.org Sat Apr 5 16:44:10 2008 From: guido at python.org (Guido van Rossum) Date: Sat, 5 Apr 2008 07:44:10 -0700 Subject: [Python-3000] raw strings and \u In-Reply-To: References: Message-ID: On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz wrote: > I just checked in r62163 with this change: > - rc = os.system(r"ml64 -c -Foms\uptable.obj ms\uptable.asm") > + rc = os.system("ml64 -c -Foms\\uptable.obj ms\\uptable.asm") > > What should happen with raw unicode strings that contain a \u? The > old code above was generating: > SyntaxError: (unicode error) truncated \uXXXX > > Is that correct? Or should the \u be translated literally? Oops, there's a regression!!! In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes in raw Unicode strings. That was a mistake, but we can't fix it (except when using "from __future__ import unicode_literals"). In 3.0, \u or \U in a raw string should have no special meaning -- it's just a backslash followed by 'u' or 'U'. This was fixed in 3.0a3. It seems to have reverted to the old (2.x) behavior in 3.0a4. THIS MUST BE FIXED! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Sat Apr 5 16:50:09 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 5 Apr 2008 09:50:09 -0500 Subject: [Python-3000] raw strings and \u In-Reply-To: References: Message-ID: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum wrote: > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz wrote: > > I just checked in r62163 with this change: > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > ms\uptable.asm") > > + rc = os.system("ml64 -c -Foms\\uptable.obj > ms\\uptable.asm") > > > > What should happen with raw unicode strings that contain a \u? The > > old code above was generating: > > SyntaxError: (unicode error) truncated \uXXXX > > > > Is that correct? Or should the \u be translated literally? > > Oops, there's a regression!!! > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes in > raw Unicode strings. That was a mistake, but we can't fix it (except > when using "from __future__ import unicode_literals"). In 3.0, \u or > \U in a raw string should have no special meaning -- it's just a > backslash followed by 'u' or 'U'. > > This was fixed in 3.0a3. It seems to have reverted to the old (2.x) > behavior in 3.0a4. > > THIS MUST BE FIXED! Done in r62165. > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080405/07d0a532/attachment.htm From guido at python.org Sat Apr 5 16:58:48 2008 From: guido at python.org (Guido van Rossum) Date: Sat, 5 Apr 2008 07:58:48 -0700 Subject: [Python-3000] raw strings and \u In-Reply-To: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> Message-ID: Thanks -- that was quick! On Sat, Apr 5, 2008 at 7:50 AM, Benjamin Peterson wrote: > > > > > On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum wrote: > > > > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz wrote: > > > I just checked in r62163 with this change: > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > ms\uptable.asm") > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > ms\\uptable.asm") > > > > > > What should happen with raw unicode strings that contain a \u? The > > > old code above was generating: > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > Is that correct? Or should the \u be translated literally? > > > > Oops, there's a regression!!! > > > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes in > > raw Unicode strings. That was a mistake, but we can't fix it (except > > when using "from __future__ import unicode_literals"). In 3.0, \u or > > \U in a raw string should have no special meaning -- it's just a > > backslash followed by 'u' or 'U'. > > > > This was fixed in 3.0a3. It seems to have reverted to the old (2.x) > > behavior in 3.0a4. > > > > THIS MUST BE FIXED! > Done in r62165. > > > > > > -- > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > > > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > -- > Cheers, > Benjamin Peterson -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Sat Apr 5 16:59:34 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 5 Apr 2008 09:59:34 -0500 Subject: [Python-3000] raw strings and \u In-Reply-To: References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> Message-ID: <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> On Sat, Apr 5, 2008 at 9:58 AM, Guido van Rossum wrote: > Thanks -- that was quick! Well, I was the guilty party... > > > On Sat, Apr 5, 2008 at 7:50 AM, Benjamin Peterson > wrote: > > > > > > > > > > On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum > wrote: > > > > > > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz > wrote: > > > > I just checked in r62163 with this change: > > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > > ms\uptable.asm") > > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > > ms\\uptable.asm") > > > > > > > > What should happen with raw unicode strings that contain a \u? The > > > > old code above was generating: > > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > > > Is that correct? Or should the \u be translated literally? > > > > > > Oops, there's a regression!!! > > > > > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes in > > > raw Unicode strings. That was a mistake, but we can't fix it (except > > > when using "from __future__ import unicode_literals"). In 3.0, \u or > > > \U in a raw string should have no special meaning -- it's just a > > > backslash followed by 'u' or 'U'. > > > > > > This was fixed in 3.0a3. It seems to have reverted to the old (2.x) > > > behavior in 3.0a4. > > > > > > THIS MUST BE FIXED! > > Done in r62165. > > > > > > > > > -- > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > > > > > _______________________________________________ > > > Python-3000 mailing list > > > Python-3000 at python.org > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > > Unsubscribe: > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > -- > > Cheers, > > Benjamin Peterson > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080405/5cbdc24e/attachment-0001.htm From guido at python.org Sat Apr 5 17:00:30 2008 From: guido at python.org (Guido van Rossum) Date: Sat, 5 Apr 2008 08:00:30 -0700 Subject: [Python-3000] raw strings and \u In-Reply-To: <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> Message-ID: :-) So did the broken version make it into the 3.0a4 release, or did you break it after the release? On Sat, Apr 5, 2008 at 7:59 AM, Benjamin Peterson wrote: > > > On Sat, Apr 5, 2008 at 9:58 AM, Guido van Rossum wrote: > > Thanks -- that was quick! > Well, I was the guilty party... > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 7:50 AM, Benjamin Peterson > > wrote: > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum > wrote: > > > > > > > > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz > wrote: > > > > > I just checked in r62163 with this change: > > > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > > > ms\uptable.asm") > > > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > > > ms\\uptable.asm") > > > > > > > > > > What should happen with raw unicode strings that contain a \u? The > > > > > old code above was generating: > > > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > > > > > Is that correct? Or should the \u be translated literally? > > > > > > > > Oops, there's a regression!!! > > > > > > > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes in > > > > raw Unicode strings. That was a mistake, but we can't fix it (except > > > > when using "from __future__ import unicode_literals"). In 3.0, \u or > > > > \U in a raw string should have no special meaning -- it's just a > > > > backslash followed by 'u' or 'U'. > > > > > > > > This was fixed in 3.0a3. It seems to have reverted to the old (2.x) > > > > behavior in 3.0a4. > > > > > > > > THIS MUST BE FIXED! > > > Done in r62165. > > > > > > > > > > > > -- > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Python-3000 mailing list > > > > Python-3000 at python.org > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > > > > Unsubscribe: > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > -- > > > Cheers, > > > Benjamin Peterson > > > > > > > > -- > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > -- > Cheers, > Benjamin Peterson -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Sat Apr 5 17:03:45 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 5 Apr 2008 10:03:45 -0500 Subject: [Python-3000] raw strings and \u In-Reply-To: References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> Message-ID: <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> On Sat, Apr 5, 2008 at 10:00 AM, Guido van Rossum wrote: > :-) > > So did the broken version make it into the 3.0a4 release, or did you > break it after the release? > Missed it by 7 revisions! I'm going to add a test for that, though, so it doesn't happen again. > > > On Sat, Apr 5, 2008 at 7:59 AM, Benjamin Peterson > wrote: > > > > > > On Sat, Apr 5, 2008 at 9:58 AM, Guido van Rossum > wrote: > > > Thanks -- that was quick! > > Well, I was the guilty party... > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 7:50 AM, Benjamin Peterson > > > wrote: > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum > > wrote: > > > > > > > > > > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz > > wrote: > > > > > > I just checked in r62163 with this change: > > > > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > > > > ms\uptable.asm") > > > > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > > > > ms\\uptable.asm") > > > > > > > > > > > > What should happen with raw unicode strings that contain a \u? > The > > > > > > old code above was generating: > > > > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > > > > > > > Is that correct? Or should the \u be translated literally? > > > > > > > > > > Oops, there's a regression!!! > > > > > > > > > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes > in > > > > > raw Unicode strings. That was a mistake, but we can't fix it > (except > > > > > when using "from __future__ import unicode_literals"). In 3.0, \u > or > > > > > \U in a raw string should have no special meaning -- it's just a > > > > > backslash followed by 'u' or 'U'. > > > > > > > > > > This was fixed in 3.0a3. It seems to have reverted to the old > (2.x) > > > > > behavior in 3.0a4. > > > > > > > > > > THIS MUST BE FIXED! > > > > Done in r62165. > > > > > > > > > > > > > > > -- > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Python-3000 mailing list > > > > > Python-3000 at python.org > > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > > > > > > Unsubscribe: > > > > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Benjamin Peterson > > > > > > > > > > > > -- > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > -- > > Cheers, > > Benjamin Peterson > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080405/92557768/attachment.htm From jmillikin at gmail.com Sat Apr 5 19:41:30 2008 From: jmillikin at gmail.com (John Millikin) Date: Sat, 5 Apr 2008 09:41:30 -0800 Subject: [Python-3000] raw strings and \u In-Reply-To: <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> Message-ID: <3283f7fe0804051041o2f07e67cs47b3977acea7616e@mail.gmail.com> If this is the case, could the regex library be modified to support \u and \U escapes as suggested by Martin v. L?wis[1]? Otherwise, the only way to use non-ASCII characters in a regex will be to avoid raw strings. [1] http://mail.python.org/pipermail/python-dev/2007-May/073074.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080405/9dd6c6cf/attachment.htm From martin at v.loewis.de Sat Apr 5 19:48:56 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 05 Apr 2008 19:48:56 +0200 Subject: [Python-3000] raw strings and \u In-Reply-To: <3283f7fe0804051041o2f07e67cs47b3977acea7616e@mail.gmail.com> References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> <3283f7fe0804051041o2f07e67cs47b3977acea7616e@mail.gmail.com> Message-ID: <47F7BB88.3050403@v.loewis.de> John Millikin wrote: > If this is the case, could the regex library be modified to support \u > and \U escapes +1 (not surprisingly). Would you like to work on a patch? Regards, Martin From nnorwitz at gmail.com Sat Apr 5 19:58:02 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Sat, 5 Apr 2008 10:58:02 -0700 Subject: [Python-3000] raw strings and \u In-Reply-To: <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> Message-ID: On Sat, Apr 5, 2008 at 8:03 AM, Benjamin Peterson wrote: > On Sat, Apr 5, 2008 at 10:00 AM, Guido van Rossum wrote: > > :-) > > > > So did the broken version make it into the 3.0a4 release, or did you > > break it after the release? > > > Missed it by 7 revisions! I'm going to add a test for that, though, so it > doesn't happen again. Are there more tests for raw strings that should be added? When I looked in the C code it looked there were were several things going on IIRC. It seems we ought to have a fairly comprehensive set of tests for raw strings (among other things) to verify we always do the right thing in the future. Thanks, n > > > > > > > > On Sat, Apr 5, 2008 at 7:59 AM, Benjamin Peterson > > > > > > > > wrote: > > > > > > > > > On Sat, Apr 5, 2008 at 9:58 AM, Guido van Rossum > wrote: > > > > Thanks -- that was quick! > > > Well, I was the guilty party... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 7:50 AM, Benjamin Peterson > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum > > > wrote: > > > > > > > > > > > > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz > > > wrote: > > > > > > > I just checked in r62163 with this change: > > > > > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > > > > > ms\uptable.asm") > > > > > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > > > > > ms\\uptable.asm") > > > > > > > > > > > > > > What should happen with raw unicode strings that contain a \u? > The > > > > > > > old code above was generating: > > > > > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > > > > > > > > > Is that correct? Or should the \u be translated literally? > > > > > > > > > > > > Oops, there's a regression!!! > > > > > > > > > > > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode escapes > in > > > > > > raw Unicode strings. That was a mistake, but we can't fix it > (except > > > > > > when using "from __future__ import unicode_literals"). In 3.0, \u > or > > > > > > \U in a raw string should have no special meaning -- it's just a > > > > > > backslash followed by 'u' or 'U'. > > > > > > > > > > > > This was fixed in 3.0a3. It seems to have reverted to the old > (2.x) > > > > > > behavior in 3.0a4. > > > > > > > > > > > > THIS MUST BE FIXED! > > > > > Done in r62165. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Python-3000 mailing list > > > > > > Python-3000 at python.org > > > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > > > > > > > > Unsubscribe: > > > > > > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Cheers, > > > > > Benjamin Peterson > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > > > > > > > > > > > -- > > > Cheers, > > > Benjamin Peterson > > > > > > > > -- > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > > > > -- > Cheers, > Benjamin Peterson From musiccomposition at gmail.com Sat Apr 5 20:39:44 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 5 Apr 2008 13:39:44 -0500 Subject: [Python-3000] raw strings and \u In-Reply-To: References: <1afaf6160804050750t691785fewcf056ea53d115ea3@mail.gmail.com> <1afaf6160804050759n3ddc5011wa08edef0c6565bcc@mail.gmail.com> <1afaf6160804050803t171f9b79qc260954acaca4fdd@mail.gmail.com> Message-ID: <1afaf6160804051139g787cfd6t367fe2ca7b779b50@mail.gmail.com> On Sat, Apr 5, 2008 at 12:58 PM, Neal Norwitz wrote: > On Sat, Apr 5, 2008 at 8:03 AM, Benjamin Peterson > wrote: > > On Sat, Apr 5, 2008 at 10:00 AM, Guido van Rossum > wrote: > > > :-) > > > > > > So did the broken version make it into the 3.0a4 release, or did you > > > break it after the release? > > > > > Missed it by 7 revisions! I'm going to add a test for that, though, so > it > > doesn't happen again. > > Are there more tests for raw strings that should be added? When I > looked in the C code it looked there were were several things going on > IIRC. It seems we ought to have a fairly comprehensive set of tests > for raw strings (among other things) to verify we always do the right > thing in the future. I think we do. Maybe I'm missing some in other files, but I only see a couple tests with raw strings in test_unicode. > > > Thanks, > n > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 7:59 AM, Benjamin Peterson > > > > > > > > > > > > wrote: > > > > > > > > > > > > On Sat, Apr 5, 2008 at 9:58 AM, Guido van Rossum > > wrote: > > > > > Thanks -- that was quick! > > > > Well, I was the guilty party... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 7:50 AM, Benjamin Peterson > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2008 at 9:44 AM, Guido van Rossum < > guido at python.org> > > > > wrote: > > > > > > > > > > > > > > On Fri, Apr 4, 2008 at 11:18 PM, Neal Norwitz < > nnorwitz at gmail.com> > > > > wrote: > > > > > > > > I just checked in r62163 with this change: > > > > > > > > - rc = os.system(r"ml64 -c -Foms\uptable.obj > > > > > > ms\uptable.asm") > > > > > > > > + rc = os.system("ml64 -c -Foms\\uptable.obj > > > > > > ms\\uptable.asm") > > > > > > > > > > > > > > > > What should happen with raw unicode strings that contain a > \u? > > The > > > > > > > > old code above was generating: > > > > > > > > SyntaxError: (unicode error) truncated \uXXXX > > > > > > > > > > > > > > > > Is that correct? Or should the \u be translated literally? > > > > > > > > > > > > > > Oops, there's a regression!!! > > > > > > > > > > > > > > In 2.x, \uDDDD and \UDDDDDDDD are interpreted as Unicode > escapes > > in > > > > > > > raw Unicode strings. That was a mistake, but we can't fix it > > (except > > > > > > > when using "from __future__ import unicode_literals"). In 3.0, > \u > > or > > > > > > > \U in a raw string should have no special meaning -- it's just > a > > > > > > > backslash followed by 'u' or 'U'. > > > > > > > > > > > > > > This was fixed in 3.0a3. It seems to have reverted to the old > > (2.x) > > > > > > > behavior in 3.0a4. > > > > > > > > > > > > > > THIS MUST BE FIXED! > > > > > > Done in r62165. > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Python-3000 mailing list > > > > > > > Python-3000 at python.org > > > > > > > http://mail.python.org/mailman/listinfo/python-3000 > > > > > > > > > > > > > > Unsubscribe: > > > > > > > > > > > > > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Cheers, > > > > > > Benjamin Peterson > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Benjamin Peterson > > > > > > > > > > > > -- > > > > > > > > > > > > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > > > > > > > > > > > -- > > Cheers, > > Benjamin Peterson > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080405/3a05ffca/attachment-0001.htm From qrczak at knm.org.pl Sun Apr 6 13:17:21 2008 From: qrczak at knm.org.pl (Marcin =?UTF-8?Q?=E2=80=98Qrczak=E2=80=99?= Kowalczyk) Date: Sun, 06 Apr 2008 13:17:21 +0200 Subject: [Python-3000] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> Message-ID: <1207480641.7624.11.camel@qrnik> 3.0a4 doesn't build for me. A python process hangs spinning the CPU, and ^C yields: Fatal Python error: Py_Initialize: can't initialize sys standard streams Traceback (most recent call last): File "/home/users/qrczak/Python-3.0a4/Lib/io.py", line 37, in import warnings File "/home/users/qrczak/Python-3.0a4/Lib/warnings.py", line 7, in import linecache File "/home/users/qrczak/Python-3.0a4/Lib/linecache.py", line 10, in import re File "/home/users/qrczak/Python-3.0a4/Lib/re.py", line 223, in _pattern_type = type(sre_compile.compile("", 0)) File "/home/users/qrczak/Python-3.0a4/Lib/sre_compile.py", line 498, in compile p = sre_parse.parse(p, flags) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 685, in parse p = _parse_sub(source, pattern, 0) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 321, in _parse_sub if sourcematch("|"): File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 209, in match if skip: KeyboardInterrupt Abort (core dumped) make: *** [sharedmods] Przerwanie or elsewhere: [...] File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 685, in parse p = _parse_sub(source, pattern, 0) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 320, in _parse_sub itemsappend(_parse(source, state)) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 393, in _parse subpattern = SubPattern(state) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 96, in __init__ def __init__(self, pattern, data=None): KeyboardInterrupt or elsewhere: File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 685, in parse p = _parse_sub(source, pattern, 0) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 320, in _parse_sub itemsappend(_parse(source, state)) File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 409, in _parse this = sourceget() File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 215, in get self.__next() File "/home/users/qrczak/Python-3.0a4/Lib/sre_parse.py", line 189, in __next if self.index >= len(self.string): KeyboardInterrupt etc. -- __("< Marcin Kowalczyk \__/ qrczak at knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ From martin at v.loewis.de Sun Apr 6 15:23:45 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 06 Apr 2008 15:23:45 +0200 Subject: [Python-3000] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <1207480641.7624.11.camel@qrnik> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> <1207480641.7624.11.camel@qrnik> Message-ID: <47F8CEE1.40309@v.loewis.de> > 3.0a4 doesn't build for me. A python process hangs spinning the CPU, and > ^C yields: What operating system and compiler? Does the behaviour change if you compile without optimization? Regards, Martin From musiccomposition at gmail.com Sun Apr 6 15:46:33 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sun, 6 Apr 2008 08:46:33 -0500 Subject: [Python-3000] readinto annotation Message-ID: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> While working on the io module docs, I noticed the annotation for readinto methods is bytes. This should be bytearray, right? -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080406/b3a094c9/attachment.htm From qrczak at knm.org.pl Sun Apr 6 15:58:47 2008 From: qrczak at knm.org.pl (Marcin =?UTF-8?Q?=E2=80=98Qrczak=E2=80=99?= Kowalczyk) Date: Sun, 06 Apr 2008 15:58:47 +0200 Subject: [Python-3000] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <47F8CEE1.40309@v.loewis.de> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> <1207480641.7624.11.camel@qrnik> <47F8CEE1.40309@v.loewis.de> Message-ID: <1207490327.7624.23.camel@qrnik> Dnia 06-04-2008, nie o godzinie 15:23 +0200, "Martin v. L?wis" pisze: > What operating system and compiler? Does the behaviour change if you > compile without optimization? You are right, this is a gcc bug. Python built with OPT=-O0 passed to configure. This is Linux/athlon, gcc-4.3.0 snapshot from 2008-03-13. Since I recently encountered a particular gcc-4.3.0 bug, I confirmed that it is the cause here too: Python-3.0a4 builds with "-O3 -fwrapv -fno-tree-vrp" but hangs with "-O3 -fwrapv". The bug is already fixed in gcc: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35833 The bug is present in gcc-4.3.0 released a month ago, so someone with vanilla gcc-4.3.0 might check this. Adding -fno-tree-vrp is a workaround for gcc-4.3.0. -- __("< Marcin Kowalczyk \__/ qrczak at knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ From martin at v.loewis.de Sun Apr 6 16:29:28 2008 From: martin at v.loewis.de (=?ISO-8859-2?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 06 Apr 2008 16:29:28 +0200 Subject: [Python-3000] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <1207490327.7624.23.camel@qrnik> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> <1207480641.7624.11.camel@qrnik> <47F8CEE1.40309@v.loewis.de> <1207490327.7624.23.camel@qrnik> Message-ID: <47F8DE48.1000008@v.loewis.de> > The bug is present in gcc-4.3.0 released a month ago, so someone with > vanilla gcc-4.3.0 might check this. Adding -fno-tree-vrp is a workaround > for gcc-4.3.0. Depending on how much you care about this issue, feel free to contribute a patch. The alternatives I see is a) do nothing (obviously), trusting that Google will find this thread b) add something to the README c) detect the case in configure, and work around appropriately The latter one sounds like overkill, given that the bug will probably fixed soon in a gcc bugfix release. I'd still apply such a patch provided it was contributed. Regards, Martin From qrczak at knm.org.pl Sun Apr 6 17:02:20 2008 From: qrczak at knm.org.pl (Marcin =?UTF-8?Q?=E2=80=98Qrczak=E2=80=99?= Kowalczyk) Date: Sun, 06 Apr 2008 17:02:20 +0200 Subject: [Python-3000] RELEASED Python 2.6a2 and 3.0a4 In-Reply-To: <47F8DE48.1000008@v.loewis.de> References: <3C3C0150-ED65-4381-9F54-BB437DD6DFB9@python.org> <1207480641.7624.11.camel@qrnik> <47F8CEE1.40309@v.loewis.de> <1207490327.7624.23.camel@qrnik> <47F8DE48.1000008@v.loewis.de> Message-ID: <1207494140.7624.28.camel@qrnik> Dnia 06-04-2008, nie o godzinie 16:29 +0200, "Martin v. L?wis" pisze: > Depending on how much you care about this issue, feel free to contribute > a patch. The alternatives I see is > a) do nothing (obviously), trusting that Google will find this thread > b) add something to the README > c) detect the case in configure, and work around appropriately Here is what I used in my code which triggered that bug: AC_DEFUN([KO_CHECK_GCC_TREE_VRP_BUG], [ AC_MSG_CHECKING([whether $CC has a particular -ftree-vrp bug]) ko_save_CFLAGS=$CFLAGS CFLAGS="$CFLAGS -O -ftree-vrp" AC_RUN_IFELSE([AC_LANG_PROGRAM([[ #include struct S {struct S *field;}; struct S True, False, Z; static inline int f(void) {return 1;} static inline int g(struct S **obj) { return f() && *obj == &Z; } struct S **h(struct S **x) { if (x) return g(x) ? &True.field : &False.field; else return &True.field; } struct S **(*hptr)(struct S **x); ]], [[ struct S obj; obj.field = 0; hptr = h; return hptr(&obj.field) == &True.field ? EXIT_SUCCESS : EXIT_FAILURE; ]])], [ko_gcc_tree_vrp_bug=yes], [ko_gcc_tree_vrp_bug=no], [ko_gcc_tree_vrp_bug=unknown]) CFLAGS=$ko_save_CFLAGS AC_MSG_RESULT($ko_gcc_tree_vrp_bug) if test $ko_gcc_tree_vrp_bug = yes; then CFLAGS="$CFLAGS -fno-tree-vrp" fi ]) -- __("< Marcin Kowalczyk \__/ qrczak at knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ From greg at krypto.org Mon Apr 7 02:00:31 2008 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 6 Apr 2008 17:00:31 -0700 Subject: [Python-3000] readinto annotation In-Reply-To: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> References: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> Message-ID: <52dc1c820804061700q6abf765fob38abdc9d6011b93@mail.gmail.com> yes bytearray makes more sense to me given that its hard to read into an immutable bytes object ;) On Sun, Apr 6, 2008 at 6:46 AM, Benjamin Peterson < musiccomposition at gmail.com> wrote: > While working on the io module docs, I noticed the annotation for readinto > methods is bytes. This should be bytearray, right? > > -- > Cheers, > Benjamin Peterson > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/greg%40krypto.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080406/5e71adb5/attachment.htm From tjreedy at udel.edu Mon Apr 7 02:34:02 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 6 Apr 2008 20:34:02 -0400 Subject: [Python-3000] Types and classes References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com><1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com><1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com><1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com><47F48E21.3070304@v.loewis.de> Message-ID: "Guido van Rossum" wrote in message news:ca471dc20804031118l30eae131i1487e46c19e55b56 at mail.gmail.com... On Thu, Apr 3, 2008 at 12:58 AM, "Martin v. L?wis" wrote: > > All I really mean to fix is to standardize the terminology, especially > > in repr(). > > So you don't want to be called a wimp anymore ?-) Indeed. > ------------------------------------------------------------------------ > r23331 | gvanrossum | 2001-09-25 05:56:29 +0200 (Di, 25 Sep 2001) | 5 > lines > > Change repr() of a new-style class to say rather > than . Exception: if it's a built-in type or an > extension type, continue to call it . Call me a > wimp, but I don't want to break more user code than necessary. > > ------------------------------------------------------------------------ Well, if we're going to break user code, 3.0 is the time to do it. :-) =============================== Could not find this in tracker, so http://bugs.python.org/issue2565 From amauryfa at gmail.com Mon Apr 7 09:52:00 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 7 Apr 2008 09:52:00 +0200 Subject: [Python-3000] readinto annotation In-Reply-To: <52dc1c820804061700q6abf765fob38abdc9d6011b93@mail.gmail.com> References: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> <52dc1c820804061700q6abf765fob38abdc9d6011b93@mail.gmail.com> Message-ID: On Mon, Apr 7, 2008 at 2:00 AM, Gregory P. Smith wrote: > yes bytearray makes more sense to me given that its hard to read into an > immutable bytes object ;) Not so hard: http://bugs.python.org/issue2538 Some time ago, bytes were mutable... -- Amaury Forgeot d'Arc From solipsis at pitrou.net Mon Apr 7 15:11:50 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 Apr 2008 13:11:50 +0000 (UTC) Subject: [Python-3000] readinto annotation References: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> <52dc1c820804061700q6abf765fob38abdc9d6011b93@mail.gmail.com> Message-ID: Gregory P. Smith krypto.org> writes: > > yes bytearray makes more sense to me given that its hard to read into an immutable bytes object ;) It seems to me that readinto accepts any object providing a writeable buffer interface. I don't know how to express that as an annotation, though. From guido at python.org Mon Apr 7 17:11:37 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 08:11:37 -0700 Subject: [Python-3000] readinto annotation In-Reply-To: References: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> <52dc1c820804061700q6abf765fob38abdc9d6011b93@mail.gmail.com> Message-ID: On Mon, Apr 7, 2008 at 6:11 AM, Antoine Pitrou wrote: > Gregory P. Smith krypto.org> writes: > > > > yes bytearray makes more sense to me given that its hard to read into an > immutable bytes object ;) > > It seems to me that readinto accepts any object providing a writeable buffer > interface. I don't know how to express that as an annotation, though. Don't worry too much about it. The annotation is just documentation anyway. I'd be fine with using bytearray as the annotation, and explaining in the docstring that other mutable bytes buffers are okay too. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From lists at cheimes.de Mon Apr 7 17:42:17 2008 From: lists at cheimes.de (Christian Heimes) Date: Mon, 07 Apr 2008 17:42:17 +0200 Subject: [Python-3000] r62195 - in python/trunk: Doc/c-api/file.rst Include/fileobject.h Lib/test/test_file.py Misc/NEWS Objects/fileobject.c In-Reply-To: <20080406231118.1A1961E400C@bag.python.org> References: <20080406231118.1A1961E400C@bag.python.org> Message-ID: <47FA40D9.5090806@cheimes.de> gregory.p.smith schrieb: > Author: gregory.p.smith > Date: Mon Apr 7 01:11:17 2008 > New Revision: 62195 > > Modified: > python/trunk/Doc/c-api/file.rst > python/trunk/Include/fileobject.h > python/trunk/Lib/test/test_file.py > python/trunk/Misc/NEWS > python/trunk/Objects/fileobject.c > Log: > Make file objects as thread safe as the underlying libc FILE* implementation. > close() will now raise an IOError if any operations on the file object > are currently in progress in other threads. > > Most code was written by Antoine Pitrou (pitrou). Additional testing, > documentation and test suite cleanup done by me (gregory.p.smith). > > Fixes issue 815646 and 595601 (as well as many other bugs and > references to this problem dating back to the dawn of Python). How much of the code needs to go into Python 3000? Python 3000 exposes only file descriptors and not wrapepd FILE*. It should be safe without the patch, shouldn't it? Christian From guido at python.org Mon Apr 7 19:21:43 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 10:21:43 -0700 Subject: [Python-3000] r62195 - in python/trunk: Doc/c-api/file.rst Include/fileobject.h Lib/test/test_file.py Misc/NEWS Objects/fileobject.c In-Reply-To: <47FA40D9.5090806@cheimes.de> References: <20080406231118.1A1961E400C@bag.python.org> <47FA40D9.5090806@cheimes.de> Message-ID: Right, this doesn't apply to py3k files at all. (Not that py3k files are all that thead-safe. :-) On Mon, Apr 7, 2008 at 8:42 AM, Christian Heimes wrote: > gregory.p.smith schrieb: > > > Author: gregory.p.smith > > Date: Mon Apr 7 01:11:17 2008 > > New Revision: 62195 > > > > Modified: > > python/trunk/Doc/c-api/file.rst > > python/trunk/Include/fileobject.h > > python/trunk/Lib/test/test_file.py > > python/trunk/Misc/NEWS > > python/trunk/Objects/fileobject.c > > Log: > > Make file objects as thread safe as the underlying libc FILE* implementation. > > close() will now raise an IOError if any operations on the file object > > are currently in progress in other threads. > > > > Most code was written by Antoine Pitrou (pitrou). Additional testing, > > documentation and test suite cleanup done by me (gregory.p.smith). > > > > Fixes issue 815646 and 595601 (as well as many other bugs and > > references to this problem dating back to the dawn of Python). > > How much of the code needs to go into Python 3000? Python 3000 exposes > only file descriptors and not wrapepd FILE*. It should be safe without > the patch, shouldn't it? > > Christian > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Apr 7 19:27:41 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 10:27:41 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> Message-ID: > Could not find this in tracker, so http://bugs.python.org/issue2565 And thanks to Martin for making it so. As a follow-up, what do people think of making the str() of a class return just the thing between '...' in the repr()? This is much shorter and in many cases enough. (This actually inverses the rule that the repr() ought to resemble an expression and the str() could be anything convenient, but I'd rather see the repr() be unambiguous.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Mon Apr 7 20:11:29 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 7 Apr 2008 14:11:29 -0400 Subject: [Python-3000] Types and classes References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com><1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com><1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com><1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com><47F48E21.3070304@v.loewis.de> Message-ID: "Guido van Rossum" wrote in message news:ca471dc20804071027h3f138cd7m345ed196c32ab0c7 at mail.gmail.com... |> Could not find this in tracker, so http://bugs.python.org/issue2565 | | And thanks to Martin for making it so. | | As a follow-up, what do people think of making the str() of a class | return just the thing between '...' in the repr()? This is much | shorter and in many cases enough. As in >>> print(type(3)) int # instead of ? Looks good to me. | (This actually inverses the rule that the repr() ought to resemble an | expression and the str() could be anything convenient, but I'd rather | see the repr() be unambiguous.) \ From musiccomposition at gmail.com Mon Apr 7 23:10:01 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 7 Apr 2008 16:10:01 -0500 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module Message-ID: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> After a long conversation on the stdlib-sig list, I'd like to bring this before you. For those of you not on the peps mailing list, Guido has expressed lukewarmness (well -0.5) to the idea. However, I'd still like your comments on my first PEP. PEP: XXX Title: Cleaning out sys and the "interpreter" module Version: $Revision$ Last-Modified: $Date$ Author: Benjamin Peterson Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-April-2008 Python-Version: 3.0 Abstract ======== This PEP proposes a new low-level module for CPython-specific interpreter functions in order to clean out the sys module and separate general Python functionality from implementation details. Rationale ========= The sys module currently contains functions and data that can be put into two major groups: 1. Data and functions that are available in all Python implementations and deal with the general running of a Python virtual machine. - argv - byteorder - path, path_hooks, meta_path, path_importer_cache, and modules - copyright, hexversion, version, and version_info - displayhook, __displayhook__ - excepthook, __excepthook__, exc_info, and exc_clear - exec_prefix and prefix - executable - exit - flags, py3kwarning, dont_write_bytecode, and warn_options - getfilesystemencoding - get/setprofile - get/settrace, call_tracing - getwindowsversion - maxint and maxunicode - platform - ps1 and ps2 - stdin, stderr, stdout, __stdin__, __stderr__, __stdout__ - tracebacklimit 2. Data and functions that affect the CPython interpreter. - get/setrecursionlimit - get/setcheckinterval - _getframe and _current_frame - getrefcount - get/setdlopenflags - settscdumps - api_version - winver - dllhandle - float_info - _compact_freelists - _clear_type_cache - subversion - builtin_module_names - callstats - intern The second collections of items has been steadily increasing over the years causing clutter in sys. Guido has even said he doesn't recognize some of things in it [#bug-1522]_! Other implementations have clearly struggled with what to do about the contents of sys they can't implement but must to retain compatibility. For example, Jython's sys module has dud set/getrecursionlimit functions. Moving these items items off to another module would send a clear message about what functions need and need not be implemented. It has also been proposed that the contents of types module be distributed across the standard library [#types-removal]_; the interpreter module would provide an excellent resting place for internal types like frames and code objects. Specification ============= A new builtin module named "interpreter" (see `Naming`_) will be added. The second list of items above will be split into the stdlib as follows: The interpreter module - get/setrecursionlimit - get/setcheckinterval - _getframe and _current_frame - get/setdlopenflags - settscdumps - api_version - winver - dllhandle - float_info - _clear_type_cache - subversion - builtin_module_names - callstats - intern The gc module: - getrefcount - _compact_freelists Transition Plan =============== Once implemented in 3.x, the interpreter module will be back-ported to 2.6. Py3k warnings will be added the the sys functions it replaces. Open Issues =========== What should move? ----------------- dont_write_bytecode ^^^^^^^^^^^^^^^^^^^^ Some believe that the writing of bytecode is an implementation detail and should be moved [#bytecode-issue]_. The counterargument is that all current, complete Python implementations do write some sort of bytecode, so it is valuable to be able to disable it. Also, if it is moved, some wish to put it in the imp module. Move to some to imp? -------------------- It was noted that dont_write_bytecode or maybe builtin_module_names might fit nicely in the imp module. Naming ------ The author proposes the name "interpreter" for the new module. "pyvm" has also been suggested [#pyvm-name]_. The name "cpython" was well liked [#cpython-name]_. References ========== .. [#bug-1522] http://bugs.python.org/issue1522 .. [#types-removal] http://mail.python.org/pipermail/stdlib-sig/2008-April/000172.html .. [#bytecode-issue] http://mail.python.org/pipermail/stdlib-sig/2008-April/000217.html .. [#pyvm-name] http://mail.python.org/pipermail/python-3000/2007-November/011351.html .. [#cpython-name] http://mail.python.org/pipermail/stdlib-sig/2008-April/000223.html Copyright ========= This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080407/05225735/attachment-0001.htm From ncoghlan at gmail.com Mon Apr 7 23:41:21 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 08 Apr 2008 07:41:21 +1000 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> Message-ID: <47FA9501.6080408@gmail.com> Benjamin Peterson wrote: > After a long conversation on the stdlib-sig list, I'd like to bring this > before you. For those of you not on the peps mailing list, Guido has > expressed lukewarmness (well -0.5) to the idea. However, I'd still like > your comments on my first PEP. +1 from me. > The gc module: > - getrefcount > - _compact_freelists These are very specific to CPython's style of garbage collection - they don't make sense in the context of something like the native GC in Jython or Ironpython. So -1 on moving these to gc - put them in the new interpreter-specific module along with everything else. > Move to some to imp? > -------------------- > > It was noted that dont_write_bytecode or maybe builtin_module_names > might fit > nicely in the imp module. I wouldn't bother moving these two - there is lots of import related stuff in sys already (path, path_hooks, metapath, etc) and it isn't worth the hassle of trying to move all of it. > Naming > ------ > > The author proposes the name "interpreter" for the new module. "pyvm" > has also > been suggested [#pyvm-name]_. The name "cpython" was well liked > [#cpython-name]_. 'interpreter' seems unnecessarily long. '_pyvm' would get my vote, although I'd also be fine with pyvm or cpython (it will be necessary to get Guido up to at least +0 just so he can choose the name!). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Mon Apr 7 23:45:52 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 14:45:52 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: <47FA9501.6080408@gmail.com> References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> Message-ID: -0.5 from me. For half of the names that the PEP proposes to move most users wouldn't be able to guess in which module to find them. On Mon, Apr 7, 2008 at 2:41 PM, Nick Coghlan wrote: > Benjamin Peterson wrote: > > After a long conversation on the stdlib-sig list, I'd like to bring this > > before you. For those of you not on the peps mailing list, Guido has > > expressed lukewarmness (well -0.5) to the idea. However, I'd still like > > your comments on my first PEP. > > +1 from me. > > > > The gc module: > > - getrefcount > > - _compact_freelists > > These are very specific to CPython's style of garbage collection - they > don't make sense in the context of something like the native GC in > Jython or Ironpython. > > So -1 on moving these to gc - put them in the new interpreter-specific > module along with everything else. > > > > Move to some to imp? > > -------------------- > > > > It was noted that dont_write_bytecode or maybe builtin_module_names > > might fit > > nicely in the imp module. > > I wouldn't bother moving these two - there is lots of import related > stuff in sys already (path, path_hooks, metapath, etc) and it isn't > worth the hassle of trying to move all of it. > > > > Naming > > ------ > > > > The author proposes the name "interpreter" for the new module. "pyvm" > > has also > > been suggested [#pyvm-name]_. The name "cpython" was well liked > > [#cpython-name]_. > > 'interpreter' seems unnecessarily long. '_pyvm' would get my vote, > although I'd also be fine with pyvm or cpython (it will be necessary to > get Guido up to at least +0 just so he can choose the name!). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Tue Apr 8 00:04:13 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 7 Apr 2008 17:04:13 -0500 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> Message-ID: <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> On Mon, Apr 7, 2008 at 4:45 PM, Guido van Rossum wrote: > -0.5 from me. For half of the names that the PEP proposes to move most > users wouldn't be able to guess in which module to find them. > If they're in *one* (maybe two; we'll see.) other module, it'd be hard to guess where they are? At the top of the sys docs, we'll put "sys: Generic Python interpreter services. For CPython specific tools, see the cpython module" I don't see why people have to be able to "guess" where a given object is. (It should be reasonably placed, of course.) > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080407/17d0b862/attachment.htm From fumanchu at aminus.org Tue Apr 8 00:10:10 2008 From: fumanchu at aminus.org (Robert Brewer) Date: Mon, 7 Apr 2008 15:10:10 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com><47FA9501.6080408@gmail.com> Message-ID: Guido van Rossum wrote: > Benjamin Peterson wrote: > > After a long conversation on the stdlib-sig list, I'd like to > > bring this before you. For those of you not on the peps mailing > > list, Guido has expressed lukewarmness (well -0.5) to the idea. > > However, I'd still like your comments on my first PEP. > > -0.5 from me. For half of the names that the PEP proposes to move most > users wouldn't be able to guess in which module to find them. Knowing which attributes are specific to a certain implementation is handy. I'd be equally happy if the Library Reference just included that in a standard way, in the same vein as the "Availability: Unix" declarations. Robert Brewer fumanchu at aminus.org From musiccomposition at gmail.com Tue Apr 8 00:28:11 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 7 Apr 2008 17:28:11 -0500 Subject: [Python-3000] readinto annotation In-Reply-To: References: <1afaf6160804060646q592dc1cdy39b3d9e48f770201@mail.gmail.com> <52dc1c820804061700q6abf765fob38abdc9d6011b93@mail.gmail.com> Message-ID: <1afaf6160804071528t22307177re4eb8088af259194@mail.gmail.com> On Mon, Apr 7, 2008 at 10:11 AM, Guido van Rossum wrote: > On Mon, Apr 7, 2008 at 6:11 AM, Antoine Pitrou > wrote: > > Gregory P. Smith krypto.org> writes: > > > > > > yes bytearray makes more sense to me given that its hard to read into > an > > immutable bytes object ;) > > > > It seems to me that readinto accepts any object providing a writeable > buffer > > interface. I don't know how to express that as an annotation, though. > > Don't worry too much about it. The annotation is just documentation > anyway. I'd be fine with using bytearray as the annotation, and > explaining in the docstring that other mutable bytes buffers are okay > too. Ok. I changed it in r62218. > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080407/0678d57e/attachment.htm From guido at python.org Tue Apr 8 01:19:50 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 16:19:50 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> Message-ID: On Mon, Apr 7, 2008 at 3:04 PM, Benjamin Peterson wrote: > On Mon, Apr 7, 2008 at 4:45 PM, Guido van Rossum wrote: > > -0.5 from me. For half of the names that the PEP proposes to move most > > users wouldn't be able to guess in which module to find them. > If they're in *one* (maybe two; we'll see.) other module, it'd be hard to > guess where they are? At the top of the sys docs, we'll put "sys: Generic > Python interpreter services. For CPython specific tools, see the cpython > module" I don't see why people have to be able to "guess" where a given > object is. (It should be reasonably placed, of course.) Yes, it will be hard, because most CPython users have no idea what other Python implementations can or cannot do. E.g. i was surprised to learn that Jython doesn't support a recursion limit, or that frame objects are not universal (in fact I think *you* are mistaken there). OTOH I would guess that "executable" may not be meaningful in Jython, as you'd have to invoke the JVM first. Other examples: I'm not at all sure that all Python implementations should be expected to support tracing and profiling. And I don't get why builtin_module_names can't be universal. Enough examples; I hope my point is clear. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Tue Apr 8 02:08:00 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 7 Apr 2008 19:08:00 -0500 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> Message-ID: <1afaf6160804071708l6a75bcd1u5a99a730d1dc7816@mail.gmail.com> On Mon, Apr 7, 2008 at 6:19 PM, Guido van Rossum wrote: > > On Mon, Apr 7, 2008 at 3:04 PM, Benjamin Peterson > wrote: > > On Mon, Apr 7, 2008 at 4:45 PM, Guido van Rossum wrote: > > > -0.5 from me. For half of the names that the PEP proposes to move most > > > users wouldn't be able to guess in which module to find them. > > > If they're in *one* (maybe two; we'll see.) other module, it'd be hard to > > guess where they are? At the top of the sys docs, we'll put "sys: Generic > > Python interpreter services. For CPython specific tools, see the cpython > > module" I don't see why people have to be able to "guess" where a given > > object is. (It should be reasonably placed, of course.) > > Yes, it will be hard, because most CPython users have no idea what > other Python implementations can or cannot do. > > E.g. i was surprised to learn that Jython doesn't support a recursion > limit, or that frame objects are not universal (in fact I think *you* > are mistaken there). On further examination, I see that you win on both counts. Jython does support _getframe and recursion limits (although, I can't seem to get it to work). > > > OTOH I would guess that "executable" may not be meaningful in Jython, > as you'd have to invoke the JVM first. Other examples: I'm not at all > sure that all Python implementations should be expected to support > tracing and profiling. And I don't get why builtin_module_names can't > be universal. executable in Jython is an empty string. I don't see how tracing and profiling aren't "universal" when frames are. I put builtin_module_names in interpreter because not all implementations would have the concept of "compiled" in. However, I can see both ways. > > > Enough examples; I hope my point is clear. Crystal. > > > -- > > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- Cheers, Benjamin Peterson From guido at python.org Tue Apr 8 02:09:01 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 17:09:01 -0700 Subject: [Python-3000] python-safethread project status In-Reply-To: <47E062A3.4030702@canterbury.ac.nz> References: <1205867606.31138.11.camel@qrnik> <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> Message-ID: [catching up on old threads] On Tue, Mar 18, 2008 at 5:47 PM, Greg Ewing wrote: > Adam Olsen wrote: > > I'd tend to assume only *purely* functional languages should have > > asynchronous interrupts. Any imperative language with them is > > suspect. > > Yet there are situations where *not* having any such thing > can be extremely inconvenient. > > If I'm performing some background calculation that only > munges on its own data, and doesn't touch anything shared, > it's quite safe to kill it at any point and throw away > everything it was working on. Maybe it should be a forked subprocess then, if it doesn't touch anything shared? > Being unable to do that from outside means that I have > to sprinkle explicit tests through it for an abort flag, > which is a horrible thing to have to do from a software > engineering standpoint for many reasons. > > In the consenting-adults environment of Python, I don't > like having a useful facility withheld from me just > because it's possible to misuse it. Huh? We do that all the time. We won't let you control when memory is deallocated. We won't let you call __hash__ when you've overridden __eq__ but not __hash__; there are zillions of examples. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Apr 8 03:50:47 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 Apr 2008 13:50:47 +1200 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> Message-ID: <47FACF77.9040401@canterbury.ac.nz> Guido van Rossum wrote: > what do people think of making the str() of a class > return just the thing between '...' in the repr()? Are you talking about the class itself, or instances of the class? If the latter, I'm not sure I like that idea. Very often I write thing like 'print "foo =", foo' as debugging statements, relying on the fact that str(foo) will give me a repr, since it doesn't have a __str__ of its own. If that changes, I'll have to print repr(foo) or "%r" % foo a lot more often instead, which would be tedious. -- Greg From greg.ewing at canterbury.ac.nz Tue Apr 8 03:53:06 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 Apr 2008 13:53:06 +1200 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> Message-ID: <47FAD002.8080306@canterbury.ac.nz> Terry Reedy wrote: > As in > >>>>print(type(3)) > > int # instead of I have the same feeling there -- the only time I'm likely to be deliberately printing a class is for debugging, and then I want unambiguity. -- Greg From greg.ewing at canterbury.ac.nz Tue Apr 8 04:40:36 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 Apr 2008 14:40:36 +1200 Subject: [Python-3000] python-safethread project status In-Reply-To: References: <1205867606.31138.11.camel@qrnik> <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> Message-ID: <47FADB24.20007@canterbury.ac.nz> Guido van Rossum wrote: > Maybe it should be a forked subprocess then, if it doesn't touch > anything shared? It might be taking and returning large data structures that it would be tedious to transfer between processes. Pickling them might not be straightforward if they contain references to objects that you don't want to transfer, but you want to maintain the references. > Huh? We do that all the time. We won't let you control when memory is > deallocated. I hardly think that being able to kill threads is anywhere near as dangerous as being able to scribble all over memory. And I *can* actually do that if I really want, using ctypes. :-) -- Greg From rhamph at gmail.com Tue Apr 8 05:01:03 2008 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 7 Apr 2008 21:01:03 -0600 Subject: [Python-3000] python-safethread project status In-Reply-To: <47FADB24.20007@canterbury.ac.nz> References: <1205867606.31138.11.camel@qrnik> <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> <47FADB24.20007@canterbury.ac.nz> Message-ID: On Mon, Apr 7, 2008 at 8:40 PM, Greg Ewing wrote: > Guido van Rossum wrote: > > Huh? We do that all the time. We won't let you control when memory is > > deallocated. > > I hardly think that being able to kill threads is > anywhere near as dangerous as being able to scribble > all over memory. And I *can* actually do that if I > really want, using ctypes. :-) Killing threads at arbitrary points really is that dangerous. You can do it too, if you know the right APIs to access using ctypes. I'd love a magic solution to cleanly exiting a thread, but I don't think one is possible. You need some way to contain the insanity. Using a process is one. Using a side-effect-free language is another. Cancellation is a third option, and what I think will be the most convenient for python. -- Adam Olsen, aka Rhamphoryncus From tjreedy at udel.edu Tue Apr 8 05:04:12 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 7 Apr 2008 23:04:12 -0400 Subject: [Python-3000] Types and classes References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com><1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com><1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com><1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com><47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> Message-ID: "Greg Ewing" wrote in message news:47FAD002.8080306 at canterbury.ac.nz... | Terry Reedy wrote: | | > As in | > | >>>>print(type(3)) | > | > int # instead of | | I have the same feeling there -- the only time I'm | likely to be deliberately printing a class is for | debugging, and then I want unambiguity. Unfortunately, *any* text printed for any object *could* have been the value of a string object. With str(), ambiguity is rife: >>> a = '1' >>> b = 1 >>> print(a,b) 1 1 So if you want unabmiguity for debugging, repr() is better since strings and only strings are surrounded by quotes. >>> print(repr(a),repr(b)) '1' 1 Guido only suggested the possibility of a more-friendly abbreviation for str, not for repr. When one calls type(x), one *knows* the answer is a class, so the boilerplate template is often redundant and unnecessary. Consider the current >>> print('Expected', type(a), '; got', type(b)) Expected ; got -or future- Expected ; got I would like to have the option of getting more normal looking text like Expected str ; got int without having to parse away the added boilerplate. If I wanted 'class' in the output, I might prefer to put it in the strings to get Expected class str ; got class int without the brackets. Should print() have an option to convert with repr instead of str? Terry Jan Reedy From guido at python.org Tue Apr 8 06:40:16 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 21:40:16 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <47FACF77.9040401@canterbury.ac.nz> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FACF77.9040401@canterbury.ac.nz> Message-ID: On Mon, Apr 7, 2008 at 6:50 PM, Greg Ewing wrote: > Guido van Rossum wrote: > > what do people think of making the str() of a class > > return just the thing between '...' in the repr()? > > Are you talking about the class itself, or instances > of the class? No, the class itself. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Apr 8 06:41:55 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 21:41:55 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> Message-ID: On Mon, Apr 7, 2008 at 8:04 PM, Terry Reedy wrote: > Should print() have an option to convert with repr instead of str? I don't think so -- just write the repr() call. Or write print(*map(repr, (a, b, c))) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jyasskin at gmail.com Tue Apr 8 07:43:46 2008 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Mon, 7 Apr 2008 22:43:46 -0700 Subject: [Python-3000] python-safethread project status In-Reply-To: <47FADB24.20007@canterbury.ac.nz> References: <1205867606.31138.11.camel@qrnik> <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> <47FADB24.20007@canterbury.ac.nz> Message-ID: <5d44f72f0804072243t4f62b05ala0027f1731abf820@mail.gmail.com> On Mon, Apr 7, 2008 at 7:40 PM, Greg Ewing wrote: > Guido van Rossum wrote: > > > Maybe it should be a forked subprocess then, if it doesn't touch > > anything shared? > > It might be taking and returning large data structures > that it would be tedious to transfer between processes. > Pickling them might not be straightforward if they > contain references to objects that you don't want to > transfer, but you want to maintain the references. > > > > Huh? We do that all the time. We won't let you control when memory is > > deallocated. > > I hardly think that being able to kill threads is > anywhere near as dangerous as being able to scribble > all over memory. And I *can* actually do that if I > really want, using ctypes. :-) I see three levels of thread interruption. First, you might want to poke a thread just to wake up a single system call, but the thread might get back to work afterwards. This resembles Java's Thread.interrupt. Second, you might want to cancel the thread, but only in ways that let the user clean up afterward. This is vaguely like pthread_cancel, or like Thread.interrupt with no way to clear the interrupted status. Third, you might want to really abort the thread, like Java's Thread.stop. There are uses for all three levels, but for a first implementation, I think we should pick just one. Because aborting is unsafe in most situations, it's out. And I vaguely remember Josh Block saying that if he had to do Java over again, he'd make it impossible to clear a Thread's interrupted status, turning it into cancellation, but I need to check with him to make sure I got that right. I'm not opposed in theory to providing the really violent option in, say, the version after we provide plain cooperative cancellation, but: 1) Any given library can simulate it by calling threading.cancellation_point() (or whatever its name turns out to be) occasionally within its inner loop, and 2) Judging from other systems with violent interruptions like posix signals and Haskell asynchronous exceptions, we'll need a way of blocking aborts in a scope, and unblocking them in a sub-scope. -- Namast?, Jeffrey Yasskin From guido at python.org Tue Apr 8 07:58:18 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 7 Apr 2008 22:58:18 -0700 Subject: [Python-3000] python-safethread project status In-Reply-To: <5d44f72f0804072243t4f62b05ala0027f1731abf820@mail.gmail.com> References: <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> <47FADB24.20007@canterbury.ac.nz> <5d44f72f0804072243t4f62b05ala0027f1731abf820@mail.gmail.com> Message-ID: We have a way to raise an exception in a thread asynchronously, *but* we don't have a way to interrupt either system calls or blocked lock acquisitions. I suppose that system calls can be made interruptable with suitable tweaking of various signal-related settings (at least on Unix -- and I expect Windows has an equivalent). But I don't know if mutex acquisitions are interruptable. --Guido On Mon, Apr 7, 2008 at 10:43 PM, Jeffrey Yasskin wrote: > On Mon, Apr 7, 2008 at 7:40 PM, Greg Ewing wrote: > > Guido van Rossum wrote: > > > > > Maybe it should be a forked subprocess then, if it doesn't touch > > > anything shared? > > > > It might be taking and returning large data structures > > that it would be tedious to transfer between processes. > > Pickling them might not be straightforward if they > > contain references to objects that you don't want to > > transfer, but you want to maintain the references. > > > > > > > Huh? We do that all the time. We won't let you control when memory is > > > deallocated. > > > > I hardly think that being able to kill threads is > > anywhere near as dangerous as being able to scribble > > all over memory. And I *can* actually do that if I > > really want, using ctypes. :-) > > I see three levels of thread interruption. First, you might want to > poke a thread just to wake up a single system call, but the thread > might get back to work afterwards. This resembles Java's > Thread.interrupt. Second, you might want to cancel the thread, but > only in ways that let the user clean up afterward. This is vaguely > like pthread_cancel, or like Thread.interrupt with no way to clear the > interrupted status. Third, you might want to really abort the thread, > like Java's Thread.stop. > > There are uses for all three levels, but for a first implementation, I > think we should pick just one. Because aborting is unsafe in most > situations, it's out. And I vaguely remember Josh Block saying that if > he had to do Java over again, he'd make it impossible to clear a > Thread's interrupted status, turning it into cancellation, but I need > to check with him to make sure I got that right. > > I'm not opposed in theory to providing the really violent option in, > say, the version after we provide plain cooperative cancellation, but: > 1) Any given library can simulate it by calling > threading.cancellation_point() (or whatever its name turns out to be) > occasionally within its inner loop, and > 2) Judging from other systems with violent interruptions like posix > signals and Haskell asynchronous exceptions, we'll need a way of > blocking aborts in a scope, and unblocking them in a sub-scope. > > -- > Namast?, > Jeffrey Yasskin > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Tue Apr 8 08:21:10 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 8 Apr 2008 00:21:10 -0600 Subject: [Python-3000] python-safethread project status In-Reply-To: References: <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> <47FADB24.20007@canterbury.ac.nz> <5d44f72f0804072243t4f62b05ala0027f1731abf820@mail.gmail.com> Message-ID: On Mon, Apr 7, 2008 at 11:58 PM, Guido van Rossum wrote: > We have a way to raise an exception in a thread asynchronously, *but* > we don't have a way to interrupt either system calls or blocked lock > acquisitions. I suppose that system calls can be made interruptable > with suitable tweaking of various signal-related settings (at least on > Unix -- and I expect Windows has an equivalent). But I don't know if > mutex acquisitions are interruptable. I'm working on APIs for interrupting syscalls. I also have interruptible (I'm now calling it cancellable) conditions in my monitors, which'd make it quite easy to implement cancellable Lock objects. -- Adam Olsen, aka Rhamphoryncus From abpillai at gmail.com Tue Apr 8 14:12:55 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 8 Apr 2008 17:42:55 +0530 Subject: [Python-3000] Is this a bug ? Message-ID: <8548c5f30804080512i2b1a439dgd1d6da5f2a89a1c3@mail.gmail.com> While playing around with true & floor division in Py3k... Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x=2+0j >>> y=3+0j >>> x / y (0.66666666666666663+0j) >>> x//y Traceback (most recent call last): File "", line 1, in TypeError: can't take floor of complex number. >>> x.__floordiv__(y) Traceback (most recent call last): File "", line 1, in TypeError: can't take floor of complex number. >>> x.__divmod__(y) Traceback (most recent call last): File "", line 1, in TypeError: can't take floor or mod of complex number. In Python2.5, [anand at localhost py3k]$ python2.5 Python 2.5.1 (r251:54863, Sep 6 2007, 17:27:08) [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x=2+0j >>> y=3+0j >>> x/y (0.66666666666666663+0j) >>> x//y __main__:1: DeprecationWarning: complex divmod(), // and % are deprecated 0j >>> x.__floordiv__(y) 0j >>> x.__divmod__(y) (0j, (2+0j)) Shouldn't Py3k also return 0j for floor division ? If it does not want to do floor division/divmod for complex numbers, shouldn't the exception error be more descriptive ? Or is this the expected behavior ? Thanks -- -Anand From abpillai at gmail.com Tue Apr 8 14:28:55 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 8 Apr 2008 17:58:55 +0530 Subject: [Python-3000] Bug in pickling range objects ? Message-ID: <8548c5f30804080528g79994875h1c04625a4d5db39@mail.gmail.com> Found this behavior in py3k, a4... Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> r=range(10) >>> import pickle >>> pickle.dumps(r) b'\x80\x03cbuiltins\nrange\nq\x00)\x81q\x01.' >>> pickle.loads(pickle.dumps(r)) Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.0/pickle.py", line 1341, in loads return Unpickler(file).load() File "/usr/local/lib/python3.0/pickle.py", line 823, in load dispatch[key[0]](self) File "/usr/local/lib/python3.0/pickle.py", line 1055, in load_newobj obj = cls.__new__(cls, *args) TypeError: range expected 1 arguments, got 0 >>> Looks like a bug in unpickling range objects. Should I report this ? Thanks -- -Anand From abpillai at gmail.com Tue Apr 8 14:40:21 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 8 Apr 2008 18:10:21 +0530 Subject: [Python-3000] Bug in pickling range objects ? In-Reply-To: <8548c5f30804080528g79994875h1c04625a4d5db39@mail.gmail.com> References: <8548c5f30804080528g79994875h1c04625a4d5db39@mail.gmail.com> Message-ID: <8548c5f30804080540v1dbbc982p9b4cf9adefed01a6@mail.gmail.com> Issue created. Sorry if this list is not meant for posting bugs. Just trying to help out :) http://bugs.python.org/issue2582 Thanks --Anand On Tue, Apr 8, 2008 at 5:58 PM, Anand Balachandran Pillai wrote: > Found this behavior in py3k, a4... > > Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> r=range(10) > >>> import pickle > >>> pickle.dumps(r) > b'\x80\x03cbuiltins\nrange\nq\x00)\x81q\x01.' > >>> pickle.loads(pickle.dumps(r)) > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python3.0/pickle.py", line 1341, in loads > return Unpickler(file).load() > File "/usr/local/lib/python3.0/pickle.py", line 823, in load > dispatch[key[0]](self) > File "/usr/local/lib/python3.0/pickle.py", line 1055, in load_newobj > obj = cls.__new__(cls, *args) > TypeError: range expected 1 arguments, got 0 > >>> > > Looks like a bug in unpickling range objects. Should I report this ? > > Thanks > -- > -Anand > -- -Anand From amauryfa at gmail.com Tue Apr 8 15:20:33 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 8 Apr 2008 15:20:33 +0200 Subject: [Python-3000] Is this a bug ? In-Reply-To: <8548c5f30804080512i2b1a439dgd1d6da5f2a89a1c3@mail.gmail.com> References: <8548c5f30804080512i2b1a439dgd1d6da5f2a89a1c3@mail.gmail.com> Message-ID: Hello, Anand Balachandran Pillai wrote: > While playing around with true & floor division in Py3k... > > Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> x=2+0j > >>> y=3+0j > >>> x / y > (0.66666666666666663+0j) > >>> x//y > Traceback (most recent call last): > File "", line 1, in > TypeError: can't take floor of complex number. > >>> x.__floordiv__(y) > Traceback (most recent call last): > File "", line 1, in > TypeError: can't take floor of complex number. > >>> x.__divmod__(y) > Traceback (most recent call last): > File "", line 1, in > TypeError: can't take floor or mod of complex number. > > In Python2.5, > > [anand at localhost py3k]$ python2.5 > Python 2.5.1 (r251:54863, Sep 6 2007, 17:27:08) > [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> x=2+0j > >>> y=3+0j > >>> x/y > (0.66666666666666663+0j) > >>> x//y > __main__:1: DeprecationWarning: complex divmod(), // and % are deprecated > 0j > >>> x.__floordiv__(y) > 0j > >>> x.__divmod__(y) > (0j, (2+0j)) > > Shouldn't Py3k also return 0j for floor division ? If it does not want to do > floor division/divmod for complex numbers, shouldn't the exception > error be more descriptive ? Or is this the expected behavior ? Yes, the DeprecationWarning has turned into a real error. This is the normal evolution of python 3.0. Then, I find the message quite descriptive: >>> divmod(x,y) TypeError: can't take floor or mod of complex number. What message would you want in this case? -- Amaury Forgeot d'Arc From abpillai at gmail.com Tue Apr 8 15:25:18 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 8 Apr 2008 18:55:18 +0530 Subject: [Python-3000] Equality of range objects Message-ID: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> Hi, There seems to be inconsistency in the way the new range(...) type implements equality and inequality operators. In Python 2.x, range(...) of course returns lists and when you equate lhs of two range(...) functions over the same range, you get True, since we are comparing equal lists. Python 2.5.1 (r251:54863, Sep 6 2007, 17:27:08) [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> range(5,10)==range(5,10) True >>> In Py3k, however I see the following behavior. Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> range(5,10)==range(5,10) False >>> r1=range(5,10) >>> r2=range(5,10) >>> r1==r2 False >>> r1 != r2 True Won't this be quite confusing for people who carry forward their code from 2.x to 3.0 ? Though the range(...) is no longer a function, but a type, the semantics should not change so much that two range objects over the same range cannot be equated. It seems __eq__ is not implemented for range. >>> r1.__eq__(r2) NotImplemented Perhaps this is the problem ? I could not find much documentation on the range type, so posting the question here. Thanks --Anand -- -Anand From abpillai at gmail.com Tue Apr 8 15:34:22 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 8 Apr 2008 19:04:22 +0530 Subject: [Python-3000] Is this a bug ? In-Reply-To: References: <8548c5f30804080512i2b1a439dgd1d6da5f2a89a1c3@mail.gmail.com> Message-ID: <8548c5f30804080634g4eed9c60kad96d9fb21fff5a6@mail.gmail.com> Hi, On Tue, Apr 8, 2008 at 6:50 PM, Amaury Forgeot d'Arc wrote: > Hello, > > > > Anand Balachandran Pillai wrote: > > While playing around with true & floor division in Py3k... > > > > Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) > > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> x=2+0j > > >>> y=3+0j > > >>> x / y > > (0.66666666666666663+0j) > > >>> x//y > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: can't take floor of complex number. > > >>> x.__floordiv__(y) > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: can't take floor of complex number. > > >>> x.__divmod__(y) > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: can't take floor or mod of complex number. > > > > In Python2.5, > > > > [anand at localhost py3k]$ python2.5 > > Python 2.5.1 (r251:54863, Sep 6 2007, 17:27:08) > > [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> x=2+0j > > >>> y=3+0j > > >>> x/y > > (0.66666666666666663+0j) > > >>> x//y > > __main__:1: DeprecationWarning: complex divmod(), // and % are deprecated > > 0j > > >>> x.__floordiv__(y) > > 0j > > >>> x.__divmod__(y) > > (0j, (2+0j)) > > > > Shouldn't Py3k also return 0j for floor division ? If it does not want to do > > floor division/divmod for complex numbers, shouldn't the exception > > error be more descriptive ? Or is this the expected behavior ? > > Yes, the DeprecationWarning has turned into a real error. > This is the normal evolution of python 3.0. Thanks for the clarification. > > Then, I find the message quite descriptive: > >>> divmod(x,y) > > TypeError: can't take floor or mod of complex number. > What message would you want in this case? The message "can't take floor..." is slightly confusing since it could mean the floor or mod cannot be taken in this context, instead of conveying the (correct) information that this operation is invalid in any context. A message like "can't convert complex to float" or "Invalid operation, can't perform floor or mod on complex number", would be more informative. > > -- > Amaury Forgeot d'Arc > Thanks -- -Anand From amauryfa at gmail.com Tue Apr 8 15:35:04 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 8 Apr 2008 15:35:04 +0200 Subject: [Python-3000] Bug in pickling range objects ? In-Reply-To: <8548c5f30804080540v1dbbc982p9b4cf9adefed01a6@mail.gmail.com> References: <8548c5f30804080528g79994875h1c04625a4d5db39@mail.gmail.com> <8548c5f30804080540v1dbbc982p9b4cf9adefed01a6@mail.gmail.com> Message-ID: Anand Balachandran Pillai wrote: > Issue created. Sorry if this list is not meant for posting bugs. > Just trying to help out :) > > http://bugs.python.org/issue2582 Most core developers also subscribe to a python-bugs-list mailing list, and receive any new issue. Thanks for helping, -- Amaury Forgeot d'Arc From steven.bethard at gmail.com Tue Apr 8 18:39:03 2008 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 8 Apr 2008 10:39:03 -0600 Subject: [Python-3000] Equality of range objects In-Reply-To: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 7:25 AM, Anand Balachandran Pillai wrote: > Hi, > > There seems to be inconsistency in the way the new range(...) > type implements equality and inequality operators. > > In Python 2.x, range(...) of course returns lists and when you > equate lhs of two range(...) functions over the same range, you > get True, since we are comparing equal lists. > > Python 2.5.1 (r251:54863, Sep 6 2007, 17:27:08) > [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> range(5,10)==range(5,10) > True > >>> > > In Py3k, however I see the following behavior. > Python 3.0a4+ (py3k:62126, Apr 3 2008, 16:28:40) > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> range(5,10)==range(5,10) > False > >>> r1=range(5,10) > >>> r2=range(5,10) > >>> r1==r2 > False > >>> r1 != r2 > True > > Won't this be quite confusing for people who carry forward their > code from 2.x to 3.0 ? People carrying code from 2.x to 3.0 should be using xrange, not range:: ActivePython 2.5.1.1 (ActiveState Software Inc.) based on Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> xrange(5, 10) == xrange(5, 10) False Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From ncoghlan at gmail.com Tue Apr 8 18:45:54 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 09 Apr 2008 02:45:54 +1000 Subject: [Python-3000] Method to populate tp_* slots via getattr()? In-Reply-To: References: <47F38A71.1020803@gmail.com> <47F4CEA5.8070200@gmail.com> Message-ID: <47FBA142.6090006@gmail.com> Guido van Rossum wrote: > I'll wait for others to jump on this bandwagon... IMO the tempfile > object would be better off not to bother with caching at all... I may have found a slightly more convincing example after spending a fairly enlightening evening browsing through the source code for weakref.proxy. The way that code works is to define every slot, delegating to the proxied object to handle each call (wrapping and unwrapping the proxied object as needed). This is normally transparent to the user due to the fact that __getattribute__ is one of the proxied methods (and at the C level, the delegated slot invocations return NotImplemented or set the appropriate exceptions). The only way it shows through is the fact that operator.isNumber and operator.isMapping will always return True for the proxy instance, and operator.isSequence will always return False - this is due to the proxy type filling in the number and mapping slots, but not the sequence slots. (Are isMapping, isNumber and isSequence slated for the chopping block in the stdlib reorg? If they aren't yet, the probably should be) The weakref.proxy function actually goes to some additional effort to get callable() to return the right answer by using a different type (one with an empty tp_call slot) when the object being weak referenced doesn't define __callable__ (the two separate weakref proxy types actually still exist in Py3k, despite the removal of callable). However, all this prompted me to try an experiment (Python 2.5.1), and the results didn't fill me with confidence regarding the approach of expecting 3rd party developers to explicitly delegate all of the special methods themselves: >>> class Demo: ... def __index__(self): ... return 1 ... >>> a = Demo() >>> b = weakref.proxy(a) >>> operator.index(a) 1 >>> operator.index(b) Traceback (most recent call last): File "", line 1, in TypeError: 'weakproxy' object cannot be interpreted as an index Oops, we didn't even catch that missing delegation in our *own* proxy implementation when __index__ was added, let alone anyone else's. (I've now raised this missing delegation as issue 2592) With the 2.x approach of using a classic class to implement delegation via __getattr__ (since classic classes are special-cased everywhere to go through that hook, even from C code) being removed in Py3k, it would probably be a worthwhile project to take the weakref.proxy code and come up with an equivalent version that retained a strong reference to the original object and was able to be subclassed (weakref.proxy doesn't permit subclasses). I'd expect including a pure Python version of this would be better than writing it in C: - It will be a lot easier to write and maintain - It will still be faster than using a classic class would have been - Being written in Python allows it to be a pure mixin class that won't restrict a subclass's ability to inherit from a class written in C - By having a full proxy implementation in the standard library, third party developers can check that their own proxy implementations are explicitly delegating at least the same range of special methods as the standard library implementation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From brett at python.org Tue Apr 8 21:38:57 2008 From: brett at python.org (Brett Cannon) Date: Tue, 8 Apr 2008 12:38:57 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> Message-ID: On Mon, Apr 7, 2008 at 4:19 PM, Guido van Rossum wrote: > On Mon, Apr 7, 2008 at 3:04 PM, Benjamin Peterson > wrote: > > On Mon, Apr 7, 2008 at 4:45 PM, Guido van Rossum wrote: > > > -0.5 from me. For half of the names that the PEP proposes to move most > > > users wouldn't be able to guess in which module to find them. > > > If they're in *one* (maybe two; we'll see.) other module, it'd be hard to > > guess where they are? At the top of the sys docs, we'll put "sys: Generic > > Python interpreter services. For CPython specific tools, see the cpython > > module" I don't see why people have to be able to "guess" where a given > > object is. (It should be reasonably placed, of course.) > > Yes, it will be hard, because most CPython users have no idea what > other Python implementations can or cannot do. > > E.g. i was surprised to learn that Jython doesn't support a recursion > limit, or that frame objects are not universal (in fact I think *you* > are mistaken there). > But I am pretty sure IronPython does not support frames access. > OTOH I would guess that "executable" may not be meaningful in Jython, > as you'd have to invoke the JVM first. Other examples: I'm not at all > sure that all Python implementations should be expected to support > tracing and profiling. And I don't get why builtin_module_names can't > be universal. Perhaps we should start this with a discussion of what exactly other VMs are expected to implement? At the bare minimum this can be documented in the sys documentation even if no new module is created or certain attributes are moved. I will start a new thread for this. -Brett From guido at python.org Tue Apr 8 21:46:50 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 12:46:50 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 12:38 PM, Brett Cannon wrote: > But I am pretty sure IronPython does not support frames access. Really? I thought you had to pay for it (your code runs slower if they see you use it), but that they bent over backwards to provide it. > > OTOH I would guess that "executable" may not be meaningful in Jython, > > as you'd have to invoke the JVM first. Other examples: I'm not at all > > sure that all Python implementations should be expected to support > > tracing and profiling. And I don't get why builtin_module_names can't > > be universal. > > Perhaps we should start this with a discussion of what exactly other > VMs are expected to implement? At the bare minimum this can be > documented in the sys documentation even if no new module is created > or certain attributes are moved. Perhaps it may be better to first make an inventory of what they *do* provide? > I will start a new thread for this. Great! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Tue Apr 8 22:27:15 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 8 Apr 2008 15:27:15 -0500 Subject: [Python-3000] Equality of range objects In-Reply-To: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> Message-ID: <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> Is there a reason this is not implemented, though? It's seems to me they should be equivalent. [snip] -- Cheers, Benjamin Peterson From musiccomposition at gmail.com Tue Apr 8 22:34:02 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 8 Apr 2008 15:34:02 -0500 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> Message-ID: <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> On Mon, Apr 7, 2008 at 6:19 PM, Guido van Rossum wrote: > On Mon, Apr 7, 2008 at 3:04 PM, Benjamin Peterson > wrote: > > On Mon, Apr 7, 2008 at 4:45 PM, Guido van Rossum wrote: > > > -0.5 from me. For half of the names that the PEP proposes to move most > > > users wouldn't be able to guess in which module to find them. > > > If they're in *one* (maybe two; we'll see.) other module, it'd be hard to > > guess where they are? At the top of the sys docs, we'll put "sys: Generic > > Python interpreter services. For CPython specific tools, see the cpython > > module" I don't see why people have to be able to "guess" where a given > > object is. (It should be reasonably placed, of course.) > > Yes, it will be hard, because most CPython users have no idea what > other Python implementations can or cannot do. > > E.g. i was surprised to learn that Jython doesn't support a recursion > limit, or that frame objects are not universal (in fact I think *you* > are mistaken there). Another thought: Even if other implementations provide these functions, it doesn't really mean they are compatible. Allowing each implementation to have their own interpreter module can clear up confusion regarding how much they support what is returned. > > OTOH I would guess that "executable" may not be meaningful in Jython, > as you'd have to invoke the JVM first. Other examples: I'm not at all > sure that all Python implementations should be expected to support > tracing and profiling. And I don't get why builtin_module_names can't > be universal. > > Enough examples; I hope my point is clear. > > -- > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- Cheers, Benjamin Peterson From guido at python.org Tue Apr 8 23:07:37 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 14:07:37 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 1:34 PM, Benjamin Peterson > Another thought: Even if other implementations provide these > functions, it doesn't really mean they are compatible. Allowing each > implementation to have their own interpreter module can clear up > confusion regarding how much they support what is returned. That's not the Python spirit. The spirit is that *if* they support similar enough functionality the APIs should be named the same, in the same module, and have the same signature. E.g. the os module is built on this principle. Many APIs there are optional, but if they exist, they have a known name and spec. (The posix/nt underlying modules are implementation details that most users never need to know about.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Tue Apr 8 23:25:07 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 8 Apr 2008 16:25:07 -0500 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> Message-ID: <1afaf6160804081425h6c8ea47dhfd76fc736bafe1fe@mail.gmail.com> On Tue, Apr 8, 2008 at 4:07 PM, Guido van Rossum wrote: > On Tue, Apr 8, 2008 at 1:34 PM, Benjamin Peterson > > Another thought: Even if other > > implementations provide these > > functions, it doesn't really mean they are compatible. Allowing each > > implementation to have their own interpreter module can clear up > > confusion regarding how much they support what is returned. > > That's not the Python spirit. The spirit is that *if* they support > similar enough functionality the APIs should be named the same, in the > same module, and have the same signature. E.g. the os module is built > on this principle. Many APIs there are optional, but if they exist, > they have a known name and spec. (The posix/nt underlying modules are > implementation details that most users never need to know about.) You can't expect people to write the same implementation as you, though. Take an implementation (imaginary for the moment) that has a frame-like object, but is barred from exposing it because it doesn't have the same API as the CPython one. You could argue too that exposing an internal object with the ugly name _getframe is hardly pythonic to begin with. ;) > > -- > > > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- Cheers, Benjamin Peterson From guido at python.org Tue Apr 8 23:29:14 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 14:29:14 -0700 Subject: [Python-3000] Equality of range objects In-Reply-To: <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 1:27 PM, Benjamin Peterson wrote: > Is there a reason this is not implemented, though? It's seems to me > they should be equivalent. Where's the use case? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Tue Apr 8 23:34:27 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 8 Apr 2008 16:34:27 -0500 Subject: [Python-3000] Equality of range objects In-Reply-To: References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> Message-ID: <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> On Tue, Apr 8, 2008 at 4:29 PM, Guido van Rossum wrote: > On Tue, Apr 8, 2008 at 1:27 PM, Benjamin Peterson > wrote: > > Is there a reason this is not implemented, though? It's seems to me > > they should be equivalent. > > Where's the use case? Education. the range object describes a set of integers from one point to another, so to a new Python student having them not equivalent can't be helpful. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- Cheers, Benjamin Peterson From guido at python.org Tue Apr 8 23:28:22 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 14:28:22 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: <1afaf6160804081425h6c8ea47dhfd76fc736bafe1fe@mail.gmail.com> References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> <1afaf6160804081425h6c8ea47dhfd76fc736bafe1fe@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 2:25 PM, Benjamin Peterson wrote: > > On Tue, Apr 8, 2008 at 4:07 PM, Guido van Rossum wrote: > > On Tue, Apr 8, 2008 at 1:34 PM, Benjamin Peterson > > > Another thought: Even if other > > > > implementations provide these > > > functions, it doesn't really mean they are compatible. Allowing each > > > implementation to have their own interpreter module can clear up > > > confusion regarding how much they support what is returned. > > > > That's not the Python spirit. The spirit is that *if* they support > > similar enough functionality the APIs should be named the same, in the > > same module, and have the same signature. E.g. the os module is built > > on this principle. Many APIs there are optional, but if they exist, > > they have a known name and spec. (The posix/nt underlying modules are > > implementation details that most users never need to know about.) > You can't expect people to write the same implementation as you, > though. Take an implementation (imaginary for the moment) that has a > frame-like object, but is barred from exposing it because it doesn't > have the same API as the CPython one. You could argue too that > exposing an internal object with the ugly name _getframe is hardly > pythonic to begin with. ;) Eh? They provide compatible APIs using different implementations all the time. Really, this is getting exasperating. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Tue Apr 8 23:43:42 2008 From: brett at python.org (Brett Cannon) Date: Tue, 8 Apr 2008 14:43:42 -0700 Subject: [Python-3000] Equality of range objects In-Reply-To: <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 2:34 PM, Benjamin Peterson wrote: > On Tue, Apr 8, 2008 at 4:29 PM, Guido van Rossum wrote: > > On Tue, Apr 8, 2008 at 1:27 PM, Benjamin Peterson > > wrote: > > > Is there a reason this is not implemented, though? It's seems to me > > > they should be equivalent. > > > > Where's the use case? > Education. the range object describes a set of integers from one point > to another, so to a new Python student having them not equivalent > can't be helpful. That's not good enough. You could say that for almost anything. Plus it just becomes that much more code and feature-set to maintain. -Brett From musiccomposition at gmail.com Tue Apr 8 23:49:38 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 8 Apr 2008 16:49:38 -0500 Subject: [Python-3000] Equality of range objects In-Reply-To: References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> Message-ID: <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> On Tue, Apr 8, 2008 at 4:43 PM, Brett Cannon wrote: > On Tue, Apr 8, 2008 at 2:34 PM, Benjamin Peterson > > wrote: > > On Tue, Apr 8, 2008 at 4:29 PM, Guido van Rossum wrote: > > > On Tue, Apr 8, 2008 at 1:27 PM, Benjamin Peterson > > > wrote: > > > > Is there a reason this is not implemented, though? It's seems to me > > > > they should be equivalent. > > > > > > Where's the use case? > > Education. the range object describes a set of integers from one point > > to another, so to a new Python student having them not equivalent > > can't be helpful. > > That's not good enough. You could say that for almost anything. Plus > it just becomes that much more code and feature-set to maintain. range is one of the first functions introduced in teaching Python. How about this similar implemented behavior: >>> {"1":2}.keys() == {"1":2}.keys() True > > -Brett > -- Cheers, Benjamin Peterson From guido at python.org Wed Apr 9 00:18:37 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 15:18:37 -0700 Subject: [Python-3000] Equality of range objects In-Reply-To: <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 2:49 PM, Benjamin Peterson > range is one of the first functions introduced in teaching Python. That's only because educators were raised on Pascal for loops. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Wed Apr 9 00:46:16 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Apr 2008 10:46:16 +1200 Subject: [Python-3000] python-safethread project status In-Reply-To: References: <1205867606.31138.11.camel@qrnik> <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> <47FADB24.20007@canterbury.ac.nz> Message-ID: <47FBF5B8.9060300@canterbury.ac.nz> Adam Olsen wrote: > Killing threads at arbitrary points really is that dangerous. I'm not talking about killing an arbitrary thread, but a particular thread that I've designed with the idea of killing it in mind. And I'm not really talking about killing it, either, just having a way of tapping it on the shoulder and getting its attention whatever it happens to be doing. I don't believe that's an outrageously unreasonable thing to want to be able to do. To be precise, what I have in mind is this: 1) A way of causing an exception to be raised asynchronously in another thread. 2) Such exceptions would be automatically blocked in a finally clause. 3) There would be a way of explicitly blocking them around a section of code, e.g. using a context manager. 4) If it makes anyone feel any better, they could be blocked by default until explicitly enabled by the thread concerned. 5) When a thread dies, either: a) Any locks it is holding are automatically released, or b) An exception is raised in the main thread if it dies while holding any locks (since this indicates a programming error, i.e. the thread failed to clean up after itself when receiving an asynchronous exception). Can anyone point to a reason it would be difficult to write well-behaved threaded code in the presence of these features? -- Greg From greg.ewing at canterbury.ac.nz Wed Apr 9 00:55:53 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Apr 2008 10:55:53 +1200 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> Message-ID: <47FBF7F9.4020608@canterbury.ac.nz> Terry Reedy wrote: > Unfortunately, *any* text printed for any object *could* have been the > value of a string object. That's true, but it's sufficiently unlikely that a string such as "" could have accidentally arisen some other way that I don't lose any sleep over it. If weird things seem to be happening in some particular case, I'll put a repr() in to find out exactly what's going on. Most of the time it's not needed, though. There's another reason it bothers me. If a string like "" turns up in otherwise normal output, it's a fairly clear indication that I've somehow ended up printing something that was never meant to be printed. Whereas if it just comes out as "foo", it could easily go unnoticed. There's something reassuring about the fact that things with no "obvious" textual representation stick out like a sore digit when you try to print them. I wouldn't like to lose that. -- Greg From facundobatista at gmail.com Wed Apr 9 01:49:13 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 8 Apr 2008 20:49:13 -0300 Subject: [Python-3000] Types and classes In-Reply-To: <47FBF7F9.4020608@canterbury.ac.nz> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: 2008/4/8, Greg Ewing : > That's true, but it's sufficiently unlikely that a string > such as "" could have accidentally arisen some > other way that I don't lose any sleep over it. If weird > things seem to be happening in some particular case, I'll > put a repr() in to find out exactly what's going on. Most > of the time it's not needed, though. > > There's another reason it bothers me. If a string like > "" turns up in otherwise normal output, it's > a fairly clear indication that I've somehow ended up > printing something that was never meant to be printed. > Whereas if it just comes out as "foo", it could easily > go unnoticed. I'm with Greg here, but I'll put it in another way: I don't want repr() to be nice, I want it to be as explicit as possible. I want to be able to trust repr(), and never doubt of what it's showing to me. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From rhamph at gmail.com Wed Apr 9 02:05:59 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 8 Apr 2008 18:05:59 -0600 Subject: [Python-3000] python-safethread project status In-Reply-To: <47FBF5B8.9060300@canterbury.ac.nz> References: <1205870652.31138.38.camel@qrnik> <1205877872.10732.28.camel@qrnik> <47E062A3.4030702@canterbury.ac.nz> <47FADB24.20007@canterbury.ac.nz> <47FBF5B8.9060300@canterbury.ac.nz> Message-ID: On Tue, Apr 8, 2008 at 4:46 PM, Greg Ewing wrote: > Adam Olsen wrote: > > Killing threads at arbitrary points really is that dangerous. > > I'm not talking about killing an arbitrary thread, but > a particular thread that I've designed with the idea of > killing it in mind. > > And I'm not really talking about killing it, either, > just having a way of tapping it on the shoulder and > getting its attention whatever it happens to be doing. > > I don't believe that's an outrageously unreasonable > thing to want to be able to do. > > To be precise, what I have in mind is this: > > 1) A way of causing an exception to be raised asynchronously > in another thread. > > 2) Such exceptions would be automatically blocked in > a finally clause. > > 3) There would be a way of explicitly blocking them > around a section of code, e.g. using a context manager. > > 4) If it makes anyone feel any better, they could be > blocked by default until explicitly enabled by the > thread concerned. > > 5) When a thread dies, either: > > a) Any locks it is holding are automatically released, > or > > b) An exception is raised in the main thread if it dies > while holding any locks (since this indicates a programming > error, i.e. the thread failed to clean up after itself > when receiving an asynchronous exception). > > Can anyone point to a reason it would be difficult to write > well-behaved threaded code in the presence of these features? I think what bothers me is I want the restrictions to be enforced upfront, so any violation produces a clear exception, whereas your proposal does not diagnose them until later. As a counter proposal, you could either forbid acquiring locks, or have them implicitly block the asynchronous exception. It's also worth noting that we have very little info on how a given operation is subdivided. I personally think that's a good thing - if you need it, you're doing something wrong. I have only two use cases in mind: 1. CPU bound tasks that can't be subdivided, such as 10**10**10 2. Arbitrary code in the interactive interpreter The latter can't be done sanely. It's simply accepting that between a hung interpreter and a possibly (but unlikely) corrupted user program, we'd rather risk corrupting the user program. -- Adam Olsen, aka Rhamphoryncus From guido at python.org Wed Apr 9 02:17:39 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 17:17:39 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: On Tue, Apr 8, 2008 at 4:49 PM, Facundo Batista wrote: > 2008/4/8, Greg Ewing : > > That's true, but it's sufficiently unlikely that a string > > such as "" could have accidentally arisen some > > other way that I don't lose any sleep over it. If weird > > things seem to be happening in some particular case, I'll > > put a repr() in to find out exactly what's going on. Most > > of the time it's not needed, though. > > > > There's another reason it bothers me. If a string like > > "" turns up in otherwise normal output, it's > > a fairly clear indication that I've somehow ended up > > printing something that was never meant to be printed. > > Whereas if it just comes out as "foo", it could easily > > go unnoticed. > > I'm with Greg here, but I'll put it in another way: I don't want > repr() to be nice, I want it to be as explicit as possible. I want to > be able to trust repr(), and never doubt of what it's showing to me. Seems to be mass confusion all around. My proposal is: repr(int) == str(int) == 'int' For user-defined classes, a module name will always be present, e.g. for class C defined in __main__: repr(C) == str(c) == '__main__.C' -- --Guido van Rossum (home page: http://www.python.org/~guido/) From facundobatista at gmail.com Wed Apr 9 02:25:19 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 8 Apr 2008 21:25:19 -0300 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: 2008/4/8, Guido van Rossum : > Seems to be mass confusion all around. My proposal is: > > repr(int) == > str(int) == 'int' > > For user-defined classes, a module name will always be present, e.g. > for class C defined in __main__: > > repr(C) == > str(c) == '__main__.C' +1 -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From greg.ewing at canterbury.ac.nz Wed Apr 9 02:34:37 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 09 Apr 2008 12:34:37 +1200 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: <47FC0F1D.7020004@canterbury.ac.nz> Guido van Rossum wrote: > Seems to be mass confusion all around. My proposal is: > > repr(int) == > str(int) == 'int' > > repr(C) == > str(c) == '__main__.C' Can I take a step back and ask why exactly we're considering doing this? In what use cases is the current result of str() considered too verbose? -- Greg From guido at python.org Wed Apr 9 04:00:16 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 8 Apr 2008 19:00:16 -0700 Subject: [Python-3000] Types and classes In-Reply-To: <47FC0F1D.7020004@canterbury.ac.nz> References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <47FC0F1D.7020004@canterbury.ac.nz> Message-ID: On Tue, Apr 8, 2008 at 5:34 PM, Greg Ewing wrote: > Guido van Rossum wrote: > > > Seems to be mass confusion all around. My proposal is: > > > > repr(int) == > > str(int) == 'int' > > > > > repr(C) == > > str(c) == '__main__.C' > > Can I take a step back and ask why exactly we're considering > doing this? In what use cases is the current result > of str() considered too verbose? In error messages. I've written more code than I'd like to admit that spits out errors of the form "Method frumble() expected a joojoo or geegee argument, but got a %s instead". Using type(arg).__name__ omits the module name, which can be ambiguous in some contexts (some apps have lots of different but related classes with the same name defined in different modules). Using repr(type(arg)) makes the message ugly (and longer, which matter). Making it pretty requires something like type(arg).__module__ + "." + type(arg).__name__ with an exception if __module__ is empty or '__builtin__' is usually not worth the code. A simpler str() for classes would simplify life. And don't tell me that I shouldn't be using isinstance(). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at gmail.com Wed Apr 9 04:42:46 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 8 Apr 2008 19:42:46 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <47FC0F1D.7020004@canterbury.ac.nz> Message-ID: On Tue, Apr 8, 2008 at 7:00 PM, Guido van Rossum wrote: ... > And don't tell me that I shouldn't be using isinstance(). Of course you shouldn't, obviously you just don't really _get_ Python...! (!-) Alex From tjreedy at udel.edu Wed Apr 9 06:31:42 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 9 Apr 2008 00:31:42 -0400 Subject: [Python-3000] Types and classes References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com><1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com><1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com><1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com><47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: "Greg Ewing" wrote in message news:47FBF7F9.4020608 at canterbury.ac.nz... | There's another reason it bothers me. If a string like | "" turns up in otherwise normal output, it's | a fairly clear indication that I've somehow ended up | printing something that was never meant to be printed. Which to me is precisely why str() should *not* look like an accident when intentionally printed in normal output -- as in my example or in Guido's. From abpillai at gmail.com Wed Apr 9 08:45:38 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Wed, 9 Apr 2008 12:15:38 +0530 Subject: [Python-3000] Equality of range objects In-Reply-To: References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> Message-ID: <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> Still this seems like a bad thing to break backward compatibility with. However I cannot really provide a use-case apart from what Benjamin has said -> Teaching. It is not a common use-case to equate ranges in code and that is bad coding anyway. Hopefully, this will be well documented at 3.0 release. Currently that "whats new" page does not mention anything about the range type and how it breaks backward compatibility. The NEWS page for 3.0 a4 does say this however. "range() now returns an iterator rather than a list. Floats are not allowed. xrange() is no longer defined." I guess more information can be added here to actually specify that range() returns not just any iterator, but an iterator which is a new type and how it is different. As regarding education, the following example can be used to illustrate why things are different now. >>> l=[1,2,3,4,5] >>> l2=[1,2,3,4,5] >>> iter(l2) >>> iter(l2)==iter(l) False >>> Since range() is an iterator type, this should explain why we cannot equate two range objects anymore. Thanks --Anand On Wed, Apr 9, 2008 at 3:48 AM, Guido van Rossum wrote: > On Tue, Apr 8, 2008 at 2:49 PM, Benjamin Peterson > > range is one of the first functions > introduced in teaching Python. > > That's only because educators were raised on Pascal for loops. > > -- > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/abpillai%40gmail.com > -- -Anand From ncoghlan at gmail.com Wed Apr 9 09:03:16 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 09 Apr 2008 17:03:16 +1000 Subject: [Python-3000] Equality of range objects In-Reply-To: <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> Message-ID: <47FC6A34.5080305@gmail.com> Anand Balachandran Pillai wrote: > Still this seems like a bad thing to break backward compatibility with. > However I cannot really provide a use-case apart from what Benjamin > has said -> Teaching. It is not a common use-case to equate ranges > in code and that is bad coding anyway. > > Hopefully, this will be well documented at 3.0 release. Currently > that "whats new" page does not mention anything about the range > type and how it breaks backward compatibility. > > The NEWS page for 3.0 a4 does say this however. > > "range() now returns an iterator rather than a list. Floats are not allowed. > xrange() is no longer defined." That's actually wrong. xrange objects (and Py3k's range objects) aren't iterators, they're only iterables (they don't provide a next/__next__ method). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From g.brandl at gmx.net Wed Apr 9 09:35:16 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 09 Apr 2008 09:35:16 +0200 Subject: [Python-3000] Equality of range objects In-Reply-To: <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> Message-ID: Anand Balachandran Pillai schrieb: > Still this seems like a bad thing to break backward compatibility with. > However I cannot really provide a use-case apart from what Benjamin > has said -> Teaching. It is not a common use-case to equate ranges > in code and that is bad coding anyway. > > Hopefully, this will be well documented at 3.0 release. Currently > that "whats new" page does not mention anything about the range > type and how it breaks backward compatibility. It says "xrange() renamed to range()". While this is everything one'd need to know, I agree that it could use a clarifying sentence, so I added one now. Georg From arnodel at googlemail.com Wed Apr 9 10:11:08 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Wed, 9 Apr 2008 09:11:08 +0100 Subject: [Python-3000] Equality of range objects In-Reply-To: <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> Message-ID: <9bfc700a0804090111r57f837d5vbd344021981c5444@mail.gmail.com> On 09/04/2008, Anand Balachandran Pillai wrote: > "range() now returns an iterator rather than a list... No: range() returns an iterable. -- Arnaud From abpillai at gmail.com Wed Apr 9 10:57:54 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Wed, 9 Apr 2008 14:27:54 +0530 Subject: [Python-3000] Equality of range objects In-Reply-To: <9bfc700a0804090111r57f837d5vbd344021981c5444@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> <9bfc700a0804090111r57f837d5vbd344021981c5444@mail.gmail.com> Message-ID: <8548c5f30804090157k6adaf5f3i1391f302eb1713a@mail.gmail.com> I was quoting from the 3.0 a4 docs. It needs to be fixed then. Thanks --Anand On Wed, Apr 9, 2008 at 1:41 PM, Arnaud Delobelle wrote: > On 09/04/2008, Anand Balachandran Pillai wrote: > > > "range() now returns an iterator rather than a list... > No: range() returns an iterable. > > -- > Arnaud > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/abpillai%40gmail.com > -- -Anand From lists at cheimes.de Wed Apr 9 12:57:17 2008 From: lists at cheimes.de (Christian Heimes) Date: Wed, 09 Apr 2008 12:57:17 +0200 Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <20080409091116.E014A1E4005@bag.python.org> References: <20080409091116.E014A1E4005@bag.python.org> Message-ID: <47FCA10D.2020303@cheimes.de> buildbot at python.org schrieb: > The Buildbot has detected a new failure of ppc Debian unstable 3.0. > Full details are available at: > http://www.python.org/dev/buildbot/all/ppc%20Debian%20unstable%203.0/builds/771 > > Buildbot URL: http://www.python.org/dev/buildbot/all/ > > Buildslave for this Build: klose-debian-ppc > > Build Reason: > Build Source Stamp: [branch branches/py3k] HEAD > Blamelist: christian.heimes > > BUILD FAILED: failed test > > Excerpt from the test logfile: > 1 test failed: > test_ssl > > ====================================================================== > ERROR: testEcho (test.test_ssl.ThreadedTests) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 743, in testEcho > chatty=True, connectionchatty=True) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 644, in serverParamsTest > s.write(indata) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/ssl.py", line 178, in write > return self._sslobj.write(data) > TypeError: write() argument 1 must be bytes or read-only buffer, not str > > ====================================================================== > ERROR: testProtocolSSL2 (test.test_ssl.ThreadedTests) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 813, in testProtocolSSL2 > tryProtocolCombo(ssl.PROTOCOL_SSLv2, ssl.PROTOCOL_SSLv2, True) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 686, in tryProtocolCombo > chatty=False, connectionchatty=False) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 644, in serverParamsTest > s.write(indata) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/ssl.py", line 178, in write > return self._sslobj.write(data) > TypeError: write() argument 1 must be bytes or read-only buffer, not str > > ====================================================================== > ERROR: testProtocolSSL23 (test.test_ssl.ThreadedTests) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 824, in testProtocolSSL23 > tryProtocolCombo(ssl.PROTOCOL_SSLv23, ssl.PROTOCOL_SSLv2, True) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 686, in tryProtocolCombo > chatty=False, connectionchatty=False) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 644, in serverParamsTest > s.write(indata) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/ssl.py", line 178, in write > return self._sslobj.write(data) > TypeError: write() argument 1 must be bytes or read-only buffer, not str > > ====================================================================== > ERROR: testProtocolSSL3 (test.test_ssl.ThreadedTests) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 846, in testProtocolSSL3 > tryProtocolCombo(ssl.PROTOCOL_SSLv3, ssl.PROTOCOL_SSLv3, True) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 686, in tryProtocolCombo > chatty=False, connectionchatty=False) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 644, in serverParamsTest > s.write(indata) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/ssl.py", line 178, in write > return self._sslobj.write(data) > TypeError: write() argument 1 must be bytes or read-only buffer, not str > > ====================================================================== > ERROR: testProtocolTLS1 (test.test_ssl.ThreadedTests) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 856, in testProtocolTLS1 > tryProtocolCombo(ssl.PROTOCOL_TLSv1, ssl.PROTOCOL_TLSv1, True) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 686, in tryProtocolCombo > chatty=False, connectionchatty=False) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 644, in serverParamsTest > s.write(indata) > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/ssl.py", line 178, in write > return self._sslobj.write(data) > TypeError: write() argument 1 must be bytes or read-only buffer, not str > > ====================================================================== > ERROR: testSocketServer (test.test_ssl.ThreadedTests) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File "/home/pybot/buildarea/3.0.klose-debian-ppc/build/Lib/test/test_ssl.py", line 931, in testSocketServer > server = AsyncoreHTTPSServer(CERTFILE) > NameError: global name 'AsyncoreHTTPSServer' is not defined I can't run the ssl tests with -unetwork on my machine. Could somebody please fix the test for me? Christian From solipsis at pitrou.net Wed Apr 9 13:10:38 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 09 Apr 2008 13:10:38 +0200 Subject: [Python-3000] r62195 - in python/trunk: Doc/c-api/file.rst Include/fileobject.h Lib/test/test_file.py Misc/NEWS Objects/fileobject.c In-Reply-To: 20080406231118.1A1961E400C@bag.python.org Message-ID: <1207739438.5774.9.camel@fsol> Christian wrote: > > Make file objects as thread safe as the underlying libc FILE* implementation. > > close() will now raise an IOError if any operations on the file object > > are currently in progress in other threads. > > > > Most code was written by Antoine Pitrou (pitrou). Additional testing, > > documentation and test suite cleanup done by me (gregory.p.smith). > > > > Fixes issue 815646 and 595601 (as well as many other bugs and > > references to this problem dating back to the dawn of Python). > > How much of the code needs to go into Python 3000? Python 3000 exposes > only file descriptors and not wrapepd FILE*. It should be safe without > the patch, shouldn't it? If you are curious you could port the unit tests to py3k and see in which kinds of ways they fail :) Then we can debate whether, and how, we should make FileIO objects thread-safe. cheers Antoine. From tnelson at onresolve.com Wed Apr 9 14:23:48 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Wed, 9 Apr 2008 05:23:48 -0700 Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <47FCA10D.2020303@cheimes.de> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEB634@EXMBX04.exchhosting.com> > I can't run the ssl tests with -unetwork on my machine. Could > somebody please fix the test for me? Oeer. The write() issues are easily fixed w/ b''. Investigating the lack of AsyncoreHTTPSServer, though, yielded quite significant differences between the trunk and py3k versions of test_ssl.py. If no-one beats me to it I'll look at fixing it over lunch (in an hour or so). Trent. From tnelson at onresolve.com Wed Apr 9 17:20:12 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Wed, 9 Apr 2008 08:20:12 -0700 Subject: [Python-3000] [Python-checkins] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEB634@EXMBX04.exchhosting.com> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEB634@EXMBX04.exchhosting.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEB815@EXMBX04.exchhosting.com> Regarding the recent test_ssl.py failures in py3k; I'm at a loss as to what the py3k version should look like in comparison to trunk. At the moment, there are pretty significant differences. I played around with just copying the trunk version into py3k and running 2to3, but that yielded significantly more errors than there is currently being thrown. Trying to review the svn logs for both the trunk and py3k versions was difficult to say the least -- the myriad of svnmerge information certainly made it hard to figure out the point at which the two files digressed so much, at least in the 10-15m I spent on it. Bill, can you offer any insight? Are the two versions meant to have diverged so much? > -----Original Message----- > From: > python-checkins-bounces+tnelson=onresolve.com at python.org > [mailto:python-checkins-bounces+tnelson=onresolve.com at python.o > rg] On Behalf Of Trent Nelson > Sent: 09 April 2008 13:24 > To: Christian Heimes; python-checkins at python.org; Python 3000 > Subject: Re: [Python-checkins] [Python-3000] buildbot failure > in ppc Debian unstable 3.0 > > > > I can't run the ssl tests with -unetwork on my machine. > Could somebody > > please fix the test for me? > > Oeer. The write() issues are easily fixed w/ b''. > Investigating the lack of AsyncoreHTTPSServer, though, > yielded quite significant differences between the trunk and > py3k versions of test_ssl.py. If no-one beats me to it I'll > look at fixing it over lunch (in an hour or so). > > Trent. > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From nas at arctrix.com Wed Apr 9 17:47:07 2008 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 9 Apr 2008 15:47:07 +0000 (UTC) Subject: [Python-3000] Types and classes References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <1afaf6160804021534y5ef9636dnf22a2a06d450e294@mail.gmail.com> <1afaf6160804021558l484185dfsa603591a4bc6eb35@mail.gmail.com> <1afaf6160804021604t1f0a0514q41621479d3c33172@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > There's something reassuring about the fact that things > with no "obvious" textual representation stick out like > a sore digit when you try to print them. I wouldn't like > to lose that. I agree with this and support the status quo (i.e. repr(int) == str(int) == ""). I think str(int) == 'int' could lead to confusion if you have a bug in your program. Neil From guido at python.org Wed Apr 9 18:20:56 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Apr 2008 09:20:56 -0700 Subject: [Python-3000] Types and classes In-Reply-To: References: <1cb725390804021457j76179af5y89c84335f65aa454@mail.gmail.com> <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: On Wed, Apr 9, 2008 at 8:47 AM, Neil Schemenauer wrote: > Greg Ewing wrote: > > There's something reassuring about the fact that things > > with no "obvious" textual representation stick out like > > a sore digit when you try to print them. I wouldn't like > > to lose that. > > I agree with this and support the status quo (i.e. repr(int) == > str(int) == ""). I think str(int) == 'int' could lead > to confusion if you have a bug in your program. So could str(3) == str('3'). I don't see why printing a type is considered something so unusual that it ought to look weird. We already have repr() if you want unambiguous output; str() is for pretty output. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Apr 9 18:31:39 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Apr 2008 09:31:39 -0700 Subject: [Python-3000] Equality of range objects In-Reply-To: <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> Message-ID: On Tue, Apr 8, 2008 at 11:45 PM, Anand Balachandran Pillai wrote: > Still this seems like a bad thing to break backward compatibility with. That's not a very strong argument for Py3k. > Hopefully, this will be well documented at 3.0 release. Currently > that "whats new" page does not mention anything about the range > type and how it breaks backward compatibility. > > The NEWS page for 3.0 a4 does say this however. > > "range() now returns an iterator rather than a list. Floats are not allowed. > xrange() is no longer defined." > > I guess more information can be added here to actually specify > that range() returns not just any iterator, but an iterator which is a new > type and how it is different. Please submit a doc patch! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at krypto.org Wed Apr 9 21:00:16 2008 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 9 Apr 2008 12:00:16 -0700 Subject: [Python-3000] r62195 - in python/trunk: Doc/c-api/file.rst Include/fileobject.h Lib/test/test_file.py Misc/NEWS Objects/fileobject.c In-Reply-To: <1207739438.5774.9.camel@fsol> References: <1207739438.5774.9.camel@fsol> Message-ID: <52dc1c820804091200y7d60f79eoa04f027b6e870394@mail.gmail.com> On Wed, Apr 9, 2008 at 4:10 AM, Antoine Pitrou wrote: > > Christian wrote: > > > Make file objects as thread safe as the underlying libc FILE* > implementation. > > > close() will now raise an IOError if any operations on the file object > > > are currently in progress in other threads. > > > > > > Most code was written by Antoine Pitrou (pitrou). Additional testing, > > > documentation and test suite cleanup done by me (gregory.p.smith). > > > > > > Fixes issue 815646 and 595601 (as well as many other bugs and > > > references to this problem dating back to the dawn of Python). > > > > How much of the code needs to go into Python 3000? Python 3000 exposes > > only file descriptors and not wrapepd FILE*. It should be safe without > > the patch, shouldn't it? > > If you are curious you could port the unit tests to py3k and see in > which kinds of ways they fail :) > > Then we can debate whether, and how, we should make FileIO objects > thread-safe. > > cheers > > Antoine. > Agreed, port the tests and watch things fail. Since we claim file objects are as thread safe as the underlying C library FILE* implementation in 2.6 which turns out to be pretty darn thread safe, we should try to match that behavior in 3.0. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080409/767137a0/attachment-0001.htm From janssen at parc.com Thu Apr 10 01:05:42 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 9 Apr 2008 16:05:42 PDT Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> Message-ID: <08Apr9.160545pdt."58696"@synergy1.parc.xerox.com> > That's not the Python spirit. The spirit is that *if* they support > similar enough functionality the APIs should be named the same, in the > same module, and have the same signature. E.g. the os module is built > on this principle. Many APIs there are optional, but if they exist, > they have a known name and spec. (The posix/nt underlying modules are > implementation details that most users never need to know about.) Does that mean I can fix the signature issues with socket.gethostname()? http://bugs.python.org/issue1049 Bill From guido at python.org Thu Apr 10 01:08:06 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Apr 2008 16:08:06 -0700 Subject: [Python-3000] PEP: Cleaning out sys and the "interpreter" module In-Reply-To: <-4078378830460022880@unknownmsgid> References: <1afaf6160804071410s101c16a9pb6cbf5493300cc4e@mail.gmail.com> <47FA9501.6080408@gmail.com> <1afaf6160804071504j342aa48nb76f7ce10887e8a@mail.gmail.com> <1afaf6160804081334u23aa154ahebfb5d16c279819@mail.gmail.com> <-4078378830460022880@unknownmsgid> Message-ID: On Wed, Apr 9, 2008 at 4:05 PM, Bill Janssen wrote: > > That's not the Python spirit. The spirit is that *if* they support > > similar enough functionality the APIs should be named the same, in the > > same module, and have the same signature. E.g. the os module is built > > on this principle. Many APIs there are optional, but if they exist, > > they have a known name and spec. (The posix/nt underlying modules are > > implementation details that most users never need to know about.) > > Does that mean I can fix the signature issues with socket.gethostname()? > > http://bugs.python.org/issue1049 I think in that case the spec includes the possibility of raising an exception, but not the possibility of returning None, so, no. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Thu Apr 10 01:13:20 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 9 Apr 2008 16:13:20 PDT Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEB634@EXMBX04.exchhosting.com> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEB634@EXMBX04.exchhosting.com> Message-ID: <08Apr9.161320pdt."58696"@synergy1.parc.xerox.com> > > I can't run the ssl tests with -unetwork on my machine. Could > > somebody please fix the test for me? > > Oeer. The write() issues are easily fixed w/ b''. Investigating the lack of AsyncoreHTTPSServer, though, yielded quite significant differences between the trunk and py3k versions of test_ssl.py. If no-one beats me to it I'll look at fixing it over lunch (in an hour or so). The "trunk" is out-of-date. The 3K code needs to be back-ported to 2.6. I'm working on it (now). Bill From janssen at parc.com Thu Apr 10 01:21:52 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 9 Apr 2008 16:21:52 PDT Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <47FCA10D.2020303@cheimes.de> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> Message-ID: <08Apr9.162155pdt."58696"@synergy1.parc.xerox.com> > I can't run the ssl tests with -unetwork on my machine. Could somebody > please fix the test for me? > > Christian I see at least three "merges from the trunk" since I last touched it in December. If you revert these (to Lib/ssl.py, Modules/_ssl.c, and Lib/test/test_ssl.py), I'll bet things will work again. The trunk of ssl is out-of-date with regard to 3k. We need to merge backward, not forward. Time to fix this (in the trunk). Why not revert these, and I understand there's some way of marking files to not be merged, now? Please so mark these three files, and when it's working in 2.6 again, we can remove the marks. Bill From eric+python-dev at trueblade.com Thu Apr 10 01:54:31 2008 From: eric+python-dev at trueblade.com (Eric Smith) Date: Wed, 09 Apr 2008 19:54:31 -0400 Subject: [Python-3000] Implementing % formatting in terms of str.format() Message-ID: <47FD5737.4060001@trueblade.com> I'm working on issue 2416, adding %b to % formatting (http://bugs.python.org/issue2416). It's really quite a pain, especially in 2.6 with int and long and str and unicode. I'm contemplating just making % formatting compute a new format string and call str.format (or obj.__format__, or something appropriate). But before I proceed, I thought I'd ask and see if this really offends anyone. By implementing % in terms of str.format, I hope to be able to delete a lot of the duplication in the formatting code, but I haven't checked yet to see what's possible. The real impetus is issue 2416, though. About the only downside I see is that str.format is somewhat slower than %, but I can probably get around most of this by directly calling int.__format__, float.__format__, etc. Other than misleading microbenchmarks, I've never really compared the difference, though. From guido at python.org Thu Apr 10 02:14:58 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Apr 2008 17:14:58 -0700 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <47FD5737.4060001@trueblade.com> References: <47FD5737.4060001@trueblade.com> Message-ID: I think there are too many risks with this approach, especially given that we're keeping % formatting mainly for backwards compatibility reasons. There will inevitably be corner cases where the conversion doesn't work exactly the same way as the old code or where the conversion is wrong for whatever reason, and it would be quite painful to change back. If 2.6 can't support %b, so be it. On Wed, Apr 9, 2008 at 4:54 PM, Eric Smith wrote: > I'm working on issue 2416, adding %b to % formatting > (http://bugs.python.org/issue2416). It's really quite a pain, > especially in 2.6 with int and long and str and unicode. > > I'm contemplating just making % formatting compute a new format string > and call str.format (or obj.__format__, or something appropriate). But > before I proceed, I thought I'd ask and see if this really offends > anyone. By implementing % in terms of str.format, I hope to be able to > delete a lot of the duplication in the formatting code, but I haven't > checked yet to see what's possible. The real impetus is issue 2416, though. > > About the only downside I see is that str.format is somewhat slower than > %, but I can probably get around most of this by directly calling > int.__format__, float.__format__, etc. Other than misleading > microbenchmarks, I've never really compared the difference, though. > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From eric+python-dev at trueblade.com Thu Apr 10 02:22:49 2008 From: eric+python-dev at trueblade.com (Eric Smith) Date: Wed, 09 Apr 2008 20:22:49 -0400 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> Message-ID: <47FD5DD9.6000705@trueblade.com> Understood. Maybe I'll just use this technique to implement %b, and leave everything else alone. I'll investigate. Guido van Rossum wrote: > I think there are too many risks with this approach, especially given > that we're keeping % formatting mainly for backwards compatibility > reasons. There will inevitably be corner cases where the conversion > doesn't work exactly the same way as the old code or where the > conversion is wrong for whatever reason, and it would be quite painful > to change back. > > If 2.6 can't support %b, so be it. > > On Wed, Apr 9, 2008 at 4:54 PM, Eric Smith > wrote: >> I'm working on issue 2416, adding %b to % formatting >> (http://bugs.python.org/issue2416). It's really quite a pain, >> especially in 2.6 with int and long and str and unicode. >> >> I'm contemplating just making % formatting compute a new format string >> and call str.format (or obj.__format__, or something appropriate). But >> before I proceed, I thought I'd ask and see if this really offends >> anyone. By implementing % in terms of str.format, I hope to be able to >> delete a lot of the duplication in the formatting code, but I haven't >> checked yet to see what's possible. The real impetus is issue 2416, though. >> >> About the only downside I see is that str.format is somewhat slower than >> %, but I can probably get around most of this by directly calling >> int.__format__, float.__format__, etc. Other than misleading >> microbenchmarks, I've never really compared the difference, though. >> >> _______________________________________________ >> Python-3000 mailing list >> Python-3000 at python.org >> http://mail.python.org/mailman/listinfo/python-3000 >> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org >> > > > From janssen at parc.com Thu Apr 10 02:42:08 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 9 Apr 2008 17:42:08 PDT Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <08Apr9.162155pdt."58696"@synergy1.parc.xerox.com> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> <08Apr9.162155pdt."58696"@synergy1.parc.xerox.com> Message-ID: <08Apr9.174216pdt."58696"@synergy1.parc.xerox.com> > > I can't run the ssl tests with -unetwork on my machine. Could somebody > > please fix the test for me? > > > > Christian > > I see at least three "merges from the trunk" since I last touched it > in December. If you revert these (to Lib/ssl.py, Modules/_ssl.c, and > Lib/test/test_ssl.py), I'll bet things will work again. The trunk of > ssl is out-of-date with regard to 3k. We need to merge backward, not > forward. Time to fix this (in the trunk). > > Why not revert these, and I understand there's some way of marking > files to not be merged, now? Please so mark these three files, and > when it's working in 2.6 again, we can remove the marks. Looking at these three, it seems that only Lib/test/test_ssl.py is broken. Could this be the effect of Trent's earlier work on port selection in the testing framework? Perhaps working from an earlier revision, and checking it in on top of later work? Bill From musiccomposition at gmail.com Thu Apr 10 04:07:13 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 9 Apr 2008 21:07:13 -0500 Subject: [Python-3000] properties on IOBase Message-ID: <1afaf6160804091907sfd64bdaoffbdb09be25ad3d6@mail.gmail.com> Should IOBase's writeable, readable, and seekable methods have decorators like the closed method? -- Cheers, Benjamin Peterson From skip at pobox.com Thu Apr 10 04:20:34 2008 From: skip at pobox.com (skip at pobox.com) Date: Wed, 9 Apr 2008 21:20:34 -0500 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <47FD5DD9.6000705@trueblade.com> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> Message-ID: <18429.31090.933499.512334@montanaro-dyndns-org.local> Is there a 2-to-3 fixer for % format? I scanned the fixes directly quickly but didn't see anything obvious. Skip From musiccomposition at gmail.com Thu Apr 10 04:23:30 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 9 Apr 2008 21:23:30 -0500 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <18429.31090.933499.512334@montanaro-dyndns-org.local> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> Message-ID: <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> On Wed, Apr 9, 2008 at 9:20 PM, wrote: > > Is there a 2-to-3 fixer for % format? I scanned the fixes directly quickly > but didn't see anything obvious. I believe the only reason that % is even in 3.0 is that a 2to3 fixer couldn't be easily written for it. > > Skip > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/musiccomposition%40gmail.com > -- Cheers, Benjamin Peterson From skip at pobox.com Thu Apr 10 04:53:06 2008 From: skip at pobox.com (skip at pobox.com) Date: Wed, 9 Apr 2008 21:53:06 -0500 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> Message-ID: <18429.33042.990861.658382@montanaro-dyndns-org.local> >> Is there a 2-to-3 fixer for % format? I scanned the fixes directly >> quickly but didn't see anything obvious. Benjamin> I believe the only reason that % is even in 3.0 is that a 2to3 Benjamin> fixer couldn't be easily written for it. I find that kind of hard to believe (that it should be terribly difficult to write a fixer, at least given a % operator with a string literal LHS and either a tuple or dict RHS or a call to locals() or globals()). Skip From tnelson at onresolve.com Thu Apr 10 15:20:55 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Thu, 10 Apr 2008 06:20:55 -0700 Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <08Apr9.174216pdt."58696"@synergy1.parc.xerox.com> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> <08Apr9.162155pdt."58696"@synergy1.parc.xerox.com> <08Apr9.174216pdt."58696"@synergy1.parc.xerox.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEBF99@EXMBX04.exchhosting.com> > > Why not revert these, and I understand there's some way of marking > > files to not be merged, now? Please so mark these three files, and > > when it's working in 2.6 again, we can remove the marks. > > Looking at these three, it seems that only > Lib/test/test_ssl.py is broken. Could this be the effect of > Trent's earlier work on port selection in the testing > framework? Perhaps working from an earlier revision, and > checking it in on top of later work? Ah, no, my commit to trunk's test_ssl.py was definitely the latest (Subversion would have prevented the commit otherwise). You mentioned in an email somewhere else though that the py3k version of test_ssl.py was far more up to date than the trunk version. That's the problem. The trunk version was svnmerge'd over the more-up-to-date version in py3k. Seems like we should revert r62242 test_ssl.py in py3k, commit that, then copy it back to trunk, manually 3to2 it, then check that in, then block that particular revision. Then, going forward, if test_ssl.py changes need to be made, make them against trunk, and they'll get picked up in the regular merges to py3k. Sound like a plan? Trent. From guido at python.org Thu Apr 10 17:04:05 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 10 Apr 2008 08:04:05 -0700 Subject: [Python-3000] properties on IOBase In-Reply-To: <1afaf6160804091907sfd64bdaoffbdb09be25ad3d6@mail.gmail.com> References: <1afaf6160804091907sfd64bdaoffbdb09be25ad3d6@mail.gmail.com> Message-ID: On Wed, Apr 9, 2008 at 7:07 PM, Benjamin Peterson wrote: > Should IOBase's writeable, readable, and seekable methods have > decorators like the closed method? No, read the PEP. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Thu Apr 10 18:35:07 2008 From: janssen at parc.com (Bill Janssen) Date: Thu, 10 Apr 2008 09:35:07 PDT Subject: [Python-3000] buildbot failure in ppc Debian unstable 3.0 In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEBF99@EXMBX04.exchhosting.com> References: <20080409091116.E014A1E4005@bag.python.org> <47FCA10D.2020303@cheimes.de> <08Apr9.162155pdt."58696"@synergy1.parc.xerox.com> <08Apr9.174216pdt."58696"@synergy1.parc.xerox.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22BEBF99@EXMBX04.exchhosting.com> Message-ID: <08Apr10.093517pdt."58696"@synergy1.parc.xerox.com> > Seems like we should revert r62242 test_ssl.py in py3k, commit that, then c= > opy it back to trunk, manually 3to2 it, then check that in, then block that= > particular revision. Then, going forward, if test_ssl.py changes need to = > be made, make them against trunk, and they'll get picked up in the regular = > merges to py3k. > > Sound like a plan? Yep. Thanks for looking through this. Bill From amauryfa at gmail.com Thu Apr 10 19:42:53 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 10 Apr 2008 19:42:53 +0200 Subject: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c Message-ID: Hello, > Log: > Issue 2440: fix the handling of %n in Python/getargs.c's convertsimple(), > extend Objects/abstract.c's PyNumber_Index() to accept PyObjects that have nb_int slots, > and update test_getargs2 to test that an exception is thrown when __int__() returns a non-int object. Does this mean that floats can now be used as list indexes? Preventing this was the motivation for introducing the nb_index slot. from http://www.python.org/dev/peps/pep-0357 :: The biggest example of why using nb_int would be a bad thing is that float objects already define the nb_int method, but float objects *should not* be used as indexes in a sequence. -- Amaury Forgeot d'Arc From eric+python-dev at trueblade.com Thu Apr 10 19:50:05 2008 From: eric+python-dev at trueblade.com (Eric Smith) Date: Thu, 10 Apr 2008 13:50:05 -0400 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> Message-ID: <47FE534D.10005@trueblade.com> Guido van Rossum wrote: > I think there are too many risks with this approach, especially given > that we're keeping % formatting mainly for backwards compatibility > reasons. There will inevitably be corner cases where the conversion > doesn't work exactly the same way as the old code or where the > conversion is wrong for whatever reason, and it would be quite painful > to change back. > > If 2.6 can't support %b, so be it. It would really be easiest to just say that if you want binary formatting in both 2.6 and 3.0, use str.format. I don't think expanding the functionality of % formatting is what anyone should be spending their time on. I'd be happy to update the PEP to drop %b. From musiccomposition at gmail.com Thu Apr 10 22:27:58 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Thu, 10 Apr 2008 15:27:58 -0500 Subject: [Python-3000] properties on IOBase In-Reply-To: References: <1afaf6160804091907sfd64bdaoffbdb09be25ad3d6@mail.gmail.com> Message-ID: <1afaf6160804101327n38954fc7na72130f397ecd7b5@mail.gmail.com> On Thu, Apr 10, 2008 at 10:04 AM, Guido van Rossum wrote: > On Wed, Apr 9, 2008 at 7:07 PM, Benjamin Peterson > wrote: > > Should IOBase's writeable, readable, and seekable methods have > > decorators like the closed method? > > No, read the PEP. I did. It doesn't mention closed at all, so I though I'd ask. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- Cheers, Benjamin Peterson From guido at python.org Thu Apr 10 23:28:08 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 10 Apr 2008 14:28:08 -0700 Subject: [Python-3000] properties on IOBase In-Reply-To: <1afaf6160804101327n38954fc7na72130f397ecd7b5@mail.gmail.com> References: <1afaf6160804091907sfd64bdaoffbdb09be25ad3d6@mail.gmail.com> <1afaf6160804101327n38954fc7na72130f397ecd7b5@mail.gmail.com> Message-ID: On Thu, Apr 10, 2008 at 1:27 PM, Benjamin Peterson wrote: > On Thu, Apr 10, 2008 at 10:04 AM, Guido van Rossum wrote: > > On Wed, Apr 9, 2008 at 7:07 PM, Benjamin Peterson > > wrote: > > > Should IOBase's writeable, readable, and seekable methods have > > > decorators like the closed method? > > > > No, read the PEP. > I did. It doesn't mention closed at all, so I though I'd ask. Well closed is a property (always has been) and the others are methods. There are good reasons why the design is this way. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tnelson at onresolve.com Fri Apr 11 02:24:19 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Thu, 10 Apr 2008 17:24:19 -0700 Subject: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c In-Reply-To: References: Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F94D@EXMBX04.exchhosting.com> > > Issue 2440: fix the handling of %n in Python/getargs.c's > convertsimple(), > > extend Objects/abstract.c's PyNumber_Index() to accept PyObjects that > have nb_int slots, > > and update test_getargs2 to test that an exception is thrown when > __int__() returns a non-int object. > > Does this mean that floats can now be used as list indexes? > Preventing this was the motivation for introducing the nb_index slot. It sure did! At least, between r62269 and r62279 ;-) Ben pointed out my error, which I fixed in r62280. Trent. From tnelson at onresolve.com Fri Apr 11 02:53:03 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Thu, 10 Apr 2008 17:53:03 -0700 Subject: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F94D@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F94D@EXMBX04.exchhosting.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F951@EXMBX04.exchhosting.com> > > Does this mean that floats can now be used as list indexes? > > Preventing this was the motivation for introducing the nb_index slot. > > > from http://www.python.org/dev/peps/pep-0357 :: > > > > The biggest example of why using nb_int would be a bad > > thing is that float objects already define the nb_int method, but > > float objects *should not* be used as indexes in a sequence. > It sure did! At least, between r62269 and r62279 ;-) Ben pointed out > my error, which I fixed in r62280. > > Trent. Hrrm. I just re-read that PEP. This stuck out: It is not possible to use the nb_int (and __int__ special method) for this purpose because that method is used to *coerce* objects to integers. It would be inappropriate to allow every object that can be coerced to an integer to be used as an integer everywhere Python expects a true integer. For example, if __int__ were used to convert an object to an integer in slicing, then float objects would be allowed in slicing and x[3.2:5.8] would not raise an error as it should. I think I've pretty much violated the first few sentences with my change to PyNumber_Index(). Even with the change in r62280 which checks that we're not dealing with a float, it's still permitting anything else with an __int__ representation to pass through just fine. Note that all of this originated from the following in test_args2: class Long: def __int__(self): return 99 class Signed_TestCase(unittest.TestCase): ... def test_n(self): ... self.failUnlessEqual(99, getargs_n(Long())) Before the change, %n was passing through to %l unless sizeof(long) != sizeof(size_t) (in convertsimple() -- Python/getargs.c). Windows x64 is the only platform where this assertion holds true, which drew my attention to the problem. The PEP's take on the situation would be that sequence[Long()] should fail (which isn't currently the case with my latest PyNumber_Index() changes). If we want to adhere to the behaviour prescribed in the PEP, then it seems like PyNumber_Index() should be reverted back to its original state, and the handling of %n in convertsimple() should be be done without calling PyNumber_Index(). (I assume we *do* want to support `'%n' % Long()` though right, or should the test be done away with?) Note that there's all sorts of problems with PyLong_AsSize_t() on Windows x64 when it comes to handling numbers close, equal or surpassing negative maximums. (See first posting to issue 2440 for examples.) Trent. From phd at phd.pp.ru Wed Apr 9 18:30:38 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 9 Apr 2008 20:30:38 +0400 Subject: [Python-3000] Recursive str (was: Types and classes) In-Reply-To: References: <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> Message-ID: <20080409163038.GC12902@phd.pp.ru> On Wed, Apr 09, 2008 at 09:20:56AM -0700, Guido van Rossum wrote: > We > already have repr() if you want unambiguous output; str() is for > pretty output. BTW, does Python 3000 fix the problem that str(container) calls repr() instead of str() for elements in the container? Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From thomas at python.org Fri Apr 11 14:44:34 2008 From: thomas at python.org (Thomas Wouters) Date: Fri, 11 Apr 2008 14:44:34 +0200 Subject: [Python-3000] Recursive str (was: Types and classes) In-Reply-To: <20080409163038.GC12902@phd.pp.ru> References: <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> Message-ID: <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> On Wed, Apr 9, 2008 at 6:30 PM, Oleg Broytmann wrote: > On Wed, Apr 09, 2008 at 09:20:56AM -0700, Guido van Rossum wrote: > > We > > already have repr() if you want unambiguous output; str() is for > > pretty output. > > BTW, does Python 3000 fix the problem that str(container) calls repr() > instead of str() for elements in the container? > No, because there is no sensible way to fix it. If a container defines __str__, it can do whatever it wants with items inside itself. If the container doesn't define __str__ (or defines it as an alias to __repr__), then __repr__ will be used, and the only sensible thing to do is call repr() on the elements inside it. If you want containers to have a 'prettier' format when passed to str(), give them a __str__ that does the pretty thing. Me, I don't see the point of having a 'pretty' format for lists that is ambiguous. If I want to print a list, 'repr' does what I expect. Or, I loop over the list and print each element how I expect it to print. I don't see the value in str(['1', 1, '1, [1]', '1]', '\n[1']) giving hard to understand output. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080411/65b0db3e/attachment.htm From phd at phd.pp.ru Fri Apr 11 14:55:22 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 11 Apr 2008 16:55:22 +0400 Subject: [Python-3000] Recursive str In-Reply-To: <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> References: <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> Message-ID: <20080411125521.GE25461@phd.pp.ru> On Fri, Apr 11, 2008 at 02:44:34PM +0200, Thomas Wouters wrote: > On Wed, Apr 9, 2008 at 6:30 PM, Oleg Broytmann wrote: > > > On Wed, Apr 09, 2008 at 09:20:56AM -0700, Guido van Rossum wrote: > > > We > > > already have repr() if you want unambiguous output; str() is for > > > pretty output. > > > > BTW, does Python 3000 fix the problem that str(container) calls repr() > > instead of str() for elements in the container? > > No, because there is no sensible way to fix it. If a container defines > __str__, it can do whatever it wants with items inside itself. If the > container doesn't define __str__ (or defines it as an alias to __repr__), > then __repr__ will be used, and the only sensible thing to do is call repr() > on the elements inside it. I see. Thank you! > If you want containers to have a 'prettier' > format when passed to str(), give them a __str__ that does the pretty thing. > Me, I don't see the point of having a 'pretty' format for lists that is > ambiguous. If I want to print a list, 'repr' does what I expect. Or, I loop > over the list and print each element how I expect it to print. I don't see > the value in str(['1', 1, '1, [1]', '1]', '\n[1']) giving hard to understand > output. str([a, b, c]) currently does a wrong thing if items are non-ascii strings - calling repr() on them produces '\XXX' escapes instead of a readable representation. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido at python.org Fri Apr 11 15:57:47 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 11 Apr 2008 06:57:47 -0700 Subject: [Python-3000] Recursive str In-Reply-To: <20080411125521.GE25461@phd.pp.ru> References: <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> Message-ID: On Fri, Apr 11, 2008 at 5:55 AM, Oleg Broytmann wrote: > str([a, b, c]) currently does a wrong thing if items are non-ascii > strings - calling repr() on them produces '\XXX' escapes instead of > a readable representation. But merely calling str() on the items instead of repr() isn't good enough here: we don't want str(['1, 2']) to return '[1, 2]'. We'd need a third form (eek!) that would preserve the string quotes but be more lenient about non-ASCII. Personally, I think some custom loop to print the values is good enough. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Apr 11 19:42:37 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 11 Apr 2008 10:42:37 -0700 Subject: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F951@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F94D@EXMBX04.exchhosting.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F951@EXMBX04.exchhosting.com> Message-ID: I think you're right, the whole thing ought to be rolled back. The whole point of __index__ was that __int__ cannot be trusted not to truncate floats or float-like types. (Or do other conversions e.g. from string.) On Thu, Apr 10, 2008 at 5:53 PM, Trent Nelson wrote: > > > Does this mean that floats can now be used as list indexes? > > > Preventing this was the motivation for introducing the nb_index slot. > > > > > > from http://www.python.org/dev/peps/pep-0357 :: > > > > > > The biggest example of why using nb_int would be a bad > > > thing is that float objects already define the nb_int method, but > > > float objects *should not* be used as indexes in a sequence. > > > > It sure did! At least, between r62269 and r62279 ;-) Ben pointed out > > my error, which I fixed in r62280. > > > > Trent. > > Hrrm. I just re-read that PEP. This stuck out: > > It is not possible to use the nb_int (and __int__ special method) > for this purpose because that method is used to *coerce* objects > to integers. It would be inappropriate to allow every object that > can be coerced to an integer to be used as an integer everywhere > Python expects a true integer. For example, if __int__ were used > to convert an object to an integer in slicing, then float objects > would be allowed in slicing and x[3.2:5.8] would not raise an error > as it should. > > I think I've pretty much violated the first few sentences with my change to PyNumber_Index(). Even with the change in r62280 which checks that we're not dealing with a float, it's still permitting anything else with an __int__ representation to pass through just fine. > > Note that all of this originated from the following in test_args2: > > class Long: > > def __int__(self): > return 99 > > class Signed_TestCase(unittest.TestCase): > ... > def test_n(self): > ... > self.failUnlessEqual(99, getargs_n(Long())) > > Before the change, %n was passing through to %l unless sizeof(long) != sizeof(size_t) (in convertsimple() -- Python/getargs.c). Windows x64 is the only platform where this assertion holds true, which drew my attention to the problem. > > The PEP's take on the situation would be that sequence[Long()] should fail (which isn't currently the case with my latest PyNumber_Index() changes). If we want to adhere to the behaviour prescribed in the PEP, then it seems like PyNumber_Index() should be reverted back to its original state, and the handling of %n in convertsimple() should be be done without calling PyNumber_Index(). > > (I assume we *do* want to support `'%n' % Long()` though right, or should the test be done away with?) > > Note that there's all sorts of problems with PyLong_AsSize_t() on Windows x64 when it comes to handling numbers close, equal or surpassing negative maximums. (See first posting to issue 2440 for examples.) > > > > > Trent. > _______________________________________________ > Python-3000-checkins mailing list > Python-3000-checkins at python.org > http://mail.python.org/mailman/listinfo/python-3000-checkins > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at python.org Fri Apr 11 22:49:03 2008 From: thomas at python.org (Thomas Wouters) Date: Fri, 11 Apr 2008 22:49:03 +0200 Subject: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F951@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F94D@EXMBX04.exchhosting.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F951@EXMBX04.exchhosting.com> Message-ID: <9e804ac0804111349i1aa1870bk960495622bdb1743@mail.gmail.com> On Fri, Apr 11, 2008 at 2:53 AM, Trent Nelson wrote: > > > Does this mean that floats can now be used as list indexes? > > > Preventing this was the motivation for introducing the nb_index slot. > > > > > from http://www.python.org/dev/peps/pep-0357 :: > > > > > > The biggest example of why using nb_int would be a bad > > > thing is that float objects already define the nb_int method, but > > > float objects *should not* be used as indexes in a sequence. > > > It sure did! At least, between r62269 and r62279 ;-) Ben pointed out > > my error, which I fixed in r62280. > > > > Trent. > > Hrrm. I just re-read that PEP. This stuck out: > > It is not possible to use the nb_int (and __int__ special method) > for this purpose because that method is used to *coerce* objects > to integers. It would be inappropriate to allow every object that > can be coerced to an integer to be used as an integer everywhere > Python expects a true integer. For example, if __int__ were used > to convert an object to an integer in slicing, then float objects > would be allowed in slicing and x[3.2:5.8] would not raise an error > as it should. > > I think I've pretty much violated the first few sentences with my change > to PyNumber_Index(). Even with the change in r62280 which checks that we're > not dealing with a float, it's still permitting anything else with an > __int__ representation to pass through just fine. > > Note that all of this originated from the following in test_args2: > > class Long: > def __int__(self): > return 99 > > class Signed_TestCase(unittest.TestCase): > ... > def test_n(self): > ... > self.failUnlessEqual(99, getargs_n(Long())) > > Before the change, %n was passing through to %l unless sizeof(long) != > sizeof(size_t) (in convertsimple() -- Python/getargs.c). Windows x64 is the > only platform where this assertion holds true, which drew my attention to > the problem. > > The PEP's take on the situation would be that sequence[Long()] should fail > (which isn't currently the case with my latest PyNumber_Index() changes). > If we want to adhere to the behaviour prescribed in the PEP, then it seems > like PyNumber_Index() should be reverted back to its original state, and the > handling of %n in convertsimple() should be be done without calling > PyNumber_Index(). > > (I assume we *do* want to support `'%n' % Long()` though right, or should > the test be done away with?) You keep talking about '%n', but the code is used for Py_BuildValue and PyArg_Parse* and such, not for string formatting (unless you are working to change that for some reason?) The 'n' argument to PyArg_Parse* is meant to be used for indices (like most uses of Py_ssize_t), so the change to PyNumber_Index makes no sense, and the test above is actually broken (IMHO.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080411/ebb7278c/attachment-0001.htm From tnelson at onresolve.com Sat Apr 12 00:23:15 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Fri, 11 Apr 2008 15:23:15 -0700 Subject: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c In-Reply-To: <9e804ac0804111349i1aa1870bk960495622bdb1743@mail.gmail.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F94D@EXMBX04.exchhosting.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1F951@EXMBX04.exchhosting.com> <9e804ac0804111349i1aa1870bk960495622bdb1743@mail.gmail.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D1FFE0@EXMBX04.exchhosting.com> Agreed, in the middle of reverting the changes made in 62269 and 62279 now. I've also figured out why getargs_n() is broken for Windows x64 for negative values. I'll post a patch for review to python-3000@ shortly. ________________________________ From: thomaswout at gmail.com [mailto:thomaswout at gmail.com] On Behalf Of Thomas Wouters Sent: 11 April 2008 21:49 To: Trent Nelson Cc: Amaury Forgeot d'Arc; python-3000-checkins at python.org; Python 3000 Subject: Re: [Python-3000] [Python-3000-checkins] r62269 - in python/branches/py3k: Lib/test/test_getargs2.py Objects/abstract.c Python/getargs.c On Fri, Apr 11, 2008 at 2:53 AM, Trent Nelson wrote: > > Does this mean that floats can now be used as list indexes? > > Preventing this was the motivation for introducing the nb_index slot. > > > from http://www.python.org/dev/peps/pep-0357 :: > > > > The biggest example of why using nb_int would be a bad > > thing is that float objects already define the nb_int method, but > > float objects *should not* be used as indexes in a sequence. > It sure did! At least, between r62269 and r62279 ;-) Ben pointed out > my error, which I fixed in r62280. > > Trent. Hrrm. I just re-read that PEP. This stuck out: It is not possible to use the nb_int (and __int__ special method) for this purpose because that method is used to *coerce* objects to integers. It would be inappropriate to allow every object that can be coerced to an integer to be used as an integer everywhere Python expects a true integer. For example, if __int__ were used to convert an object to an integer in slicing, then float objects would be allowed in slicing and x[3.2:5.8] would not raise an error as it should. I think I've pretty much violated the first few sentences with my change to PyNumber_Index(). Even with the change in r62280 which checks that we're not dealing with a float, it's still permitting anything else with an __int__ representation to pass through just fine. Note that all of this originated from the following in test_args2: class Long: def __int__(self): return 99 class Signed_TestCase(unittest.TestCase): ... def test_n(self): ... self.failUnlessEqual(99, getargs_n(Long())) Before the change, %n was passing through to %l unless sizeof(long) != sizeof(size_t) (in convertsimple() -- Python/getargs.c). Windows x64 is the only platform where this assertion holds true, which drew my attention to the problem. The PEP's take on the situation would be that sequence[Long()] should fail (which isn't currently the case with my latest PyNumber_Index() changes). If we want to adhere to the behaviour prescribed in the PEP, then it seems like PyNumber_Index() should be reverted back to its original state, and the handling of %n in convertsimple() should be be done without calling PyNumber_Index(). (I assume we *do* want to support `'%n' % Long()` though right, or should the test be done away with?) You keep talking about '%n', but the code is used for Py_BuildValue and PyArg_Parse* and such, not for string formatting (unless you are working to change that for some reason?) The 'n' argument to PyArg_Parse* is meant to be used for indices (like most uses of Py_ssize_t), so the change to PyNumber_Index makes no sense, and the test above is actually broken (IMHO.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From eric+python-dev at trueblade.com Sat Apr 12 00:21:28 2008 From: eric+python-dev at trueblade.com (Eric Smith) Date: Fri, 11 Apr 2008 18:21:28 -0400 Subject: [Python-3000] Phasing out % string formatting Message-ID: <47FFE468.2040605@trueblade.com> I've proposed on another thread the no new features are added to % formatting, specifically the PEP 3127 (Integer Literal Support and Syntax) '%b' formatting. It didn't generate any discussion, so I thought I'd bring it up in its own thread. I'd like to see us take the position that % formatting is being phased out, if not actually starting the deprecation process. If it's being phased out, then I think it makes sense to add no new features to it. Early in the PEP 3101 (Advanced String Formatting) discussion, Guido said that he'd like to deprecate % formatting, but that it was too late in the 3.0 stage to actually do that[1] yet. I'm not sure if the thinking has changed since then, but I'm hoping that it's still desirable to do that deprecation eventually. And if it's going to be deprecated, let's not add features to it. [1] http://mail.python.org/pipermail/python-3000/2007-August/009621.html From brett at python.org Sat Apr 12 01:05:01 2008 From: brett at python.org (Brett Cannon) Date: Fri, 11 Apr 2008 16:05:01 -0700 Subject: [Python-3000] Phasing out % string formatting In-Reply-To: <47FFE468.2040605@trueblade.com> References: <47FFE468.2040605@trueblade.com> Message-ID: On Fri, Apr 11, 2008 at 3:21 PM, Eric Smith wrote: > I've proposed on another thread the no new features are added to % > formatting, specifically the PEP 3127 (Integer Literal Support and > Syntax) '%b' formatting. It didn't generate any discussion, so I > thought I'd bring it up in its own thread. > > I'd like to see us take the position that % formatting is being phased > out, if not actually starting the deprecation process. If it's being > phased out, then I think it makes sense to add no new features to it. +1 from me. -Brett From tnelson at onresolve.com Sat Apr 12 03:31:08 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Fri, 11 Apr 2008 18:31:08 -0700 Subject: [Python-3000] getargs_n(), take two. Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20055@EXMBX04.exchhosting.com> I reverted the changes from r62269 and r62279 in r62292. Any issues with the following patch? Note the removal of the guards around case 'n'; this will be the first time a lot of platforms will see this particular code path, as we're not falling back to 'l' anymore. Index: Python/getargs.c =================================================================== --- Python/getargs.c (revision 62292) +++ Python/getargs.c (working copy) @@ -663,7 +663,6 @@ } case 'n': /* Py_ssize_t */ -#if SIZEOF_SIZE_T != SIZEOF_LONG { PyObject *iobj; Py_ssize_t *p = va_arg(*p_va, Py_ssize_t *); @@ -672,14 +671,12 @@ return converterr("integer", arg, msgbuf, bufsize); iobj = PyNumber_Index(arg); if (iobj != NULL) - ival = PyLong_AsSsize_t(arg); + ival = PyLong_AsSsize_t(iobj); if (ival == -1 && PyErr_Occurred()) return converterr("integer", arg, msgbuf, bufsize); *p = ival; break; } -#endif - /* Fall through from 'n' to 'l' if Py_ssize_t is int */ case 'l': {/* long int */ long *p = va_arg(*p_va, long *); long ival; Index: Lib/test/test_getargs2.py =================================================================== --- Lib/test/test_getargs2.py (revision 62292) +++ Lib/test/test_getargs2.py (working copy) @@ -187,8 +187,8 @@ # n returns 'Py_ssize_t', and does range checking # (PY_SSIZE_T_MIN ... PY_SSIZE_T_MAX) self.assertRaises(TypeError, getargs_n, 3.14) - self.failUnlessEqual(99, getargs_n(Long())) - self.failUnlessEqual(99, getargs_n(Int())) + self.assertRaises(TypeError, getargs_n, Long()) + self.assertRaises(TypeError, getargs_n, Int()) self.assertRaises(OverflowError, getargs_n, PY_SSIZE_T_MIN-1) self.failUnlessEqual(PY_SSIZE_T_MIN, getargs_n(PY_SSIZE_T_MIN)) From greg.ewing at canterbury.ac.nz Sat Apr 12 05:14:32 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 12 Apr 2008 15:14:32 +1200 Subject: [Python-3000] Recursive str In-Reply-To: <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> References: <47F48E21.3070304@v.loewis.de> <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> Message-ID: <48002918.7020105@canterbury.ac.nz> Thomas Wouters wrote: > I don't see the value in str(['1', 1, '1, [1]', '1]', > '\n[1']) giving hard to understand output. Random data point: Being forced to do some Ruby programming recently, I found that Ruby does in fact produce just this sort of ambiguous output when you print a list, and it's very annoying! I'm very happy that Python doesn't do this. -- Greg From greg at krypto.org Sat Apr 12 06:15:38 2008 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 11 Apr 2008 21:15:38 -0700 Subject: [Python-3000] [Python-Dev] Need help for SWIG's Python 3.0 backend In-Reply-To: <47FF9363.7060005@gmail.com> References: <47FF9363.7060005@gmail.com> Message-ID: <52dc1c820804112115sc2a270cueb6c7159913b916b@mail.gmail.com> -cc: python-dev +cc: python-3000 Hi Haoyu, I'm glad someone wanting to work on updating swig for python 3.x. A better mailing list for python 3.x internals questions as you work on this is the python-3000 at python.org list. The first place I suggest looking when you have a question is in the Python trunk vs Python py3k branch source trees themselves (see http://python.org/dev/ for instructions on how to check them out from subversion). Take a look at how functions were used internally in the Python/ and Objects/ subdirectories in trunk and take another look at how similar stuff works in the py3k branch and maybe it'll give you hints about what to do. It'd be ideal if swig could do its thing in the future without using any undocumented/private APIs. -gps On Fri, Apr 11, 2008 at 9:35 AM, Haoyu Bai wrote: > Hello, > > I am a Google Summer of Code student who preparing a SWIG's Python 3.0 > support proposal. Here's detail of my proposal: > > http://www.dabeaz.com/cgi-bin/wiki.pl?GSoCPython3Proposal > > And abstract shown below for convenient: > > This project adds Python 3.0 support for SWIG. We will add a "-3" option > to SWIG's current backend, which indicates SWIG to generate wrapper for > Python 3. We also make SWIG generate more efficient code and more clear > proxy by utilizing Python 3's new features. > > The considered features are as follows: > > * Function Annotations > > * Mutable Buffer Support > > * Abstract Base Classes > > > I have read PEPs and Python 3's document, then did some experiment on > the API. I have modified a SWIG generated wrapper code by hand so it can > running with Python 3.0. > > However, there still some API changes I can't handle. SWIG used some > undocumented C API, for example the _PyInstance_Lookup(). And some API > disappeared, I can't found the alternative of them, for example > PyInstance_NewRaw(). > > I think I will need a lot of help from Python developers if my proposal > is accepted. So I post this here to make sure if I can get help when > doing this project. And I really appreciate if you can give any advice > about how to solve the problems I mentioned before. > > Thank you! > > > Best regards, > > Haoyu Bai > 4/12/2008 > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080411/d807b79e/attachment.htm From aleaxit at gmail.com Sat Apr 12 06:53:43 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 11 Apr 2008 21:53:43 -0700 Subject: [Python-3000] Phasing out % string formatting In-Reply-To: References: <47FFE468.2040605@trueblade.com> Message-ID: On Fri, Apr 11, 2008 at 4:05 PM, Brett Cannon wrote: > On Fri, Apr 11, 2008 at 3:21 PM, Eric Smith > wrote: > > I've proposed on another thread the no new features are added to % > > formatting, specifically the PEP 3127 (Integer Literal Support and > > Syntax) '%b' formatting. It didn't generate any discussion, so I > > thought I'd bring it up in its own thread. > > > > I'd like to see us take the position that % formatting is being phased > > out, if not actually starting the deprecation process. If it's being > > phased out, then I think it makes sense to add no new features to it. > > +1 from me. +1 here too -- adding features to a soon-to-be-obsolete subsystem would be an "attractive nuisance"!-) Alex From ncoghlan at gmail.com Sat Apr 12 15:49:43 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 12 Apr 2008 23:49:43 +1000 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <18429.33042.990861.658382@montanaro-dyndns-org.local> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> Message-ID: <4800BDF7.1000205@gmail.com> skip at pobox.com wrote: > >> Is there a 2-to-3 fixer for % format? I scanned the fixes directly > >> quickly but didn't see anything obvious. > > Benjamin> I believe the only reason that % is even in 3.0 is that a 2to3 > Benjamin> fixer couldn't be easily written for it. > > I find that kind of hard to believe (that it should be terribly difficult to > write a fixer, at least given a % operator with a string literal LHS and > either a tuple or dict RHS or a call to locals() or globals()). That's exactly the problem though - while a 2to3 fixer can be written for a tiny subset of formatting calls (those that meet the constraints you gave), the vast majority are out of luck without some major type inferencing additions to 2to3. Given the expression "x % y", 2to3 hasn't got a clue whether it needs to do anything unless it somehow knows the types of x and y. So my understanding matches Benjamin's: while string %-formatting is definitely a 'second way' to do something for which str.format will be the preferred approach, getting rid of it for Py3k just isn't worth the staggering amount of breakage that would result. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Sat Apr 12 15:51:17 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 12 Apr 2008 23:51:17 +1000 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <47FE534D.10005@trueblade.com> References: <47FD5737.4060001@trueblade.com> <47FE534D.10005@trueblade.com> Message-ID: <4800BE55.4030206@gmail.com> Eric Smith wrote: > Guido van Rossum wrote: >> If 2.6 can't support %b, so be it. > > It would really be easiest to just say that if you want binary > formatting in both 2.6 and 3.0, use str.format. I don't think expanding > the functionality of % formatting is what anyone should be spending > their time on. > > I'd be happy to update the PEP to drop %b. I didn't even realise a %b formatter was in the PEP: +1 for dropping it and leaving the % formatting implementation alone. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From skip at pobox.com Sat Apr 12 16:26:32 2008 From: skip at pobox.com (skip at pobox.com) Date: Sat, 12 Apr 2008 09:26:32 -0500 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <4800BDF7.1000205@gmail.com> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> Message-ID: <18432.50840.556518.954678@montanaro-dyndns-org.local> Nick> That's exactly the problem though - while a 2to3 fixer can be Nick> written for a tiny subset of formatting calls (those that meet the Nick> constraints you gave)... In my personal experience, either the LHS will be a string literal or the RHS will be locals(), globals() or a tuple. Yes, you will have a hard, if not impossible, time with the general x % y. Still, I think a fixer that only addresses the "tiny subset" would go a long ways to converting existing code. For the rest it could insert special comments so the programmer can grep for the problematic cases. As a quick back-of-the-envelope check, I counted the number of occurrences of ' % ' in the 2.5 source and found 4555 instances. Of those, either a single quote or a double quote preceded the leading space in 3943 cases. In 2051 cases (obviously overlapping with the preceding count) the trailing space was followed by a left paren or left brace. In 21 cases the trailing space was followed by a call to locals() or globals(). I think something like 85-90% of the uses in the Python core should be able to be converted mechanically. Nick> So my understanding matches Benjamin's: while string %-formatting Nick> is definitely a 'second way' to do something for which str.format Nick> will be the preferred approach, getting rid of it for Py3k just Nick> isn't worth the staggering amount of breakage that would result. True, I'm not implying anything should be broken, just that much of the work can be mechanical conversion. Also, for some of us, % formatting will remain the "first way" of generating formatted string output as long as it exists in the language. Skip From rbp at isnomore.net Sat Apr 12 16:35:23 2008 From: rbp at isnomore.net (Rodrigo Bernardo Pimentel) Date: Sat, 12 Apr 2008 11:35:23 -0300 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <4800BDF7.1000205@gmail.com> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> Message-ID: <20080412143136.GE10294@isnomore.net> On Sat, Apr 12 2008 at 10:49:43AM BRT, Nick Coghlan wrote: > skip at pobox.com wrote: > > >> Is there a 2-to-3 fixer for % format? I scanned the fixes directly > > >> quickly but didn't see anything obvious. > > > > Benjamin> I believe the only reason that % is even in 3.0 is that a 2to3 > > Benjamin> fixer couldn't be easily written for it. > > > > I find that kind of hard to believe (that it should be terribly difficult to > > write a fixer, at least given a % operator with a string literal LHS and > > either a tuple or dict RHS or a call to locals() or globals()). > > That's exactly the problem though - while a 2to3 fixer can be written > for a tiny subset of formatting calls (those that meet the constraints > you gave), the vast majority are out of luck without some major type > inferencing additions to 2to3. Given the expression "x % y", 2to3 hasn't > got a clue whether it needs to do anything unless it somehow knows the > types of x and y. I have submitted a GSoC proposal which might make a % fixer somewhat useful. It was suggested by Collin Winter and, in a nutshell, is giving 2to3 fixers a way to say how confident they are on a certain fix, and then users may specify a confidence threshold below which they want to manually intervene to make a decision on whether to apply, skip or edit the fix. The full proposal is at http://isnomore.net/2to3 . With that in place, a % filter wouldn't need to be exact, but it could apply some heuristics to rank how confident it is that "x % y" bears translating. I'll try to implement that even if my GSoC isn't approved, and, from this thread, I think a confidence-ranked % fixer would be a nice usage example, so I'll probably try to write one. rbp -- Rodrigo Bernardo Pimentel | GPG: <0x0DB14978> From martin at v.loewis.de Sat Apr 12 17:15:33 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 12 Apr 2008 17:15:33 +0200 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <18432.50840.556518.954678@montanaro-dyndns-org.local> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> Message-ID: <4800D215.4050106@v.loewis.de> > True, I'm not implying anything should be broken, just that much of the work > can be mechanical conversion. Also, for some of us, % formatting will > remain the "first way" of generating formatted string output as long as it > exists in the language. For that reason, I wouldn't want 2to3 to convert it for me, at least not by default. Regards, Martin From rbp at isnomore.net Sat Apr 12 17:35:21 2008 From: rbp at isnomore.net (Rodrigo Bernardo Pimentel) Date: Sat, 12 Apr 2008 12:35:21 -0300 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <4800D215.4050106@v.loewis.de> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> <4800D215.4050106@v.loewis.de> Message-ID: <20080412153520.GA7411@isnomore.net> On Sat, Apr 12 2008 at 12:15:33PM BRT, "\"Martin v. L?wis\"" wrote: > > True, I'm not implying anything should be broken, just that much of the work > > can be mechanical conversion. Also, for some of us, % formatting will > > remain the "first way" of generating formatted string output as long as it > > exists in the language. > > For that reason, I wouldn't want 2to3 to convert it for me, at least not > by default. If you don't want % conversion at all, you can always simply skip that specific fixer with 2to3 -f fix1 -f fix2 (...). Come to think of it, it would be nice to have a -F/--no-fix option to indicate fixers *not* to run. rbp -- Rodrigo Bernardo Pimentel | GPG: <0x0DB14978> From steven.bethard at gmail.com Sat Apr 12 19:49:18 2008 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 12 Apr 2008 11:49:18 -0600 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <18432.50840.556518.954678@montanaro-dyndns-org.local> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> Message-ID: On Sat, Apr 12, 2008 at 8:26 AM, wrote: > > Nick> That's exactly the problem though - while a 2to3 fixer can be > Nick> written for a tiny subset of formatting calls (those that meet the > Nick> constraints you gave)... > > In my personal experience, either the LHS will be a string literal or the > RHS will be locals(), globals() or a tuple. Yes, you will have a hard, if > not impossible, time with the general x % y. Still, I think a fixer that > only addresses the "tiny subset" would go a long ways to converting existing > code. For the rest it could insert special comments so the programmer can > grep for the problematic cases. Rather than inserting special comments, why don't we just introduce a -3 warning for using % string formatting? Then, you can use 2to3 to convert as much as it can, and you can use the -3 warning to identify any other places you're using % string formatting and fix them by hand. Adding such a -3 warning shouldn't be much more than a couple of lines at the beginning of stringobject.c:PyString_Format. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From g.brandl at gmx.net Sat Apr 12 20:26:21 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 12 Apr 2008 20:26:21 +0200 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> Message-ID: Steven Bethard schrieb: > On Sat, Apr 12, 2008 at 8:26 AM, wrote: >> >> Nick> That's exactly the problem though - while a 2to3 fixer can be >> Nick> written for a tiny subset of formatting calls (those that meet the >> Nick> constraints you gave)... >> >> In my personal experience, either the LHS will be a string literal or the >> RHS will be locals(), globals() or a tuple. Yes, you will have a hard, if >> not impossible, time with the general x % y. Still, I think a fixer that >> only addresses the "tiny subset" would go a long ways to converting existing >> code. For the rest it could insert special comments so the programmer can >> grep for the problematic cases. > > Rather than inserting special comments, why don't we just introduce a > -3 warning for using % string formatting? Then, you can use 2to3 to > convert as much as it can, and you can use the -3 warning to identify > any other places you're using % string formatting and fix them by > hand. Adding such a -3 warning shouldn't be much more than a couple of > lines at the beginning of stringobject.c:PyString_Format. Please don't -- a Py3k warning makes no sense if the feature isn't really going away in Py3k. Py3k warnings really should only warn about things that are going to break in 3.0. If the decision is reached that such a warning makes sense, I'd propose to only warn in an "extended Py3k warning mode" activated with -33. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From steven.bethard at gmail.com Sat Apr 12 21:01:58 2008 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 12 Apr 2008 13:01:58 -0600 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> Message-ID: On Sat, Apr 12, 2008 at 12:26 PM, Georg Brandl wrote: > Steven Bethard schrieb: > > > On Sat, Apr 12, 2008 at 8:26 AM, wrote: > >> > >> Nick> That's exactly the problem though - while a 2to3 fixer can be > >> Nick> written for a tiny subset of formatting calls (those that meet the > >> Nick> constraints you gave)... > >> > >> In my personal experience, either the LHS will be a string literal or the > >> RHS will be locals(), globals() or a tuple. Yes, you will have a hard, if > >> not impossible, time with the general x % y. Still, I think a fixer that > >> only addresses the "tiny subset" would go a long ways to converting existing > >> code. For the rest it could insert special comments so the programmer can > >> grep for the problematic cases. > > > > Rather than inserting special comments, why don't we just introduce a > > -3 warning for using % string formatting? Then, you can use 2to3 to > > convert as much as it can, and you can use the -3 warning to identify > > any other places you're using % string formatting and fix them by > > hand. Adding such a -3 warning shouldn't be much more than a couple of > > lines at the beginning of stringobject.c:PyString_Format. > > Please don't -- a Py3k warning makes no sense if the feature isn't really > going away in Py3k. Py3k warnings really should only warn about things > that are going to break in 3.0. My understanding is that we'd break % string formatting in Py3k if we could. The only reason we won't is that it's not generally possible to write a 2to3 fixer. So while it won't actually break, I think conveying our intention to break it (using a -3 warning) wouldn't be unreasonable. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From alexandre at peadrop.com Sat Apr 12 21:05:39 2008 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Sat, 12 Apr 2008 15:05:39 -0400 Subject: [Python-3000] Inclusion of the optimized C version of io.BytesIO into Py3K's trunk Message-ID: Hello, Since I am free today, I would like to merge my work on io.BytesIO into Py3K's trunk. Antoine Pitrou reviewed my patch (see http://bugs.python.org/issue1751) and concluded that the new module looked fine. However, he couldn't say much about my changes to io.py and _fileio.c (i.e., the semantic change of truncate to imply a seek). So, I would appreciate it if someone would check io and _fileio fixes, before I commit the patch. Thank you, -- Alexandre From musiccomposition at gmail.com Sat Apr 12 21:30:23 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 12 Apr 2008 14:30:23 -0500 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> Message-ID: <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> > Please don't -- a Py3k warning makes no sense if the feature isn't really > going away in Py3k. Py3k warnings really should only warn about things > that are going to break in 3.0. > > If the decision is reached that such a warning makes sense, I'd propose > to only warn in an "extended Py3k warning mode" activated with -33. A Py3k warning is already a extended DeprecationWarning! Why don't we just give it a DeprecationWarning in 3.0? > > Georg -- Cheers, Benjamin Peterson From musiccomposition at gmail.com Sat Apr 12 23:07:32 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 12 Apr 2008 16:07:32 -0500 Subject: [Python-3000] Equality of range objects In-Reply-To: References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> Message-ID: <1afaf6160804121407o379aacabm61b34da9dc7e07e8@mail.gmail.com> If you're interested, I've implemented equality for range in issue 2603. -- Cheers, Benjamin Peterson From greg.ewing at canterbury.ac.nz Sun Apr 13 02:17:55 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Apr 2008 12:17:55 +1200 Subject: [Python-3000] Recursive str In-Reply-To: References: <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> Message-ID: <48015133.4020105@canterbury.ac.nz> Guido van Rossum wrote: > We'd > need a third form (eek!) that would preserve the string quotes but be > more lenient about non-ASCII. Personally, I think some custom loop to > print the values is good enough. It might not be a serious problem when most of the chars in the string are ascii, but what about e.g. a Japanese user whose strings consist almost entirely of non-ascii, but are for the most part what constitutes perfectly readable text to them? They will have no straightforward way to display a list of strings in a readable form. I'm not sure what to do about that, though. Maybe some sort of locale setting that makes repr() of a string not escape chars that fall into some kind of "normal" set according to the user's native language? -- Greg From ncoghlan at gmail.com Sun Apr 13 08:05:16 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Apr 2008 16:05:16 +1000 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> References: <47FD5737.4060001@trueblade.com> <47FD5DD9.6000705@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> Message-ID: <4801A29C.3020806@gmail.com> Benjamin Peterson wrote: >> Please don't -- a Py3k warning makes no sense if the feature isn't really >> going away in Py3k. Py3k warnings really should only warn about things >> that are going to break in 3.0. >> >> If the decision is reached that such a warning makes sense, I'd propose >> to only warn in an "extended Py3k warning mode" activated with -33. > A Py3k warning is already a extended DeprecationWarning! Why don't we > just give it a DeprecationWarning in 3.0? PendingDeprecatingWarning: maybe. DeprecationWarning: no. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From divinekid at gmail.com Sun Apr 13 15:48:51 2008 From: divinekid at gmail.com (Haoyu Bai) Date: Sun, 13 Apr 2008 21:48:51 +0800 Subject: [Python-3000] [Python-Dev] Need help for SWIG's Python 3.0 backend In-Reply-To: <52dc1c820804112115sc2a270cueb6c7159913b916b@mail.gmail.com> References: <47FF9363.7060005@gmail.com> <52dc1c820804112115sc2a270cueb6c7159913b916b@mail.gmail.com> Message-ID: <48020F43.1050000@gmail.com> Gregory P. Smith wrote: > -cc: python-dev > +cc: python-3000 > > Hi Haoyu, > > I'm glad someone wanting to work on updating swig for python 3.x. A > better mailing list for python 3.x internals questions as you work on > this is the python-3000 at python.org list. > > The first place I suggest looking when you have a question is in the > Python trunk vs Python py3k branch source trees themselves (see > http://python.org/dev/ for instructions on how to check them out from > subversion). Take a look at how functions were used internally in the > Python/ and Objects/ subdirectories in trunk and take another look at > how similar stuff works in the py3k branch and maybe it'll give you > hints about what to do. > > It'd be ideal if swig could do its thing in the future without using any > undocumented/private APIs. > > -gps > Thanks for Gregory to point out my mistake and forward this mail to python-3000. I really feel sorry for my mistake. I have already checked out py3k branch and done some comparison with the Python 2.5.2 release. My knowledge about Python internal is fetched from the source code. And I will continue to study the document. But I am still afraid of the lacking of document. So I would like to make sure that some people is willing to help me whenever I encountered a really hard problem relating to Python internal. Thanks a lot! Best regards, Haoyu Bai 4/13/2008 From solipsis at pitrou.net Sun Apr 13 16:10:40 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 Apr 2008 14:10:40 +0000 (UTC) Subject: [Python-3000] Recursive str References: <47FAD002.8080306@canterbury.ac.nz> <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> Message-ID: Greg Ewing canterbury.ac.nz> writes: > > It might not be a serious problem when most of the chars in > the string are ascii, but what about e.g. a Japanese user > whose strings consist almost entirely of non-ascii, but are > for the most part what constitutes perfectly readable text > to them? They will have no straightforward way to display > a list of strings in a readable form. How about print ",".join(mylist) ? > I'm not sure what to do about that, though. Maybe some > sort of locale setting that makes repr() of a string not > escape chars that fall into some kind of "normal" set > according to the user's native language? If it's only a problem with the interactive interpreter, perhaps it's just a matter of converting back \uXXXX and \xYY codes if possible (according to the detected terminal encoding) when outputting the result of an expression? From tnelson at onresolve.com Mon Apr 14 12:02:18 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Mon, 14 Apr 2008 03:02:18 -0700 Subject: [Python-3000] longobject.c and Windows x64 Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> On Windows x64, sizeof(size_t) > sizeof(long), so the existing PyLong_FromSsize_t and PyLong_FromSize_t implementations in longobject.c are just plain wrong. I've patched it as follows, but as I'm not well versed in the many intricacies of longobject.c, I'd appreciate input from others. As far as I can tell, we can use FromLong or FromLongLong (and their unsigned counterparts) exclusively; I don't see why we'd need to rely on _PyLong_FromByteArray as our incoming ival is never going to be larger than an ssize_t|size_t. (The _PyLong_FromByteArray is intended for incoming numbers that we don't know the size of up front, right?) Index: longobject.c =================================================================== --- longobject.c (revision 62292) +++ longobject.c (working copy) @@ -1099,13 +1099,13 @@ PyObject * PyLong_FromSsize_t(Py_ssize_t ival) { - Py_ssize_t bytes = ival; - int one = 1; - if (ival < PyLong_BASE) - return PyLong_FromLong(ival); - return _PyLong_FromByteArray( - (unsigned char *)&bytes, - SIZEOF_SIZE_T, IS_LITTLE_ENDIAN, 1); +#if SIZEOF_SIZE_T == SIZEOF_LONG_LONG + return PyLong_FromLongLong(ival); +#elif SIZEOF_SIZE_T == SIZEOF_LONG + return PyLong_FromLong(ival); +#else +#error "Expected SIZEOF_SIZE_T to equal SIZEOF_LONG_LONG or SIZEOF_LONG" +#endif } /* Create a new long int object from a C size_t. */ @@ -1113,13 +1113,13 @@ PyObject * PyLong_FromSize_t(size_t ival) { - size_t bytes = ival; - int one = 1; - if (ival < PyLong_BASE) - return PyLong_FromLong(ival); - return _PyLong_FromByteArray( - (unsigned char *)&bytes, - SIZEOF_SIZE_T, IS_LITTLE_ENDIAN, 0); +#if SIZEOF_SIZE_T == SIZEOF_LONG_LONG + return PyLong_FromUnsignedLongLong(ival); +#elif SIZEOF_SIZE_T == SIZEOF_LONG + return PyLong_FromUnsignedLong(ival); +#else +#error "Expected SIZEOF_SIZE_T to equal SIZEOF_LONG_LONG or SIZEOF_LONG" +#endif } /* Get a C PY_LONG_LONG int from a long int object. From ishimoto at gembook.org Mon Apr 14 12:33:49 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Mon, 14 Apr 2008 19:33:49 +0900 Subject: [Python-3000] Recursive str In-Reply-To: <48015133.4020105@canterbury.ac.nz> References: <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> Message-ID: <797440730804140333ha0f5262i9f8b191ecd78cc4c@mail.gmail.com> 2008/4/13, Greg Ewing : > I'm not sure what to do about that, though. Maybe some > sort of locale setting that makes repr() of a string not > escape chars that fall into some kind of "normal" set > according to the user's native language? > Here's my idea. repr() cannot convert 'unprintable characters', since repr() doesn't know which characters are printable or not. So, repr() should convert only non-printable "ASCII" characters and special "ASCII" characters such as '\n". Other non-ASCII characters are converted by output file. Output files can convert non-printable characters to \uXXXX. Such conversions could be implemented by backslashreplace error handler. I wrote a quick patch. Please take a look at http://bugs.python.org/issue2630 . From tnelson at onresolve.com Mon Apr 14 12:19:55 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Mon, 14 Apr 2008 03:19:55 -0700 Subject: [Python-3000] longobject.c and Windows x64 In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D2014C@EXMBX04.exchhosting.com> > On Windows x64, sizeof(size_t) > sizeof(long), so the > existing PyLong_FromSsize_t and PyLong_FromSize_t > implementations in longobject.c are just plain wrong. There are a whole bunch of other areas where longobject.c could do with casts to silence Windows x64 compiler warnings (where it's safe to cast), and a few areas where we're going to be losing bits where we shouldn't be (i.e. long_getN() calling PyLong_FromLong() against Py_intptr_t). Should I just go through and commit fixes where I find issues, or would people prefer something in tracker or discussion on the list beforehand? Trent. From guido at python.org Mon Apr 14 20:05:45 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 14 Apr 2008 11:05:45 -0700 Subject: [Python-3000] Recursive str In-Reply-To: <48015133.4020105@canterbury.ac.nz> References: <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> Message-ID: On Sat, Apr 12, 2008 at 5:17 PM, Greg Ewing wrote: > Guido van Rossum wrote: > > We'd > > need a third form (eek!) that would preserve the string quotes but be > > more lenient about non-ASCII. Personally, I think some custom loop to > > print the values is good enough. > > It might not be a serious problem when most of the chars in > the string are ascii, but what about e.g. a Japanese user > whose strings consist almost entirely of non-ascii, but are > for the most part what constitutes perfectly readable text > to them? They will have no straightforward way to display > a list of strings in a readable form. A complaint about this would carry more weight when it came from someone who actually has to deal with the issue than coming from a purely theoretical perspective (unless I'm wrong and you actually read Japanese). Another issue is that repr() is supposed to return an 8-bit string. I don't think we should put non-ASCII characters in the output in some encoding. > I'm not sure what to do about that, though. Maybe some > sort of locale setting that makes repr() of a string not > escape chars that fall into some kind of "normal" set > according to the user's native language? That would be worse. Making repr() non-predictable and locale-specific? Eeeek! In Py3k we may be able to do something else though -- instead of insisting on ASCII we could allow a much larger set of characters to be unescaped. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Apr 14 21:05:48 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 14 Apr 2008 12:05:48 -0700 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <4801A29C.3020806@gmail.com> References: <47FD5737.4060001@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> <4801A29C.3020806@gmail.com> Message-ID: I thought I had a reasonable proposal: deprecate in 3.1, remove in 3.3. Adding a PendingDeprecationWarning in 3.0 would be fine. Doing anything in 2.6 would not be fine, except perhaps making it a PendingDeprecationWarning whan -3 is given. On Sat, Apr 12, 2008 at 11:05 PM, Nick Coghlan wrote: > Benjamin Peterson wrote: > >> Please don't -- a Py3k warning makes no sense if the feature isn't really > >> going away in Py3k. Py3k warnings really should only warn about things > >> that are going to break in 3.0. > >> > >> If the decision is reached that such a warning makes sense, I'd propose > >> to only warn in an "extended Py3k warning mode" activated with -33. > > A Py3k warning is already a extended DeprecationWarning! Why don't we > > just give it a DeprecationWarning in 3.0? > > PendingDeprecatingWarning: maybe. > DeprecationWarning: no. > > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > _______________________________________________ > > > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From musiccomposition at gmail.com Mon Apr 14 23:26:48 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 14 Apr 2008 16:26:48 -0500 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> <4801A29C.3020806@gmail.com> Message-ID: <1afaf6160804141426v55fdac73u441a73fff936fa9b@mail.gmail.com> On Mon, Apr 14, 2008 at 2:05 PM, Guido van Rossum wrote: > I thought I had a reasonable proposal: deprecate in 3.1, remove in > 3.3. Adding a PendingDeprecationWarning in 3.0 would be fine. Doing > anything in 2.6 would not be fine, except perhaps making it a > PendingDeprecationWarning whan -3 is given. I'm working on a patch for this. However, the exception system must call PyString_Format, because a warning can cause infinite recursion. What do you recommend? -- Cheers, Benjamin Peterson From dickinsm at gmail.com Tue Apr 15 00:38:59 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 14 Apr 2008 18:38:59 -0400 Subject: [Python-3000] longobject.c and Windows x64 In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> Message-ID: <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> On Mon, Apr 14, 2008 at 6:02 AM, Trent Nelson wrote: > > On Windows x64, sizeof(size_t) > sizeof(long), so the existing > PyLong_FromSsize_t and PyLong_FromSize_t implementations in longobject.c are > just plain wrong. I've patched it as follows, but as I'm not well versed in > the many intricacies of longobject.c, I'd appreciate input from others. > I'm missing something: in what way are the existing implementations wrong? I see that the test (ival < PyLong_BASE) in PyLong_FromSsize_t should be something like: (ival < PyLong_BASE && ival > -PyLong_BASE), but PyLong_FromSize_t looks okay to me. (Apart from the unused "int one = 1;", that is.) I agree that it's a little odd to go via _PyLong_FromByteArray. Couldn't PyLong_FromSsize_t be written to exactly mimic PyLong_FromLongLong? It means duplication of code, I know, but it also means not relying on ssize_t being equal to either long or long long. By the way, I don't much like the handling of negative values in PyLong_FromLong and PyLong_FromLongLong: these functions use code like: if (ival < 0) { ival = -ival; negative = 1; } which looks to me as though it might mishandle the case where ival = LONG_MIN. Should this be fixed? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080414/2f837139/attachment.htm From tnelson at onresolve.com Tue Apr 15 02:01:04 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Mon, 14 Apr 2008 17:01:04 -0700 Subject: [Python-3000] longobject.c and Windows x64 In-Reply-To: <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> > > On Mon, Apr 14, 2008 at 6:02 AM, Trent Nelson wrote: > > > > On Windows x64, sizeof(size_t) > sizeof(long), so the existing > > PyLong_FromSsize_t and PyLong_FromSize_t implementations in longobject.c > > are just plain wrong. I've patched it as follows, but as I'm not well > > versed in the many intricacies of longobject.c, I'd appreciate input > > from others. > > I'm missing something: in what way are the existing implementations > wrong? I see that the test (ival < PyLong_BASE) in > PyLong_FromSsize_t should be something like: > (ival < PyLong_BASE && ival > -PyLong_BASE), Yeah, that's the 'wrong' part I was referring to. I guess I wanted to bring that issue up as well as question the actual implementation. For example, if we fixed the if statement, we'd having something looking like: PyObject * PyLong_FromSsize_t(Py_ssize_t ival) { Py_ssize_t bytes = ival; int one = 1; if (ival < PyLong_BASE && ival > -PyLong_BASE) return PyLong_FromLong(ival); return _PyLong_FromByteArray( (unsigned char *)&bytes, SIZEOF_SIZE_T, IS_LITTLE_ENDIAN, 1); } I don't understand why we'd be interested in testing for the PyLong_FromLong() shortcut, then reverting to _PyLong_FromByteArray(), when we know we're always going to be dealing with a Py_ssize_t. Speed? Safety? Cut and paste that went awry? Why not just call the correct PyLong_FromLong(Long)() depending on sizeof(size_t) and be done with it? > Couldn't PyLong_FromSsize_t be written to exactly mimic > PyLong_FromLongLong? It means duplication of code, I know, > but it also means not relying on ssize_t being equal to either > long or long long. Surely, if we're guarding properly with #error in the case where sizeof(size_t) not in (sizeof(long), sizeof(Py_LONG_LONG)), reusing existing methods that do exactly what we want to do would be better than mimicking them? > By the way, I don't much like the handling of negative > values in PyLong_FromLong and PyLong_FromLongLong: > these functions use code like: > > if (ival < 0) { > ival = -ival; > negative = 1; > } > > which looks to me as though it might mishandle the case > where ival = LONG_MIN. Should this be fixed? Ah, interesting. Stepped through PyLong_FromLong via the following: Python 3.0a4+ (py3k, Apr 14 2008, 18:44:17) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import _testcapi as t >>> t.getargs_l(t.LONG_MIN) -2147483648 >>> t.getargs_l(t.LONG_MIN+1) -2147483647 When ival == LONG_MIN, the 'ival = -ival' statement doesn't have any effect on the value of ival, it stays as LONG_MIN. (With LONG_MIN+1 though, ival does correctly get cast back into the positive realm...) This isn't causing a problem (at least not on Windows) as ival's cast to an unsigned long later in the method -- I wonder if all platforms/compilers silently ignore 'ival = -ival' when at LONG_MIN though... Trent. From dickinsm at gmail.com Tue Apr 15 02:29:31 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 14 Apr 2008 20:29:31 -0400 Subject: [Python-3000] longobject.c and Windows x64 In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> Message-ID: <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> On Mon, Apr 14, 2008 at 8:01 PM, Trent Nelson wrote: > > Yeah, that's the 'wrong' part I was referring to. I guess I wanted to > bring that issue up as well as question the actual implementation. For > example, if we fixed the if statement, we'd having something looking like: > > PyObject * > PyLong_FromSsize_t(Py_ssize_t ival) > { > Py_ssize_t bytes = ival; > int one = 1; > if (ival < PyLong_BASE && ival > -PyLong_BASE) > return PyLong_FromLong(ival); > return _PyLong_FromByteArray( > (unsigned char *)&bytes, > SIZEOF_SIZE_T, IS_LITTLE_ENDIAN, 1); > } > > I don't understand why we'd be interested in testing for the > PyLong_FromLong() shortcut, then reverting to _PyLong_FromByteArray(), when > we know we're always going to be dealing with a Py_ssize_t. Speed? Safety? > Cut and paste that went awry? Why not just call the correct > PyLong_FromLong(Long)() depending on sizeof(size_t) and be done with it? > The extra tests aren't in the trunk version of longobject.c; It looks to me as though they're the result of merging the 2.x longobject.c and intobject.c to produce the 3.0 longobject.c. I also notice that _PyLong_FromByteArray doesn't do a CHECK_SMALL_INT, while PyLong_FromLong does. Perhaps this is the reason for the extra test? I agree it would be simpler to just use PyLong_FromLong or PyLong_FromLongLong. > > Surely, if we're guarding properly with #error in the case where > sizeof(size_t) not in (sizeof(long), sizeof(Py_LONG_LONG)), reusing existing > methods that do exactly what we want to do would be better than mimicking > them? > Fair enough. My twisted mind was trying to find ways that size_t might be something other than long or long long, but that seems unlikely... > > Ah, interesting. Stepped through PyLong_FromLong via the following: > > Python 3.0a4+ (py3k, Apr 14 2008, 18:44:17) [MSC v.1500 64 bit (AMD64)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import _testcapi as t > >>> t.getargs_l(t.LONG_MIN) > -2147483648 > >>> t.getargs_l(t.LONG_MIN+1) > -2147483647 > > When ival == LONG_MIN, the 'ival = -ival' statement doesn't have any > effect on the value of ival, it stays as LONG_MIN. (With LONG_MIN+1 though, > ival does correctly get cast back into the positive realm...) This isn't > causing a problem (at least not on Windows) as ival's cast to an unsigned > long later in the method -- I wonder if all platforms/compilers silently > ignore 'ival = -ival' when at LONG_MIN though... > Right: I think it's technically undefined behaviour (a signed arithmetic overflow) that nevertheless ends up doing the right thing on most (all?) compilers. I think it should be fixed. Something like (untested) if (ival < 0) { t = (unsigned long)(-1-ival) + 1; } else { t = (unsigned long)ival; } should be safe from overflow (including on machines with a ones' complement or sign-magnitude representation of negative integers---do any such machines exist any more?). Shall I check in a fix? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080414/0114d8e8/attachment.htm From greg.ewing at canterbury.ac.nz Tue Apr 15 03:23:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 15 Apr 2008 13:23:56 +1200 Subject: [Python-3000] Recursive str In-Reply-To: References: <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> Message-ID: <480403AC.7080100@canterbury.ac.nz> Guido van Rossum wrote: > A complaint about this would carry more weight when it came from > someone who actually has to deal with the issue It's not a complaint, just something I thought of. If Japanese programmers aren't actually bothered by this, then I'm not either. > Another issue is that repr() is supposed to return an 8-bit string. If that's still true by definition in the new unicode-only world, then I guess there's no problem. But what do you mean by an "8-bit string" in py3k? A unicode string with all char codes <= 255, or a byte array? If the former, what's the rationale for making that 8 bits and not 7? I'm just trying to understand how the old rules and conventions translate to the new world. -- Greg From dickinsm at gmail.com Tue Apr 15 03:58:23 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 14 Apr 2008 21:58:23 -0400 Subject: [Python-3000] Should int() and float() accept bytes? Message-ID: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> This is a repeat of a question that came up on the "Decimal(unicode)" thread a little while ago. I think it needs an answer, so I'm reposting it in its own thread. I couldn't find any other previous discussion of this; apologies if I'm rehashing old issues. Currently, int() and float() accept bytes instances. For example: >>> int(bytes([49, 50, 51])) 123 [40381 refs] >>> int(b'123') 123 [40381 refs] Philosophically, this seems wrong: it's not clear why bytes([49, 50, 51]) should represent an integer, or even which integer it should represent; if it's intended that the bytes sequence be thought of as an ascii string then really it should be explicitly decoded as such first: >>> int(b'123'.decode('ascii')) 123 On the other hand, there's at least some sense in which bytes already acts as a sort of poor-man's string: witness bytes.lower and friends. Maybe practicality beats purity here? What do people think about changing the int() and float() constructors so that they don't accept bytes? I experimented with removing int(bytes) and int(bytearray) support in longobject.c's long_new and in PyNumber_Long in abstract.c, to see how much breakage occurred. The results: 11 tests failed: test_email test_httplib test_io test_mimetools test_pickle test_pickletools test_random test_smtplib test_sqlite test_tarfile test_uu (random.py needed some patching to get the test-suite to run in the first place.) None of the breakage looks particularly serious or difficult to fix. I haven't tried removing float(bytes) support yet. See also http://bugs.python.org/issue2483 Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080414/887f9bd5/attachment.htm From ishimoto at gembook.org Tue Apr 15 04:07:05 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Tue, 15 Apr 2008 11:07:05 +0900 Subject: [Python-3000] Recursive str In-Reply-To: <480403AC.7080100@canterbury.ac.nz> References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <480403AC.7080100@canterbury.ac.nz> Message-ID: <797440730804141907x367a1198vdc85b84ed603a1ec@mail.gmail.com> 2008/4/15, Greg Ewing : > Guido van Rossum wrote: > > > A complaint about this would carry more weight when it came from > > someone who actually has to deal with the issue > > > It's not a complaint, just something I thought of. If > Japanese programmers aren't actually bothered by this, > then I'm not either. > I'm a Japanese, and I'm bothered for years! I maintained a patch to print Japanese strings collect for a long time. In Japan(and China, Korea and other lot of countries), most text data are built with non-Latin characters. For example, my name 'Atsuo Ishimoto' is written in four Kanji characters("\u77f3\u672c\u6566\u592b"), and my address 'Koshigaya city, Saitama' is also written in some kanji characters. Custom for-loop for debug works, but I definitely prefer Unicode-friendly repr(). From ishimoto at gembook.org Tue Apr 15 05:04:38 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Tue, 15 Apr 2008 12:04:38 +0900 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <797440730804140333ha0f5262i9f8b191ecd78cc4c@mail.gmail.com> Message-ID: <797440730804142004t389e885bwbb0a5aea41611b41@mail.gmail.com> 2008/4/14, Michael Urman : > This theory sounds good to me. Should it perhaps also convert Unicode > whitespace and control characters (categories Z* and C*)? While these > will often still be printable, like \n and \t they may not be > distinguishable from some number of ASCII spaces in printed form. It would be nice, but make result of repr() less predictable a bit, since result of repr() depends on version of Unicode spec, not Python language. I'm not sure it is harmful or not, but having a list of characters converted in repr() (e.g. sys.nonprintablechars) might help. From guido at python.org Tue Apr 15 05:12:26 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 14 Apr 2008 20:12:26 -0700 Subject: [Python-3000] Recursive str In-Reply-To: <797440730804142004t389e885bwbb0a5aea41611b41@mail.gmail.com> References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <797440730804140333ha0f5262i9f8b191ecd78cc4c@mail.gmail.com> <797440730804142004t389e885bwbb0a5aea41611b41@mail.gmail.com> Message-ID: On Mon, Apr 14, 2008 at 8:04 PM, atsuo ishimoto wrote: > 2008/4/14, Michael Urman : > > > This theory sounds good to me. Should it perhaps also convert Unicode > > whitespace and control characters (categories Z* and C*)? While these > > will often still be printable, like \n and \t they may not be > > distinguishable from some number of ASCII spaces in printed form. > > It would be nice, but make result of repr() less predictable a bit, > since result > of repr() depends on version of Unicode spec, not Python language. > I'm not sure it is harmful or not, but having a list of characters converted > in repr() (e.g. sys.nonprintablechars) might help. I wouldn't worry too much about the version of the Unicode standard. We have to do real work to start using a new version of the standard anyway (like generating new data files) so this is unlikely to be causing surprise failures. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nnorwitz at gmail.com Tue Apr 15 05:59:43 2008 From: nnorwitz at gmail.com (Neal Norwitz) Date: Mon, 14 Apr 2008 20:59:43 -0700 Subject: [Python-3000] [Python-Dev] Need help for SWIG's Python 3.0 backend In-Reply-To: <48020F43.1050000@gmail.com> References: <47FF9363.7060005@gmail.com> <52dc1c820804112115sc2a270cueb6c7159913b916b@mail.gmail.com> <48020F43.1050000@gmail.com> Message-ID: On Sun, Apr 13, 2008 at 6:48 AM, Haoyu Bai wrote: > Gregory P. Smith wrote: > > -cc: python-dev > > +cc: python-3000 > > Thanks for Gregory to point out my mistake and forward this mail to > python-3000. I really feel sorry for my mistake. Don't worry about it. > I have already checked out py3k branch and done some comparison with the > Python 2.5.2 release. My knowledge about Python internal is fetched from > the source code. And I will continue to study the document. But I am That's the best way to learn! > still afraid of the lacking of document. Yeah, I'm sure we should have more documentation. You could really help the situation by documenting all the things you learn. That way everyone else will be able to learn from you. > So I would like to make sure that some people is willing to help me > whenever I encountered a really hard problem relating to Python internal. Sure. It would be best to start by asking your mentor(s). Sometimes a question will be appropriate to ask on comp.lang.python. The mentors should be able to help you decide where to post. If you send a message here that others feel is not appropriate, you will be pointed at comp.lang.python. Cheers, n From martin at v.loewis.de Tue Apr 15 07:56:56 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 15 Apr 2008 07:56:56 +0200 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> Message-ID: <480443A8.9050507@v.loewis.de> > Philosophically, this seems wrong I agree. Regards, Martin From tnelson at onresolve.com Tue Apr 15 08:59:55 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Mon, 14 Apr 2008 23:59:55 -0700 Subject: [Python-3000] longobject.c and Windows x64 In-Reply-To: <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C93@EXMBX04.exchhosting.com> > Shall I check in a fix? Be my guest (I take it you'll address the original PyLong_FromSsize_t/FromSize_t issue that I posted a patch for, right?). Trent. ________________________________ From: Mark Dickinson [mailto:dickinsm at gmail.com] Sent: 15 April 2008 01:30 To: Trent Nelson Cc: Python 3000 Subject: Re: [Python-3000] longobject.c and Windows x64 On Mon, Apr 14, 2008 at 8:01 PM, Trent Nelson wrote: Yeah, that's the 'wrong' part I was referring to. I guess I wanted to bring that issue up as well as question the actual implementation. For example, if we fixed the if statement, we'd having something looking like: PyObject * PyLong_FromSsize_t(Py_ssize_t ival) { Py_ssize_t bytes = ival; int one = 1; if (ival < PyLong_BASE && ival > -PyLong_BASE) return PyLong_FromLong(ival); return _PyLong_FromByteArray( (unsigned char *)&bytes, SIZEOF_SIZE_T, IS_LITTLE_ENDIAN, 1); } I don't understand why we'd be interested in testing for the PyLong_FromLong() shortcut, then reverting to _PyLong_FromByteArray(), when we know we're always going to be dealing with a Py_ssize_t. Speed? Safety? Cut and paste that went awry? Why not just call the correct PyLong_FromLong(Long)() depending on sizeof(size_t) and be done with it? The extra tests aren't in the trunk version of longobject.c; It looks to me as though they're the result of merging the 2.x longobject.c and intobject.c to produce the 3.0 longobject.c. I also notice that _PyLong_FromByteArray doesn't do a CHECK_SMALL_INT, while PyLong_FromLong does. Perhaps this is the reason for the extra test? I agree it would be simpler to just use PyLong_FromLong or PyLong_FromLongLong. Surely, if we're guarding properly with #error in the case where sizeof(size_t) not in (sizeof(long), sizeof(Py_LONG_LONG)), reusing existing methods that do exactly what we want to do would be better than mimicking them? Fair enough. My twisted mind was trying to find ways that size_t might be something other than long or long long, but that seems unlikely... Ah, interesting. Stepped through PyLong_FromLong via the following: Python 3.0a4+ (py3k, Apr 14 2008, 18:44:17) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import _testcapi as t >>> t.getargs_l(t.LONG_MIN) -2147483648 >>> t.getargs_l(t.LONG_MIN+1) -2147483647 When ival == LONG_MIN, the 'ival = -ival' statement doesn't have any effect on the value of ival, it stays as LONG_MIN. (With LONG_MIN+1 though, ival does correctly get cast back into the positive realm...) This isn't causing a problem (at least not on Windows) as ival's cast to an unsigned long later in the method -- I wonder if all platforms/compilers silently ignore 'ival = -ival' when at LONG_MIN though... Right: I think it's technically undefined behaviour (a signed arithmetic overflow) that nevertheless ends up doing the right thing on most (all?) compilers. I think it should be fixed. Something like (untested) if (ival < 0) { t = (unsigned long)(-1-ival) + 1; } else { t = (unsigned long)ival; } should be safe from overflow (including on machines with a ones' complement or sign-magnitude representation of negative integers---do any such machines exist any more?). Shall I check in a fix? Mark From abpillai at gmail.com Tue Apr 15 10:04:10 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 15 Apr 2008 13:34:10 +0530 Subject: [Python-3000] Equality of range objects In-Reply-To: <1afaf6160804121407o379aacabm61b34da9dc7e07e8@mail.gmail.com> References: <8548c5f30804080625h5a6d35f8ha0b1abc76008f5f7@mail.gmail.com> <1afaf6160804081327y5f9a16cbta48f4399c4888fcb@mail.gmail.com> <1afaf6160804081434y52c3dec4yf3baf54b70bfd306@mail.gmail.com> <1afaf6160804081449m17e0fbbbo921151f1bc292d2b@mail.gmail.com> <8548c5f30804082345s71e592dby3262554f904bae10@mail.gmail.com> <1afaf6160804121407o379aacabm61b34da9dc7e07e8@mail.gmail.com> Message-ID: <8548c5f30804150104o1a6dcbe8nc2fd640394ca6f46@mail.gmail.com> Good to see this. Thanks! On Sun, Apr 13, 2008 at 2:37 AM, Benjamin Peterson wrote: > If you're interested, I've implemented equality for range in issue 2603. > > > -- > Cheers, > Benjamin Peterson > -- -Anand From abpillai at gmail.com Tue Apr 15 10:43:03 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 15 Apr 2008 14:13:03 +0530 Subject: [Python-3000] Trunk broken ? Message-ID: <8548c5f30804150143j60300e2bgc7baa60628619671@mail.gmail.com> Hi list, After updating py3k trunk to the most recent svn version (62349) and building Python, I am getting the following error while trying to start the interpreter. [anand at localhost ~]$ /usr/local/bin/python Fatal Python error: Py_Initialize: can't initialize sys standard streams Traceback (most recent call last): File "/usr/local/lib/python3.0/io.py", line 63, in import warnings File "/usr/local/lib/python3.0/warnings.py", line 280, in bytes_warning = sys.flags.bytes_warning AttributeError: 'sys.flags' object has no attribute 'bytes_warning' Aborted I am running Fedora Core 6, with kernel 2.6.22.7-57.fc6 on an Intel i686 SMP box. Python was compiled with gcc version 4.1.2 (Redhat - 4.1.2-13). Python was built with no additional ./configure flags . Thanks -- -Anand From abpillai at gmail.com Tue Apr 15 10:44:28 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 15 Apr 2008 14:14:28 +0530 Subject: [Python-3000] Trunk broken ? In-Reply-To: <8548c5f30804150143j60300e2bgc7baa60628619671@mail.gmail.com> References: <8548c5f30804150143j60300e2bgc7baa60628619671@mail.gmail.com> Message-ID: <8548c5f30804150144l6510be36m537e188e66f262cd@mail.gmail.com> When I said "trunk" I meant the py3k branch i.e http://svn.python.org/projects/python/branches/py3k . Thanks On Tue, Apr 15, 2008 at 2:13 PM, Anand Balachandran Pillai wrote: > Hi list, > > After updating py3k trunk to the most recent svn version > (62349) and building Python, I am getting the following > error while trying to start the interpreter. > > [anand at localhost ~]$ /usr/local/bin/python > Fatal Python error: Py_Initialize: can't initialize sys standard streams > Traceback (most recent call last): > File "/usr/local/lib/python3.0/io.py", line 63, in > import warnings > File "/usr/local/lib/python3.0/warnings.py", line 280, in > bytes_warning = sys.flags.bytes_warning > AttributeError: 'sys.flags' object has no attribute 'bytes_warning' > Aborted > > I am running Fedora Core 6, with kernel 2.6.22.7-57.fc6 on > an Intel i686 SMP box. Python was compiled with gcc > version 4.1.2 (Redhat - 4.1.2-13). Python was built with > no additional ./configure flags . > > Thanks > > -- > -Anand > -- -Anand From eric+python-dev at trueblade.com Tue Apr 15 10:55:04 2008 From: eric+python-dev at trueblade.com (Eric Smith) Date: Tue, 15 Apr 2008 04:55:04 -0400 Subject: [Python-3000] Trunk broken ? In-Reply-To: <8548c5f30804150143j60300e2bgc7baa60628619671@mail.gmail.com> References: <8548c5f30804150143j60300e2bgc7baa60628619671@mail.gmail.com> Message-ID: <48046D68.1090302@trueblade.com> Anand Balachandran Pillai wrote: > Hi list, > > After updating py3k trunk to the most recent svn version > (62349) and building Python, I am getting the following > error while trying to start the interpreter. > > [anand at localhost ~]$ /usr/local/bin/python > Fatal Python error: Py_Initialize: can't initialize sys standard streams > Traceback (most recent call last): > File "/usr/local/lib/python3.0/io.py", line 63, in > import warnings > File "/usr/local/lib/python3.0/warnings.py", line 280, in > bytes_warning = sys.flags.bytes_warning > AttributeError: 'sys.flags' object has no attribute 'bytes_warning' > Aborted > > I am running Fedora Core 6, with kernel 2.6.22.7-57.fc6 on > an Intel i686 SMP box. Python was compiled with gcc > version 4.1.2 (Redhat - 4.1.2-13). Python was built with > no additional ./configure flags . I don't have this problem on my 2.6.20-1.2962.fc6 box, also Intel SMP, also using gcc-4.1.2-13. It works with either './configure' or './configure --with-pydebug'. $ ./python Python 3.0a4+ (py3k:62349M, Apr 15 2008, 04:48:00) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.flags.bytes_warning 0 Eric. From abpillai at gmail.com Tue Apr 15 11:06:14 2008 From: abpillai at gmail.com (Anand Balachandran Pillai) Date: Tue, 15 Apr 2008 14:36:14 +0530 Subject: [Python-3000] Trunk broken ? In-Reply-To: <48046D68.1090302@trueblade.com> References: <8548c5f30804150143j60300e2bgc7baa60628619671@mail.gmail.com> <48046D68.1090302@trueblade.com> Message-ID: <8548c5f30804150206n565e1515l4ccfb66f4a157301@mail.gmail.com> Thanks. Perhaps it is a false alarm. I will test again. Regards On Tue, Apr 15, 2008 at 2:25 PM, Eric Smith wrote: > > Anand Balachandran Pillai wrote: > > > Hi list, > > > > After updating py3k trunk to the most recent svn version > > (62349) and building Python, I am getting the following > > error while trying to start the interpreter. > > > > [anand at localhost ~]$ /usr/local/bin/python > > Fatal Python error: Py_Initialize: can't initialize sys standard streams > > Traceback (most recent call last): > > File "/usr/local/lib/python3.0/io.py", line 63, in > > import warnings > > File "/usr/local/lib/python3.0/warnings.py", line 280, in > > bytes_warning = sys.flags.bytes_warning > > AttributeError: 'sys.flags' object has no attribute 'bytes_warning' > > Aborted > > > > I am running Fedora Core 6, with kernel 2.6.22.7-57.fc6 on > > an Intel i686 SMP box. Python was compiled with gcc > > version 4.1.2 (Redhat - 4.1.2-13). Python was built with > > no additional ./configure flags . > > > > I don't have this problem on my 2.6.20-1.2962.fc6 box, also Intel SMP, also > using gcc-4.1.2-13. It works with either './configure' or './configure > --with-pydebug'. > > $ ./python > Python 3.0a4+ (py3k:62349M, Apr 15 2008, 04:48:00) > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import sys > >>> sys.flags.bytes_warning > 0 > > Eric. > -- -Anand From solipsis at pitrou.net Tue Apr 15 11:26:49 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 15 Apr 2008 09:26:49 +0000 (UTC) Subject: [Python-3000] =?utf-8?b?c2l6ZW9mKHNpemVfdCkgPCBzaXplb2YobG9uZyk=?= References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> Message-ID: Mark Dickinson gmail.com> writes: > Fair enough. My twisted mind was trying to find ways that size_t > might be something other than long or long long, but that > seems unlikely... There has been a report where sizeof(size_t) < sizeof(long). It breaks things in the dict implementation: http://bugs.python.org/issue1646068 ? On the system I'm porting to, ints and pointers (and ssize_t) are 32-bit, but longs and long longs are 64-bit. ? From ncoghlan at gmail.com Tue Apr 15 11:42:01 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 15 Apr 2008 19:42:01 +1000 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> Message-ID: <48047869.1060101@gmail.com> Mark Dickinson wrote: > On the other hand, there's at least some sense in which bytes already > acts as a sort of poor-man's string: witness bytes.lower and friends. > Maybe practicality beats purity here? From PEP 358 (describing what is now bytearray): """Note the conspicuous absence of .isupper(), .upper(), and friends. (But see "Open Issues" below.) There is no .__hash__() because the object is mutable.""" And the open issue: """A case could even be made for supporting .islower(), .isupper(), .isspace(), .isalpha(), .isalnum(), .isdigit() and the corresponding conversions (.lower() etc.), using the ASCII definitions for letters, digits and whitespace. If this is accepted, the cases for .ljust(), .rjust(), .center() and .split() become much stronger, and they should have default arguments as well, using an ASCII space or all ASCII whitespace (for .split()).""" PEP 3157 resolved that open issue as follows: """This is exactly the set of methods present on the str type in Python 2.x, with the exclusion of .encode(). The signatures and semantics are the same too. However, whenever character classes like letter, whitespace, lower case are used, the ASCII definitions of these classes are used.""" That seems fairly explicit to me in saying that a bytes or bytearray object should be considered to be ASCII encoded when treated as a string. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From tnelson at onresolve.com Tue Apr 15 13:04:31 2008 From: tnelson at onresolve.com (Trent Nelson) Date: Tue, 15 Apr 2008 04:04:31 -0700 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> Message-ID: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> > Mark Dickinson gmail.com> writes: > > Fair enough. My twisted mind was trying to find ways that size_t > > might be something other than long or long long, but that seems > > unlikely... > > There has been a report where sizeof(size_t) < sizeof(long). > It breaks things in the dict implementation: > http://bugs.python.org/issue1646068 > > < On the system I'm porting to, ints and pointers (and > ssize_t) are 32-bit, but longs and long longs are 64-bit. > I wonder what system that is; sizeof(size_t) & sizeof(void *) < sizeof(long|long long) is quite peculiar, no? Trent. From ncoghlan at gmail.com Tue Apr 15 16:14:23 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 Apr 2008 00:14:23 +1000 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> Message-ID: <4804B83F.5010504@gmail.com> Trent Nelson wrote: >> Mark Dickinson gmail.com> writes: >>> Fair enough. My twisted mind was trying to find ways that size_t >>> might be something other than long or long long, but that seems >>> unlikely... >> There has been a report where sizeof(size_t) < sizeof(long). >> It breaks things in the dict implementation: >> http://bugs.python.org/issue1646068 >> >> < On the system I'm porting to, ints and pointers (and >> ssize_t) are 32-bit, but longs and long longs are 64-bit. > > > I wonder what system that is; sizeof(size_t) & sizeof(void *) < sizeof(long|long long) is quite peculiar, no? I've worked on a DSP where TI were forced to define a 'byte' as 16 bits long because the smallest chunk of memory you could address was 16 bits and the C standard says that sizeof(char) == 1 byte. Fortunately most documentation for that chip talked about chars or MAUs (Minimum Addressable Units) instead of confusing everyone by actually calling the 16-bit chunks bytes. Anyway, once you get into chips with separate code and data buses, I can quite easily see configurations where the program bus is only 32-bits while the data bus is 64-bits. However, so long as whatever solution you come up with can be tweaked in pyconfig.h (is that the right file?) to do the right thing on such odd platforms, it should be fine. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Tue Apr 15 16:18:22 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 15 Apr 2008 07:18:22 -0700 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: <48047869.1060101@gmail.com> References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> <48047869.1060101@gmail.com> Message-ID: Yeah, practicalibty beat purity on that one. I'd say let it beat purity on int() and float() as well. On Tue, Apr 15, 2008 at 2:42 AM, Nick Coghlan wrote: > Mark Dickinson wrote: > > On the other hand, there's at least some sense in which bytes already > > acts as a sort of poor-man's string: witness bytes.lower and friends. > > Maybe practicality beats purity here? > > From PEP 358 (describing what is now bytearray): > > """Note the conspicuous absence of .isupper(), .upper(), and friends. > (But see "Open Issues" below.) There is no .__hash__() because > the object is mutable.""" > > And the open issue: > > """A case could even be made for supporting .islower(), .isupper(), > .isspace(), .isalpha(), .isalnum(), .isdigit() and the > corresponding conversions (.lower() etc.), using the ASCII > definitions for letters, digits and whitespace. If this is > accepted, the cases for .ljust(), .rjust(), .center() and > .split() become much stronger, and they should have default > arguments as well, using an ASCII space or all ASCII whitespace > (for .split()).""" > > PEP 3157 resolved that open issue as follows: > > """This is exactly the set of methods present on the str type in Python > 2.x, with the exclusion of .encode(). The signatures and semantics are > the same too. However, whenever character classes like letter, > whitespace, lower case are used, the ASCII definitions of these classes > are used.""" > > > That seems fairly explicit to me in saying that a bytes or bytearray > object should be considered to be ASCII encoded when treated as a string. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From stephen at xemacs.org Tue Apr 15 19:29:56 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 16 Apr 2008 02:29:56 +0900 Subject: [Python-3000] Recursive str In-Reply-To: References: <47FBF7F9.4020608@canterbury.ac.nz> <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> Message-ID: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Guido van Rossum writes: > A complaint about this would carry more weight when it came from > someone who actually has to deal with the issue than coming from a > purely theoretical perspective (unless I'm wrong and you actually read > Japanese). This *is* a problem. In my experience a lot of string bugs are "off by one" bugs (inserting UTF signatures that shouldn't be there in the middle of string, fencepost errors, etc), which stick out like a sore thumb when printed readably. But they're very hard to diagnose when what I'm seeing looks like output from "cat /dev/random". I don't suffer from it particularly because most of my test data is ASCII, and even when I do use Japanese, Emacs has commands to "wash" a portion of the buffer as needed. On the other hand Japanese is my second language. I suppose a native might be really bothered that the strings are not readable without extra effort. > Another issue is that repr() is supposed to return an 8-bit string. I > don't think we should put non-ASCII characters in the output in some > encoding. No, we should not put non-ASCII characters in the output of repr() for 2.x. It's not worth the effort to expand it to allow ISO 8859/1. And anything locale-specific is right out, you'll have buildbots going red across the globe, no doubt. Not just once, either. Locale-specific stuff is very hard to enforce consistency on. > In Py3k we may be able to do something else though -- instead of > insisting on ASCII we could allow a much larger set of characters to > be unescaped. Yes. The implications of the PEP 3131 discussions about Unicode identifiers should be considered carefully. Eg, consider the potential of confusing ASCII 'A' with Cyrillic 'A'. I'm very unhappy with the idea of having Cyrillic 'A' \u-escaped when calling repr() on objects in a Russian's program, but I don't like the alternative of having "print repr(bogus)" being no more informative than "print bogus" in this situation any better. From divinekid at gmail.com Tue Apr 15 20:01:08 2008 From: divinekid at gmail.com (Haoyu Bai) Date: Wed, 16 Apr 2008 02:01:08 +0800 Subject: [Python-3000] [Python-Dev] Need help for SWIG's Python 3.0 backend In-Reply-To: References: <47FF9363.7060005@gmail.com> <52dc1c820804112115sc2a270cueb6c7159913b916b@mail.gmail.com> <48020F43.1050000@gmail.com> Message-ID: <4804ED64.1030008@gmail.com> Neal Norwitz wrote: > > Yeah, I'm sure we should have more documentation. You could really > help the situation by documenting all the things you learn. That way > everyone else will be able to learn from you. > >> So I would like to make sure that some people is willing to help me >> whenever I encountered a really hard problem relating to Python internal. > > Sure. It would be best to start by asking your mentor(s). Sometimes > a question will be appropriate to ask on comp.lang.python. The > mentors should be able to help you decide where to post. If you send > a message here that others feel is not appropriate, you will be > pointed at comp.lang.python. Thank you! I'll try my best to dig into Python's source code and share things I learned. Also, I'll carefully choose where to post. :) Best regards, Haoyu Bai 4/16/2008 From guido at python.org Tue Apr 15 20:21:35 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 15 Apr 2008 11:21:35 -0700 Subject: [Python-3000] Recursive str In-Reply-To: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Tue, Apr 15, 2008 at 10:29 AM, Stephen J. Turnbull wrote: > Guido van Rossum writes: > > In Py3k we may be able to do something else though -- instead of > > insisting on ASCII we could allow a much larger set of characters to > > be unescaped. > > Yes. The implications of the PEP 3131 discussions about Unicode > identifiers should be considered carefully. Eg, consider the > potential of confusing ASCII 'A' with Cyrillic 'A'. I'm very unhappy > with the idea of having Cyrillic 'A' \u-escaped when calling repr() on > objects in a Russian's program, but I don't like the alternative of > having "print repr(bogus)" being no more informative than "print > bogus" in this situation any better. So it sounds like we're doomed if we do, and damned if we don't. Or do I misunderstand you? Do you have a practical suggestion? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at krypto.org Tue Apr 15 20:37:04 2008 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 15 Apr 2008 11:37:04 -0700 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> <48047869.1060101@gmail.com> Message-ID: <52dc1c820804151137y5d442b99q3ed171ce48a0f337@mail.gmail.com> Agreed. Otherwise the common ascii based network protocol task of reading some bytes in and converting them to the integer that they represent in ascii would require an additional unicode decoding step. On Tue, Apr 15, 2008 at 7:18 AM, Guido van Rossum wrote: > Yeah, practicalibty beat purity on that one. I'd say let it beat > purity on int() and float() as well. > > On Tue, Apr 15, 2008 at 2:42 AM, Nick Coghlan wrote: > > Mark Dickinson wrote: > > > On the other hand, there's at least some sense in which bytes already > > > acts as a sort of poor-man's string: witness bytes.lower and friends. > > > Maybe practicality beats purity here? > > > > From PEP 358 (describing what is now bytearray): > > > > """Note the conspicuous absence of .isupper(), .upper(), and friends. > > (But see "Open Issues" below.) There is no .__hash__() because > > the object is mutable.""" > > > > And the open issue: > > > > """A case could even be made for supporting .islower(), .isupper(), > > .isspace(), .isalpha(), .isalnum(), .isdigit() and the > > corresponding conversions (.lower() etc.), using the ASCII > > definitions for letters, digits and whitespace. If this is > > accepted, the cases for .ljust(), .rjust(), .center() and > > .split() become much stronger, and they should have default > > arguments as well, using an ASCII space or all ASCII whitespace > > (for .split()).""" > > > > PEP 3157 resolved that open issue as follows: > > > > """This is exactly the set of methods present on the str type in Python > > 2.x, with the exclusion of .encode(). The signatures and semantics are > > the same too. However, whenever character classes like letter, > > whitespace, lower case are used, the ASCII definitions of these classes > > are used.""" > > > > > > That seems fairly explicit to me in saying that a bytes or bytearray > > object should be considered to be ASCII encoded when treated as a > string. > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > --------------------------------------------------------------- > > http://www.boredomandlaziness.org > > > > > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/guido%40python.org > > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080415/9540e0ba/attachment-0001.htm From dickinsm at gmail.com Tue Apr 15 23:56:51 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 15 Apr 2008 17:56:51 -0400 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: <52dc1c820804151137y5d442b99q3ed171ce48a0f337@mail.gmail.com> References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> <48047869.1060101@gmail.com> <52dc1c820804151137y5d442b99q3ed171ce48a0f337@mail.gmail.com> Message-ID: <5c6f2a5d0804151456i7fda67bs500fd33e428b63ff@mail.gmail.com> On Tue, Apr 15, 2008 at 2:37 PM, Gregory P. Smith wrote: > Agreed. Otherwise the common ascii based network protocol task of reading > some bytes in and converting them to the integer that they represent in > ascii would require an additional unicode decoding step. > This use-case doesn't seem particularly convincing when the reverse step of converting an integer to an (ascii) bytes instance still has to go through unicode. Maybe there should be an int.to_ascii method? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080415/82bd75b6/attachment.htm From guido at python.org Wed Apr 16 00:06:18 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 15 Apr 2008 15:06:18 -0700 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: <5c6f2a5d0804151456i7fda67bs500fd33e428b63ff@mail.gmail.com> References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> <48047869.1060101@gmail.com> <52dc1c820804151137y5d442b99q3ed171ce48a0f337@mail.gmail.com> <5c6f2a5d0804151456i7fda67bs500fd33e428b63ff@mail.gmail.com> Message-ID: On Tue, Apr 15, 2008 at 2:56 PM, Mark Dickinson wrote: > > On Tue, Apr 15, 2008 at 2:37 PM, Gregory P. Smith wrote: > > Agreed. Otherwise the common ascii based network protocol task of reading > some bytes in and converting them to the integer that they represent in > ascii would require an additional unicode decoding step. > > > > This use-case doesn't seem particularly convincing when the reverse step of > converting an integer to an (ascii) bytes instance still has to go through > unicode. > Maybe there should be an int.to_ascii method? Input and output are often wildly asymmetric anyway. It's easy to make int() and float() accept more input types. But making them return a different output type is different. I find the existing work-arounds good enough not to propose a whole new API. If we end up deciding to add one anyway, I don't think that to_ascii is a good name; it doesn't imply the type of the result, since ASCII text can also be (and usually is) represented as a (Unicode) str instance. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Wed Apr 16 00:07:46 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 15 Apr 2008 18:07:46 -0400 Subject: [Python-3000] Recursive str References: <20080409163038.GC12902@phd.pp.ru><9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com><20080411125521.GE25461@phd.pp.ru><48015133.4020105@canterbury.ac.nz><87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: "Guido van Rossum" wrote in message news:ca471dc20804151121u142a37dcr9fb6b47fac1af0f2 at mail.gmail.com... | On Tue, Apr 15, 2008 at 10:29 AM, Stephen J. Turnbull | wrote: | > Guido van Rossum writes: | > > In Py3k we may be able to do something else though -- instead of | > > insisting on ASCII we could allow a much larger set of characters to | > > be unescaped. | > | > Yes. The implications of the PEP 3131 discussions about Unicode | > identifiers should be considered carefully. Eg, consider the | > potential of confusing ASCII 'A' with Cyrillic 'A'. I'm very unhappy | > with the idea of having Cyrillic 'A' \u-escaped when calling repr() on | > objects in a Russian's program, but I don't like the alternative of | > having "print repr(bogus)" being no more informative than "print | > bogus" in this situation any better. | | So it sounds like we're doomed if we do, and damned if we don't. Or do | I misunderstand you? Do you have a practical suggestion? You understood the same as me. I suspect the real solution has to be language-community (and even programmer) specific, since I expect most people would like the chars they know and expect to be unescaped and others left escaped. So, perhaps there should be a unirep module, stdlib or not, used like: import unirep print(*map(unirep.russian, objects)) or even from unirep import rus_print rus_print(ojbects) # does same as above, with **kwds passed on tjr From dickinsm at gmail.com Wed Apr 16 00:30:11 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 15 Apr 2008 18:30:11 -0400 Subject: [Python-3000] Should int() and float() accept bytes? In-Reply-To: References: <5c6f2a5d0804141858y5794a7ap3429c3e89ae8ad2c@mail.gmail.com> <48047869.1060101@gmail.com> <52dc1c820804151137y5d442b99q3ed171ce48a0f337@mail.gmail.com> <5c6f2a5d0804151456i7fda67bs500fd33e428b63ff@mail.gmail.com> Message-ID: <5c6f2a5d0804151530n2fea5c20q14cbf2efef29940d@mail.gmail.com> On Tue, Apr 15, 2008 at 6:06 PM, Guido van Rossum wrote: > Input and output are often wildly asymmetric anyway. It's easy to make > int() and float() accept more input types. But making them return a > different output type is different. I find the existing work-arounds > good enough not to propose a whole new API. If we end up deciding to > add one anyway, I don't think that to_ascii is a good name; it doesn't > imply the type of the result, since ASCII text can also be (and > usually is) represented as a (Unicode) str instance. > Okay. Thanks for al the responses! I'll close issue 2483. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080415/2cc83b38/attachment.htm From p.f.moore at gmail.com Wed Apr 16 00:53:14 2008 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 15 Apr 2008 23:53:14 +0100 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <79990c6b0804151553w7c92f635u8138be2e723d6e07@mail.gmail.com> On 15/04/2008, Terry Reedy wrote: > "Guido van Rossum" wrote in message > | So it sounds like we're doomed if we do, and damned if we don't. Or do > | I misunderstand you? Do you have a practical suggestion? > > You understood the same as me. That's how it sounded to me, as well. > I suspect the real solution has to be language-community (and even > programmer) specific, since I expect most people would like the chars they > know and expect to be unescaped and others left escaped. So, perhaps there > should be a unirep module, stdlib or not, used like: > > import unirep > print(*map(unirep.russian, objects)) > > or even > > from unirep import rus_print > > rus_print(ojbects) # does same as above, with **kwds passed on To put this another way, repr() has a single advantage, that it's as near as possible unambiguous while using only ASCII (the "only ASCII" bit is to avoid ambiguity between identical-looking glyphs). Readability is explicitly *not* a goal in this context. If you want a readable version, write it yourself (or import someone else's - possibly the stdlib's if anyone writes one). If you don't want to change your code, write from my_repr import my_repr as repr or fiddle with builtins if you want to do this across all modules. So repr() should stay as it is, and it should be remembered that it is *not* intended for reading, but for debugging. Stephen's point that some errors are more easily debuggable with a readable version is, in my English-only view (:-)), a corner case, although I'm happy to concede, a valid one. Paul. From ishimoto at gembook.org Wed Apr 16 02:06:27 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Wed, 16 Apr 2008 09:06:27 +0900 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> 2008/4/16, Guido van Rossum : > > So it sounds like we're doomed if we do, and damned if we don't. Or do > I misunderstand you? Do you have a practical suggestion? > For debugging, I think patch http://bugs.python.org/issue2630 is practical enough if error handler of sys.stdout is 'backslashescape'. If you are Russian and you want to print list of Cyrillic string, you can print repr(listOfRussian). If you want to see more detailed information of specific string, you can print repr(russianStr).encode("ascii", "backslashreplace"). Latter gives you a same result as Python2's repr(russianStr). If you are not Russian and working on ASCII console, print(repr(listOfRussian)) give you a same result as Python2. From ishimoto at gembook.org Wed Apr 16 02:26:40 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Wed, 16 Apr 2008 09:26:40 +0900 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <797440730804151726t281ca554p9cd649efa09abf18@mail.gmail.com> 2008/4/16, Terry Reedy : > I suspect the real solution has to be language-community (and even > programmer) specific, since I expect most people would like the chars they > know and expect to be unescaped and others left escaped. So, perhaps there > should be a unirep module, stdlib or not, used like: > > import unirep > print(*map(unirep.russian, objects)) But how to implement unirep.russian()? Printing string is not a problem, but what annoy me is printing list, tuple or other instances. To implement such language-specific repr(), it should know how to build a repr()ed string of all types. Another idea is supplying a language parameter to PyObject_Repr() and let each type call language-specific string convert function, but I think it's excess. From eric+python-dev at trueblade.com Wed Apr 16 02:34:17 2008 From: eric+python-dev at trueblade.com (Eric Smith) Date: Tue, 15 Apr 2008 20:34:17 -0400 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: References: <47FD5737.4060001@trueblade.com> <18429.31090.933499.512334@montanaro-dyndns-org.local> <1afaf6160804091923w3b7cc33dv802650ebf0786b48@mail.gmail.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> <4801A29C.3020806@gmail.com> Message-ID: <48054989.5070700@trueblade.com> Guido van Rossum wrote: > I thought I had a reasonable proposal: deprecate in 3.1, remove in > 3.3. Adding a PendingDeprecationWarning in 3.0 would be fine. Doing > anything in 2.6 would not be fine, except perhaps making it a > PendingDeprecationWarning whan -3 is given. How do you feel if I close http://bugs.python.org/issue2416 as "wont fix"? I'll then update PEP 3127 (Integer Literal Support and Syntax) to remove mention of adding %b formatting. Eric. From guido at python.org Wed Apr 16 02:41:59 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 15 Apr 2008 17:41:59 -0700 Subject: [Python-3000] Implementing % formatting in terms of str.format() In-Reply-To: <48054989.5070700@trueblade.com> References: <47FD5737.4060001@trueblade.com> <18429.33042.990861.658382@montanaro-dyndns-org.local> <4800BDF7.1000205@gmail.com> <18432.50840.556518.954678@montanaro-dyndns-org.local> <1afaf6160804121230s62eef58dkf5a1b31f0204ba01@mail.gmail.com> <4801A29C.3020806@gmail.com> <48054989.5070700@trueblade.com> Message-ID: On Tue, Apr 15, 2008 at 5:34 PM, Eric Smith wrote: > Guido van Rossum wrote: > > > I thought I had a reasonable proposal: deprecate in 3.1, remove in > > 3.3. Adding a PendingDeprecationWarning in 3.0 would be fine. Doing > > anything in 2.6 would not be fine, except perhaps making it a > > PendingDeprecationWarning whan -3 is given. > > > > How do you feel if I close http://bugs.python.org/issue2416 as "wont fix"? > I'll then update PEP 3127 (Integer Literal Support and Syntax) to remove > mention of adding %b formatting. +1 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From murman at gmail.com Wed Apr 16 03:10:07 2008 From: murman at gmail.com (Michael Urman) Date: Tue, 15 Apr 2008 20:10:07 -0500 Subject: [Python-3000] Recursive str In-Reply-To: <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> Message-ID: On Tue, Apr 15, 2008 at 7:06 PM, atsuo ishimoto wrote: > For debugging, I think patch http://bugs.python.org/issue2630 is > practical enough if error handler of sys.stdout is 'backslashescape'. > > If you are Russian and you want to print list of Cyrillic string, you > can print repr(listOfRussian). If you want to see more detailed > information of specific string, you can print > repr(russianStr).encode("ascii", "backslashreplace"). Latter gives > you a same result as Python2's repr(russianStr). > If you are not Russian and working on ASCII console, > print(repr(listOfRussian)) give you a same result as Python2. I agree with that this is enoguh. I see two main uses for repr when it comes to strings: to put quotes around the contents, and to replace control characters with safe representations the interpreter understands. The third use, to represent strings unambiguously, is not a major point, and is clearly not serviced as you cannot tell via repr if string1 *is* string2; only that they are equal. The first (quotes) disambiguates values in lists containing strings with commas, and the second (backslash replaced control characters) avoids using characters with special meanings. The latter also historically disambiguates everything beyond ascii, but in practice just as it's more useful to have 'mystring' than . Similarly it's more useful to have '???' than to have '\u1234\u5678\u9abc'. While there are cases this can become visually ambiguous, it will still pass the ideal case s == eval(repr(s)). Finally, Atsuo Ishimoto's .encode("ascii", "backslashreplace") is much more explicit about expectations, and handles identifying whether you have a combined character, or a base and combining diacritic, etc. What should the string_escape codec do when repr has been changed (assuming it's not internally linked directly to repr)? I can see benefits to matching repr and benefits to being more like ASCII+backslashreplace, and don't have a strong preference like I do for repr. [Apologies for hitting reply on the unicodedata suggestion yesterday.] Michael -- Michael Urman From greg.ewing at canterbury.ac.nz Wed Apr 16 05:15:22 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 16 Apr 2008 15:15:22 +1200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4804B83F.5010504@gmail.com> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> Message-ID: <48056F4A.4030102@canterbury.ac.nz> Nick Coghlan wrote: > and the C standard says that sizeof(char) == 1 byte. Does it actually use the word byte, or does it just say the "smallest addressable unit of memory" or something? Seems to me it can't have it both ways, without also trying to define the meaning of "byte", which I don't think it has any business doing. -- Greg From greg.ewing at canterbury.ac.nz Wed Apr 16 05:30:40 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 16 Apr 2008 15:30:40 +1200 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <480572E0.1070204@canterbury.ac.nz> Terry Reedy wrote: > import unirep > print(*map(unirep.russian, objects)) That's okay if the objects are strings, but what about non-string objects that contain strings? We'd need another protocol, such as __unirep__. -- Greg From greg.ewing at canterbury.ac.nz Wed Apr 16 05:44:53 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 16 Apr 2008 15:44:53 +1200 Subject: [Python-3000] Recursive str In-Reply-To: <79990c6b0804151553w7c92f635u8138be2e723d6e07@mail.gmail.com> References: <20080409163038.GC12902@phd.pp.ru> <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <79990c6b0804151553w7c92f635u8138be2e723d6e07@mail.gmail.com> Message-ID: <48057635.3090008@canterbury.ac.nz> Paul Moore wrote: > If you don't > want to change your code, write > > from my_repr import my_repr as repr But repr() itself doesn't do anything -- it just invokes the __repr__ method of its argument. So you can't actually accomplish anything by replacing it, unless your replacement does a lot of un-duckish type testing. What you actually need to replace is the __repr__ method of the builtin string object, and I'm not sure if you can do that from Python. -- Greg From guido at python.org Wed Apr 16 06:55:09 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 15 Apr 2008 21:55:09 -0700 Subject: [Python-3000] Recursive str In-Reply-To: References: <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> Message-ID: On Tue, Apr 15, 2008 at 6:10 PM, Michael Urman wrote: > I agree with that this is enoguh. I see two main uses for repr when it > comes to strings: to put quotes around the contents, and to replace > control characters with safe representations the interpreter > understands. The third use, to represent strings unambiguously, is not > a major point, This is very much dependent on who is looking. > and is clearly not serviced as you cannot tell via repr > if string1 *is* string2; only that they are equal. nobody asked for *that*, so this is a red herring; strings don't compare meaningfully by identity but by value. > The first (quotes) disambiguates values in lists containing strings > with commas, and the second (backslash replaced control characters) > avoids using characters with special meanings. The latter also > historically disambiguates everything beyond ascii, but in practice > just as it's more useful to have 'mystring' than 0x12345678>. Similarly it's more useful to have '$BF|K\8l(B' than to have > '\u1234\u5678\u9abc'. Again, that's not universally true. > While there are cases this can become visually > ambiguous, it will still pass the ideal case s == eval(repr(s)). > Finally, Atsuo Ishimoto's .encode("ascii", "backslashreplace") is much > more explicit about expectations, and handles identifying whether you > have a combined character, or a base and combining diacritic, etc. > > What should the string_escape codec do when repr has been changed > (assuming it's not internally linked directly to repr)? I can see > benefits to matching repr and benefits to being more like > ASCII+backslashreplace, and don't have a strong preference like I do > for repr. The more I think about this, the more I believe that repr() should *not* be changed, and that instead we should give people who like to see '$BF|K\8l(B' instead of '\u1234\u5678\u9abc' other tools to help themselves. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From murman at gmail.com Wed Apr 16 07:16:36 2008 From: murman at gmail.com (Michael Urman) Date: Wed, 16 Apr 2008 00:16:36 -0500 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> Message-ID: 2008/4/15 Guido van Rossum : > On Tue, Apr 15, 2008 at 6:10 PM, Michael Urman wrote: > >The third use, to represent strings unambiguously, is not a major point, > > This is very much dependent on who is looking. > > > Similarly it's more useful to have '???' than to have '\u1234\u5678\u9abc'. > > Again, that's not universally true. It does depend on the use case. I base my comments on an expectation that repr of str will be more commonly used to view contents than to analyze strings character by character. Those doing the latter who care about the difference between various 'a' look-alike characters are doing so for very specific reasons, and can probably afford to "do it right," whatever that ends up meaning. > The more I think about this, the more I believe that repr() should > *not* be changed, and that instead we should give people who like to > see '???' instead of '\u1234\u5678\u9abc' other tools to help > themselves. I'll miss this, as I suspect the case of printing a list of unicode strings will be fairly common. Given Unicode identifiers, even print locals() could hit this. But perhaps tools for printing better summaries of the contents of lists and dicts, or shell quoting (repr as is makes a passable hack for quotes and spaces, but not unicode characters), etc., can alleviate the pain well enough. -- Michael Urman From martin at v.loewis.de Wed Apr 16 08:15:02 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 16 Apr 2008 08:15:02 +0200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <48056F4A.4030102@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> Message-ID: <48059966.4010102@v.loewis.de> >> and the C standard says that sizeof(char) == 1 byte. > > Does it actually use the word byte, or does it just say the > "smallest addressable unit of memory" or something? > > Seems to me it can't have it both ways, without also trying > to define the meaning of "byte", which I don't think it has > any business doing. 3.6 byte addressable unit of data storage large enough to hold any member of the basic character set of the execution environment 6.5.3.4 The sizeof operator [#2] The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. [...] [#3] When applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1. [...] Regards, Martin From ishimoto at gembook.org Wed Apr 16 11:49:21 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Wed, 16 Apr 2008 18:49:21 +0900 Subject: [Python-3000] Recursive str In-Reply-To: References: <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> Message-ID: <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> 2008/4/16, Michael Urman : > I'll miss this, as I suspect the case of printing a list of unicode > strings will be fairly common. Given Unicode identifiers, even print > locals() could hit this. But perhaps tools for printing better > summaries of the contents of lists and dicts, or shell quoting (repr > as is makes a passable hack for quotes and spaces, but not unicode > characters), etc., can alleviate the pain well enough. > If such tools are given, but I'm not sure it is enough. Using repr() to build output string is common practice in Python world, so repr() is called everywhere in Python-core and third-party applications to print objects, emitting logs, etc.,. For example, >>> f = open("???") Traceback (most recent call last): File "", line 1, in File "c:\ww\Python-3.0a4-orig\lib\io.py", line 212, in __new__ return open(*args, **kwargs) File "c:\ww\Python-3.0a4-orig\lib\io.py", line 151, in open closefd) IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e' This is annoying error message. Or, in Python 2, >>> f = open(u"???", "w") >>> f This repr()ed form is difficult to read. When Japanese (or Chinise) programmers look u'\u65e5\u672c\u8a9e', they'll have strong impression that Python is not intended to be used in their country. From phd at phd.pp.ru Wed Apr 16 12:32:07 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 16 Apr 2008 14:32:07 +0400 Subject: [Python-3000] Recursive str In-Reply-To: References: Message-ID: <20080416103207.GB6295@phd.pp.ru> Hello. Sorry for being a bit late in the discussion - my sysadmin has problems setting up our DNS server so I could not send mail. On Tue, Apr 15, 2008 at 06:07:46PM -0400, Terry Reedy wrote: > import unirep > print(*map(unirep.russian, objects)) > > or even > > from unirep import rus_print > > rus_print(ojbects) # does same as above, with **kwds passed on First, this doesn't help anything because that form of print must be recursive if "objects" is a container that contains other objects. Second, I am satisfied with how repr(objects) works - it calls repr() recursively and that's ok. What I was complaining in the original post is that str(objects) calls repr() for items. This is especially problematic when I use repr() and str() semi-explicitly. For example, compare logging.debug("objects: %r", objects) and logging.debug("objects: %s", objects) In the first call I expect and get repr(objects), fine. But in the second case I again get repr(), and even logging.debug("objects: %s", str(objects)) doesn't help. Do I understand it right that str(objects) calls repr() on items to properly quote strings? (str([1, '1']) must give "[1, '1']" as the result). Is it the only reason? PS. atsuo ishimoto has showed that repr() is called in tracebacks. I agree that's a problem, but that's another problem, not "recursive str". Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Apr 16 14:11:13 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 Apr 2008 22:11:13 +1000 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt (was Re: Recursive str) In-Reply-To: <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> References: <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> Message-ID: <4805ECE1.6040501@gmail.com> atsuo ishimoto wrote: > Using repr() to build output string is common practice in Python world, > so repr() is called everywhere in Python-core and third-party applications > to print objects, emitting logs, etc.,. > > For example, > >>>> f = open("???") > Traceback (most recent call last): > File "", line 1, in > File "c:\ww\Python-3.0a4-orig\lib\io.py", line 212, in __new__ > return open(*args, **kwargs) > File "c:\ww\Python-3.0a4-orig\lib\io.py", line 151, in open > closefd) > IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e' > > This is annoying error message. Or, in Python 2, > >>>> f = open(u"???", "w") >>>> f > > > This repr()ed form is difficult to read. When Japanese (or Chinise) > programmers look u'\u65e5\u672c\u8a9e', they'll have strong > impression that Python is not intended to be used in their country. This is starting to seem to me more like something to be addressed through sys.displayhook/excepthook at the interactive interpreter level than it is to be dealt with through changes to any __repr__() implementations. Given the following setup code: def replace_escapes(escaped_str): return escaped_str.encode('latin-1').decode('unicode_escape') def displayhook_unicode(expr_result): if expr_result is not None: __builtins__._ = expr_result print(replace_escapes(repr(expr_result))) from traceback import format_exception def excepthook_unicode(*exc_details): msg = ''.join(format_exception(*exc_details)) print(replace_escapes(msg), end='') import sys sys.displayhook = displayhook_unicode sys.excepthook = excepthook_unicode I get the following behaviour: >>> "\u65e5\u672c\u8a9e" '???' >>> print("\u65e5\u672c\u8a9e") ??? >>> '???' '???' >>> print('???') ??? >>> ??? = 1 >>> ??? 1 >>> dir() ['__builtins__', '__doc__', '__name__', '__package__', 'displayhook_unicode', 'excepthook_unicode', 'format_exception', 'replace_escapes', 'sys', '???'] >>> b"\u65e5\u672c\u8a9e" b'\u65e5\u672c\u8a9e' >>> print(b"\u65e5\u672c\u8a9e") b'\\u65e5\\u672c\\u8a9e' >>> f = open("\u65e5\u672c\u8a9e") Traceback (most recent call last): File "", line 1, in File "/home/ncoghlan/devel/py3k/Lib/io.py", line 212, in __new__ return open(*args, **kwargs) File "/home/ncoghlan/devel/py3k/Lib/io.py", line 151, in open closefd) IOError: [Errno 2] No such file or directory: '???' >>> f = open("\u65e5\u672c\u8a9e", 'w') >>> f.name '???' Note that even though the bytes object representation is slightly different from that for the normal displayhook (which doubles up on the backslashes, just like the bytes printing example above), the two different representations are equivalent because \u isn't a valid escape sequence for bytes literals. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From phd at phd.pp.ru Wed Apr 16 14:45:29 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 16 Apr 2008 16:45:29 +0400 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <4805ECE1.6040501@gmail.com> References: <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> Message-ID: <20080416124529.GC8598@phd.pp.ru> On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote: > atsuo ishimoto wrote: > > IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e' > > This is starting to seem to me more like something to be addressed > through sys.displayhook/excepthook at the interactive interpreter level The problem manifests itself in scripts, too: Traceback (most recent call last): File "./ttt.py", line 4, in open("????") # filename is in koi8-r encoding IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Apr 16 15:21:26 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 Apr 2008 23:21:26 +1000 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <20080416124529.GC8598@phd.pp.ru> References: <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> Message-ID: <4805FD56.6070902@gmail.com> Oleg Broytmann wrote: > On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote: >> atsuo ishimoto wrote: >>> IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e' >> This is starting to seem to me more like something to be addressed >> through sys.displayhook/excepthook at the interactive interpreter level > > The problem manifests itself in scripts, too: > > Traceback (most recent call last): > File "./ttt.py", line 4, in > open("????") # filename is in koi8-r encoding > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' Hmm, the io module along with sys.stdout/err may be a better way to attack the problem then. Given: import sys, io class ParseUnicodeEscapes(io.TextIOWrapper): def write(self, text): super().write(text.encode('latin-1').decode('unicode_escape')) args = (sys.stdout.buffer, sys.stdout.encoding, sys.stdout.errors, None, sys.stdout.line_buffering) sys.stdout = ParseUnicodeEscapes(*args) args = (sys.stderr.buffer, sys.stderr.encoding, sys.stderr.errors, None, sys.stderr.line_buffering) sys.stderr = ParseUnicodeEscapes(*args) You get: >>> "????" '????' >>> open("????") Traceback (most recent call last): File "", line 1, in File "/home/ncoghlan/devel/py3k/Lib/io.py", line 212, in __new__ return open(*args, **kwargs) File "/home/ncoghlan/devel/py3k/Lib/io.py", line 151, in open closefd) IOError: [Errno 2] No such file or directory: '????' Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From phd at phd.pp.ru Wed Apr 16 15:30:46 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 16 Apr 2008 17:30:46 +0400 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <4805FD56.6070902@gmail.com> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> Message-ID: <20080416133046.GB16087@phd.pp.ru> On Wed, Apr 16, 2008 at 11:21:26PM +1000, Nick Coghlan wrote: > Hmm, the io module along with sys.stdout/err may be a better way to > attack the problem then. Given: > > import sys, io > > class ParseUnicodeEscapes(io.TextIOWrapper): > def write(self, text): > super().write(text.encode('latin-1').decode('unicode_escape')) > > args = (sys.stdout.buffer, sys.stdout.encoding, sys.stdout.errors, > None, sys.stdout.line_buffering) > > sys.stdout = ParseUnicodeEscapes(*args) > > args = (sys.stderr.buffer, sys.stderr.encoding, sys.stderr.errors, > None, sys.stderr.line_buffering) > > sys.stderr = ParseUnicodeEscapes(*args) > > You get: > > >>> "????" > '????' > >>> open("????") > Traceback (most recent call last): > File "", line 1, in > File "/home/ncoghlan/devel/py3k/Lib/io.py", line 212, in __new__ > return open(*args, **kwargs) > File "/home/ncoghlan/devel/py3k/Lib/io.py", line 151, in open > closefd) > IOError: [Errno 2] No such file or directory: '????' Very well, then. Thank you! The code should be put in a cookbook or the wiki, if not in the library. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido at python.org Wed Apr 16 16:26:36 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 16 Apr 2008 07:26:36 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <20080416124529.GC8598@phd.pp.ru> References: <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> Message-ID: 2008/4/16 Oleg Broytmann : > The problem manifests itself in scripts, too: > > Traceback (most recent call last): > File "./ttt.py", line 4, in > open("????") # filename is in koi8-r encoding > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' Note that this can be a feature too! You might have a filename that *looks* normal but contains a character from a different language -- the \u encoding will show you the problem. $ ls *.py mc.py x.py guido-van-rossums-imac:~ guido$ python Python 2.5.2 (release25-maint:60953, Feb 25 2008, 09:38:08) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> open('m?.py') Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory: 'm\xd1\x81.py' >>> -- --Guido van Rossum (home page: http://www.python.org/~guido/) From phd at phd.pp.ru Wed Apr 16 16:33:22 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 16 Apr 2008 18:33:22 +0400 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> Message-ID: <20080416143322.GF16087@phd.pp.ru> On Wed, Apr 16, 2008 at 07:26:36AM -0700, Guido van Rossum wrote: > 2008/4/16 Oleg Broytmann : > > The problem manifests itself in scripts, too: > > > > Traceback (most recent call last): > > File "./ttt.py", line 4, in > > open("????") # filename is in koi8-r encoding > > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' > > Note that this can be a feature too! You might have a filename that > *looks* normal but contains a character from a different language -- > the \u encoding will show you the problem. > > $ ls *.py > mc.py x.py > guido-van-rossums-imac:~ guido$ python > Python 2.5.2 (release25-maint:60953, Feb 25 2008, 09:38:08) > [GCC 4.0.1 (Apple Inc. build 5465)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> open('m?.py') > Traceback (most recent call last): > File "", line 1, in > IOError: [Errno 2] No such file or directory: 'm\xd1\x81.py' This can be a feature only for people who always have all-ascii file names and never expect non-ascii characters in the file names. Those of us who regularly use non-ascii filenames are too accustomed to that brok^H^H^H^H escaped repr's to spot a difference. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Apr 16 16:53:02 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 17 Apr 2008 00:53:02 +1000 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <20080416133046.GB16087@phd.pp.ru> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> Message-ID: <480612CE.1010300@gmail.com> Oleg Broytmann wrote: > On Wed, Apr 16, 2008 at 11:21:26PM +1000, Nick Coghlan wrote: >> You get: >> >> >>> "????" >> '????' >> >>> open("????") >> Traceback (most recent call last): >> File "", line 1, in >> File "/home/ncoghlan/devel/py3k/Lib/io.py", line 212, in __new__ >> return open(*args, **kwargs) >> File "/home/ncoghlan/devel/py3k/Lib/io.py", line 151, in open >> closefd) >> IOError: [Errno 2] No such file or directory: '????' > > Very well, then. Thank you! The code should be put in a cookbook or the > wiki, if not in the library. > Unfortunately, it turns out that the trick also breaks display of strings containing any other escape codes. For example: >>> '\n' ' ' >>> '\t' ' ' The unicode_escape codec is interpreting all of the escape sequences recognised in Python strings, not just the \u sequences we're interested in. I can't see an easy way around this at the moment, but I'm still reasonably convinced that the issue of Unicode escapes for non-ASCII users is best attacked as a display problem rather than an internal representation problem. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From guido at python.org Wed Apr 16 17:43:05 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 16 Apr 2008 08:43:05 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <480612CE.1010300@gmail.com> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> Message-ID: I just had a shower, and I think it's cleared my thoughts a bit. :-) Clearly this is an important problem to those in countries where ASCII doesn't cut it. And just like in Python 3000 we're using UTF-8 as the default source encoding and allowing Unicode letters in identifiers, I think we should bite the bullet and allow repr() of a string to pass through all characters that the Unicode standard considers printable. For those of us with less capable IO devices, setting the error flag for stdout and stderr to backslashreplace is probably the best solution (and it solves more problems than just repr()). I will have another look at Atsuo's patch. I do think we should use some kind of Unicode-standard-endorsed definition of "printable" (as long as it excludes all ASCII escapes), since there are plenty of undefined code points that even Japanese people would probably prefer to see rendered as \uxxxx rather than completely invisible. I'm also not sure what people would want to happen for surrogate pairs. (OTOH an unpaired surrogate should be rendered as \uxxxx.) I expect that this will require some more research and agreement. Perhaps someone can produce a draft PEP and attempt to sort out the details of specification and implementation? It would also be nice if it could be friendly to Jython, IronPython and PyPy. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jason.orendorff at gmail.com Wed Apr 16 18:05:01 2008 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Wed, 16 Apr 2008 11:05:01 -0500 Subject: [Python-3000] Recursive str In-Reply-To: <480572E0.1070204@canterbury.ac.nz> References: <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <480572E0.1070204@canterbury.ac.nz> Message-ID: On Tue, Apr 15, 2008 at 10:30 PM, Greg Ewing wrote: > Terry Reedy wrote: > > import unirep > > print(*map(unirep.russian, objects)) > > That's okay if the objects are strings, but what about > non-string objects that contain strings? > > We'd need another protocol, such as __unirep__. Or have str.__repr__() respect per-thread settings, the way decimal arithmetic does. Default settings would be in force most of the time; the interactive prompt would apply the user's settings when repr-ing a result. This approach solves the nested-strings problem quite nicely. But it does not catch error/warning/log messages when they are generated, unless the program does *everything* under custom repr settings (dangerous). There really are two use cases here: a human-readable repr for error/warning/log messages; and a machine-readable, always-the-same, ASCII-only repr. Users want to be able to tweak the former. -j From guido at python.org Wed Apr 16 18:52:27 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 16 Apr 2008 09:52:27 -0700 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <480572E0.1070204@canterbury.ac.nz> Message-ID: [Jason Orendorff] > Or have str.__repr__() respect per-thread settings, the way decimal > arithmetic does. I don't think that's a very compelling example. I have serious issues with having global or per-thread state that can change the outcome of repr(); it would make it impossible to write correct code involving repr() because you can never know what it will do the next time. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From murman at gmail.com Wed Apr 16 19:57:29 2008 From: murman at gmail.com (Michael Urman) Date: Wed, 16 Apr 2008 12:57:29 -0500 Subject: [Python-3000] Recursive str In-Reply-To: References: <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <480572E0.1070204@canterbury.ac.nz> Message-ID: On Wed, Apr 16, 2008 at 11:05 AM, Jason Orendorff wrote: > There really are two use cases here: a human-readable repr for > error/warning/log messages; and a machine-readable, always-the-same, > ASCII-only repr. Users want to be able to tweak the former. Does machine-readable require ASCII-only, and does repr() guarantee this? It sounded like the worries about not escaping Unicode characters were related to it not visually distinguishing between different encodings for the same visual results (as their machine-readable Unicode strings, or encoded UTF-8 bytestreams, would already differ). -- Michael Urman From ishimoto at gembook.org Thu Apr 17 01:09:25 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Thu, 17 Apr 2008 08:09:25 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <4805FD56.6070902@gmail.com> References: <48015133.4020105@canterbury.ac.nz> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> Message-ID: <797440730804161609k6deb9154m3b5ed831712135c9@mail.gmail.com> 2008/4/16, Nick Coghlan : > Oleg Broytmann wrote: > > On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote: > >> atsuo ishimoto wrote: > >>> IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e' > >> This is starting to seem to me more like something to be addressed > >> through sys.displayhook/excepthook at the interactive interpreter level > > > > The problem manifests itself in scripts, too: > > > > Traceback (most recent call last): > > File "./ttt.py", line 4, in > > open("????") # filename is in koi8-r encoding > > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' > > > Hmm, the io module along with sys.stdout/err may be a better way to > attack the problem then. Given: > > import sys, io > > class ParseUnicodeEscapes(io.TextIOWrapper): > def write(self, text): > super().write(text.encode('latin-1').decode('unicode_escape')) > > args = (sys.stdout.buffer, sys.stdout.encoding, sys.stdout.errors, > None, sys.stdout.line_buffering) > > sys.stdout = ParseUnicodeEscapes(*args) > > args = (sys.stderr.buffer, sys.stderr.encoding, sys.stderr.errors, > None, sys.stderr.line_buffering) > > sys.stderr = ParseUnicodeEscapes(*args) > > You get: > > >>> "????" > '????' > >>> open("????") > I got: >>> print("?") Traceback (most recent call last): File "", line 1, in File "", line 3, in write UnicodeEncodeError: 'latin-1' codec can't encode character '?' in position 0: ordinal not in range(256) >>> print('\\'+'u0041') A Your hack doesn't work. Displayhook hack doesn't work, too. Question: Are you happy if you are forced to live with these hacks forever? If not, why do you think I'll accept your suggestion? From ishimoto at gembook.org Thu Apr 17 01:09:30 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Thu, 17 Apr 2008 08:09:30 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: References: <48015133.4020105@canterbury.ac.nz> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> Message-ID: <797440730804161609s22a3e1f6ncc86da3da4b144b8@mail.gmail.com> 2008/4/16, Guido van Rossum : > Note that this can be a feature too! You might have a filename that > *looks* normal but contains a character from a different language -- > the \u encoding will show you the problem. You won't call it a feature, if your *normal* encoding was koi8-r. From guido at python.org Thu Apr 17 01:13:09 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 16 Apr 2008 16:13:09 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <797440730804161609s22a3e1f6ncc86da3da4b144b8@mail.gmail.com> References: <48015133.4020105@canterbury.ac.nz> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <797440730804161609s22a3e1f6ncc86da3da4b144b8@mail.gmail.com> Message-ID: I changed my mind already. :-) See my post of this morning in another thread. On Wed, Apr 16, 2008 at 4:09 PM, atsuo ishimoto wrote: > 2008/4/16, Guido van Rossum : > > > Note that this can be a feature too! You might have a filename that > > *looks* normal but contains a character from a different language -- > > the \u encoding will show you the problem. > > You won't call it a feature, if your *normal* encoding was koi8-r. > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ishimoto at gembook.org Thu Apr 17 01:20:57 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Thu, 17 Apr 2008 08:20:57 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: References: <48015133.4020105@canterbury.ac.nz> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <797440730804161609s22a3e1f6ncc86da3da4b144b8@mail.gmail.com> Message-ID: <797440730804161620m6b0fd92uc449392504419e68@mail.gmail.com> 2008/4/17, Guido van Rossum : > I changed my mind already. :-) See my post of this morning in another thread. Ah, I missed the mail! Thank you. From stephen at xemacs.org Thu Apr 17 02:20:52 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 17 Apr 2008 09:20:52 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> Message-ID: <87wsmxpc0r.fsf@uwakimon.sk.tsukuba.ac.jp> I've reordered Guido's words. Guido van Rossum writes: > For those of us with less capable IO devices, setting the error flag > for stdout and stderr to backslashreplace is probably the best > solution (and it solves more problems than just repr()). True. But it doesn't solve the ambiguity problem on capable displays. > And just like in Python 3000 we're using UTF-8 as the default > source encoding and allowing Unicode letters in identifiers, I > think we should bite the bullet and allow repr() of a string to > pass through all characters that the Unicode standard considers > printable. The problem is that this doesn't display the representation of strings and identifier names in an unambiguous way. "AKMOT" could be all-ASCII, it could be all-Cyrillic, or it could be a mixture of ASCII, Cyrillic, and Greek. Odds are quite good that there are other scripts that could be mixed in, too. This kind of mixing happens all the time in Japanese, where people mix half-width and full-width ASCII with abandon (especially when altering digits in dates). I could easily see a Russian using Cyrillic 'A' to uppercase an ASCII 'a' in the same way. How about choosing a standard Python repertoire (based on the Unicode standard, of course) of which characters get a graphic repr and which ones get \u-escaped, and have a post-hook for repr which gets passed the string repr proposes to print out? This hook would always be identity in Python-distributed stuff, of course, but on the consenting adults principle applications and modules outside of the stdlib could use it. Would that be acceptable? The standard repertoire would grandfather ASCII, I suppose, because for the foreseeable future most identifiers are going to be ASCII, and all Python implementations will contain a lot of ASCII identifiers and strings indefinitely. From ntung at ntung.com Thu Apr 17 02:52:11 2008 From: ntung at ntung.com (Nicholas T) Date: Wed, 16 Apr 2008 17:52:11 -0700 Subject: [Python-3000] end scope of iteration variables after loop Message-ID: hello all, A few times in practice I have been tripped up by how Python keeps variables in scope after a loop--and it wasn't immediately obvious what the problem was. I think it is one of the ugliest and non-intuitive features, and hope some others agree that it should be changed in py3k. >>> for a in range(11): pass ... >>> print(a) 10 Thanks, Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080416/c5ce7edd/attachment.htm From greg.ewing at canterbury.ac.nz Thu Apr 17 03:00:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 13:00:56 +1200 Subject: [Python-3000] Recursive str In-Reply-To: References: <9e804ac0804110544k1a41a93cm2978b9ed170d6eea@mail.gmail.com> <20080411125521.GE25461@phd.pp.ru> <48015133.4020105@canterbury.ac.nz> <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> Message-ID: <4806A148.8020505@canterbury.ac.nz> Guido van Rossum wrote: > The more I think about this, the more I believe that repr() should > *not* be changed, and that instead we should give people who like to > see '???' instead of '\u1234\u5678\u9abc' other tools to help > themselves. This seems to be a rather ASCII-centric way of thinking about things, though, which I thought py3k was trying to get away from, with unicode being the one and only string type. Maybe it really is the only practical option, but I can understand non-ASCII speakers feeling disappointed. -- Greg From greg.ewing at canterbury.ac.nz Thu Apr 17 03:11:58 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 13:11:58 +1200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <48059966.4010102@v.loewis.de> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> Message-ID: <4806A3DE.6030807@canterbury.ac.nz> Martin v. L?wis wrote: > 3.6 byte > addressable unit of data storage large enough to hold any > member of the basic character set of the execution > environment Blarg. Well, I think the wording of that part of the standard is braindamaged. The word "byte" already has a pre-existing meaning outside of C, and the C standard shouldn't be redefining it for its own purposes. This is like a financial document that defines "dollar" as "the unit of currency in use in the country concerned". Thoroughly confusing and unnecessary. Particularly since they seem to just be defining "byte" to mean the same thing as "char". Why not just use the term "char" in the first place? -- Greg From gatoatigrado at gmail.com Thu Apr 17 03:25:26 2008 From: gatoatigrado at gmail.com (Nicholas T) Date: Wed, 16 Apr 2008 18:25:26 -0700 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: Message-ID: previous discussion at http://mail.python.org/pipermail/python-dev/2005-September/056677.html I don't agree with the author that >>> i = 3 >>> for i in range(11): pass ... >>> i 10 is much less confusing than i returning 3. furthermore, his C example makes it obvious that "i" will be available in the scope after the loop. There's no way to know now, but I think mistakes would be less frequent. Additionally, what are others' opinions about this "pseudo-namespace" (i.e. scoping) being slow? Admittedly, I don't know much about the current parser's implementation, but it doesn't seem like scoping necessitates slow parsing -- considering it's done in other languages, and python functions have reasonable scope. >>> def do_nothing(i): i = 3 ... >>> do_nothing(1) >>> i 10 Nicholas On Wed, Apr 16, 2008 at 5:52 PM, Nicholas T wrote: > hello all, > > A few times in practice I have been tripped up by how Python keeps > variables in scope after a loop--and it wasn't immediately obvious what the > problem was. I think it is one of the ugliest and non-intuitive features, > and hope some others agree that it should be changed in py3k. > > >>> for a in range(11): pass > ... > >>> print(a) > 10 > > Thanks, > Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080416/da598ada/attachment.htm From greg.ewing at canterbury.ac.nz Thu Apr 17 03:36:54 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 13:36:54 +1200 Subject: [Python-3000] Recursive str In-Reply-To: <20080416103207.GB6295@phd.pp.ru> References: <20080416103207.GB6295@phd.pp.ru> Message-ID: <4806A9B6.5000907@canterbury.ac.nz> Oleg Broytmann wrote: > Do I understand it right that str(objects) calls repr() on items to > properly quote strings? (str([1, '1']) must give "[1, '1']" as the result). > Is it the only reason? In the case of strings, yes. More generally, there can be any kind of object in the list, and repr(x) is more likely to give an unambiguous idea of what x is than str(x) when it's embedded in a comma- separated list. Python has no way of guessing the most appropriate way to display your list of objects when you use str(), so it doesn't try. You have to tell it by writing code to do what you want. -- Greg From greg.ewing at canterbury.ac.nz Thu Apr 17 03:53:37 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 13:53:37 +1200 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt Message-ID: <4806ADA1.3010001@canterbury.ac.nz> Oleg Broytmann wrote: > Traceback (most recent call last): > File "./ttt.py", line 4, in > open("????") # filename is in koi8-r encoding > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' In that particular case, I'd say the IOError constructor is doing the wrong thing -- it should be using something like "No such file or directory: '%s'" % filename\ instead of "No such file or directory: %r" % filename i.e. %r shouldn't be used as a quick and dirty way to get a string quoted. -- Greg From greg.ewing at canterbury.ac.nz Thu Apr 17 04:07:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 14:07:56 +1200 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <480612CE.1010300@gmail.com> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> Message-ID: <4806B0FC.2080005@canterbury.ac.nz> Nick Coghlan wrote: > Unfortunately, it turns out that the trick also breaks display of > strings containing any other escape codes. There's also the worry that it could trigger falsely on something that happened to look like \uxxxx but didn't originate from the repr of a unicode char. > I'm still > reasonably convinced that the issue of Unicode escapes for non-ASCII > users is best attacked as a display problem It can only ever be a heuristic, though, not an exact solution, since there isn't enough information left by the time it's a string to undo the escaping correctly in all cases. I'm currently thinking there are too many use cases overloaded onto repr() at the moment. -- Greg From greg.ewing at canterbury.ac.nz Thu Apr 17 05:00:24 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 15:00:24 +1200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806B922.9070202@ar.media.kyoto-u.ac.jp> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> Message-ID: <4806BD48.6030908@canterbury.ac.nz> David Cournapeau wrote: > They are totally different concepts: byte is not a (C) type, but a unit, > the one returned by the sizeof operator. If a word is needed for this concept, then invent a new one, e.g. "size unit", rather than reusing "byte", which everyone already understands as meaning 8 bits. > C impose that sizeof(unsigned type) == sizeof(signed type) for any type, > so if one byte is one char, unsigned char would be a byte too, and so > unsigned char and char would be the same, which is obviously wrong. No, "char" and "unsigned char" can still be different types. You just need to say that sizeof(char) == sizeof(unsigned char) == 1, and leave bytes out of the discussion altogether. -- Greg From david at ar.media.kyoto-u.ac.jp Thu Apr 17 04:42:42 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 17 Apr 2008 11:42:42 +0900 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806A3DE.6030807@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> Message-ID: <4806B922.9070202@ar.media.kyoto-u.ac.jp> Greg Ewing wrote: > > Blarg. Well, I think the wording of that part of the > standard is braindamaged. The word "byte" already has > a pre-existing meaning outside of C, and the C standard > shouldn't be redefining it for its own purposes. > > This is like a financial document that defines "dollar" > as "the unit of currency in use in the country concerned". > Thoroughly confusing and unnecessary. > > Particularly since they seem to just be defining "byte" > to mean the same thing as "char". Why not just use the > term "char" in the first place? > They are totally different concepts: byte is not a (C) type, but a unit, the one returned by the sizeof operator. One char occupies one byte of memory, and in memory, they are the same, but conceptually, they are totally different, from the C point of view at least. For example, C impose that sizeof(unsigned type) == sizeof(signed type) for any type, so if one byte is one char, unsigned char would be a byte too, and so unsigned char and char would be the same, which is obviously wrong. cheers, David From david at ar.media.kyoto-u.ac.jp Thu Apr 17 05:15:17 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 17 Apr 2008 12:15:17 +0900 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806BD48.6030908@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> Message-ID: <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> Greg Ewing wrote: > > If a word is needed for this concept, then invent a new > one, e.g. "size unit", rather than reusing "byte", which > everyone already understands as meaning 8 bits. > Maybe everyone understands it as 8 bits, but it has always been wrong. Byte is a unit of storage, which often contains 8 bits, but not always. This definition of a byte as a unit of storage certainly precludes the convention that one byte = 8 bits; even if it always contained 8 bits, it would still be wrong to say that one byte is 8 bits BTW: the byte notion (unit of storage), and its actual size are totally different concepts. > > No, "char" and "unsigned char" can still be different types. > You just need to say that sizeof(char) == sizeof(unsigned char) == 1, > and leave bytes out of the discussion altogether. > I was merely answering to the question "why not using char in the first place": because they are totally difference concepts. If you assume char and byte are the same thing because sizeof(char) == 1 byte, then you should assume that unsigned char is the same as a byte, and thus that unsigned char and char are the same. This was a proof by contradiction :) cheers, David From foom at fuhm.net Thu Apr 17 05:59:40 2008 From: foom at fuhm.net (James Y Knight) Date: Wed, 16 Apr 2008 23:59:40 -0400 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806BD48.6030908@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> Message-ID: On Apr 16, 2008, at 11:00 PM, Greg Ewing wrote: > If a word is needed for this concept, then invent a new > one, e.g. "size unit", rather than reusing "byte", which > everyone already understands as meaning 8 bits. Nope. Everyone understands "octet" to be 8 bits. Bytes being exactly 8 bits is itself the redefinition! In the not-too- distant-past, some hardware had 9-bit bytes. Common Lisp also uses the term "byte" to mean an arbitrary (specified) number of bits. E.g. http://www.lisp.org/HyperSpec/Body/typ_unsigned-byte.html See also http://dictionary.die.net/byte James From greg.ewing at canterbury.ac.nz Thu Apr 17 06:32:41 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 16:32:41 +1200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> Message-ID: <4806D2E9.3080905@canterbury.ac.nz> David Cournapeau wrote: > Maybe everyone understands it as 8 bits, but it has always been wrong. It may not be officially written down anywhere, but almost everyone in the world understands a byte to mean 8 bits. When you go into a computer store and ask for 256MB of RAM, you don't expect to be asked "What size bytes would that be, then, sir?" So it's a de facto standard, and one that works perfectly well. Going against it is both futile and unnecessary, as far as I can see. -- Greg From aleaxit at gmail.com Thu Apr 17 06:59:08 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 16 Apr 2008 21:59:08 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <4806ADA1.3010001@canterbury.ac.nz> References: <4806ADA1.3010001@canterbury.ac.nz> Message-ID: On Wed, Apr 16, 2008 at 6:53 PM, Greg Ewing wrote: ... > > open("????") # filename is in koi8-r encoding > > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4' > > In that particular case, I'd say the IOError constructor > is doing the wrong thing -- it should be using something > like > > "No such file or directory: '%s'" % filename\ > > instead of > > "No such file or directory: %r" % filename > > i.e. %r shouldn't be used as a quick and dirty way to > get a string quoted. I disagree: I always recommend using %r to display (in an error message, log entry, etc), a string that may be in error, NOT '%s', because the cause of the error can often be that the string mistakenly contains otherwise-invisible characters -- %r will show them clearly (as escape sequences), while %s could hide them and lead anybody but the most experienced developer to a long and frustrating debugging session. Alex From greg.ewing at canterbury.ac.nz Thu Apr 17 07:20:32 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 17 Apr 2008 17:20:32 +1200 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: References: <4806ADA1.3010001@canterbury.ac.nz> Message-ID: <4806DE20.8030903@canterbury.ac.nz> Alex Martelli wrote: > I disagree: I always recommend using %r to display (in an error > message, log entry, etc), a string that may be in error, For debugging messages, yes, but not output produced in the normal course of operation. And "File Not Found" I consider to be in the latter category -- the user typed in the wrong file name, but it's still a string, and should be displayed to him as such. If it's not a string, the program will most likely fall over with a TypeError trying to open the file before it gets as far as constructing an IOError. -- Greg From guido at python.org Thu Apr 17 07:38:15 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 16 Apr 2008 22:38:15 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <4806DE20.8030903@canterbury.ac.nz> References: <4806ADA1.3010001@canterbury.ac.nz> <4806DE20.8030903@canterbury.ac.nz> Message-ID: On Wed, Apr 16, 2008 at 10:20 PM, Greg Ewing wrote: > Alex Martelli wrote: > > I disagree: I always recommend using %r to display (in an error > > message, log entry, etc), a string that may be in error, > > For debugging messages, yes, but not output produced > in the normal course of operation. And "File Not Found" > I consider to be in the latter category -- the user > typed in the wrong file name, but it's still a string, > and should be displayed to him as such. I respectfully disagree. Control characters and such in the string should *definitely* be escaped. Regarding printable characters outside the ASCII range, see my post in another thread (which somehow nearly everybody appears to have missed); in Py3k I propose to pass printable Unicode characters unchanged through repr(). stdout/stderr will set their error attribute to backslashreplace so that if their encoding is ASCII or some such, out-of-range characters will be printed as \uxxxx rather than raising an exception during printing. But as I said, please follow up to my other post. Another reason to use %r is that if someone manages to include \n in a filename, with %s the log message might be spread across two lines, possibly confusing log parsers and even providing ways to hide illegal activities from log scanners. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Thu Apr 17 07:47:37 2008 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 16 Apr 2008 23:47:37 -0600 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806D2E9.3080905@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> <4806D2E9.3080905@canterbury.ac.nz> Message-ID: On Wed, Apr 16, 2008 at 10:32 PM, Greg Ewing wrote: > David Cournapeau wrote: > > > Maybe everyone understands it as 8 bits, but it has always been wrong. > > It may not be officially written down anywhere, but > almost everyone in the world understands a byte to mean > 8 bits. When you go into a computer store and ask for > 256MB of RAM, you don't expect to be asked "What size > bytes would that be, then, sir?" > > So it's a de facto standard, and one that works perfectly > well. Going against it is both futile and unnecessary, > as far as I can see. Sure, *now*, but C inherited their definition from a day when it wasn't so clear cut. It may be obsolete today, but good luck getting them to change the standard. -- Adam Olsen, aka Rhamphoryncus From martin at v.loewis.de Thu Apr 17 09:07:37 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 17 Apr 2008 09:07:37 +0200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806D2E9.3080905@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> <4806D2E9.3080905@canterbury.ac.nz> Message-ID: <4806F739.5080308@v.loewis.de> > So it's a de facto standard, and one that works perfectly > well. Going against it is both futile and unnecessary, > as far as I can see. Is python-3000 really the right place to debate the wording of the C standard? Now that you know what it says, you should accept that it does say that. If you want to change that, join your national standards body. Regards, Martin From amauryfa at gmail.com Thu Apr 17 10:48:18 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 17 Apr 2008 10:48:18 +0200 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: Message-ID: Nicholas T wrote: > hello all, > > A few times in practice I have been tripped up by how Python keeps > variables in scope after a loop--and it wasn't immediately obvious what the > problem was. I think it is one of the ugliest and non-intuitive features, > and hope some others agree that it should be changed in py3k. > > >>> for a in range(11): pass > ... > >>> print(a) > 10 There are use cases when the last value of the loop variable is needed after a "break" statement. for myObject in someList: if myObject.fits(): break else: myObject = someDefaultValue # continue with myObject See for example in csv.py, function Sniffer.has_header() [*]: The loop tries to find a suitable value for the "thisType" variable, then use it. I like to use this pattern: it avoids an additional variable with the same meaning, and still separates the search from the other processing. [*] BTW, in the py3k version there are two obvious simplifications, due to the long->int massive replace. -- Amaury Forgeot d'Arc From phd at phd.pp.ru Thu Apr 17 11:14:10 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 17 Apr 2008 13:14:10 +0400 Subject: [Python-3000] Recursive str In-Reply-To: <4806A9B6.5000907@canterbury.ac.nz> References: <20080416103207.GB6295@phd.pp.ru> <4806A9B6.5000907@canterbury.ac.nz> Message-ID: <20080417091410.GA23016@phd.pp.ru> On Thu, Apr 17, 2008 at 01:36:54PM +1200, Greg Ewing wrote: > Oleg Broytmann wrote: > > Do I understand it right that str(objects) calls repr() on items to > > properly quote strings? (str([1, '1']) must give "[1, '1']" as the result). > > Is it the only reason? > > In the case of strings, yes. More generally, there > can be any kind of object in the list, and repr(x) > is more likely to give an unambiguous idea of what > x is than str(x) when it's embedded in a comma- > separated list. When I use str(container) instead of repr(comtainer) does Python need to guess if I want an unambiguous representation or a printable representation of items? I don't think there is a room for guessing - I explicitly said str(). > Python has no way of guessing the most appropriate > way to display your list of objects when you use > str(), so it doesn't try. It doesn't need to guess - all objects *except strings* have __str__, so it should just call it. > You have to tell it by > writing code to do what you want. Well, I found the root of the problem. Python's builtin containers (list, tuple, dict, set) implement __repr__ but not __str__ and of course __repr__ calls repr() on items (which is the correct behaviour). The current implementation goes like this: class object: def __str__(self): # In case the derived class doesn't implement it return repr(self) class list(object): def __repr__(self) pieces = [] for item in self: pieces.append(repr(item)) return '[%s]' % (', '.join(pieces)) I'd like to see __str__ implemented the following way: def __str__(self) pieces = [] for item in self: if isinstance(item, str): pieces.append("'%s'" % item) else: pieces.append(str(item)) return '[%s]' % (', '.join(pieces)) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From solipsis at pitrou.net Thu Apr 17 11:43:00 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 17 Apr 2008 09:43:00 +0000 (UTC) Subject: [Python-3000] =?utf-8?b?W09UXSBzaXplb2Yoc2l6ZV90KSA8IHNpemVvZihs?= =?utf-8?q?ong=29?= References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> Message-ID: James Y Knight fuhm.net> writes: > On Apr 16, 2008, at 11:00 PM, Greg Ewing wrote: > > > If a word is needed for this concept, then invent a new > > one, e.g. "size unit", rather than reusing "byte", which > > everyone already understands as meaning 8 bits. > > Nope. Everyone understands "octet" to be 8 bits. And in French, the only word for "byte" is... "octet" ;-) Antoine. From ncoghlan at gmail.com Thu Apr 17 13:06:36 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 17 Apr 2008 21:06:36 +1000 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <797440730804161609k6deb9154m3b5ed831712135c9@mail.gmail.com> References: <48015133.4020105@canterbury.ac.nz> <797440730804151706u5a87b978of0641b46e25a87f1@mail.gmail.com> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <797440730804161609k6deb9154m3b5ed831712135c9@mail.gmail.com> Message-ID: <48072F3C.4060807@gmail.com> atsuo ishimoto wrote: > Question: Are you happy if you are forced to live with these hacks forever? > If not, why do you think I'll accept your suggestion? If they worked, I'd be happy to use them wherever they made my life easier. They don't work though, so the point is rather moot. I think attempting it makes a reasonable case that the problem you raise *can't* be adequately addressed purely as a display issue though, indicating it may be time to reconsider how repr() works for strings as you originally suggested. This is important information for a PEP writer to include to counter the arguments of anyone that initially has a similar attitude to the problem as I did. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Thu Apr 17 14:23:34 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 17 Apr 2008 22:23:34 +1000 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: References: <4806ADA1.3010001@canterbury.ac.nz> <4806DE20.8030903@canterbury.ac.nz> Message-ID: <48074146.4080807@gmail.com> Guido van Rossum wrote: > On Wed, Apr 16, 2008 at 10:20 PM, Greg Ewing > wrote: >> Alex Martelli wrote: >> > I disagree: I always recommend using %r to display (in an error >> > message, log entry, etc), a string that may be in error, >> >> For debugging messages, yes, but not output produced >> in the normal course of operation. And "File Not Found" >> I consider to be in the latter category -- the user >> typed in the wrong file name, but it's still a string, >> and should be displayed to him as such. > > I respectfully disagree. Control characters and such in the string > should *definitely* be escaped. Regarding printable characters outside > the ASCII range, see my post in another thread (which somehow nearly > everybody appears to have missed); Sorry, it got a "Usenet nod" from me after my efforts at working around the problem on the display side proved futile (anyone know how to delete an ASPN cookbook recipe that you've realised is fundamentally broken and is never going to work?). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From jimjjewett at gmail.com Thu Apr 17 16:13:14 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 17 Apr 2008 10:13:14 -0400 Subject: [Python-3000] Recursive str In-Reply-To: <20080417091410.GA23016@phd.pp.ru> References: <20080416103207.GB6295@phd.pp.ru> <4806A9B6.5000907@canterbury.ac.nz> <20080417091410.GA23016@phd.pp.ru> Message-ID: I think asking every container type to implement str just to ensure its subobjects are printed correctly is a losing proposition. It might be possible for repr to take an extra keyword argument indicating that it is being used in place of string. Then, when it recurses on subobjects, it should call str instead of repr when this argument is passed. (And the top-level delegation should of course pass this argument when changing str to repr.) -jJ From guido at python.org Thu Apr 17 16:39:24 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 17 Apr 2008 07:39:24 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes at the interactive prompt In-Reply-To: <48074146.4080807@gmail.com> References: <4806ADA1.3010001@canterbury.ac.nz> <4806DE20.8030903@canterbury.ac.nz> <48074146.4080807@gmail.com> Message-ID: On Thu, Apr 17, 2008 at 5:23 AM, Nick Coghlan wrote: > Guido van Rossum wrote: > > Regarding printable characters outside > > the ASCII range, see my post in another thread (which somehow nearly > > everybody appears to have missed); > Sorry, it got a "Usenet nod" from me after my efforts at working around the > problem on the display side proved futile (anyone know how to delete an ASPN > cookbook recipe that you've realised is fundamentally broken and is never > going to work?). It's frustrating not to see it acknowledged because the bickering seems to be going on unfettered in several other threads. Also, work needs to be done, a proposal needs to be written up and reviewed. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Apr 17 17:00:35 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 17 Apr 2008 08:00:35 -0700 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: Message-ID: The substance of this discussion has already be answered by Amaury. I'd also like to remind everyone that at this point we're trying to get 3.0 (*and* 2.6!) stable enough to release by September 3rd. That's about 4.5 months away only! We should not be considering major language changes at this point. If you have an idea for a ground-breaking changes, write to python-ideas and we'll consider it for 3.1 or 4.0. --Guido On Wed, Apr 16, 2008 at 5:52 PM, Nicholas T wrote: > hello all, > > A few times in practice I have been tripped up by how Python keeps > variables in scope after a loop--and it wasn't immediately obvious what the > problem was. I think it is one of the ugliest and non-intuitive features, > and hope some others agree that it should be changed in py3k. > > >>> for a in range(11): pass > ... > >>> print(a) > 10 > > Thanks, > Nicholas > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/guido%40python.org > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ishimoto at gembook.org Thu Apr 17 17:39:50 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Fri, 18 Apr 2008 00:39:50 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> Message-ID: <797440730804170839t7a78b0e1j8ef1301fd0c7db36@mail.gmail.com> > I expect that this will require some more research and agreement. > Perhaps someone can produce a draft PEP and attempt to sort out the > details of specification and implementation? It would also be nice if > it could be friendly to Jython, IronPython and PyPy. I'll write a draft PEP, if people can stand my awful English. For me, writing a long document in English is harder and more time-consuming job than you might expect. So please be patient. I'll write a PEP as fast as I can. From janssen at parc.com Thu Apr 17 19:00:01 2008 From: janssen at parc.com (Bill Janssen) Date: Thu, 17 Apr 2008 10:00:01 PDT Subject: [Python-3000] =?utf-8?b?W09UXSBzaXplb2Yoc2l6ZV90KSA8IHNpemVvZihs?= =?utf-8?q?ong=29?= In-Reply-To: References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> Message-ID: <08Apr17.100002pdt."58696"@synergy1.parc.xerox.com> > And in French, the only word for "byte" is... "octet" ;-) Well, you can always use "byte". We won't mind :-). Bill From stephen at xemacs.org Thu Apr 17 19:56:27 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 18 Apr 2008 02:56:27 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <797440730804170839t7a78b0e1j8ef1301fd0c7db36@mail.gmail.com> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <797440730804170839t7a78b0e1j8ef1301fd0c7db36@mail.gmail.com> Message-ID: <87ve2gnz5g.fsf@uwakimon.sk.tsukuba.ac.jp> atsuo ishimoto writes: > I'll write a draft PEP, if people can stand my awful English. For me, > writing a long document in English is harder and more time-consuming > job than you might expect. So please be patient. I'll write a PEP as > fast as I can. I'd be happy to help. I don't have time to write a PEP, but I can understand, speak and write Japanese, and help with wording. Contact me offlist if that's attractive to you. From greg.ewing at canterbury.ac.nz Thu Apr 17 23:24:34 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 18 Apr 2008 09:24:34 +1200 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> <4806D2E9.3080905@canterbury.ac.nz> Message-ID: <4807C012.8060800@canterbury.ac.nz> Adam Olsen wrote: > Sure, *now*, but C inherited their definition from a day when it > wasn't so clear cut. It may be obsolete today, but good luck getting > them to change the standard. I'm not really expecting the standard to be changed. But I do think it's silly for a modern C implementation for a modern CPU to take the letter of the C standard as implying that they have to use the word "byte" as though it meant something other than 8 bits. In the present day, that can only lead to confusion. -- Greg From martin at v.loewis.de Thu Apr 17 23:40:17 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 17 Apr 2008 23:40:17 +0200 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> Message-ID: <4807C3C1.6010602@v.loewis.de> > I do think we should use some kind of Unicode-standard-endorsed > definition of "printable" (as long as it excludes all ASCII escapes), I think unicodedata.category(c)[0] != "C" is fairly close. That excludes control characters (Cc), format characters (Cf), surrogates (Cs), private-use (Co) and unassigned characters (Cn). We should then also escape \, ' and ", following the traditional algorithm. Printable then would be all letters, numbers, punctuation, symbols, but also marks (e.g. TILDE, COMBINING RIGHT HARPOON ABOVE) and separators (SPACE, NO-BREAK SPACE, THREE-PER-EM SPACE, LINE SEPARATOR, PARAGRAPH SEPARATOR). It might be reasonable to also exclude line separators (Zl) and paragraph separators (Zp), each category having only one character in them. Regards, Martin From greg.ewing at canterbury.ac.nz Fri Apr 18 00:25:51 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 18 Apr 2008 10:25:51 +1200 Subject: [Python-3000] Recursive str In-Reply-To: <20080417091410.GA23016@phd.pp.ru> References: <20080416103207.GB6295@phd.pp.ru> <4806A9B6.5000907@canterbury.ac.nz> <20080417091410.GA23016@phd.pp.ru> Message-ID: <4807CE6F.1020100@canterbury.ac.nz> Oleg Broytmann wrote: > When I use str(container) instead of repr(comtainer) does Python need to > guess if I want an unambiguous representation or a printable representation > of items? I don't think there is a room for guessing - I explicitly said > str(). But there's no single, obvious way of doing str() on a list that will suit all situations. So Python doesn't define str() for a list *at all*. You're getting the fallback, which is repr(). -- Greg From ntung at ntung.com Fri Apr 18 10:10:47 2008 From: ntung at ntung.com (Nicholas T) Date: Fri, 18 Apr 2008 01:10:47 -0700 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: Message-ID: Amaury - I think it's generally cleaner code to write for myObject in someList: if myObject.fits(): process(myObject) break than for myObject in someList: if myObject.fits(): break process(myObject) I see from csv.py how it could simplify things (e.g. if the else case was less trivial); however, for csv.py specifically, lines 372 to 392 could prob. be rewritten as # default to length of string thisType = len(row[col]) for typeFunc in [int, float, complex]: try: typeFunc(row[col]) thisType = typeFunc break except (ValueError, OverflowError): pass if columnTypes[col] is None: # add new column type columnTypes[col] = thisType elif thisType != columnTypes[col]: # type is inconsistent, remove column from consideration del columnTypes[col] I'd be interested in seeing how often it is actually used. I suppose carrying loop variable after the loop makes some sense in the context of having only local and global scopes: clearly one wouldn't want to make code inside the loop use "global" to access variables outside of the loop. Creating a special scope for loop iteration variables would probably also be a bad thing, though py3k currently prints a warning about concurrent modification; perhaps this is not so different. Guido - sorry I didn't know. Given how scopes work in python, I don't think this is going to go anywhere, so at the moment I'm not going to repost / revive arguments. Thanks, Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-3000/attachments/20080418/826f6208/attachment.htm From facundobatista at gmail.com Fri Apr 18 15:15:18 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Fri, 18 Apr 2008 10:15:18 -0300 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: Message-ID: 2008/4/18, Nicholas T : > Amaury - I think it's generally cleaner code to write > for myObject in someList: > if myObject.fits(): > process(myObject) > break > than > for myObject in someList: > if myObject.fits(): > break > process(myObject) See, I do this a lot: for a, b, c in someList: if : break else: c = foobar c.something() Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From R.W.Thomas.02 at cantab.net Fri Apr 18 17:29:03 2008 From: R.W.Thomas.02 at cantab.net (Richard Thomas) Date: Fri, 18 Apr 2008 16:29:03 +0100 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: Message-ID: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> I like that loop variables end up still in scope, as demonstrated so far on this list it is quite useful, but only when there is a break somewhere. The one that confuses me, therefore, is the dummy variables in a generator expression leaking into the scope defining that expression. Hence: x = 0 L = [f(x) for x in range(2)] assert x == 1 This is not particularly intuitive as the for loop in a generator expression can never break; generator expressions feel more "closed". Richard. On Fri, Apr 18, 2008 at 2:15 PM, Facundo Batista wrote: > 2008/4/18, Nicholas T : > > > > Amaury - I think it's generally cleaner code to write > > for myObject in someList: > > if myObject.fits(): > > process(myObject) > > break > > than > > for myObject in someList: > > if myObject.fits(): > > break > > process(myObject) > > See, I do this a lot: > > for a, b, c in someList: > if : > break > else: > c = foobar > c.something() > > Regards, > > -- > . Facundo > > Blog: http://www.taniquetil.com.ar/plog/ > PyAr: http://www.python.org/ar/ > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/r.w.thomas.02%40cantab.net > From aleaxit at gmail.com Fri Apr 18 17:57:35 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Fri, 18 Apr 2008 08:57:35 -0700 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> References: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> Message-ID: On Fri, Apr 18, 2008 at 8:29 AM, Richard Thomas wrote: > I like that loop variables end up still in scope, as demonstrated so > far on this list it is quite useful, but only when there is a break > somewhere. The one that confuses me, therefore, is the dummy variables > in a generator expression leaking into the scope defining that > expression. Hence: > > x = 0 > L = [f(x) for x in range(2)] > assert x == 1 There is no genexp here -- this is a list comprehension. Generator expressions do NOT leak their control variable; LCs were originally designed to leak (to mimic a for loop's semantics exactly) and thus had to remain that way throughout 2.* -- BUT that's changed in 3.0 (download and try the alpha!), where LCs don't "leak" any more. > This is not particularly intuitive as the for loop in a generator > expression can never break; generator expressions feel more "closed". They are and always have been; list comprehensions also become that way in 3.* (can't change in 2.* for obvious reasons of backwards compatibility). Alex From robin at nibor.org Fri Apr 18 18:03:10 2008 From: robin at nibor.org (Robin Stocker) Date: Fri, 18 Apr 2008 18:03:10 +0200 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> References: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> Message-ID: <4808C63E.70402@nibor.org> Richard Thomas schrieb: > I like that loop variables end up still in scope, as demonstrated so > far on this list it is quite useful, but only when there is a break > somewhere. The one that confuses me, therefore, is the dummy variables > in a generator expression leaking into the scope defining that > expression. Hence: > > x = 0 > L = [f(x) for x in range(2)] > assert x == 1 > > This is not particularly intuitive as the for loop in a generator > expression can never break; generator expressions feel more "closed". It's fixed in Python 3: Python 3.0a4+ (py3k:62372, Apr 18 2008, 17:45:09) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x = 0 >>> l = [str(x) for x in range(2)] >>> x 0 Robin From greg.ewing at canterbury.ac.nz Sat Apr 19 02:30:29 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 19 Apr 2008 12:30:29 +1200 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: References: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> Message-ID: <48093D25.7050401@canterbury.ac.nz> Alex Martelli wrote: > LCs were originally designed to leak Well, they weren't really *designed* to leak, it was just a side effect of the implementation. It seems to have been decided that it was better to keep it that way than risk breaking things that might depend on it. Personally I would rather have had it documented as "undefined". -- Greg From ishimoto at gembook.org Sat Apr 19 04:35:19 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Sat, 19 Apr 2008 11:35:19 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> Message-ID: <797440730804181935p1f618e90ob1b8b9efb48932c3@mail.gmail.com> 2008/4/17, Guido van Rossum : > For those of us with less capable IO devices, setting the error flag > for stdout and stderr to backslashreplace is probably the best > solution (and it solves more problems than just repr()). > Some thought on Points I found while investigating further. - A lot of people uses utf-8 for their encoding, such as de_DE.utf8. In such case, backslashescape trick doesn't work. I need to find a way to select an appropriate codec to render unwanted characters as \uXXXX. - io.TextIOWrapper doesn't provide interface to change encoding and error-handler after it was created. This feature is supported in PEP-3116, but isn't impletented at this time. Will it be implemented? It would be nice if we have optional encoding and errors args for print() and TextIOWrapper.write(), so people can write print(repr(obj), 'koi8-r', 'backslashescape'). From ncoghlan at gmail.com Sat Apr 19 14:01:44 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 19 Apr 2008 22:01:44 +1000 Subject: [Python-3000] end scope of iteration variables after loop In-Reply-To: <48093D25.7050401@canterbury.ac.nz> References: <1b54228d0804180829v312f717difbbfe3100ac911c1@mail.gmail.com> <48093D25.7050401@canterbury.ac.nz> Message-ID: <4809DF28.5090207@gmail.com> Greg Ewing wrote: > Alex Martelli wrote: >> LCs were originally designed to leak > > Well, they weren't really *designed* to leak, it was > just a side effect of the implementation. > > It seems to have been decided that it was better to > keep it that way than risk breaking things that might > depend on it. Personally I would rather have had it > documented as "undefined". Genexps don't leak their iteration variables, and neither do Py3k list comps. So it has only been kept that way in 2.x because nobody came up with a graceful way of deprecating it in the past and "list(genexp)" is now available as a trivial workaround. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From theaney at gmail.com Sun Apr 13 01:33:36 2008 From: theaney at gmail.com (Tim Heaney) Date: Sat, 12 Apr 2008 19:33:36 -0400 Subject: [Python-3000] os.popen versus subprocess.Popen Message-ID: In Python 3.0, it seems that os.popen yields a string, whereas subprocess.Popen yields bytes $ ./python Python 3.0a4 (r30a4:62119, Apr 12 2008, 18:15:16) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os, subprocess >>> os.popen('date').readline() 'Sat Apr 12 19:08:05 EDT 2008\n' >>> subprocess.Popen(['date'], stdout=subprocess.PIPE).communicate()[0] b'Sat Apr 12 19:08:13 EDT 2008\n' Is this intentional? If so, why should I expect this? Thanks! Tim From mwm at mired.org Thu Apr 17 07:19:24 2008 From: mwm at mired.org (Mike Meyer) Date: Thu, 17 Apr 2008 01:19:24 -0400 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <4806D2E9.3080905@canterbury.ac.nz> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> <4806D2E9.3080905@canterbury.ac.nz> Message-ID: <20080417011924.6f1b62d2@bhuda.mired.org> On Thu, 17 Apr 2008 16:32:41 +1200 Greg Ewing wrote: > David Cournapeau wrote: > > > Maybe everyone understands it as 8 bits, but it has always been wrong. > > It may not be officially written down anywhere, but > almost everyone in the world understands a byte to mean > 8 bits. When you go into a computer store and ask for > 256MB of RAM, you don't expect to be asked "What size > bytes would that be, then, sir?" The key word is *almost*. And actually, the reason it's almost is because it the context is *almost* always hardware with 8 bit bytes. If the computer store in question exclusively sold hardware that used 9-bit bytes, then they wouldn't ask what size the bytes should be - they'd just give you 9-bit bytes. If they sold heterogeneous hardware, they might well ask. > So it's a de facto standard, and one that works perfectly > well. Going against it is both futile and unnecessary, > as far as I can see. Yup, it's probably futile - most people don't care about portability or precision, and will use "byte" to mean "8-bit byte". On the other hand, trying to redefine "byte" to mean "8-bit byte" is also futile, because any company that builds hardware (or software for bit-slice processors, or ...) that manipulates subword units that hold single characters is going to call those things bytes, no matter what length they are. Standards can't get away with the sloppy usage that's common practice. So they wind up providing definitions for words that may seem to contradict or repeat common usage, or using uncommon words with a precise meaning in place of a common word that usually, but not always, has that meaning. You could make pretty much the same case that "computer" means "machine running Windows". That is what almost everyone in the world understands "computer" to mean. If I go into a computer store and ask for a "computer", I expect them to offer me a machine running Windows without asking "What operating system would that haven, then, sir?" http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From jjb5 at cornell.edu Mon Apr 21 16:05:14 2008 From: jjb5 at cornell.edu (Joel Bender) Date: Mon, 21 Apr 2008 10:05:14 -0400 Subject: [Python-3000] sizeof(size_t) < sizeof(long) In-Reply-To: <20080417011924.6f1b62d2@bhuda.mired.org> References: <87D3F9C72FBF214DB39FA4E3FE618CDC6E22D20148@EXMBX04.exchhosting.com> <5c6f2a5d0804141538q2e5a777dp60807879a56105fe@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74C2C@EXMBX04.exchhosting.com> <5c6f2a5d0804141729t63dec56an4c53176249cee9b5@mail.gmail.com> <87D3F9C72FBF214DB39FA4E3FE618CDC6E22E74CB6@EXMBX04.exchhosting.com> <4804B83F.5010504@gmail.com> <48056F4A.4030102@canterbury.ac.nz> <48059966.4010102@v.loewis.de> <4806A3DE.6030807@canterbury.ac.nz> <4806B922.9070202@ar.media.kyoto-u.ac.jp> <4806BD48.6030908@canterbury.ac.nz> <4806C0C5.3090703@ar.media.kyoto-u.ac.jp> <4806D2E9.3080905@canterbury.ac.nz> <20080417011924.6f1b62d2@bhuda.mired.org> Message-ID: <480C9F1A.1020505@cornell.edu> Mike Meyer wrote: > Yup, it's probably futile - most people don't care about portability > or precision, and will use "byte" to mean "8-bit byte". Nor will this be an issue in Python. Maybe an inset paragraph on some footnote of a bit of documentation on a wiki page. > Standards can't get away with the sloppy usage that's common > practice. So they wind up providing definitions for words that may > seem to contradict or repeat common usage, or using uncommon words > with a precise meaning in place of a common word that usually, but not > always, has that meaning. As Guido succinctly wrote to me: > ...octet is not, and never will be a technical term for > Python. It is a silly standards body compromise. While I think "silly" might have been an overstatement, I think the point is clear enough. In the context of Python, bytes will be 8 bits, and arguments about the appropriateness of that definition are silly. Joel From guido at python.org Mon Apr 21 19:30:02 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 21 Apr 2008 10:30:02 -0700 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: Message-ID: IMO os.popen() is wrong here. On Sat, Apr 12, 2008 at 4:33 PM, Tim Heaney wrote: > In Python 3.0, it seems that os.popen yields a string, whereas > subprocess.Popen yields bytes > > $ ./python > Python 3.0a4 (r30a4:62119, Apr 12 2008, 18:15:16) > [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import os, subprocess > >>> os.popen('date').readline() > 'Sat Apr 12 19:08:05 EDT 2008\n' > >>> subprocess.Popen(['date'], stdout=subprocess.PIPE).communicate()[0] > b'Sat Apr 12 19:08:13 EDT 2008\n' > > Is this intentional? If so, why should I expect this? Thanks! > > Tim > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Mon Apr 21 23:44:38 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 21 Apr 2008 23:44:38 +0200 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: Message-ID: <480D0AC6.3080500@v.loewis.de> > IMO os.popen() is wrong here. Should os.popen go away entirely? Apparently, it does two things: a) redefine close to block until the child process terminated, and b) wrap stdout/stdout with a TextIOWrapper If there is an actual need to specify an encoding when communicating with the subprocess, I'd rather make that parameter to Popen itself. Regards, Martin From guido at python.org Mon Apr 21 23:49:09 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 21 Apr 2008 14:49:09 -0700 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: <480D0AC6.3080500@v.loewis.de> References: <480D0AC6.3080500@v.loewis.de> Message-ID: I think the original plan was to reimplement os.popen() on top of subprocess.py as a convenience (the API is an order of magnitude simpler). That still sounds good to me. On Mon, Apr 21, 2008 at 2:44 PM, "Martin v. L?wis" wrote: > > IMO os.popen() is wrong here. > > Should os.popen go away entirely? > > Apparently, it does two things: > a) redefine close to block until the child process terminated, > and > b) wrap stdout/stdout with a TextIOWrapper > > If there is an actual need to specify an encoding when communicating > with the subprocess, I'd rather make that parameter to Popen itself. > > Regards, > Martin > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Apr 22 17:39:47 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 22 Apr 2008 08:39:47 -0700 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: Message-ID: I need to retract this. os.popen() has a 'mode' flag that indicates reading or writing but also specifies text vs. binary. So os.popen(..., 'r') should return a text stream, while os.popen(..., 'rb') should return a binary stream. The subprocess module has similar options, though the default is geared more towards binary. I still think os.popen() should be reimplemented on top of subprocess, and add the same optional flags as the open() function has grown to indicate encoding and buffering. I think the more complex variants (popen2, popen3, popen4, ...?) should probably go away, since it's easy enough to do what they do using the subprocess module, and there were some serious API design mistakes there (confusing reversal of input and output in some cases). --Guido On Mon, Apr 21, 2008 at 10:30 AM, Guido van Rossum wrote: > IMO os.popen() is wrong here. > > > > On Sat, Apr 12, 2008 at 4:33 PM, Tim Heaney wrote: > > In Python 3.0, it seems that os.popen yields a string, whereas > > subprocess.Popen yields bytes > > > > $ ./python > > Python 3.0a4 (r30a4:62119, Apr 12 2008, 18:15:16) > > [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import os, subprocess > > >>> os.popen('date').readline() > > 'Sat Apr 12 19:08:05 EDT 2008\n' > > >>> subprocess.Popen(['date'], stdout=subprocess.PIPE).communicate()[0] > > b'Sat Apr 12 19:08:13 EDT 2008\n' > > > > Is this intentional? If so, why should I expect this? Thanks! > > > > Tim > > _______________________________________________ > > Python-3000 mailing list > > Python-3000 at python.org > > http://mail.python.org/mailman/listinfo/python-3000 > > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From facundobatista at gmail.com Tue Apr 22 17:57:56 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 22 Apr 2008 12:57:56 -0300 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: Message-ID: 2008/4/22, Guido van Rossum : > I still think os.popen() should be reimplemented on top of subprocess, > and add the same optional flags as the open() function has grown to > indicate encoding and buffering. os.popen() is deprecated in 2.6, with the recommendation of using the subprocess module. In view of this, I'm always recommending to use os.system() for easy execution (no control), and the subprocess.Popen() for full control (not so easy to use). I agree that it should appear a more easy to use function, and also agree that it should be constructed over subprocess.Popen(). But, as all os.popen* functions are deprecated, and it will be constructed over subprocess.Popen(), I think that this easy-to-use function should be in the subprocess module. BTW, the top two complains about subprocess.Popen "complicated API" in the Python Argentina mail list and the courses I give, are: - Why can't I write "ls -l", instead of ["ls", "-l"] (people ends writing "ls -l".split()) - The parameter stdout should default to subprocess.PIPE Maybe we could use this feedback for the ease-to-use function. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From amcnabb at mcnabbs.org Tue Apr 22 19:42:48 2008 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Tue, 22 Apr 2008 11:42:48 -0600 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: Message-ID: <20080422174248.GA3422@mcnabbs.org> On Tue, Apr 22, 2008 at 12:57:56PM -0300, Facundo Batista wrote: > > - Why can't I write "ls -l", instead of ["ls", "-l"] (people ends > writing "ls -l".split()) That's the best thing about subprocess. Whenever I've used APIs that accept a single string instead of list of arguments, I've quickly descended into quoting hell. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From facundobatista at gmail.com Tue Apr 22 20:33:30 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 22 Apr 2008 15:33:30 -0300 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: <20080422174248.GA3422@mcnabbs.org> References: <20080422174248.GA3422@mcnabbs.org> Message-ID: 2008/4/22, Andrew McNabb : > That's the best thing about subprocess. Whenever I've used APIs that > accept a single string instead of list of arguments, I've quickly > descended into quoting hell. I don't understand why, could you please provide me one example or two? Thank you! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From amcnabb at mcnabbs.org Tue Apr 22 21:39:11 2008 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Tue, 22 Apr 2008 13:39:11 -0600 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: <20080422174248.GA3422@mcnabbs.org> Message-ID: <20080422193911.GB3422@mcnabbs.org> On Tue, Apr 22, 2008 at 03:33:30PM -0300, Facundo Batista wrote: > 2008/4/22, Andrew McNabb : > > > That's the best thing about subprocess. Whenever I've used APIs that > > accept a single string instead of list of arguments, I've quickly > > descended into quoting hell. > > I don't understand why, could you please provide me one example or two? Here's a really simple example: ("bash", "-c", 'FILE="/tmp/a b c"; cat "$FILE"') That's pretty simple as a list of arguments. But if you do it as a single string, you get: 'bash -c \'FILE="/tmp/a b c"; cat "$FILE"\'' It can get much worse than this, especially if you need to use backslashes. Here's another argument that you might find even more convincing. What if you got a filename from a user, and had to pass that filename as an argument to a command. If your argument was a string, like: "cat %s" % filename then your program would break if filename contained spaces. However, if your arguments are ("cat", filename) then everything does exactly what you expect. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From gzlist at googlemail.com Tue Apr 22 21:42:33 2008 From: gzlist at googlemail.com (Martin (gzlist)) Date: Tue, 22 Apr 2008 20:42:33 +0100 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: <20080422174248.GA3422@mcnabbs.org> Message-ID: On 22/04/2008, Facundo Batista wrote: > 2008/4/22, Andrew McNabb : > > > > That's the best thing about subprocess. Whenever I've used APIs that > > accept a single string instead of list of arguments, I've quickly > > descended into quoting hell. > > > I don't understand why, could you please provide me one example or two? > > Thank you! > > > -- > . Facundo Well, here's a real method I ran into a while back - with .replace(origname, 'example'). Written by a perfectly able python programmer in publicly released code: def example_command(self, command, args): args = ' '.join("'%s'" % arg for arg in args) return 'example --example-dir=%s %s %s' % (self.example_dir, command, args) What's wrong with it? Well, if you use unix, with no spaces in directories and legal 'quote', you'll probably say "nothing". String interpolating command calls makes for unnecessarily non-portable code. Now, if subprocess could only start (optionally) taking unicode arguments so I could actually get at my whole filesystem through it... Martin From phd at phd.pp.ru Tue Apr 22 21:47:44 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Tue, 22 Apr 2008 23:47:44 +0400 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: <20080422193911.GB3422@mcnabbs.org> References: <20080422174248.GA3422@mcnabbs.org> <20080422193911.GB3422@mcnabbs.org> Message-ID: <20080422194744.GA26008@phd.pp.ru> On Tue, Apr 22, 2008 at 01:39:11PM -0600, Andrew McNabb wrote: > "cat %s" % filename > > then your program would break if filename contained spaces. It'd break even worse if the filename contains ';' or any other command-separated character (&&, ||, etc.) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido at python.org Tue Apr 22 21:52:04 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 22 Apr 2008 12:52:04 -0700 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: Message-ID: On Tue, Apr 22, 2008 at 8:57 AM, Facundo Batista wrote: > 2008/4/22, Guido van Rossum : > > I still think os.popen() should be reimplemented on top of subprocess, > > and add the same optional flags as the open() function has grown to > > indicate encoding and buffering. > > os.popen() is deprecated in 2.6, with the recommendation of using the > subprocess module. I forgot about that. Well, I propose to undeprecate it or at lest replicate it as subprocess.popen(). > In view of this, I'm always recommending to use os.system() for easy > execution (no control), and the subprocess.Popen() for full control > (not so easy to use). > > I agree that it should appear a more easy to use function, and also > agree that it should be constructed over subprocess.Popen(). But, as > all os.popen* functions are deprecated, and it will be constructed > over subprocess.Popen(), I think that this easy-to-use function should > be in the subprocess module. Good plan. > BTW, the top two complains about subprocess.Popen "complicated API" in > the Python Argentina mail list and the courses I give, are: > > - Why can't I write "ls -l", instead of ["ls", "-l"] (people ends > writing "ls -l".split()) There's a flag that will make it do that for you (I think it's called shell or some such). > - The parameter stdout should default to subprocess.PIPE That depends on what you want to do with the output. I think it's fine as it is. > Maybe we could use this feedback for the ease-to-use function. IMO the easy-to-use function should replicate os.popen() closely. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From facundobatista at gmail.com Tue Apr 22 21:52:42 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 22 Apr 2008 16:52:42 -0300 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: <20080422193911.GB3422@mcnabbs.org> References: <20080422174248.GA3422@mcnabbs.org> <20080422193911.GB3422@mcnabbs.org> Message-ID: 2008/4/22, Andrew McNabb : > Here's a really simple example: > > ("bash", "-c", 'FILE="/tmp/a b c"; cat "$FILE"') > > That's pretty simple as a list of arguments. But if you do it as a > single string, you get: > > 'bash -c \'FILE="/tmp/a b c"; cat "$FILE"\'' > > It can get much worse than this, especially if you need to use > backslashes. I think that force me to write a tuple or a list just in case I'd need to write a string that uses simple and double quotes, or backslashes, because it's "ugly", don't worth it. What about growing the possibility of write a tuple/list *or* a string, and if I have a string, just use it? You could say that writing a plain string I incur in the risk of not enclosing the parameters correctly at bash level, but note that you're still doing that quote enclosing even in the tuple/list, and that Python normally treats the programmer as an adult. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From amcnabb at mcnabbs.org Tue Apr 22 22:07:14 2008 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Tue, 22 Apr 2008 14:07:14 -0600 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: <20080422174248.GA3422@mcnabbs.org> <20080422193911.GB3422@mcnabbs.org> Message-ID: <20080422200714.GE3422@mcnabbs.org> On Tue, Apr 22, 2008 at 04:52:42PM -0300, Facundo Batista wrote: > > I think that force me to write a tuple or a list just in case I'd need > to write a string that uses simple and double quotes, or backslashes, > because it's "ugly", don't worth it. Or spaces, or user input, or any special shell characters. Basically, if you give a list or tuple of arguments, you can fork and exec. It's really simple, and it does what you expect. If you specify a string, then either Bash or something else has to parse the input and separate it into arguments. If any user input is involved, there will almost certainly be security problems. If not, it will frequently break anyway. As Guido pointed out, you can specify shell=True to get this latter behavior. But if you do this, you often sacrifice correctness and/or security. It's not a good habit. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From mwm at mired.org Tue Apr 22 22:34:49 2008 From: mwm at mired.org (Mike Meyer) Date: Tue, 22 Apr 2008 16:34:49 -0400 Subject: [Python-3000] os.popen versus subprocess.Popen In-Reply-To: References: <20080422174248.GA3422@mcnabbs.org> <20080422193911.GB3422@mcnabbs.org> Message-ID: <20080422163449.07cd9586@mbook-fbsd> On Tue, 22 Apr 2008 16:52:42 -0300 "Facundo Batista" wrote: > 2008/4/22, Andrew McNabb : > > > Here's a really simple example: > > > > ("bash", "-c", 'FILE="/tmp/a b c"; cat "$FILE"') > > > > That's pretty simple as a list of arguments. But if you do it as a > > single string, you get: > > > > 'bash -c \'FILE="/tmp/a b c"; cat "$FILE"\'' > > > > It can get much worse than this, especially if you need to use > > backslashes. > > I think that force me to write a tuple or a list just in case I'd need > to write a string that uses simple and double quotes, or backslashes, > because it's "ugly", don't worth it. But it's *not* because it's "ugly". It's because it's *safe*. > What about growing the possibility of write a tuple/list *or* a > string, and if I have a string, just use it? You could say that > writing a plain string I incur in the risk of not enclosing the > parameters correctly at bash level, but note that you're still doing > that quote enclosing even in the tuple/list, and that Python normally > treats the programmer as an adult. No, the two cases really are different. If you pass in a string, the shell will parse the string into a list of strings to hand to exec, allowing a hostile user to use data injection attacks to do all kinds of nasty things unless you're very, very careful. Getting this right is hard. If you pass in a list or tuple, the strings involved are passed to exec as is. You don't have to figure out how to get the shell to parse the string into the list you actually want; nor do you have to worry about the shell treating something you thought was data as executable code (at least unless you are exec'ing the shell yourself), pretty much killing data injection attacks. Basically, unless you're using a relatively simple, fixed string, the right way to write this is a list or tuple of strings. While making Popen act as if you set shell=True if you handed it a single string might be desirable, you can't tell if said string is a constant string, or the buggy "cat %s" % filename (whereas ("cat", filename) isn't buggy). Further, the nature of the bug - it allows hostile users to get your shell to execute arbitrary code - is enough to justify not making this case any easier than it already is. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From musiccomposition at gmail.com Wed Apr 23 16:32:19 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 23 Apr 2008 09:32:19 -0500 Subject: [Python-3000] gettext Message-ID: <1afaf6160804230732t266c8285la97a1fac62f96a8d@mail.gmail.com> [I'm not a gettext expert, so sorry if the following is totally wrong. :)] Are we going to want to keep the "u" variants of the gettext APIs around in 3.0? Also, the unicode parameters (for .install methods) don't make much sense in 3.0. I don't see how we could remove them in 3.0, but perhaps rename then to their non-"u" variants and deprecate? -- Cheers, Benjamin Peterson From ntung at ntung.com Wed Apr 23 22:56:57 2008 From: ntung at ntung.com (Nicholas T) Date: Wed, 23 Apr 2008 13:56:57 -0700 Subject: [Python-3000] what do I use in place of reduce? Message-ID: Hi all, It's obvious how to use LC's to replace map and filter, but what about reduce? It is one of my favorite functions. >>> time=1901248 >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], [[time], 60, 60, 24]) [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days Nicholas -- http://ntung.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From musiccomposition at gmail.com Wed Apr 23 23:00:41 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Wed, 23 Apr 2008 16:00:41 -0500 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: <1afaf6160804231400vfa5835ckbb40d66e3cb124db@mail.gmail.com> On Wed, Apr 23, 2008 at 3:56 PM, Nicholas T wrote: > Hi all, > > It's obvious how to use LC's to replace map and filter, but what about > reduce? It is one of my favorite functions. It's still there. Just in the functools module! In the future, please ask comp.lang.python or some similar group. This mailing list is for the core development of Python 3.x. Thanks! -- Cheers, Benjamin Peterson From greg.ewing at canterbury.ac.nz Thu Apr 24 02:05:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 24 Apr 2008 12:05:56 +1200 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: <480FCEE4.7050307@canterbury.ac.nz> Nicholas T wrote: > It's obvious how to use LC's to replace map and filter, but what > about reduce? LCs were never intended to be a replacement for reduce. If you like reduce, why not continue to use it? I don't think it's going away, just being moved into a different module. -- Greg From guido at python.org Thu Apr 24 03:47:20 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 23 Apr 2008 18:47:20 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: On Wed, Apr 23, 2008 at 1:56 PM, Nicholas T wrote: > It's obvious how to use LC's to replace map and filter, but what about > reduce? It is one of my favorite functions. > > >>> time=1901248 > >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], [[time], > 60, 60, 24]) > [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days I recommend learning how to use a good old for-loop. That example is as cryptic as can be. It's also inefficient due to calling a function for each iteration. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at gmail.com Thu Apr 24 06:08:11 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 23 Apr 2008 21:08:11 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: On Wed, Apr 23, 2008 at 6:47 PM, Guido van Rossum wrote: > On Wed, Apr 23, 2008 at 1:56 PM, Nicholas T wrote: > > > It's obvious how to use LC's to replace map and filter, but what about > > reduce? It is one of my favorite functions. > > > > >>> time=1901248 > > >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], [[time], > > 60, 60, 24]) > > [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days > > I recommend learning how to use a good old for-loop. That example is > as cryptic as can be. It's also inefficient due to calling a function > for each iteration. I normally frown on "me too" posts, but this time I won't refrain from a loud "hear, hear!". "Clever" code is NOT a culturally positive trait in the Python community (differently from most language communities... and this is in fact one reason I love Python). Alex From humberto at digi.com.br Thu Apr 24 05:45:45 2008 From: humberto at digi.com.br (Humberto Diogenes) Date: Thu, 24 Apr 2008 00:45:45 -0300 Subject: [Python-3000] help() broken? Message-ID: <7415E467-6B88-4ECD-BC49-4506B8F27373@digi.com.br> Hi, It seems that help() doesn't work on instances in py3k. Is this what this ticket is about? http://bugs.python.org/issue1883 Python 3.0a4+ (py3k:62469M, Apr 23 2008, 20:46:05) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class C: ... """Bla""" ... >>> help(C) Help on class C in module __main__: class C(builtins.object) | Bla | | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) >>> c = C() >>> help(c) Help on C in module __main__: <__main__.C object at 0x4a8ab8> help(instance) should give the same answer as help(Class), right? -- Humberto Di?genes http://humberto.digi.com.br From ntung at ntung.com Thu Apr 24 07:06:47 2008 From: ntung at ntung.com (Nicholas T) Date: Wed, 23 Apr 2008 22:06:47 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: On Wed, Apr 23, 2008 at 9:08 PM, Alex Martelli wrote: > On Wed, Apr 23, 2008 at 6:47 PM, Guido van Rossum > wrote: > > On Wed, Apr 23, 2008 at 1:56 PM, Nicholas T wrote: > > > > > It's obvious how to use LC's to replace map and filter, but what > about > > > reduce? It is one of my favorite functions. > > > > > > >>> time=1901248 > > > >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], > [[time], > > > 60, 60, 24]) > > > [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days > > > > I recommend learning how to use a good old for-loop. That example is > > as cryptic as can be. It's also inefficient due to calling a function > > for each iteration. > > I normally frown on "me too" posts, but this time I won't refrain from > a loud "hear, hear!". "Clever" code is NOT a culturally positive trait > in the Python community (differently from most language communities... > and this is in fact one reason I love Python). > > Alex > It wasn't only posted to be cryptic, it's one thing that's difficult to write with a for loop without a lot of verbosity (at least I couldn't figure out how to do it...). Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Thu Apr 24 07:24:41 2008 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 23 Apr 2008 23:24:41 -0600 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: On Wed, Apr 23, 2008 at 11:06 PM, Nicholas T wrote: > On Wed, Apr 23, 2008 at 9:08 PM, Alex Martelli wrote: > > > > > On Wed, Apr 23, 2008 at 6:47 PM, Guido van Rossum > wrote: > > > On Wed, Apr 23, 2008 at 1:56 PM, Nicholas T wrote: > > > > > > > It's obvious how to use LC's to replace map and filter, but what > about > > > > reduce? It is one of my favorite functions. > > > > > > > > >>> time=1901248 > > > > >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], > [[time], > > > > 60, 60, 24]) > > > > [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days > > > > > > I recommend learning how to use a good old for-loop. That example is > > > as cryptic as can be. It's also inefficient due to calling a function > > > for each iteration. > > > > I normally frown on "me too" posts, but this time I won't refrain from > > a loud "hear, hear!". "Clever" code is NOT a culturally positive trait > > in the Python community (differently from most language communities... > > and this is in fact one reason I love Python). > > > > Alex > > > > It wasn't only posted to be cryptic, it's one thing that's difficult to > write with a for loop without a lot of verbosity (at least I couldn't figure > out how to do it...). >>> time = 1901248 >>> seconds = time % 60 >>> minutes = time // 60 % 60 >>> hours = time // 60 // 60 % 24 >>> days = time // 60 // 60 // 24 >>> seconds, minutes, hours, days (28, 7, 0, 22) Doesn't even need a loop. Just don't try to be clever. If you think it's too verbose then put it in its own function (even if you only call it in one place!) The function itself will be quite readable and so will the caller (so long as you pick a good name.) -- Adam Olsen, aka Rhamphoryncus From aleaxit at gmail.com Thu Apr 24 07:35:06 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Wed, 23 Apr 2008 22:35:06 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: On Wed, Apr 23, 2008 at 10:06 PM, Nicholas T wrote: ... > > > > >>> time=1901248 > > > > >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], > [[time], > > > > 60, 60, 24]) > > > > [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days ... > It wasn't only posted to be cryptic, it's one thing that's difficult to > write with a for loop without a lot of verbosity (at least I couldn't figure > out how to do it...). def dhms1(t): r = [] for d in 60, 60, 24: r.append(t % d) t //= d r.append(t) return r >>> dhms1(1901248) [28, 7, 0, 22] vs import math def dhms2(t): return reduce(lambda a, b: a[:-1] + [a[-1]%b, math.floor(a[-1]/b)], [[t], 60, 60, 24]) >>> dhms2(1901248) [28, 7.0, 0.0, 22.0] Not sure why you consider it advantageous to get a mixed list with the first item guaranteed to be an int vs the other three guaranteed to be floats with a 0 fractional part; the simple approach gives a list of four ints, which seems a much saner idea to me. Apart from that, putting each snippet into a string (resp. x for dhms1, y for dhms2) I see: >>> len(x) 119 >>> len(y) 117 Is 119 vs 117 characters "a LOT of verbosity"...?! OK, then what about...: >>> import re >>> len(re.sub(r'\s','',x)) 67 >>> len(re.sub(r'\s','',y)) 97 The conceptually simple approach is WAY SHORTER if you only count non-space characters (is whitespace "verbosity"...?!). If you're really SO keen to minimize the count of characters, I suspect Perl is much closer to your tastes than Python is (or, God willing, will EVER be). Alex From martin at v.loewis.de Thu Apr 24 07:40:24 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 24 Apr 2008 07:40:24 +0200 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: Message-ID: <48101D48.1010102@v.loewis.de> > > > >>> time=1901248 > > > >>> reduce(lambda a, b: a[:-1] + [a[-1]%b, > math.floor(a[-1]/b)], [[time], > > > 60, 60, 24]) > > > [28, 7.0, 0.0, 22.0] # secs, mins, hrs, days > > > > I recommend learning how to use a good old for-loop. That example is > > as cryptic as can be. It's also inefficient due to calling a function > > for each iteration. > > I normally frown on "me too" posts, but this time I won't refrain from > a loud "hear, hear!". "Clever" code is NOT a culturally positive trait > in the Python community (differently from most language communities... > and this is in fact one reason I love Python). > > Alex > > > It wasn't only posted to be cryptic, it's one thing that's difficult to > write with a for loop without a lot of verbosity (at least I couldn't > figure out how to do it...). In this case, I wouldn't use a loop at all: py> time=1901248 py> minutes,seconds = divmod(time, 60) py> hours,minutes = divmod(minutes, 60) py> days,hours = divmod(hours,24) py> seconds,minutes,hours,days (28, 7, 0, 22) If you absolutely want to use a loop (because you have a variable list of divisors), write py> time=1901248 py> div = time py> res = [] py> for divisor in (60,60,24): ... div, mod = divmod(div, divisor) ... res.append(mod) ... else: ... res.append(div) ... py> res [28, 7, 0, 22] If you think this shows a lot of verbosity, please reconsider. It's not verbose. If you absolutely want it in a single expression, I'd write py> time % 60, time//60%60, time//3600%24, time//(3600*24) (28, 7, 0, 22) Regards, Martin P.S. I'm not sure why you had been using floating point operations. From qrczak at knm.org.pl Thu Apr 24 10:27:09 2008 From: qrczak at knm.org.pl (Marcin =?UTF-8?Q?=E2=80=98Qrczak=E2=80=99?= Kowalczyk) Date: Thu, 24 Apr 2008 10:27:09 +0200 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: <48101D48.1010102@v.loewis.de> References: <48101D48.1010102@v.loewis.de> Message-ID: <1209025629.6891.8.camel@qrnik> Dnia 24-04-2008, czw o godzinie 07:40 +0200, "Martin v. L?wis" pisze: > In this case, I wouldn't use a loop at all: > > py> time=1901248 > py> minutes,seconds = divmod(time, 60) > py> hours,minutes = divmod(minutes, 60) > py> days,hours = divmod(hours,24) > py> seconds,minutes,hours,days > (28, 7, 0, 22) divmod could be extended to more arguments: days, hours, minutes, seconds = divmod(time, 24, 60, 60) def divmod(x, d, *ds): if ds: q, *rs = divmod(x, *ds) q1, r1 = divmod2(q, d) return q1, r1, *rs else: return divmod2(x, d) -- __("< Marcin Kowalczyk \__/ qrczak at knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ From amauryfa at gmail.com Thu Apr 24 10:57:46 2008 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 24 Apr 2008 10:57:46 +0200 Subject: [Python-3000] help() broken? In-Reply-To: <7415E467-6B88-4ECD-BC49-4506B8F27373@digi.com.br> References: <7415E467-6B88-4ECD-BC49-4506B8F27373@digi.com.br> Message-ID: Humberto Diogenes wrote: > Hi, > > It seems that help() doesn't work on instances in py3k. > > Is this what this ticket is about? > http://bugs.python.org/issue1883 > > > Python 3.0a4+ (py3k:62469M, Apr 23 2008, 20:46:05) > [GCC 4.0.1 (Apple Inc. build 5465)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> class C: > ... """Bla""" > ... > >>> help(C) > Help on class C in module __main__: > > class C(builtins.object) > | Bla > | > | Data descriptors defined here: > | > | __dict__ > | dictionary for instance variables (if defined) > | > | __weakref__ > | list of weak references to the object (if defined) > > >>> c = C() > >>> help(c) > Help on C in module __main__: > > <__main__.C object at 0x4a8ab8> > > > help(instance) should give the same answer as help(Class), right? Yes. A bug was introduced in the pydoc.render_doc() function, during the removal of classic classes: The "not (inspect.ismodule...)" test should be in a "if" statement, not a "elif". If nobody does it before, I will take care of this tonight. -- Amaury Forgeot d'Arc From charles.merriam at gmail.com Thu Apr 24 11:27:17 2008 From: charles.merriam at gmail.com (Charles Merriam) Date: Thu, 24 Apr 2008 02:27:17 -0700 Subject: [Python-3000] Assert syntax change... Message-ID: Hello All, I expect it is far to late for this, and I still wanted to make the issue known. The assert statement is one of the few remaining Python statements where it (1) does not use parenthesis and (2) takes multiple arguments. This leads to the common, hard to detect, programming error: assert(rarelyHappens > 0 , "Hyperdrive component needs replacement!") which is equivalent to: assert True because the programmer forgot to remove parenthesis for assert statements. It would be great to change assert from: assert_stmt ::= "assert" expression ["," expression] To: assert_stmt ::= "assert" expression ["as" expression] Which would make only one way to make an assert statement. + One less exception to remember + Fewer "fail silently" errors + One less place where commas outside parenthesis are used. (try/except and print just got fixed). - Way too late: I wish I had noticed months ago. - Need to patch 2to3, documentation, etc. There may be a problem in tuples evaluating as expressions with this naive fix; it's unclear to my little brain if a bare expression_list can be reached from expression. That is, would assert "tuple","implied" as "Message" or assert neverHappen > 0, "Programmer used 2.5 syntax here" be valid statements? In summary, I know that the current assert syntax is wrong. I know what it should look like. I do not know if the implementation details of parenthesized tuples make this difficult. I should have noticed it a year ago. Charles Merriam From facundobatista at gmail.com Thu Apr 24 16:40:33 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 24 Apr 2008 11:40:33 -0300 Subject: [Python-3000] Using range() Message-ID: Hi all! Used to be able to do this... >>> l = (x for x in range(10)) >>> l.__next__() 0 >>> l.__next__() 1 ...I tried the following: >>> r = range(5) >>> r range(0, 5) >>> r.__next__ Traceback (most recent call last): ... AttributeError: 'range' object has no attribute '__next__' Which is the normal way to "consume" a range object, item by item? Furthermore, I took a look inside, and found a __getitem__, so I tried >>> r[4] 4 which apparently works, but see: >>> r = range(10000000000000000000) >>> r[0] Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t >>> This is a bug, right? Thank you very much!! Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From martin at v.loewis.de Thu Apr 24 16:58:40 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Apr 2008 16:58:40 +0200 Subject: [Python-3000] Using range() In-Reply-To: References: Message-ID: <4810A020.6030008@v.loewis.de> > Which is the normal way to "consume" a range object, item by item? The normal way is a for loop. The advanced way of invoking some method on the object (i.e. emulating the for loop) is to first create an iterator from the range object. You can't consume the range itself: it will always contain the same numbers - just like you can't consume a list. >>>> r = range(10000000000000000000) >>>> r[0] > Traceback (most recent call last): > File "", line 1, in > OverflowError: Python int too large to convert to C ssize_t > > This is a bug, right? I'd call it an implementation limitation. Regards, Martin From facundobatista at gmail.com Thu Apr 24 17:08:13 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 24 Apr 2008 12:08:13 -0300 Subject: [Python-3000] Using range() In-Reply-To: <4810A020.6030008@v.loewis.de> References: <4810A020.6030008@v.loewis.de> Message-ID: 2008/4/24, "Martin v. L?wis" : > The advanced way of invoking some method on the object (i.e. emulating > the for loop) is to first create an iterator from the range object. > You can't consume the range itself: it will always contain the same > numbers - just like you can't consume a list. Great! Thanks! >>> r = range(10000000000000000000) >>> it = iter(r) >>> it.__next__() 0 >>> it.__next__() 1 > >>>> r = range(10000000000000000000) > >>>> r[0] > > Traceback (most recent call last): > > File "", line 1, in > > OverflowError: Python int too large to convert to C ssize_t > > > > This is a bug, right? > > I'd call it an implementation limitation. This is because I'm in a 32 bit machine? >>> n = 10000000000000000000 >>> 2**32 > n False >>> 2**64 > n True Should it work in a 64 bit hardware? Thanks again! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From dickinsm at gmail.com Thu Apr 24 17:15:10 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 24 Apr 2008 11:15:10 -0400 Subject: [Python-3000] Using range() In-Reply-To: <4810A020.6030008@v.loewis.de> References: <4810A020.6030008@v.loewis.de> Message-ID: <5c6f2a5d0804240815h4930316bma8931d1da64f2d99@mail.gmail.com> On Thu, Apr 24, 2008 at 10:58 AM, "Martin v. L?wis" wrote: > >>>> r = range(10000000000000000000) > >>>> r[0] > > Traceback (most recent call last): > > File "", line 1, in > > OverflowError: Python int too large to convert to C ssize_t > > > > This is a bug, right? > > I'd call it an implementation limitation. > It is a bit surprising, especially given that the following works: >>> r = range(10**19-100, 10**19) >>> r[0] 9999999999999999900 Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Thu Apr 24 17:32:44 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Apr 2008 17:32:44 +0200 Subject: [Python-3000] Using range() In-Reply-To: References: <4810A020.6030008@v.loewis.de> Message-ID: <4810A81C.4030501@v.loewis.de> >> > This is a bug, right? >> >> I'd call it an implementation limitation. > > This is because I'm in a 32 bit machine? Right. The assumption is that you typically use the range elements to index into some collections, and you can't have collections with more than 2**32 elements (actually, address space is exhausted at 2**29 elements already, except for str and unicode). It would be possible to make it support larger ranges, but then the common case would get slower, and the code would be more convoluted. >>>> n = 10000000000000000000 >>>> 2**32 > n > False >>>> 2**64 > n > True > > Should it work in a 64 bit hardware? No: py> 2**63 > n False The largest possible value value of Py_ssize_t is 2**63-1. See sys.maxsize. Regards, Martin From ncoghlan at gmail.com Thu Apr 24 17:37:26 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Apr 2008 01:37:26 +1000 Subject: [Python-3000] Using range() In-Reply-To: <4810A020.6030008@v.loewis.de> References: <4810A020.6030008@v.loewis.de> Message-ID: <4810A936.1050602@gmail.com> Martin v. L?wis wrote: >> Which is the normal way to "consume" a range object, item by item? > > The normal way is a for loop. > > The advanced way of invoking some method on the object (i.e. emulating > the for loop) is to first create an iterator from the range object. > You can't consume the range itself: it will always contain the same > numbers - just like you can't consume a list. > >>>>> r = range(10000000000000000000) >>>>> r[0] >> Traceback (most recent call last): >> File "", line 1, in >> OverflowError: Python int too large to convert to C ssize_t >> >> This is a bug, right? > > I'd call it an implementation limitation. It looks a bit suspicious to me, and definitely worth raising a tracker issue for. While I could understand a 'must fit in ssize_t' limitation on the index passed to the range object (or conceivably even on the value returned, although that would be a little odd), that isn't happening in the example - the index being passed in is zero, and the value that should be getting returned is zero. Where is that OverflowError coming from? Is there a missing PyErr_Clear() call in the range code somewhere? Should the range code be invoking a different PyNumber_ call when it does the conversion? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From dickinsm at gmail.com Thu Apr 24 17:42:56 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 24 Apr 2008 11:42:56 -0400 Subject: [Python-3000] Using range() In-Reply-To: <4810A936.1050602@gmail.com> References: <4810A020.6030008@v.loewis.de> <4810A936.1050602@gmail.com> Message-ID: <5c6f2a5d0804240842h2f1d8585g3dc14b03b8a008fb@mail.gmail.com> On Thu, Apr 24, 2008 at 11:37 AM, Nick Coghlan wrote: > While I could understand a 'must fit in ssize_t' limitation on the index > passed to the range object (or conceivably even on the value returned, > although that would be a little odd), that isn't happening in the example - > the index being passed in is zero, and the value that should be getting > returned is zero. Where is that OverflowError coming from? Is there a > missing PyErr_Clear() call in the range code somewhere? Should the range > code be invoking a different PyNumber_ call when it does the conversion? > Looks like it's coming from range_length, which gets called from range_item, which implements the sq_item slot. range_item uses the length to check the validity of the given index. I don't think it would be difficult to fix this so that indices up to the max value of a Py_ssize_t are valid. I agree it's worth opening a tracker issue for. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Thu Apr 24 17:47:57 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 24 Apr 2008 17:47:57 +0200 Subject: [Python-3000] Using range() In-Reply-To: <5c6f2a5d0804240815h4930316bma8931d1da64f2d99@mail.gmail.com> References: <4810A020.6030008@v.loewis.de> <5c6f2a5d0804240815h4930316bma8931d1da64f2d99@mail.gmail.com> Message-ID: <4810ABAD.4050304@v.loewis.de> > It is a bit surprising, especially given > that the following works: > >>>> r = range(10**19-100, 10**19) >>>> r[0] > 9999999999999999900 The original example fails because range_length() overflows in range_item. Given that range_item is "almost" there, it's probably not as difficult to fix this as I first thought: range_item should use range_length_obj, and compare to rem. Contributions are welcome. Regards, Martin From ncoghlan at gmail.com Thu Apr 24 17:50:22 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Apr 2008 01:50:22 +1000 Subject: [Python-3000] Using range() In-Reply-To: <4810A81C.4030501@v.loewis.de> References: <4810A020.6030008@v.loewis.de> <4810A81C.4030501@v.loewis.de> Message-ID: <4810AC3E.40203@gmail.com> Martin v. L?wis wrote: >>> > This is a bug, right? >>> >>> I'd call it an implementation limitation. >> This is because I'm in a 32 bit machine? > > Right. The assumption is that you typically use > the range elements to index into some collections, > and you can't have collections with more than 2**32 > elements (actually, address space is exhausted at > 2**29 elements already, except for str and unicode). > > It would be possible to make it support larger > ranges, but then the common case would get slower, > and the code would be more convoluted. There's definitely some bugs in this area of the range object code though: >>> x = range(2**33, 2) >>> len(x) 0 >>> x[0] Traceback (most recent call last): File "", line 1, in IndexError: range object index out of range I also believe that the OverflowError from doing len(self) while attempting to index into the range should be intercepted and converted to something more meaningful for the actual operation requested by the programmer (e.g. "ValueError: Cannot index range objects with sys.maxsize or more elements") -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Thu Apr 24 17:52:26 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Apr 2008 17:52:26 +0200 Subject: [Python-3000] Using range() In-Reply-To: <4810A936.1050602@gmail.com> References: <4810A020.6030008@v.loewis.de> <4810A936.1050602@gmail.com> Message-ID: <4810ACBA.9070301@v.loewis.de> > Where is that OverflowError coming from? It computes the length of the range, to find out whether the index is out of range. Computing the length then raises the exception, as it uses range_length, not range_length_obj. Regards, Martin From martin at v.loewis.de Thu Apr 24 20:23:35 2008 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 24 Apr 2008 20:23:35 +0200 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: <48101D48.1010102@v.loewis.de> Message-ID: <4810D027.30906@v.loewis.de> > On Wed, Apr 23, 2008 at 10:40 PM, "Martin v. L?wis" > wrote: > > py> time % 60, time//60%60, time//3600%24, time//(3600*24) > (28, 7, 0, 22) > > the 3600 and 3600*24 was what I was trying to avoid. This is getting off-topic, so you don't need to answer; I still ask: Why??? It's still *shorter* than your reduce version, and much much much more legible. Readability counts. > I like the divmod > solution. You can also use it in the reduce :) > reduce(lambda a, b: divmod(a[0], b) + a[1:], [(t,), 60, 60, 24])[::-1] Even after knowing what this does, I still cannot easily understand how it does that. I think having reduce produce a growing value, and passing it an inhomogeneous list, is just deep abuse. In any case, writing multiple lines is good, writing a single line only is bad. Regards, Martin From guido at python.org Thu Apr 24 20:37:58 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 24 Apr 2008 11:37:58 -0700 Subject: [Python-3000] Assert syntax change... In-Reply-To: References: Message-ID: I sympathize with the sentiment, but 'as' is the wrong keyword; there is no assignment to the thing on its right hand side like there is in all other places where it is used in the syntax (import-as, with-as, except-as). Also, have you actually tried this in 3.0? It prints a nice SytaxWarning message, which IMO is enough. --Guido On Thu, Apr 24, 2008 at 2:27 AM, Charles Merriam wrote: > Hello All, > > I expect it is far to late for this, and I still wanted to make the issue known. > > The assert statement is one of the few remaining Python statements > where it (1) does not use parenthesis and (2) takes multiple > arguments. This leads to the common, hard to detect, programming > error: > assert(rarelyHappens > 0 , "Hyperdrive component needs replacement!") > which is equivalent to: > assert True > because the programmer forgot to remove parenthesis for assert statements. > > It would be great to change assert from: > assert_stmt ::= "assert" expression ["," expression] > To: > assert_stmt ::= "assert" expression ["as" expression] > > Which would make only one way to make an assert statement. > > + One less exception to remember > + Fewer "fail silently" errors > + One less place where commas outside parenthesis are used. > (try/except and print just got fixed). > - Way too late: I wish I had noticed months ago. > - Need to patch 2to3, documentation, etc. > > There may be a problem in tuples evaluating as expressions with this > naive fix; it's unclear to > my little brain if a bare expression_list can be reached from > expression. That is, would > assert "tuple","implied" as "Message" > or > assert neverHappen > 0, "Programmer used 2.5 syntax here" > be valid statements? > > In summary, I know that the current assert syntax is wrong. I know > what it should look like. > I do not know if the implementation details of parenthesized tuples > make this difficult. I > should have noticed it a year ago. > > Charles Merriam > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ntung at ntung.com Thu Apr 24 20:49:03 2008 From: ntung at ntung.com (Nicholas T) Date: Thu, 24 Apr 2008 11:49:03 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: <4810D027.30906@v.loewis.de> References: <48101D48.1010102@v.loewis.de> <4810D027.30906@v.loewis.de> Message-ID: On Thu, Apr 24, 2008 at 11:23 AM, "Martin v. L?wis" wrote: > This is getting off-topic, so you don't need to answer; I still ask: > Why??? yes I know, apologies for not mailing the right list. I'll try to do so next time. Dividing the previous result seems more logical: minutes = seconds/60, hours=minutes/60, days=hours/24, not minutes=seconds/60, hours=seconds/3600, days=seconds/86400. Also, if you do something like adding years, it's simple--you just append to the list versus change the last thing to a modulo and add another unit. If you add years, changing days to sidereal days would also be easier with the list. Also, you don't have to check multiplication, etc. [in answer to your next question as well] The example was created to show the ability to express things with reduce, not necessarily to quickly calculate human readable time. > Even after knowing what this does, I still cannot easily understand > how it does that. I think having reduce produce a growing value, and > passing it an inhomogeneous list, is just deep abuse. I guess I am likely to agree: the function is not associative or commutative, and unable to be parallelized--one of the reasons reduce was created. > In any case, writing multiple lines is good, writing a single line only > is bad. I don't agree. If it's code you want to write once and never look at again, having it out of the way can be nice. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ntung at ntung.com Thu Apr 24 19:53:19 2008 From: ntung at ntung.com (Nicholas T) Date: Thu, 24 Apr 2008 10:53:19 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: <48101D48.1010102@v.loewis.de> References: <48101D48.1010102@v.loewis.de> Message-ID: On Wed, Apr 23, 2008 at 10:35 PM, Alex Martelli wrote: > Is 119 vs 117 characters "a LOT of verbosity"...?! OK, then what about...: the reduce is actually 69 *and only one line* if you don't need it in a function. You can't put the dhms2 in a function unless you want to leak a bunch of variables like 'd', 'r', and 't'... reduce(lambda a, b: a[:-1] + [a[-1]%b, a[-1]//b], [[t], 60, 60, 24]) On Wed, Apr 23, 2008 at 10:40 PM, "Martin v. L?wis" wrote: > py> time % 60, time//60%60, time//3600%24, time//(3600*24) > (28, 7, 0, 22) the 3600 and 3600*24 was what I was trying to avoid. I like the divmod solution. You can also use it in the reduce :) reduce(lambda a, b: divmod(a[0], b) + a[1:], [(t,), 60, 60, 24])[::-1] P.S. I'm not sure why you had been using floating point > operations. no, this wasn't necessary, I didn't know about "//". I could have used int as well... Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Thu Apr 24 22:01:13 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Thu, 24 Apr 2008 16:01:13 -0400 Subject: [Python-3000] Using range() In-Reply-To: <4810AC3E.40203@gmail.com> References: <4810A020.6030008@v.loewis.de> <4810A81C.4030501@v.loewis.de> <4810AC3E.40203@gmail.com> Message-ID: <5c6f2a5d0804241301m7dbc3bd8q534cdb6a27f3910a@mail.gmail.com> On Thu, Apr 24, 2008 at 11:50 AM, Nick Coghlan wrote: > There's definitely some bugs in this area of the range object code though: > > >>> x = range(2**33, 2) > >>> len(x) > 0 > >>> x[0] > Traceback (most recent call last): > File "", line 1, in > IndexError: range object index out of range > Hmm. I'm not seeing the bug here. What am I missing? It seems to me that there are two reasonable behaviours for range(a, b) when b is less than a: return an 'empty' range, as in the example above, or raise a ValueError; I can see arguments for both behaviours. But one good argument in favour of the current behaviour is that xrange(a, b) in Python 2.x currently returns an empty range when b < a: >>> xrange(3, -2) xrange(3, 3) > I also believe that the OverflowError from doing len(self) while attempting > to index into the range should be intercepted and converted to something > more meaningful for the actual operation requested by the programmer (e.g. > "ValueError: Cannot index range objects with sys.maxsize or more elements") > Agreed. Though if it's easy to fix things so that range(a, b)[n] always 'just works' for any integer a <= n < b, and if the fix doesn't have any significant performance impact, wouldn't that be even better? Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 24 22:21:37 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 24 Apr 2008 13:21:37 -0700 Subject: [Python-3000] Using range() In-Reply-To: <5c6f2a5d0804241301m7dbc3bd8q534cdb6a27f3910a@mail.gmail.com> References: <4810A020.6030008@v.loewis.de> <4810A81C.4030501@v.loewis.de> <4810AC3E.40203@gmail.com> <5c6f2a5d0804241301m7dbc3bd8q534cdb6a27f3910a@mail.gmail.com> Message-ID: On Thu, Apr 24, 2008 at 1:01 PM, Mark Dickinson wrote: > It seems to me that there are two reasonable behaviours > for range(a, b) when b is less than a: return an 'empty' range, > as in the example above, or raise a ValueError; Don't even think about suggesting to change this. It matches the behavior of slices: >>> a = 'abc' >>> a[2:1] '' >>> -- --Guido van Rossum (home page: http://www.python.org/~guido/) From the.dead.shall.rise at gmail.com Thu Apr 24 22:51:37 2008 From: the.dead.shall.rise at gmail.com (Mikhail Glushenkov) Date: Thu, 24 Apr 2008 20:51:37 +0000 (UTC) Subject: [Python-3000] Assert syntax change... References: Message-ID: Hello, Guido van Rossum python.org> writes: > > I sympathize with the sentiment, but 'as' is the wrong keyword; Why not make ``assert`` a built-in function then? From guido at python.org Thu Apr 24 23:01:08 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 24 Apr 2008 14:01:08 -0700 Subject: [Python-3000] Assert syntax change... In-Reply-To: References: Message-ID: On Thu, Apr 24, 2008 at 1:51 PM, Mikhail Glushenkov > Why not make ``assert`` a built-in function then? Because then it can't be disabled by the compiler in -O mode. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From charles.merriam at gmail.com Thu Apr 24 23:13:04 2008 From: charles.merriam at gmail.com (Charles Merriam) Date: Thu, 24 Apr 2008 14:13:04 -0700 Subject: [Python-3000] Assert syntax change... In-Reply-To: References: Message-ID: On Thu, Apr 24, 2008 at 2:01 PM, Guido van Rossum wrote: > On Thu, Apr 24, 2008 at 1:51 PM, Mikhail Glushenkov > > Why not make ``assert`` a built-in function then? > Because then it can't be disabled by the compiler in -O mode. A reasonable conclusion, but needs better reasoning. One could certainly do an: assert_stmt ::= "assert" (expression ["," expression]) and implement it, when there isn't a -O, as: __assert__(expression, message=None) # built-in This gives: + more language consistency for developer using assert(). + over-ride assertion failure to log it correctly. + easier to decide not to throw exception during debugging. - might have security concerns. So, better reasoning? or just ISO? From brett at python.org Fri Apr 25 00:58:18 2008 From: brett at python.org (Brett Cannon) Date: Thu, 24 Apr 2008 15:58:18 -0700 Subject: [Python-3000] what do I use in place of reduce? In-Reply-To: References: <48101D48.1010102@v.loewis.de> <4810D027.30906@v.loewis.de> Message-ID: [SNIP] > > > In any case, writing multiple lines is good, writing a single line only > > is bad. > I don't agree. If it's code you want to write once and never look at again, > having it out of the way can be nice. But this is Python; explicit is better than implicit. You write one-liners on the weekend as a challenge and a joke, not for any code you will ever actually use. -Brett From greg.ewing at canterbury.ac.nz Fri Apr 25 01:30:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 25 Apr 2008 11:30:56 +1200 Subject: [Python-3000] Assert syntax change... In-Reply-To: References: Message-ID: <48111830.9020609@canterbury.ac.nz> Charles Merriam wrote: > It would be great to change assert from: > assert_stmt ::= "assert" expression ["," expression] > To: > assert_stmt ::= "assert" expression ["as" expression] I don't think "as" is the right word to use here... maybe assert else > That is, would > assert "tuple","implied" as "Message" > or > assert neverHappen > 0, "Programmer used 2.5 syntax here" > be valid statements? I think they should be invalid. I can't see a use case for this -- the resulting condition would always be true. -- Greg From ncoghlan at gmail.com Fri Apr 25 06:04:35 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Apr 2008 14:04:35 +1000 Subject: [Python-3000] Using range() In-Reply-To: <5c6f2a5d0804241301m7dbc3bd8q534cdb6a27f3910a@mail.gmail.com> References: <4810A020.6030008@v.loewis.de> <4810A81C.4030501@v.loewis.de> <4810AC3E.40203@gmail.com> <5c6f2a5d0804241301m7dbc3bd8q534cdb6a27f3910a@mail.gmail.com> Message-ID: <48115853.4000307@gmail.com> Mark Dickinson wrote: > On Thu, Apr 24, 2008 at 11:50 AM, Nick Coghlan > wrote: > > There's definitely some bugs in this area of the range object code > though: > > >>> x = range(2**33, 2) > >>> len(x) > > 0 > >>> x[0] > Traceback (most recent call last): > File "", line 1, in > IndexError: range object index out of range > > > Hmm. I'm not seeing the bug here. What am I missing? Eh, brain explosion from typing too late at night. The experiment I actually *meant* to try was: >>> x = range(0, 2**33, 2) >>> len(x) Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t >>> x[0] Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t The error message in the latter case is thoroughly confusing (although it is now clearer what is causing it). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Fri Apr 25 06:13:18 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Apr 2008 14:13:18 +1000 Subject: [Python-3000] Assert syntax change... In-Reply-To: References: Message-ID: <48115A5E.9030302@gmail.com> Charles Merriam wrote: > On Thu, Apr 24, 2008 at 2:01 PM, Guido van Rossum wrote: >> On Thu, Apr 24, 2008 at 1:51 PM, Mikhail Glushenkov >>> Why not make ``assert`` a built-in function then? >> Because then it can't be disabled by the compiler in -O mode. > > A reasonable conclusion, but needs better reasoning. One could > certainly do an: > assert_stmt ::= "assert" (expression ["," expression]) > and implement it, when there isn't a -O, as: > __assert__(expression, message=None) # built-in Hmm, having an __assert__ builtin might be nice regardless - easier to have assertions in test suites that are executed regardless of -0, instead of every different Python test suite having to include its own function to wrap 'raise AssertionError(message)'. Independently of that, changing assert to allow surrounding parentheses (similar to the name list in a from module import name-list style import statement) would also be convenient for longer expressions or error messages. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From martin at v.loewis.de Fri Apr 25 20:00:50 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 25 Apr 2008 20:00:50 +0200 Subject: [Python-3000] Assert syntax change... In-Reply-To: <48115A5E.9030302@gmail.com> References: <48115A5E.9030302@gmail.com> Message-ID: <48121C52.2050907@v.loewis.de> > Independently of that, changing assert to allow surrounding parentheses > (similar to the name list in a from module import name-list style import > statement) would also be convenient for longer expressions or error > messages. But that's already supported... py> assert (1+1+1+1+1+1 ... +1+1+1+1+1+1+1+1+1 ... >20),("The sum of many" ... "integers should be larger" ... "than a single small integer") Traceback (most recent call last): File "", line 1, in ? AssertionError: The sum of manyintegers should be largerthan a single small integer Regards, Martin From martin at v.loewis.de Fri Apr 25 20:09:14 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 25 Apr 2008 20:09:14 +0200 Subject: [Python-3000] Assert syntax change... In-Reply-To: References: Message-ID: <48121E4A.4000708@v.loewis.de> > A reasonable conclusion, but needs better reasoning. One could > certainly do an: > assert_stmt ::= "assert" (expression ["," expression]) I don't understand that change. Adding parentheses in the EBNF merely adds grouping in the grammar; it doesn't actually change the syntax. Perhaps you meant assert_stmt ::= "assert" "(" expression ["," expression] ")" > and implement it, when there isn't a -O, as: > __assert__(expression, message=None) # built-in For the issue under discussion, this is unrelated. In any case, this would be another incompatible change. Python 2.4.5 (#2, Mar 12 2008, 00:15:51) [GCC 4.2.3 (Debian 4.2.3-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. py> assert 3>0, a py> Here, the "message" expression isn't evaluated unless the assertion fails. With your change, it would be evaluated. Regards, Martin From dickinsm at gmail.com Sat Apr 26 01:24:09 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Fri, 25 Apr 2008 19:24:09 -0400 Subject: [Python-3000] Using range() In-Reply-To: <48115853.4000307@gmail.com> References: <4810A020.6030008@v.loewis.de> <4810A81C.4030501@v.loewis.de> <4810AC3E.40203@gmail.com> <5c6f2a5d0804241301m7dbc3bd8q534cdb6a27f3910a@mail.gmail.com> <48115853.4000307@gmail.com> Message-ID: <5c6f2a5d0804251624t6dff376fuaeca4d56afbd510a@mail.gmail.com> On Fri, Apr 25, 2008 at 12:04 AM, Nick Coghlan wrote: > > Eh, brain explosion from typing too late at night. The experiment I > actually *meant* to try was: > > >>> x = range(0, 2**33, 2) > >>> len(x) > Traceback (most recent call last): > File "", line 1, in > OverflowError: Python int too large to convert to C ssize_t > >>> x[0] > Traceback (most recent call last): > File "", line 1, in > OverflowError: Python int too large to convert to C ssize_t > > > The error message in the latter case is thoroughly confusing (although it > is now clearer what is causing it). > Agreed. See also the discussion over at http://bugs.python.org/issue2690 Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From musiccomposition at gmail.com Sat Apr 26 05:25:52 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Fri, 25 Apr 2008 22:25:52 -0500 Subject: [Python-3000] range() issues Message-ID: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Recently, various discussions about builtin range have come up in the tracker that need to be brought to the attention of the general developer pool: First of all, should the length of range be completely constricted by Py_ssize_t? (issue 2690) Since indexing already is constrained by this, it would make sense to make the whole object live under that law. However, it appears Amaury has a patch to allow these huge ranges [1] Also, how should range values be normalized in the constructor (if at all) to make ranges over the same set of integers equivalent? (see 2603) For example, given the set of integers [0, 2, 4], which should happen: >>> range(0, 5, 2) range(0, 6, 2) >>> range(0, 6, 2) range(0, 6, 2) or >>> range(0, 5, 2) range(0, 5, 2) >>> range(0, 6, 2) range(0, 6, 2) [ I probably missed something, so feel free to add it. ] [1] http://bugs.python.org/msg65807 -- Cheers, Benjamin Peterson From facundobatista at gmail.com Sat Apr 26 13:50:55 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Sat, 26 Apr 2008 08:50:55 -0300 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: 2008/4/26, Benjamin Peterson : > First of all, should the length of range be completely constricted by > Py_ssize_t? (issue 2690) Since indexing already is constrained by > this, it would make sense to make the whole object live under that What is range()? help(range) shows me that range "Returns an iterator that generates the numbers in the range on demand." Ah?! So, as ints are unbound in Python, I could easily do: >>> r = range(1,1000000000000000000000) *If* range() provides me the indexing facility (a nice feature to have, but in any means core to this function), it should allow me to index it completely, or at least, to Py_ssize_t. IOW, r[0] should work, even if r[9999999999999999999) doesn't. That is, to me, the range semantics that we should aim to. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From alexander.belopolsky at gmail.com Sat Apr 26 15:22:28 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 26 Apr 2008 13:22:28 +0000 (UTC) Subject: [Python-3000] range() issues References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: Facundo Batista gmail.com> writes: > > 2008/4/26, Benjamin Peterson gmail.com>: .. > What is range()? > > help(range) shows me that range "Returns an iterator that generates > the numbers in the range on demand." > This is not correct in 3.x: range does not return an iterator. There is an iterator similar to range in itertools: count. I would not mind adding optional step and stop arguments to it. > Ah?! So, as ints are unbound in Python, I could easily do: > > >>> r = range(1,1000000000000000000000) > The problem with supporting this is that len(r) will raise overflow error. It would be nice to get rid of the limitation on len(), but it will be hard and may not be possible to do efficiently. > *If* range() provides me the indexing facility (a nice feature to > have, but in any means core to this function), it should allow me to > index it completely, or at least, to Py_ssize_t. IOW, r[0] should > work, even if r[9999999999999999999) doesn't. > It will be very strange to allow objects for which for x in r is not the same as for i in range(len(r)): x = r[i]. Doing so will lead to hard to detect errors. From ncoghlan at gmail.com Sat Apr 26 16:02:41 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Apr 2008 00:02:41 +1000 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: <48133601.50308@gmail.com> Alexander Belopolsky wrote: > Facundo Batista gmail.com> writes: >> Ah?! So, as ints are unbound in Python, I could easily do: >> >>>>> r = range(1,1000000000000000000000) > > The problem with supporting this is that len(r) will raise overflow error. > It would be nice to get rid of the limitation on len(), but it will be hard > and may not be possible to do efficiently. My personal preference is that we stay within the bounds of what was possible with the 2.x range() that returned a list instead of a customised object: start, stop and step are unbounded, but the overall length of the resulting sequence cannot exceed sys.maxsize. All that needs to be done to make this consistent is to move the length calculation into the range object's constructor (and Alexander has already provided a patch to do this in issue 2690) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From facundobatista at gmail.com Sat Apr 26 20:49:19 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Sat, 26 Apr 2008 15:49:19 -0300 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: 2008/4/26 Alexander Belopolsky : > > What is range()? > > > > help(range) shows me that range "Returns an iterator that generates > > the numbers in the range on demand." > > This is not correct in 3.x: range does not return an iterator. There is an > iterator similar to range in itertools: count. I would not mind adding > optional step and stop arguments to it. I took that string doing help(range) in the py3k branch, r62509, is it a bug? Which should the range() definition be, in your words? > > Ah?! So, as ints are unbound in Python, I could easily do: > > > > >>> r = range(1,1000000000000000000000) > > The problem with supporting this is that len(r) will raise overflow error. > It would be nice to get rid of the limitation on len(), but it will be hard > and may not be possible to do efficiently. Maybe len() should be removed? Maybe indexing? I don't know: I don't know what range() is. I mean, I took the previous definition from the actualy Py3k, but you say it's wrong. I think that we should first define the range() semantic, what is core to it and what would be a nice thing to have but is not mandatory, and then try to comply. At this moment I stopped writing this mail, and I went to code a Range() class to have the semantics that we're seeking here (it's attached), and I couldn't finish it 100% because of a len() behaviour that I'm including here, because it's related to what we're discussing here: >>> class C: ... def __len__(self): ... return 100000000000000000000000000000 ... >>> c = C() >>> len(c) Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t >From an external point of view, and knowing that ints are unbound, why should I have an error here? Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ -------------- next part -------------- A non-text attachment was scrubbed... Name: myrange.py Type: text/x-python Size: 1482 bytes Desc: not available URL: From musiccomposition at gmail.com Sat Apr 26 20:58:20 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 26 Apr 2008 13:58:20 -0500 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> On Sat, Apr 26, 2008 at 1:49 PM, Facundo Batista wrote: > Which should the range() definition be, in your words? "A set of integers from start to stop skipping step." [ ... ] > At this moment I stopped writing this mail, and I went to code a > Range() class to have the semantics that we're seeking here (it's > attached), and I couldn't finish it 100% because of a len() behaviour > that I'm including here, because it's related to what we're discussing > here: > > >>> class C: > ... def __len__(self): > ... return 100000000000000000000000000000 > ... > >>> c = C() > >>> len(c) > Traceback (most recent call last): > File "", line 1, in > OverflowError: Python int too large to convert to C ssize_t > > >From an external point of view, and knowing that ints are unbound, why > should I have an error here? lens are forced to be <= Py_ssize_t because that's the limit put on sequence sizes. -- Cheers, Benjamin Peterson From facundobatista at gmail.com Sat Apr 26 21:06:38 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Sat, 26 Apr 2008 16:06:38 -0300 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> Message-ID: 2008/4/26, Benjamin Peterson : > lens are forced to be <= Py_ssize_t because that's the limit put on > sequence sizes. But this should be a secuence issue... or not? Why I'm limiting the general len()/__len__ infrastructure? Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From musiccomposition at gmail.com Sat Apr 26 21:10:41 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 26 Apr 2008 14:10:41 -0500 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> Message-ID: <1afaf6160804261210p270c4c9drac6e6aa24df9563c@mail.gmail.com> On Sat, Apr 26, 2008 at 2:06 PM, Facundo Batista wrote: > 2008/4/26, Benjamin Peterson : > > > > lens are forced to be <= Py_ssize_t because that's the limit put on > > sequence sizes. > > But this should be a secuence issue... or not? Why I'm limiting the > general len()/__len__ infrastructure? Well, I suppose we could add a length method or attribute, but that would be clunky and violates "there is only one way to do it." -- Cheers, Benjamin Peterson From g.brandl at gmx.net Sat Apr 26 22:34:54 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 26 Apr 2008 22:34:54 +0200 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> Message-ID: Benjamin Peterson schrieb: > On Sat, Apr 26, 2008 at 1:49 PM, Facundo Batista > wrote: >> Which should the range() definition be, in your words? > > "A set of integers from start to stop skipping step." > > [ ... ] "Set" is definitely misleading -- it has no ordering. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From alexandre at peadrop.com Sat Apr 26 22:51:16 2008 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Sat, 26 Apr 2008 16:51:16 -0400 Subject: [Python-3000] Consistency of memoryview and bytes object Message-ID: Hi, Would it be a good idea to make memoryview indexing consistent with the behaviour of bytes object? >>> memoryview(b'hello')[0] bytearray(b'h') >>> b'hello'[0] 104 -- Alexandre From guido at python.org Sun Apr 27 00:13:15 2008 From: guido at python.org (Guido van Rossum) Date: Sat, 26 Apr 2008 15:13:15 -0700 Subject: [Python-3000] Consistency of memoryview and bytes object In-Reply-To: References: Message-ID: Hm, yes this seems reasonable. Travis, what do you think of this? On Sat, Apr 26, 2008 at 1:51 PM, Alexandre Vassalotti wrote: > Would it be a good idea to make memoryview indexing consistent with > the behaviour of bytes object? > > >>> memoryview(b'hello')[0] > bytearray(b'h') > >>> b'hello'[0] > 104 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexandre at peadrop.com Sun Apr 27 00:21:38 2008 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Sat, 26 Apr 2008 18:21:38 -0400 Subject: [Python-3000] Hiding _abcoll from introspection (e.g. help() and cie.) Message-ID: Hi, Since _abcoll shouldn't be used directly, would changing its __name__ module attribute to 'collections' be justified? This would hide the module from appearing in the subclasses listing of help(). -- Alexandre From musiccomposition at gmail.com Sun Apr 27 00:34:36 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Sat, 26 Apr 2008 17:34:36 -0500 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> Message-ID: <1afaf6160804261534t39b8050eo9c1a66f0ab2285bc@mail.gmail.com> On Sat, Apr 26, 2008 at 3:34 PM, Georg Brandl wrote: > "Set" is definitely misleading -- it has no ordering. True. I was trying to convey the unrepeated part of the set definition. Is "an ordered set of integers" better? -- Cheers, Benjamin Peterson From guido at python.org Sun Apr 27 00:53:58 2008 From: guido at python.org (Guido van Rossum) Date: Sat, 26 Apr 2008 15:53:58 -0700 Subject: [Python-3000] Hiding _abcoll from introspection (e.g. help() and cie.) In-Reply-To: References: Message-ID: I'm not in favor of lying regarding the origin of objects; it makes it harder to find the source and can confuse other introspection tools. This is an inherent limitation of help(), and not one I'm inclined to lose sleep over. On Sat, Apr 26, 2008 at 3:21 PM, Alexandre Vassalotti wrote: > Since _abcoll shouldn't be used directly, would changing its __name__ > module attribute to 'collections' be justified? This would hide the > module from appearing in the subclasses listing of help(). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexandre at peadrop.com Sun Apr 27 02:37:02 2008 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Sat, 26 Apr 2008 20:37:02 -0400 Subject: [Python-3000] Hiding _abcoll from introspection (e.g. help() and cie.) In-Reply-To: References: Message-ID: Although I am not totally convinced that it would make harder to find to source or that it could confuse other introspection tools, I don't feel that something worth arguing about. So, I guess that pretty much kill the idea. Thanks, -- Alexandre On Sat, Apr 26, 2008 at 6:53 PM, Guido van Rossum wrote: > I'm not in favor of lying regarding the origin of objects; it makes it > harder to find the source and can confuse other introspection tools. > This is an inherent limitation of help(), and not one I'm inclined to > lose sleep over. > > > > On Sat, Apr 26, 2008 at 3:21 PM, Alexandre Vassalotti > wrote: > > Since _abcoll shouldn't be used directly, would changing its __name__ > > module attribute to 'collections' be justified? This would hide the > > module from appearing in the subclasses listing of help(). > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > From oliphant.travis at ieee.org Sun Apr 27 02:49:28 2008 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Sat, 26 Apr 2008 19:49:28 -0500 Subject: [Python-3000] Consistency of memoryview and bytes object In-Reply-To: References: Message-ID: Guido van Rossum wrote: > Hm, yes this seems reasonable. Travis, what do you think of this? > > On Sat, Apr 26, 2008 at 1:51 PM, Alexandre Vassalotti > wrote: >> Would it be a good idea to make memoryview indexing consistent with >> the behaviour of bytes object? >> >> >>> memoryview(b'hello')[0] >> bytearray(b'h') >> >>> b'hello'[0] >> 104 I'm not sure that we should rush into this. There are reasons for the differences. The idea is that an "element" of a memory-view object be a bytes object (either a bytearray or a bytes object depending on mutability of the original memoryview object --- seems like it should be a bytes object in this case). Remember that an "element" of a memory-view object can have more than one byte depending on the format attribute. So, I'm not sure what is gained by special-casing the 1-byte item except possible confusion later. Perhaps it is useful to special-case this one, but then you lose useful mutability. My feel right now is to not do the special case at all and actually return a memory-view object even for element access (this is especially needed, I think for nested formats which arise in memory-mapping files which provides some very handy io-related functionality). Then, we should leave it to method call to extract a bytes object as desired. -Travis From martin at v.loewis.de Sun Apr 27 02:55:01 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 27 Apr 2008 02:55:01 +0200 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> Message-ID: <4813CEE5.5070306@v.loewis.de> >> lens are forced to be <= Py_ssize_t because that's the limit put on >> sequence sizes. > > But this should be a secuence issue... or not? Why I'm limiting the > general len()/__len__ infrastructure? Because a C type is used to represent it, not a Python object. Any C type (whichever you chose) will have a length restriction. More specifically, it's because of this definition from object.h: typedef Py_ssize_t (*lenfunc)(PyObject *); ... lenfunc sq_length; If you were asking whether it is good as it is: yes, practicality beats purity. Being pure here has no real value. Regards, Martin From greg.ewing at canterbury.ac.nz Sun Apr 27 05:15:44 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 27 Apr 2008 15:15:44 +1200 Subject: [Python-3000] Consistency of memoryview and bytes object In-Reply-To: References: Message-ID: <4813EFE0.3050802@canterbury.ac.nz> Travis Oliphant wrote: > My feel right now is to not do the special case at all and > actually return a memory-view object even for element access That could be very tedious in the case where the elements are actually bytes, though. Maybe there should be a separate bytesview() object to use instead of memoryview() when you know the elements are bytes? -- Greg From ncoghlan at gmail.com Sun Apr 27 05:48:43 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Apr 2008 13:48:43 +1000 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804261534t39b8050eo9c1a66f0ab2285bc@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <1afaf6160804261158g62a16c8fq313a3cbb02613325@mail.gmail.com> <1afaf6160804261534t39b8050eo9c1a66f0ab2285bc@mail.gmail.com> Message-ID: <4813F79B.1040301@gmail.com> Benjamin Peterson wrote: > On Sat, Apr 26, 2008 at 3:34 PM, Georg Brandl wrote: >> "Set" is definitely misleading -- it has no ordering. > > True. I was trying to convey the unrepeated part of the set > definition. Is "an ordered set of integers" better? > > What's wrong with 'sequence'? You can index it, find out it's length, etc - sounds like a sequence to me. The only difference between it and the list returned in the 2.x series is that it should be far more memory efficient because it will just store the start/stop/step values instead of every value in the sequence. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From alexander.belopolsky at gmail.com Sun Apr 27 13:07:31 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 27 Apr 2008 07:07:31 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: On Sat, Apr 26, 2008 at 2:49 PM, Facundo Batista wrote: > Which should the range() definition be, in your words? In terms of ABCs, range(..) is a Sized Iterable in the current implementation. It is not a Sequence because it is not a Container and does not support slicing. The idea to support x in range(..) was discussed last year [1] and appears to have been accepted but not implemented. I understand that slicing support is in the works. [2] I believe it would make sense to turn range(..) into a Sequence. Here are my reasons: 1. It will be easy to explain what range(..) is: "a sequence of integers from start to stop, excluding stop, skipping step". 2. There will be fewer 2 to 3 incompatibilities. [1] http://mail.python.org/pipermail/python-3000/2007-July/009028.html [2] http://bugs.python.org/msg65807 From ncoghlan at gmail.com Sun Apr 27 16:29:47 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 Apr 2008 00:29:47 +1000 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: <48148DDB.2090509@gmail.com> Alexander Belopolsky wrote: > On Sat, Apr 26, 2008 at 2:49 PM, Facundo Batista > wrote: > >> Which should the range() definition be, in your words? > > In terms of ABCs, range(..) is a Sized Iterable in the current > implementation. It is not a Sequence because it is not a Container > and does not support slicing. The idea to support x in range(..) was > discussed last year [1] and appears to have been accepted but not > implemented. I understand that slicing support is in the works. [2] > > I believe it would make sense to turn range(..) into a Sequence. Here > are my reasons: > > 1. It will be easy to explain what range(..) is: "a sequence of > integers from start to stop, excluding stop, skipping step". > > 2. There will be fewer 2 to 3 incompatibilities. > > [1] http://mail.python.org/pipermail/python-3000/2007-July/009028.html > [2] http://bugs.python.org/msg65807 I like this as a goal - I'll make sure to find the time to help review any patches aimed at achieving it (starting with the one to cache the length of the range during object creation). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From alexander.belopolsky at gmail.com Sun Apr 27 17:01:46 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 27 Apr 2008 11:01:46 -0400 Subject: [Python-3000] range() issues In-Reply-To: <48148DDB.2090509@gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <48148DDB.2090509@gmail.com> Message-ID: On Sun, Apr 27, 2008 at 10:29 AM, Nick Coghlan wrote: > > Alexander Belopolsky wrote: .. > > I believe it would make sense to turn range(..) into a Sequence. .. > I like this as a goal - I'll make sure to find the time to help review any > patches aimed at achieving it (starting with the one to cache the length of > the range during object creation). Thanks, Nick. I have already implemented slicing and am going to implement __contains__ and post a patch with tests and documentation updates. Should we reuse http://bugs.python.org/issue2690 or open a new issue for this? From divinekid at gmail.com Sun Apr 27 18:24:48 2008 From: divinekid at gmail.com (Haoyu Bai) Date: Mon, 28 Apr 2008 00:24:48 +0800 Subject: [Python-3000] Binding builtin function to class Message-ID: <4814A8D0.9090900@gmail.com> Hello, I'm a GSoC student working on SWIG's Python 3 support. When doing experiment on Python 3's new features, the different behavior between binding 'function' and 'builtin_function_or_method' confused me. As we know, unbound method is removed in Python 3. To bind a function to a class, we can directly use this instead: MyClass.myfunc = func But in the case of builtin function, it can't work. The below code demonstrates this: class Test: pass def afunc(*args): print(*args) Test.prt = print Test.func = afunc t = Test() t.prt() #nothing t.func() #<__main__.Test object at 0xb79987ec> I know this is not a bug, but however it is an exception in the language, what Python trying to avoid. Since all C function in extension module is treated as builtin function or method, the problem maybe bigger than it looks like. In the SWIG's case, it originally uses new.instancemethod to generate unbound method from the C function in DLL module. The code snippet looks like this: class TestBase(object): """Proxy of C++ TestBase class""" #some unrelated code omitted pass #_test.TestBase_test is the C function in _test DLL module TestBase.test = new.instancemethod(_test.TestBase_test,None,TestBase) Is there a corresponding way to do it in Python 3? A workaround I found is: from types import MethodType class TestBase(object): """Proxy of C++ TestBase class""" def __init__(self, *args): #some initialization code ... self.test = MethodType(_test.TestBase_test, self) But this changed the original code structure so the migration would be more complicated. Is there any better way to get rid of it? Thank you a lot! Best regards, Haoyu Bai 4/27/2008 From tjreedy at udel.edu Sun Apr 27 22:24:53 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 27 Apr 2008 16:24:53 -0400 Subject: [Python-3000] Binding builtin function to class References: <4814A8D0.9090900@gmail.com> Message-ID: "Haoyu Bai" wrote in message news:4814A8D0.9090900 at gmail.com... | Hello, | | I'm a GSoC student working on SWIG's Python 3 support. When doing | experiment on Python 3's new features, the different behavior between | binding 'function' and 'builtin_function_or_method' confused me. | | As we know, unbound method is removed in Python 3. To bind a function to | a class, we can directly use this instead: | | MyClass.myfunc = func That was always possible. | But in the case of builtin function, it can't work. What is it that 'cannot work'? My guess is that you are talking about the fact that instances do not get bound as an argument to the first parameter of a builtin. This is also true in 2.5.2 (for instance): >>> class T2(object): l=len ... >>> t2=T2() >>> t2.l() Traceback (most recent call last): File "", line 1, in TypeError: len() takes exactly one argument (0 given) Your example with print did not throw an exception only because it allows no args. (And, it cannot work in 2.x where print is a statement keyword.) Or did you mean something else? If *extension* function show a difference, perhaps SWIG needs revision for 3.x. [Builtin callables are also different in respect to parameter naming and binding args by keyword. Perhaps to reduce confusion, they should not be named 'functions' in the manuals.] tjr From richard at tartarus.org Mon Apr 28 10:42:33 2008 From: richard at tartarus.org (Richard Boulton) Date: Mon, 28 Apr 2008 09:42:33 +0100 Subject: [Python-3000] Binding builtin function to class In-Reply-To: References: <4814A8D0.9090900@gmail.com> Message-ID: <48158DF9.8070708@tartarus.org> Terry Reedy wrote: > | But in the case of builtin function, it can't work. > > What is it that 'cannot work'? My guess is that you are talking about the > fact that instances do not get bound as an argument to the first parameter > of a builtin. Yes, this is what Haoyu was talking about - I suspect he meant "doesn't work" rather than "cannot work", and that's the reason it doesn't work (both in 2.x and 3.0). > If *extension* function show a difference, perhaps SWIG needs revision for > 3.x. SWIG does need an update for 3.0: this is precisely what Haoyu is working on! :) I don't think this particular aspect of extension functions has changed in 3.0, as you say, but the problem Haoyu is trying to solve is working out what to replace usage of new.instancemethod with, as described in the code snippets at the end of his email. SWIG currently generates code for python 2.x which makes heavy use of new.instancemethod, and since "new" is deprecated in 3.0, we need to find a replacement. I'll ask a direct question: what is the recommended replacement for new.instancemethod? In particular, what would be the recommended replacement for the following code snippet? class TestBase(object): """Proxy of C++ TestBase class""" #some unrelated code omitted pass #_test.TestBase_test is the C function in _test DLL module TestBase.test = new.instancemethod(_test.TestBase_test,None,TestBase) A secondary question is whether new.instancemethod was ever the right way for SWIG to be working: the person who originally wrote the python backend for SWIG isn't around any more, as far as I know, so we don't have knowledge of the reason that the code was written this way. > [Builtin callables are also different in respect to parameter naming and > binding args by keyword. Perhaps to reduce confusion, they should not be > named 'functions' in the manuals.] That might be helpful for beginners, yes. -- Richard From divinekid at gmail.com Mon Apr 28 12:41:33 2008 From: divinekid at gmail.com (Haoyu Bai) Date: Mon, 28 Apr 2008 18:41:33 +0800 Subject: [Python-3000] Binding builtin function to class In-Reply-To: References: <4814A8D0.9090900@gmail.com> Message-ID: <4815A9DD.9030708@gmail.com> Terry Reedy wrote: > What is it that 'cannot work'? My guess is that you are talking about the > fact that instances do not get bound as an argument to the first parameter > of a builtin. Yes, this is what I means. Sorry if my words confused you. > > If *extension* function show a difference, perhaps SWIG needs revision for > 3.x. > C extension functions are treated as same as builtins, so they are different from functions written in Python code. Am I right? If I'm right, then how can we avoid the difference? From divinekid at gmail.com Mon Apr 28 13:14:00 2008 From: divinekid at gmail.com (Haoyu Bai) Date: Mon, 28 Apr 2008 19:14:00 +0800 Subject: [Python-3000] Binding builtin function to class In-Reply-To: <48158DF9.8070708@tartarus.org> References: <4814A8D0.9090900@gmail.com> <48158DF9.8070708@tartarus.org> Message-ID: <4815B178.30901@gmail.com> Richard Boulton wrote: > Yes, this is what Haoyu was talking about - I suspect he meant "doesn't > work" rather than "cannot work", and that's the reason it doesn't work > (both in 2.x and 3.0). Thanks Richard for helping me to explain. > I'll ask a direct question: what is the recommended replacement for > new.instancemethod? In particular, what would be the recommended > replacement for the following code snippet? > > class TestBase(object): > """Proxy of C++ TestBase class""" > #some unrelated code omitted > pass > #_test.TestBase_test is the C function in _test DLL module > TestBase.test = new.instancemethod(_test.TestBase_test,None,TestBase) > > > A secondary question is whether new.instancemethod was ever the right > way for SWIG to be working: the person who originally wrote the python > backend for SWIG isn't around any more, as far as I know, so we don't > have knowledge of the reason that the code was written this way. > Yes, these are the very problems I encountered. I think the using of "new.instancemethod" is for speed, because in SWIG's command line, the "-fastproxy" option enabled it: -fastproxy - Use fast proxy mechanism for member methods So what we expect is to find a way doing this in Python 3, as fast as the "new.instancemethod". Best regards, Haoyu Bai 4/28/2008 From humberto at digi.com.br Mon Apr 28 13:30:45 2008 From: humberto at digi.com.br (Humberto Diogenes) Date: Mon, 28 Apr 2008 08:30:45 -0300 Subject: [Python-3000] Adapt pydoc to new doc system Message-ID: Hi, I started working on this ticket but I'm going to need some clarifications, it's called "Adapt pydoc to new doc system" and says only "so that this doesn't get lost": http://bugs.python.org/issue1883 Can someone give more directions on what really needs to be done? Thanks in advance! Humberto Di?genes http://humberto.digi.com.br From g.brandl at gmx.net Mon Apr 28 15:15:43 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 28 Apr 2008 15:15:43 +0200 Subject: [Python-3000] Adapt pydoc to new doc system In-Reply-To: References: Message-ID: Humberto Diogenes schrieb: > Hi, > > I started working on this ticket but I'm going to need some > clarifications, it's called "Adapt pydoc to new doc system" and says > only "so that this doesn't get lost": > http://bugs.python.org/issue1883 > > Can someone give more directions on what really needs to be done? Hehe, this was mainly meant as a reminder item for me since the URLs for pydoc to refer to HTML documentation will change. Georg From humberto at digi.com.br Mon Apr 28 15:42:18 2008 From: humberto at digi.com.br (Humberto Diogenes) Date: Mon, 28 Apr 2008 10:42:18 -0300 Subject: [Python-3000] Adapt pydoc to new doc system In-Reply-To: References: Message-ID: On 28/04/2008, at 10:15, Georg Brandl wrote: > Humberto Diogenes schrieb: >> http://bugs.python.org/issue1883 >> Can someone give more directions on what really needs to be done? > > Hehe, this was mainly meant as a reminder item for me since the URLs > for pydoc to refer to HTML documentation will change. Thanks for the quick answer; I've just added that comment to the ticket. Anyway, it already served to fix the "help() on instances" issue that I mentioned earlier on the list. Oh, and there's still one very simple patch pending: http://bugs.python.org/file10103/py3k-pydoc.doc-cleanup.patch -- Humberto Di?genes http://humberto.digi.com.br From aleaxit at gmail.com Mon Apr 28 16:14:10 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Mon, 28 Apr 2008 07:14:10 -0700 Subject: [Python-3000] Binding builtin function to class In-Reply-To: <4815B178.30901@gmail.com> References: <4814A8D0.9090900@gmail.com> <48158DF9.8070708@tartarus.org> <4815B178.30901@gmail.com> Message-ID: On Mon, Apr 28, 2008 at 4:14 AM, Haoyu Bai wrote: ... > Yes, these are the very problems I encountered. I think the using of > "new.instancemethod" is for speed, because in SWIG's command line, the > "-fastproxy" option enabled it: > > -fastproxy - Use fast proxy mechanism for member methods > > So what we expect is to find a way doing this in Python 3, as fast as the > "new.instancemethod". Essentially a descriptor type with a suitable __get__, right? And C-coded if it needs to be that fast. Is this a SWIG-specific issue (so that SWIG can take care of it in the C code it generates or links) or sufficiently general to warrant an addition to the Python core? Instinctively I think the latter, but can't easily think of another usecase beyond SWIG (and perhaps similar tools such as Boost or SIP). Alex From lists at cheimes.de Mon Apr 28 19:14:51 2008 From: lists at cheimes.de (Christian Heimes) Date: Mon, 28 Apr 2008 19:14:51 +0200 Subject: [Python-3000] Binding builtin function to class In-Reply-To: <4814A8D0.9090900@gmail.com> References: <4814A8D0.9090900@gmail.com> Message-ID: Haoyu Bai schrieb: > I know this is not a bug, but however it is an exception in the > language, what Python trying to avoid. > > Since all C function in extension module is treated as builtin function > or method, the problem maybe bigger than it looks like. In the SWIG's > case, it originally uses new.instancemethod to generate unbound method > from the C function in DLL module. The code snippet looks like this: > > class TestBase(object): > """Proxy of C++ TestBase class""" > #some unrelated code omitted > pass > #_test.TestBase_test is the C function in _test DLL module > TestBase.test = new.instancemethod(_test.TestBase_test,None,TestBase) > > Is there a corresponding way to do it in Python 3? A workaround I found is: > > from types import MethodType > class TestBase(object): > """Proxy of C++ TestBase class""" > def __init__(self, *args): > #some initialization code > ... > self.test = MethodType(_test.TestBase_test, self) > > But this changed the original code structure so the migration would be > more complicated. Is there any better way to get rid of it? I've implemented a wrapper for your problem a while ago. It's in Object/classobject.c:PyInstanceMethod_Type. The wrapper is currently not available in Python code. But it's very easy to make it public. Christian From guido at python.org Tue Apr 29 01:18:29 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Apr 2008 16:18:29 -0700 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: On Sun, Apr 27, 2008 at 4:07 AM, Alexander Belopolsky wrote: > On Sat, Apr 26, 2008 at 2:49 PM, Facundo Batista > wrote: > > > > Which should the range() definition be, in your words? > > In terms of ABCs, range(..) is a Sized Iterable in the current > implementation. It is not a Sequence because it is not a Container > and does not support slicing. The idea to support x in range(..) was > discussed last year [1] and appears to have been accepted but not > implemented. I understand that slicing support is in the works. [2] > > I believe it would make sense to turn range(..) into a Sequence. Here > are my reasons: > > 1. It will be easy to explain what range(..) is: "a sequence of > integers from start to stop, excluding stop, skipping step". > > 2. There will be fewer 2 to 3 incompatibilities. > > [1] http://mail.python.org/pipermail/python-3000/2007-July/009028.html > [2] http://bugs.python.org/msg65807 I'm -0 on this (and on other recent enhancements like indexing and the proposed repr() enhancement). The reason that I'm so lukewarm is that I don't expect there to be much use for all this extra functionality. Teachers who want to show their students what range(x, y, z) is can just cast it to a list. The cost of the extra functionality: writing it, reviewing it, adding unittests, documenting it, maintaining it, making sure it works on 64-bit machines, having Python book authors discuss it; and in addition some extra baggage in the executable that is never needed (but I think the other reasons are more compelling). There's a reason the xrange() object didn't have all this extra baggage. Remember, one of the goals of Py3k is to *shrink* the language so that it will fit in your brain again. This thread seems to be going in the opposite direction. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Apr 29 01:21:14 2008 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Apr 2008 16:21:14 -0700 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: BTW, if you're looking for a term describing range() that's better than set or sequence, how about "series"? It's a mathematical word that matches pretty exactly. (More accurately, I believe it's an algebraic series.) On Mon, Apr 28, 2008 at 4:18 PM, Guido van Rossum wrote: > On Sun, Apr 27, 2008 at 4:07 AM, Alexander Belopolsky > wrote: > > On Sat, Apr 26, 2008 at 2:49 PM, Facundo Batista > > wrote: > > > > > > > Which should the range() definition be, in your words? > > > > In terms of ABCs, range(..) is a Sized Iterable in the current > > implementation. It is not a Sequence because it is not a Container > > and does not support slicing. The idea to support x in range(..) was > > discussed last year [1] and appears to have been accepted but not > > implemented. I understand that slicing support is in the works. [2] > > > > I believe it would make sense to turn range(..) into a Sequence. Here > > are my reasons: > > > > 1. It will be easy to explain what range(..) is: "a sequence of > > integers from start to stop, excluding stop, skipping step". > > > > 2. There will be fewer 2 to 3 incompatibilities. > > > > [1] http://mail.python.org/pipermail/python-3000/2007-July/009028.html > > [2] http://bugs.python.org/msg65807 > > I'm -0 on this (and on other recent enhancements like indexing and the > proposed repr() enhancement). > > The reason that I'm so lukewarm is that I don't expect there to be > much use for all this extra functionality. Teachers who want to show > their students what range(x, y, z) is can just cast it to a list. > > The cost of the extra functionality: writing it, reviewing it, adding > unittests, documenting it, maintaining it, making sure it works on > 64-bit machines, having Python book authors discuss it; and in > addition some extra baggage in the executable that is never needed > (but I think the other reasons are more compelling). There's a reason > the xrange() object didn't have all this extra baggage. > > Remember, one of the goals of Py3k is to *shrink* the language so that > it will fit in your brain again. This thread seems to be going in the > opposite direction. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Apr 29 01:56:15 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 29 Apr 2008 11:56:15 +1200 Subject: [Python-3000] Binding builtin function to class In-Reply-To: References: <4814A8D0.9090900@gmail.com> <48158DF9.8070708@tartarus.org> <4815B178.30901@gmail.com> Message-ID: <4816641F.5030605@canterbury.ac.nz> Alex Martelli wrote: > Is this a SWIG-specific issue (so > that SWIG can take care of it in the C code it generates or links) or > sufficiently general to warrant an addition to the Python core? I haven't been following this closely, but if the issue is what I think it is, Pyrex is going to have the same problem. An alternative solution for Pyrex would be to provide a flag on the C method object giving it instance-binding behaviour, or perhaps even make it standard for all C-implemented functions. -- Greg From musiccomposition at gmail.com Tue Apr 29 00:21:19 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Mon, 28 Apr 2008 17:21:19 -0500 Subject: [Python-3000] Removal of os.path.walk Message-ID: <1afaf6160804281521t12d07c73hf64be096882f2b96@mail.gmail.com> It seems that os.walk has more options and a cleaner interface to walking trees than os.path.walk does. Is there support for the removal this in Py3k? -- Cheers, Benjamin Peterson From brett at python.org Tue Apr 29 04:30:48 2008 From: brett at python.org (Brett Cannon) Date: Mon, 28 Apr 2008 19:30:48 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup Message-ID: [bcc to stdlib-sig] After two false starts over the YEARS of trying to cleanup and reorganize the stdlib, creating a SIG to get this going, having Guido give the PEP the once-over over the past several days, and creating two new bugs reports (issues 2715 and 2716), PEP 3108 is finally ready for public vetting! While reading this PEP, do remember this is only about either removing modules, renaming them, or moving them into a package. Additions are not covered by this PEP! Also realize all of the right people have been consulted on this stuff (e.g., the web SIG about the urllib package). So please do not think that something that seems drastic (e.g., the removal of all Mac-specific modules) was taken lightly when in fact the proper people were asked and they were okay with what is going on. Lastly, I do not want this to turn into a drawn-out thread about how people think some module should stay because they happen to use it or suggest some other module to remove. Please think before you propose a change. I have been through this proposal process for this reorg before and every time it has gotten way out of control. I do not want it happen this time. OK, with all of that out of the way, here is the PEP: ----------------------------------------------- PEP: 3108 Title: Standard Library Reorganization Version: $Revision: 62573 $ Last-Modified: $Date: 2008-04-28 17:56:36 -0700 (Mon, 28 Apr 2008) $ Author: Brett Cannon Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 01-Jan-2007 Python-Version: 3.0 Post-History: Abstract ======== Just like the language itself, Python's standard library (stdlib) has grown over the years to be very rich. But over time some modules have lost their need to be included with Python. There has also been an introduction of a naming convention for modules since Python's inception that not all modules follow. Python 3.0 has presented a chance to remove modules that do not have long term usefulness. This chance also allows for the renaming of modules so that they follow the Python style guide [#pep-0008]_. This PEP lists modules that should not be included in Python 3.0 and what modules need to be renamed. Modules to Remove ================= Guido pronounced that "silly old stuff" is to be deleted from the stdlib for Py3K [#silly-old-stuff]_. This is open-ended on purpose. Each module to be removed needs to have a justification as to why it should no longer be distributed with Python. This can range from the module being deprecated in Python 2.x to being for a platform that is no longer widely used. This section of the PEP lists the various modules to be removed. Each subsection represents a different reason for modules to be removed. Each module must have a specific justification on top of being listed in a specific subsection so as to make sure only modules that truly deserve to be removed are in fact removed. When a reason mentions how long it has been since a module has been "uniquely edited", it is in reference to how long it has been since a checkin was done specifically for the module and not for a change that applied universally across the entire stdlib. If an edit time is not denoted as "unique" then it is the last time the file was edited, period. The procedure to thoroughly remove a module is: #. Remove the module. #. Remove the tests. #. Edit ``Modules/Setup.dist`` and ``setup.py`` if needed. #. Remove the docs (if applicable). #. Run the regression test suite (using ``-uall``); watch out for tests that are skipped because an import failed for the removed module. If a deprecation warning is added to 2.6, it would be better to make all the changes to 2.6, merge the changes into the 3k branch, then perform the procedure above. This will avoid some merge conflicts. Previously deprecated --------------------- PEP 4 lists all modules that have been deprecated in the stdlib [#pep-0004]_. The specified motivations mirror those listed in PEP 4. All modules listed in the PEP at the time of the first alpha release of Python 3.0 will be removed. The entire contents of lib-old will also be removed. These modules have already been removed from being imported but are kept in the distribution for Python for users that rely upon the code. * buildtools + Documented as deprecated since Python 2.3 without an explicit reason. * cfmfile + Documented as deprecated since Python 2.4 without an explicit reason. * cl + Documented as obsolete since Python 2.0 or earlier. + Interface to SGI hardware. * md5 + Supplanted by the ``hashlib`` module. * mimetools + Documented as obsolete without an explicit reason. * MimeWriter + Supplaned by the ``email`` package. * mimify + Supplanted by the ``email`` package. * multifile + Supplanted by the ``email`` package. * posixfile + Locking is better done by ``fcntl.lockf()``. * rfc822 + Supplanted by the ``email`` package. * sha + Supplanted by the ``hashlib`` package. * sv + Documented as obsolete since Python 2.0 or earlier. + Interface to obsolete SGI Indigo hardware. * timing + Documented as obsolete since Python 2.0 or earlier. + ``time.clock()`` gives better time resolution. Platform-specific with minimal use ---------------------------------- Python supports many platforms, some of which are not widely held. And on some of these platforms there are modules that have limited use to people on those platforms. Because of their limited usefulness it would be better to no longer burden the Python development team with their maintenance. The module mentioned below are documented. All undocumented modules for the specified platforms will also be removed. IRIX ///// The IRIX operating system is no longer produced [#irix-retirement]_. Removing all modules from the plat-irix[56] directory has been deemed reasonable because of this fact. + AL/al [done: 3.0] - Provides sound support on Indy and Indigo workstations. - Both workstations are no longer available. - Code has not been uniquely edited in three years. + cd [done: 3.0] - CD drive control for SGI systems. - SGI no longer sells machines with IRIX on them. - Code has not been uniquely edited in 14 years. + cddb [done: 3.0] - Undocumented. + cdplayer [done: 3.0] - Undocumented. + cl/CL/CL_old [done: 3.0] - Compression library for SGI systems. - SGI no longer sells machines with IRIX on them. - Code has not been uniquely edited in 14 years. + DEVICE/GL/gl/cgen/cgensuport [done: 3.0] - GL access, which is the predecessor to OpenGL. - Has not been edited in at least eight years. - Third-party libraries provide better support (PyOpenGL [#pyopengl]_). + ERRNO [done: 3.0] - Undocumented. + FILE [done: 3.0] - Undocumented. + FL/fl/flp [done: 3.0] - Wrapper for the FORMS library [#irix-forms]_ - FORMS has not been edited in 12 years. - Library is not widely used. - First eight hits on Google are for Python docs for fl. + fm [done: 3.0] - Wrapper to the IRIS Font Manager library. - Only available on SGI machines which no longer come with IRIX. + GET [done: 3.0] - Undocumented. + GLWS [done: 3.0] - Undocumented. + imgfile [done: 3.0] - Wrapper for SGI libimage library for imglib image files (``.rgb`` files). - Python Imaging Library provdes read-only support [#pil]_. - Not uniquely edited in 13 years. + IN [done: 3.0] - Undocumented. + IOCTL [done: 3.0] - Undocumented. + jpeg [done: 3.0] - Wrapper for JPEG (de)compressor. - Code not uniquely edited in nine years. - Third-party libraries provide better support (Python Imaging Library [#pil]_). + panel [done: 3.0] - Undocumented. + panelparser [done: 3.0] - Undocumented. + readcd [done: 3.0] - Undocumented. + SV [done: 3.0] - Undocumented. + torgb [done: 3.0] - Undocumented. + WAIT [done: 3.0] - Undocumented. Mac-specific modules //////////////////// The Mac-specific modules are mostly unmaintained (e.g., the bgen tool used to auto-generate many of the modules has never been updated to support UCS-4). It is also not Python's place to maintain such a large amount of OS-specific modules. Thus all modules under plat-mac are to be removed. A stub module for proxy access will be provided for use by urllib. * _builtinSuites - Undocumented. - Package under lib-scriptpackages. * Audio_mac - Undocumented. * aepack - OSA support is better through third-party modules. * Appscript [#appscript]_. - Hard-coded endianness which breaks on Intel Macs. - Might need to rename if Carbon package dependent. * aetools - See aepack. * aetypes - See aepack. * applesingle - Undocumented. - AppleSingle is a binary file format for A/UX. - A/UX no longer distributed. * appletrawmain - Undocumented. * appletrunner - Undocumented. * argvemulator - Undocumented. * autoGIL - Very bad model for using Python with the CFRunLoop. * bgenlocations - Undocumented. * bundlebuilder - Undocumented. * Carbon - Carbon development has stopped. - Does not support 64-bit systems completely. - Dependent on bgen which has never been updated to support UCS-4 Unicode builds of Python. * CodeWarrior - Undocumented. - Package under lib-scriptpackages. * ColorPicker - Better to use Cocoa for GUIs. * EasyDialogs - Better to use Cocoa for GUIs. * Explorer - Undocumented. - Package under lib-scriptpackages. * Finder - Undocumented. - Package under lib-scriptpackages. * findertools - No longer useful. * FrameWork - Poorly documented. - Not updated to support Carbon Events. * gensuitemodule - See aepack. * ic * icopen - Not needed on OS X. - Meant to replace 'open' which is usually a bad thing to do. * macerrors - Undocumented. * MacOS - Would also mean the removal of binhex. * macostools * macresource - Undocumented. * MiniAEFrame - See aepack. * Nav - Undocumented. * Netscape - Undocumented. - Package under lib-scriptpackages. * pimp - Undocumented. * PixMapWrapper - Undocumented. * StdSuites - Undocumented. - Package under lib-scriptpackages. * SystemEvents - Undocumented. - Package under lib-scriptpackages. * Terminal - Undocumented. - Package under lib-scriptpackages. * terminalcommand - Undocumented. * videoreader - No longer used. * W - No longer distributed with Python. .. _PyObjC: http://pyobjc.sourceforge.net/ Solaris /////// + SUNAUDIODEV/sunaudiodev [done: 3.0] - Access to the sound card on Sun machines. - Code not uniquely edited in over eight years. Hardly used ------------ Some modules that are platform-independent are hardly used. This can be from how easy it is to implement the functionality from scratch or because the audience for the code is very small. * audiodev [done: 3.0] + Undocumented. + Not edited in five years. + If removed sunaudio should go as well (also undocumented; not edited in over seven years). * imputil + Undocumented. + Never updated to support absolute imports. * mutex + Easy to implement using a semaphore and a queue. + Cannot block on a lock attempt. + Not uniquely edited since its addition 15 years ago. + Only useful with the 'sched' module. + Not thread-safe. * stringold [done: 3.0] + Function versions of the methods on string objects. + Obsolete since Python 1.6. + Any functionality not in the string object or module will be moved to the string module (mostly constants). * symtable/_symtable + Undocumented. * toaiff [done: 3.0, moved to Demo] + Undocumented. + Requires ``sox`` library to be installed on the system. * user + Easily handled by allowing the application specify its own module name, check for existence, and import if found. * new [done: 3.0] + Just a rebinding of names from the 'types' module. + Can also call ``type`` built-in to get most types easily. + Docstring states the module is no longer useful as of revision 27241 (2002-06-15). * pure [done: 3.0] + Written before Pure Atria was bought by Rational which was then bought by IBM (in other words, very old). * test.testall [done: 3.0] + From the days before regrtest. Obsolete -------- Becoming obsolete signifies that either another module in the stdlib or a widely distributed third-party library provides a better solution for what the module is meant for. * Bastion/rexec [done: 3.0] + Restricted execution / security. + Turned off in Python 2.3. + Modules deemed unsafe. * bsddb185 [done: 3.0] + Superceded by bsddb3 + Not built by default. + Documentation specifies that the "module should never be used directly in new code". * commands + subprocess module replaces it [#pep-0324]_. + Remove getstatus(), move rest to subprocess. * compiler (need to add AST -> bytecode mechanism) [done: 3.0] + Having to maintain both the built-in compiler and the stdlib package is redundant [#ast-removal]_. + The AST created by the compiler is available [#ast]_. + Mechanism to compile from an AST needs to be added. * dircache + Negligible use. + Easily replicated. * dl [done: 3.0] + ctypes provides better support for same functionality. * fpformat + All functionality is supported by string interpolation. * htmllib + Superceded by HTMLParser. * ihooks + Undocumented. + For use with rexec which has been turned off since Python 2.3. * imageop [done: 3.0] + Better support by third-party libraries (Python Imaging Library [#pil]_). + Unit tests relied on rgbimg and imgfile. - rgbimg was removed in Python 2.6. - imgfile slated for removal in this PEP. [done: 3.0] * linuxaudiodev [done: 3.0] + Replaced by ossaudiodev. * mhlib + Obsolete mailbox format. * popen2 [done: 3.0] + subprocess module replaces them [#pep-0324]_. * sched + Replaced by threading.Timer. * sgmllib + Does not fully parse SGML. + In the stdlib for support to htmllib which is slated for removal. * stat + ``os.stat`` now returns a tuple with attributes. + Functions in the module should be made into methods for the object returned by os.stat. * statvfs + ``os.statvfs`` now returns a tuple with attributes. * thread + People should use 'threading' instead. - Rename 'thread' to _thread. - Deprecate dummy_thread and rename _dummy_thread. - Move thread.get_ident over to threading. + Guido has previously supported the deprecation [#thread-deprecation]_. * urllib + Superceded by urllib2. + Functionality unique to urllib will be kept in the `urllib package`_. * UserDict [done: 3.0] + Not as useful since types can be a superclass. + Useful bits moved to the 'collections' module. * UserList/UserString [done: 3.0] + Not useful since types can be a superclass. Modules to Rename ================= Along with the stdlib gaining some modules that are no longer relevant, there is also the issue of naming. Many modules existed in the stdlib before PEP 8 came into existence [#pep-0008]_. This has led to some naming inconsistencies and namespace bloat that should be addressed. PEP 8 violations ---------------- PEP 8 specifies that modules "should have short, all-lowercase names" where "underscores can be used ... if it improves readability" [#pep-0008]_. The use of underscores is discouraged in package names. The following modules violate PEP 8 and are not somehow being renamed by being moved to a package. ================== ================================================== Current Name Replacement Name ================== ================================================== _winreg winreg (rename also because module has a public interface and thus should not have a leading underscore) ConfigParser configparser copy_reg copyreg PixMapWrapper pixmapwrapper Queue queue SocketServer socketserver ================== ================================================== Merging C and Python implementations of the same interface ---------------------------------------------------------- Several interfaces have both a Python and C implementation. While it is great to have a C implementation for speed with a Python implementation as fallback, there is no need to expose the two implementations independently in the stdlib. For Python 3.0 all interfaces with two implementations will be merged into a single public interface. The C module is to be given a leading underscore to delineate the fact that it is not the reference implementation (the Python implementation is). This means that any semantic difference between the C and Python versions must be dealt with before Python 3.0 or else the C implementation will be removed until it can be fixed. One interface that is not listed below is xml.etree.ElementTree. This is an externally maintained module and thus is not under the direct control of the Python development team for renaming. See `Open Issues`_ for a discussion on this. * pickle/cPickle + Rename cPickle to _pickle. + Semantic completeness of C implementation *not* verified. * profile/cProfile + Rename cProfile to _profile. + Semantic completeness of C implementation *not* verified. * StringIO/cStringIO [done: 3.0] + Add the class to the 'io' module. No public, documented interface ------------------------------- There are several modules in the stdlib that have no defined public interface. These modules exist as support code for other modules that are exposed. Because they are not meant to be used directly they should be renamed to reflect this fact. ============ =============================== Current Name Replacement Name ============ =============================== markupbase _markupbase [done: 3.0] dummy_thread _dummy_thread [#]_ ============ =============================== .. [#] Assumes ``thread`` is renamed to ``_thread``. Poorly chosen names ------------------- A few modules have names that were poorly chosen in hindsight. They should be renamed so as to prevent their bad name from perpetuating beyond the 2.x series. ================= =============================== Current Name Replacement Name ================= =============================== repr reprlib test.test_support test.support ================= =============================== Grouping of modules ------------------- As the stdlib has grown, several areas within it have expanded to include multiple modules (e.g., dbm support). Thus some new packages make sense where the renaming makes a module's name easier to work with. dbm package /////////// ================= =============================== Current Name Replacement Name ================= =============================== anydbm dbm.tools [1]_ dbhash dbm.bsd dbm dbm.ndbm dumbdm dbm.dumb gdbm dbm.gnu whichdb dbm.tools [1]_ ================= =============================== .. [1] ``dbm.tools`` can combine ``anybdbm`` and ``whichdb`` since the public API for both modules has no name conflict and the two modules have closely related usage. html package //////////// ================== =============================== Current Name Replacement Name ================== =============================== HTMLParser html.parser htmlentitydefs html.entities ================== =============================== http package //////////// ================= =============================== Current Name Replacement Name ================= =============================== httplib http.client BaseHTTPServer http.server [2]_ CGIHTTPServer http.server [2]_ SimpleHTTPServer http.server [2]_ Cookie http.cookies cookielib http.cookiejar ================= =============================== .. [2] The ``http.server`` module can combine the specified modules safely as they have no naming conflicts. tkinter package /////////////// ================== =============================== Current Name Replacement Name ================== =============================== Canvas tkinter.canvas Dialog tkinter.dialog FileDialog tkinter.filedialog [4]_ FixTk tkinter._fix ScrolledText tkinter.scrolledtext SimpleDialog tkinter.simpledialog [5]_ Tix tkinter.tix Tkconstants tkinter.constants Tkdnd tkinter.dnd Tkinter tkinter.__init__ tkColorChooser tkinter.colorchooser tkCommonDialog tkinter.commondialog tkFileDialog tkinter.filedialog [4]_ tkFont tkinter.font tkMessageBox tkinter.messagebox tkSimpleDialog tkinter.simpledialog [5]_ turtle tkinter.turtle ================== =============================== .. [4] ``tkinter.filedialog`` can safely combine ``FileDialog`` and ``tkFileDialog`` as there are no naming conflicts. .. [5] ``tkinter.simpledialog`` can safely combine ``SimpleDialog`` and ``tkSimpleDialog`` have no naming conflicts. urllib package ////////////// Originally this new package was to be named ``url``, but because of the common use of the name as a variable, it has been deemed better to keep the name ``urllib`` and instead shift existing modules around into a new package. ================== =============================== Current Name Replacement Name ================== =============================== urllib2 urllib.request urlparse urllib.parse urllib urllib.parse, urllib.request [6]_ ================== =============================== .. [6] The quoting-related functions from ``urllib`` will be added to ``urllib.parse``. ``urllib.URLOpener`` and ``urllib.FancyUrlOpener`` will be added to ``urllib.request`` as long as the documentation for both modules is updated. xmlrpc package ////////////// ================== =============================== Current Name Replacement Name ================== =============================== xmlrpclib xmlrpc.client SimpleXMLRPCServer xmlrpc.server [3]_ CGIXMLRPCServer xmlrpc.server [3]_ ================== =============================== .. [3] The modules being combined into ``xmlrpc.server`` have no naming conflicts and thus can safely be merged. Transition Plan =============== For modules to be removed ------------------------- For the removal of modules that are continuing to exist in the Python 2.x series (i.e., not deprecated explicitly in the 2.x series), ``warnings.warn3k()`` will be used to issue a DeprecationWarning. Renaming of modules ------------------- For modules that are renamed, stub modules will be created with the original names and be kept in a directory within the stdlib (e.g. like how lib-old was once used). The need to keep the stub modules within a directory is to prevent naming conflicts with case-insensitive filesystems in those cases where nothing but the case of the module is changing. These stub modules will import the module code based on the new naming. The same type of warning being raised by modules being removed will be raised in the stub modules. Support in the 2to3 refactoring tool for renames will also be used [#2to3]_. Import statements will be rewritten so that only the import statement and none of the rest of the code needs to be touched. This will be accomplished by using the ``as`` keyword in import statements to bind in the module namespace to the old name while importing based on the new name. Open Issues =========== Renaming of modules maintained outside of the stdlib ---------------------------------------------------- xml.etree.ElementTree not only does not meet PEP 8 naming standards but it also has an exposed C implementation [#pep-0008]_. It is an externally maintained package, though [#pep-0360]_. A request will be made for the maintainer to change the name so that it matches PEP 8 and hides the C implementation. Rejected Ideas ============== Modules that were originally suggested for removal -------------------------------------------------- * asynchat/asyncore + Josiah Carlson has said he will maintain the modules. * audioop/sunau/aifc + Audio modules where the formats are still used. * base64/quopri/uu + All still widely used. + 'codecs' module does not provide as nice of an API for basic usage. * fileinput + Useful when having to work with stdin. * linecache + Used internally in several places. * nis + Testimonials from people that new installations of NIS are still occurring * getopt + Simpler than optparse. * repr + Useful as a basis for overriding. + Used internally. * telnetlib + Really handy for quick-and-dirty remote access. + Some hardware supports using telnet for configuration and querying. * Tkinter + Would prevent IDLE from existing. + No GUI toolkit would be available out of the box. Introducing a new top-level package ----------------------------------- It has been suggested that the entire stdlib be placed within its own package. This PEP will not address this issue as it has its own design issues (naming, does it deserve special consideration in import semantics, etc.). Everything within this PEP can easily be handled if a new top-level package is introduced. Introducing new packages to contain theme-related modules --------------------------------------------------------- During the writing of this PEP it was noticed that certain themes appeared in the stdlib. In the past people have suggested introducing new packages to help collect modules that share a similar theme (e.g., audio). An Open Issue was created to suggest some new packages to introduce. In the end, though, not enough support could be pulled together to warrant moving forward with the idea. Instead name simplification has been chosen as the guiding force for PEPs to create. References ========== .. [#pep-0004] PEP 4: Deprecation of Standard Modules (http://www.python.org/dev/peps/pep-0004/) .. [#pep-0008] PEP 8: Style Guide for Python Code (http://www.python.org/dev/peps/pep-0008/) .. [#pep-0324] PEP 324: subprocess -- New process module (http://www.python.org/dev/peps/pep-0324/) .. [#pep-0360] PEP 360: Externally Maintained Packages (http://www.python.org/dev/peps/pep-0360/) .. [#module-index] Python Documentation: Global Module Index (http://docs.python.org/modindex.html) .. [#timing-module] Python Library Reference: Obsolete (http://docs.python.org/lib/obsolete-modules.html) .. [#silly-old-stuff] Python-Dev email: "Py3k release schedule worries" (http://mail.python.org/pipermail/python-3000/2006-December/005130.html) .. [#thread-deprecation] Python-Dev email: Autoloading? (http://mail.python.org/pipermail/python-dev/2005-October/057244.html) .. [#py-dev-summary-2004-11-01] Python-Dev Summary: 2004-11-01 (http://www.python.org/dev/summary/2004-11-01_2004-11-15/#id10) .. [#2to3] 2to3 refactoring tool (http://svn.python.org/view/sandbox/trunk/2to3/) .. [#pyopengl] PyOpenGL (http://pyopengl.sourceforge.net/) .. [#pil] Python Imaging Library (PIL) (http://www.pythonware.com/products/pil/) .. [#twisted] Twisted (http://twistedmatrix.com/trac/) .. [#irix-retirement] SGI Press Release: End of General Availability for MIPS IRIX Products -- December 2006 (http://www.sgi.com/support/mips_irix.html) .. [#irix-forms] FORMS Library by Mark Overmars (ftp://ftp.cs.ruu.nl/pub/SGI/FORMS) .. [#sun-au] Wikipedia: Au file format (http://en.wikipedia.org/wiki/Au_file_format) .. [#appscript] appscript (http://appscript.sourceforge.net/) .. [#ast] _ast module (http://docs.python.org/lib/ast.html) .. [#ast-removal] python-dev email: getting compiler package failures (http://mail.python.org/pipermail/python-3000/2007-May/007615.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From tjreedy at udel.edu Tue Apr 29 04:38:22 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 28 Apr 2008 22:38:22 -0400 Subject: [Python-3000] range() issues References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: "Guido van Rossum" wrote in message news:ca471dc20804281621g77332ca0n5173d6c1034c905 at mail.gmail.com... | BTW, if you're looking for a term describing range() that's better | than set or sequence, how about "series"? It's a mathematical word | that matches pretty exactly. (More accurately, I believe it's an | algebraic series.) I believe you are thinking of arithmetic series. But in math (modern, at least), a series is a sum of a sequence or progression of terms. (1, 1/2, 1/4, ... is a sequence, 1+1/2+1/4... is a series.) The output of range constitutes an arithmetic sequence or progression (terms have a constant difference -- the step). Of course, it can also be regarded as a sequence of partial sums of the series start + step + step + ... + step whose underlying sequence is the trivial [step, step, step, ....] with successive differences 0 -- but this is not the standard view or usage. So 'progression' would be an even better near-synonym for 'sequence'. In common usage, a 'series' can be any group of related items ordered in time or space, but the ordering can be rather arbitrary. Most programmers take a sequence of math courses in definite order -- algebra, geometry, trigonometry, calculus, and maybe even differential equations -- and learn a series of programming languages over their career in some order that is mostly haphazard. tjr From fumanchu at aminus.org Tue Apr 29 04:43:37 2008 From: fumanchu at aminus.org (Robert Brewer) Date: Mon, 28 Apr 2008 19:43:37 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: Brett Cannon wrote: > Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup > > [bcc to stdlib-sig] > > After two false starts over the YEARS of trying to cleanup and > reorganize the stdlib, creating a SIG to get this going, having Guido > give the PEP the once-over over the past several days, and creating > two new bugs reports (issues 2715 and 2716), PEP 3108 is finally ready > for public vetting! Whew! That's a big doc. All I have to say at the end of it is: GREAT JOB Brett and all the stdlib-sig! Robert Brewer fumanchu at aminus.org From talin at acm.org Tue Apr 29 04:45:38 2008 From: talin at acm.org (Talin) Date: Mon, 28 Apr 2008 19:45:38 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: <48168BD2.7040305@acm.org> +1 -- Talin From tjreedy at udel.edu Tue Apr 29 05:39:10 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 28 Apr 2008 23:39:10 -0400 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup References: Message-ID: A few comments: Intro to delete: could point out here that anyone who really wants a deleted module can/should update it to work with 3.0 and make available through PyPI. --------- obsolete/popen2 ... "replaces them." 'them' should be 'it': any other popenx modules seem to be gone already --------------- "This will be accomplished by using the ``as`` keyword in import statements to bind in the module namespace to the old name while importing based on the new name. " This should only be done if there is not already an 'as' clause! If there is, just substitute. ----------------------- +1 on everything proposed. This will make the stdlib less confusing, especially for newcomers. I might like to see even more packages, but do not have the energy to research/propose/argue for any. tjr cc'ed From janssen at parc.com Tue Apr 29 05:39:54 2008 From: janssen at parc.com (Bill Janssen) Date: Mon, 28 Apr 2008 20:39:54 PDT Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: <08Apr28.203957pdt."58696"@synergy1.parc.xerox.com> Nice job, Brett. I only have two concerns: As you don't quite note, the Mac "ic" module is the interface to the "Internet Configuration" system on the Mac. In particular, it's where proxy information is drawn from. We need to have that replacement code in hand before "ic" is jettisoned. You allude to that, but with no detail. I'd like to hear the detail. > * mhlib > > + Obsolete mailbox format. The "mh" format is far from obsolete; in fact, it's (weirdly enough) coming back into popularity as modern indexing engines start requiring a one-message-per-file mail format. And as modern file systems allow that. Systems like "nmh" and "Sylpheed" are under active development and corresponding use. I would take the code from the current "mhlib" module that manipulates ".mh_profile" and mailbox-specific context and add it to the "mailboxes.MH" module before dumping it. Bill From brett at python.org Tue Apr 29 07:18:41 2008 From: brett at python.org (Brett Cannon) Date: Mon, 28 Apr 2008 22:18:41 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <1924341155841993233@unknownmsgid> References: <1924341155841993233@unknownmsgid> Message-ID: On Mon, Apr 28, 2008 at 8:39 PM, Bill Janssen wrote: > Nice job, Brett. I only have two concerns: > > As you don't quite note, the Mac "ic" module is the interface to the > "Internet Configuration" system on the Mac. In particular, it's where > proxy information is drawn from. We need to have that replacement > code in hand before "ic" is jettisoned. You allude to that, but with > no detail. I'd like to hear the detail. > I talked with Ronald and someone on the macpython list agreed to write a simple proxy module to replace it for urllib. So there will be a replacement. > > * mhlib > > > > + Obsolete mailbox format. > > The "mh" format is far from obsolete; in fact, it's (weirdly enough) > coming back into popularity as modern indexing engines start requiring > a one-message-per-file mail format. And as modern file systems allow > that. Systems like "nmh" and "Sylpheed" are under active development > and corresponding use. I would take the code from the current "mhlib" > module that manipulates ".mh_profile" and mailbox-specific context and > add it to the "mailboxes.MH" module before dumping it. > That's fine by me. The individual module just doesn't need to stick around. -Brett From brett at python.org Tue Apr 29 07:23:39 2008 From: brett at python.org (Brett Cannon) Date: Mon, 28 Apr 2008 22:23:39 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: On Mon, Apr 28, 2008 at 8:39 PM, Terry Reedy wrote: > A few comments: > > Intro to delete: could point out here that anyone who really wants a > deleted module can/should update it to work with 3.0 and make available > through PyPI. Guido and I discussed this and he pointed out that if someone went that far we probably should consider keeping the module. But there was originally a comment along those lines. > --------- > > obsolete/popen2 ... "replaces them." > 'them' should be 'it': Yep. I originally had 'commands' and 'popen2' as the same bullet point. > any other popenx modules seem to be gone > already > --------------- > > > "This > will be accomplished by using the ``as`` keyword in import statements > to bind in the module namespace to the old name while importing based > on the new name. > " > This should only be done if there is not already an 'as' clause! > If there is, just substitute. Done. > ----------------------- > +1 on everything proposed. This will make the stdlib less confusing, > especially for newcomers. > > I might like to see even more packages, but do not have the energy to > research/propose/argue for any. =) Yeah, it was tough to come up with the ones there, especially trying to keep with the "new packages only when the names are an improvement" guideline. -Brett From phd at phd.pp.ru Tue Apr 29 07:26:40 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Tue, 29 Apr 2008 09:26:40 +0400 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <1924341155841993233@unknownmsgid> Message-ID: <20080429052640.GA17647@phd.pp.ru> On Mon, Apr 28, 2008 at 10:18:41PM -0700, Brett Cannon wrote: > On Mon, Apr 28, 2008 at 8:39 PM, Bill Janssen wrote: > > Nice job, Brett. I only have two concerns: > > > > As you don't quite note, the Mac "ic" module is the interface to the > > "Internet Configuration" system on the Mac. In particular, it's where > > proxy information is drawn from. We need to have that replacement > > code in hand before "ic" is jettisoned. You allude to that, but with > > no detail. I'd like to hear the detail. > > I talked with Ronald and someone on the macpython list agreed to write > a simple proxy module to replace it for urllib. So there will be a > replacement. "ic" is also used in webbrowser.py (just a reminder). Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From rhamph at gmail.com Tue Apr 29 08:11:16 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 29 Apr 2008 00:11:16 -0600 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: On Mon, Apr 28, 2008 at 8:30 PM, Brett Cannon wrote: > * sched > > + Replaced by threading.Timer. I don't see sched as obsoleted by threading.Timer. It's much simpler to use (no need for locking) and more efficient (no legions of sleeping threads). Instead, maybe it should be removed because it's trivial to reimplement as well as being overshadowed by all the other event loops built into bigger systems (tk, qt, gtk, twisted, etc)? -- Adam Olsen, aka Rhamphoryncus From ronaldoussoren at mac.com Tue Apr 29 10:57:59 2008 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 29 Apr 2008 10:57:59 +0200 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <08Apr28.203957pdt.58696@synergy1.parc.xerox.com> References: <08Apr28.203957pdt.58696@synergy1.parc.xerox.com> Message-ID: <5C8E1D11-8C39-4443-AD6B-79C16D2DC60F@mac.com> On 29 Apr, 2008, at 5:39, Bill Janssen wrote: > Nice job, Brett. I only have two concerns: > > As you don't quite note, the Mac "ic" module is the interface to the > "Internet Configuration" system on the Mac. In particular, it's where > proxy information is drawn from. We need to have that replacement > code in hand before "ic" is jettisoned. You allude to that, but with > no detail. I'd like to hear the detail. Daniel Miller provided an implementation of that functionality on the pythonmac-sig list, I hope to commit a fully fleshed out version of that code sometime this week. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2224 bytes Desc: not available URL: From python at rcn.com Tue Apr 29 11:46:31 2008 From: python at rcn.com (Raymond Hettinger) Date: Tue, 29 Apr 2008 02:46:31 -0700 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup References: Message-ID: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> > * UserList/UserString [done: 3.0] Note that these were updated and moved to the collections module in Py3.0. > anydbm dbm.tools [1]_ > whichdb dbm.tools [1]_ Were there any better naming suggestions than dbm.tools? The original names seem much more informative. > For modules that are renamed, stub modules will be created with the > original names and be kept in a directory within the stdlib (e.g. like > how lib-old was once used). What is the purpose of the new directory? Are there some use cases for intermixing the new and old names? Is there something that the 2-to-3 converter won't be able to handle? Raymond From thomas at python.org Tue Apr 29 14:01:43 2008 From: thomas at python.org (Thomas Wouters) Date: Tue, 29 Apr 2008 14:01:43 +0200 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> On Tue, Apr 29, 2008 at 8:11 AM, Adam Olsen wrote: > On Mon, Apr 28, 2008 at 8:30 PM, Brett Cannon wrote: > > * sched > > > > + Replaced by threading.Timer. > > I don't see sched as obsoleted by threading.Timer. It's much simpler > to use (no need for locking) and more efficient (no legions of > sleeping threads). Instead, maybe it should be removed because it's > trivial to reimplement as well as being overshadowed by all the other > event loops built into bigger systems (tk, qt, gtk, twisted, etc)? > More importantly, sched doesn't use threads, so replacing it with threading.Timer is inappropriate :) But yes, it should just go. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Apr 29 14:07:26 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 29 Apr 2008 22:07:26 +1000 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> References: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> Message-ID: <48170F7E.90609@gmail.com> Raymond Hettinger wrote: >> * UserList/UserString [done: 3.0] > > Note that these were updated and moved to the collections module in Py3.0. > > >> anydbm dbm.tools [1]_ >> whichdb dbm.tools [1]_ > > Were there any better naming suggestions than dbm.tools? The original > names seem much more informative. Maybe they're more informative if you've been using them for a long time. As a non-DB-API user, anydbm seems just as generic to me as dbm.tools, and whichdb.whichdb is just redundant. dbm.tools.open and dbm.tools.whichdb seem fine as names for the functions. >> For modules that are renamed, stub modules will be created with the >> original names and be kept in a directory within the stdlib (e.g. like >> how lib-old was once used). > > What is the purpose of the new directory? Are there some use > cases for intermixing the new and old names? Is there something > that the 2-to-3 converter won't be able to handle? The reason is noted in the PEP - it's to keep case insensitive filesystems (such as NTFS) from spitting the dummy when we try to put both a ConfigParser.py (old name) and configparser.py (new name) in the Python Lib directory. I'd like to see the PEP address the question of how it is going to deal with getting duplicate copies of modules in sys.modules when some code in an application uses the old name and some code uses the new name. On the proposed name changes themselves - excellent work! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ncoghlan at gmail.com Tue Apr 29 14:10:30 2008 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 29 Apr 2008 22:10:30 +1000 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> Message-ID: <48171036.8000701@gmail.com> Thomas Wouters wrote: > > > On Tue, Apr 29, 2008 at 8:11 AM, Adam Olsen > wrote: > > On Mon, Apr 28, 2008 at 8:30 PM, Brett Cannon > wrote: > > * sched > > > > + Replaced by threading.Timer. > > I don't see sched as obsoleted by threading.Timer. It's much simpler > to use (no need for locking) and more efficient (no legions of > sleeping threads). Instead, maybe it should be removed because it's > trivial to reimplement as well as being overshadowed by all the other > event loops built into bigger systems (tk, qt, gtk, twisted, etc)? > > > More importantly, sched doesn't use threads, so replacing it with > threading.Timer is inappropriate :) But yes, it should just go. I agree that "use a real event loop engine" is a better argument for getting rid of sched/mutex than "use threads". Perhaps sched/mutex could be dumped in the Demo directory? Or perhaps we should just get rid of them entirely and see if anyone with a real use case complains - it's not like the modules will be particularly hard to dig out of SVN if we decide we want to keep them after all. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From ishimoto at gembook.org Tue Apr 29 14:20:10 2008 From: ishimoto at gembook.org (atsuo ishimoto) Date: Tue, 29 Apr 2008 21:20:10 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <87wsmxpc0r.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <87wsmxpc0r.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <797440730804290520g6982518fkf25ac81bf7ea9260@mail.gmail.com> 2008/4/17 Stephen J. Turnbull : > How about choosing a standard Python repertoire (based on the Unicode > standard, of course) of which characters get a graphic repr and which > ones get \u-escaped, and have a post-hook for repr which gets passed > the string repr proposes to print out? Will the standard repertoire exclude Cyrillic or full-with ASCII? If so, I (Japanese) will disable the hook because full-with ASCII characters are not ambiguous to me. Russian people may not want to use the repertoire, also. I think ambiguity will occur when we meet with unfamiliar characters. So choosing repertoire everybody can accept will be difficult. For Python identifiers, it is good idea to select a repertoire for my project. But the repertoire is best used with such tool as PyChecker. repr() is not necessary to check the repertoire. From aleaxit at gmail.com Tue Apr 29 15:53:53 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 29 Apr 2008 06:53:53 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: Hi Brett, great job -- but I would like to plead for the life of the sched module. I love sched and use it often for purposes for which it seems to me that the proposed replacement strategy (via threading.Timer) is not suitable: for example when I'm on a platform without threads (and don't need threads, nor care about them -- I just want events executed serially at specific times), or for simulation purposes (more often than not, I'm passing something other than time.time and time.sleep as the scheduler's callbacks). Alex From alexander.belopolsky at gmail.com Tue Apr 29 15:51:11 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 09:51:11 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: On Mon, Apr 28, 2008 at 7:21 PM, Guido van Rossum wrote: > BTW, if you're looking for a term describing range() that's better > than set or sequence, how about "series"? It's a mathematical word > that matches pretty exactly. No, mathematical series is the sum of a sequence: . An alternative to "sequence" would be "progression." > (More accurately, I believe it's an algebraic series.) It is an "arithmetic progression": . From mkieverpy at tlink.de Tue Apr 29 17:43:33 2008 From: mkieverpy at tlink.de (mkieverpy at tlink.de) Date: Tue, 29 Apr 2008 15:43:33 -0000 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup Message-ID: <20080429134336.08761B964@mail.terralink.de> Hi list, I'm a long time lurker with only a very few contributions in Tkinter. Just one remark about the inclusion of Canvas in the tkinter package: Canvas is marked as obsolete since 2000. See this issue (and the comment at the top of Canvas.py): http://bugs.python.org/issue210677 Cheers, Matthias Kievernagel. (mkiever/at/web/dot/de) From aleaxit at gmail.com Tue Apr 29 16:21:22 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 29 Apr 2008 07:21:22 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 5:01 AM, Thomas Wouters wrote: ... > > > * sched > > > > > > + Replaced by threading.Timer. > > > > I don't see sched as obsoleted by threading.Timer. It's much simpler > > to use (no need for locking) and more efficient (no legions of > > sleeping threads). Instead, maybe it should be removed because it's > > trivial to reimplement as well as being overshadowed by all the other > > event loops built into bigger systems (tk, qt, gtk, twisted, etc)? > > More importantly, sched doesn't use threads, so replacing it with > threading.Timer is inappropriate :) But yes, it should just go. Oops, sorry for the duplicate subthread -- I had mailed my objections to sched removal privately to Brett first (trying to avoid the thread explosion he feared) and I thought I had later mailed the whole list but hadn't, so I hit send again first thing this morning. Anyway, I disagree that sched should go away from the Python standard library. It's not trivial to reimplement it _properly_, it's a great way to code a pure-Python cron-substitute that works across all platforms including very limited (e.g. embedded) ones w/o threads and with scarce memory (without the overhead of the bigger systems, which may or may not be well supported on a given limited platform), AND it's an excellent way to do _simulations_ -- I use it regularly for both pure simulations AND for systems that, in production, run on a real timeline, and can also be easily "simulation-tested" by feeding sched with functions that read the simulated timeline (instead of time.time and time.sleep). Please let's keep it! Alex From alexander.belopolsky at gmail.com Tue Apr 29 16:30:44 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 10:30:44 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: On Mon, Apr 28, 2008 at 7:18 PM, Guido van Rossum wrote: .. > The cost of the extra functionality: writing it, reviewing it, adding > unittests, documenting it, maintaining it, making sure it works on > 64-bit machines, having Python book authors discuss it; and in > addition some extra baggage in the executable that is never needed > (but I think the other reasons are more compelling). There's a reason > the xrange() object didn't have all this extra baggage. > > Remember, one of the goals of Py3k is to *shrink* the language so that > it will fit in your brain again. This thread seems to be going in the > opposite direction. I would say making range return a instance of Sequence will make that feature easier to understand. In the current implementation, given a range r, you can do r[0], but not r[0:2], or x in r; given n > sys.maxsize, you can create range(n), but the result is not indexable or sizable. I would say "range(n) is a memory efficient substitute for [0, 1, ... n-1]" is easier to fit into one's brain that the current hodgepodge of exceptions. in terms of implementation, slicing support does not add much complexity once indexing is supported. If you really want to simplify the language, I would suggest that range() should simply return an iterator and the only recommended use be list(range(..)) and for x in range(..) constructs. __length_hint__ can be provided for efficiency of list(range(..)) instead of __length__ and that will eliminate range() issues. Note that making range() return an iterator will make it very similar to itertools.count() suggesting that the two should be unified somehow. (With ellipsis becoming a valid argument, would you consider range(...) for count() and range(start, ...) for count(n) too clever? Alternatively, range() for count() and range(n, None) for count(n)?) In any case, turning range into an Iterator is unlikely to be accepted this late in the 3.0 development stage, so I believe making it a Sequence is cleaner and simpler than the status quo. From qgallet at gmail.com Tue Apr 29 16:39:48 2008 From: qgallet at gmail.com (Quentin Gallet-Gilles) Date: Tue, 29 Apr 2008 16:39:48 +0200 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <20080429134336.08761B964@mail.terralink.de> References: <20080429134336.08761B964@mail.terralink.de> Message-ID: <8b943f2b0804290739p63153166ib3786e2bdadb9ab4@mail.gmail.com> On Tue, Apr 29, 2008 at 5:43 PM, wrote: > Hi list, > > I'm a long time lurker with only a very few contributions > in Tkinter. Just one remark about the inclusion of Canvas > in the tkinter package: > Canvas is marked as obsolete since 2000. > See this issue (and the comment at the top of Canvas.py): > http://bugs.python.org/issue210677 > > Cheers, > Matthias Kievernagel. > (mkiever/at/web/dot/de) Indeed. I remember mentioning it back in December 2007, along with a question about the deprecated status of Tkdnd. With the stdlib-sig list creation, I guess the discussion was lost and I totally forgot to mention it again when the discussion about modules deletions occurred. Sorry Brett. Quentin As a reminder, here's an abstract from the mail back then : """Apart from the awfully inconsistent naming convention, there are a few things that are worth considering for the reorg: 1. FixTk is only called by Tkinter and has no API to expose since it's only a win32 specific piece of code to manage the _tkinter import. It should be renamed _fixtk/_tkfix or merged into Tkinter.py 2. Tkconstants is used in several places : - Tkinter.py does a simple "from Tkconstants import *" - Tix doesn't import it and its documentation shows an example with "import Tkinter" followed by "from Tkconstants import *". And sure enough, DirList and DirTree (from Demo/tix/samples/) both do that. - Finally, CodeContext (Lib/idlelib) imports a few constants manually IMO Tkconstants could be renamed _tkconstants and all imports besides the first one changed to access the constants via Tkinter. 3. Canvas contains a comment saying it's obsolete and that Tkinter.Canvasshould be used instead. I've gone ahead and added it in the "Possible Deletions" tab. 4. All those *Dialog modules seems an obvious candidate for merging/deleting/rewriting/whatever but I have no idea which one should go or stay. Doing some quick greps, it appears the pynche tool uses the tk-prefixed versions, IDLE a combination of the two (see IOBinding.py that uses tkFileDialog, tkMessageBox but also SimpleDialog). I haven't seen much love for the non-prefixed versions, by the way. 5. About Tkdnd, Tkinter documentation says: "This is experimental and should become deprecated when it is replaced with the Tk DND". What's the status on this one ? """ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aleaxit at gmail.com Tue Apr 29 17:05:51 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 29 Apr 2008 08:05:51 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <48171036.8000701@gmail.com> References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 5:10 AM, Nick Coghlan wrote: ... > Perhaps sched/mutex could be dumped in the Demo directory? Or perhaps we > should just get rid of them entirely and see if anyone with a real use case > complains - it's not like the modules will be particularly hard to dig out > of SVN if we decide we want to keep them after all. I have real use cases and I'm already complaining;-). I'd rather not get into very specific details -- after all for the last 3+ years I've been working at Google in the area we now call "cluster management", so you can imagine that my recent and most important use cases may be systems that play important roles in the innards of Google clusters, and it's not exactly stuff that Google likes to have me chat about. But, somewhat abstractly, suppose that as a part of keeping clusters healthy you are, in a certain generation N of your cluster management infrastructure, periodically running (e.g. via cron) several scripts that perform housekeeping "sysadm" tasks -- check that all vital services are up and running and healthy, allocating replacement machines (to ensure redundancy remains good) if some vital server has keeled over and has been replaced by a "hot spare", ensuring multiply-redundant backups of important data offsite, and the like. At the next generation N+1 you rewrite all that morass of scripts (originally a mix of bash, perl and python) into Python - so far so good. But then you notice that using cron is far from optimal -- tasks are performed at different periodicity and take different times, so sometimes they end up overlapping and that's no good (they should be sequenced...) so you introduce locking between processes... but still that leaves times where several Python processes are alive (all but one waiting on a lock) uselessly consuming machine resources (the machines aren't all that limited, but, they ARE live-running servers, so resources taken by "overhead" sysadm tasks must be minimized). So, the big breakthrough: you rewrite the whole thing as one Python daemon based on sched. No more locking, delicate bugs and race conditions just disappear, more steady and predictable resource-consumption footprint (you still want to avoid conditions where the now long-running sysadm daemon grows memory footprint and never shrinks again, but for those rare tasks which risk that you can fork, do the work in the subprocess while the parent waits for it, etc), AND, suddenly and wonderfully, a new higher level of testability -- in a level of tests living between unit-tests and full system-integration tests, you can use a simulated (accelerated) timeline for sched, mock out just the parts that get information from "the outside" and perform actions on it, and exercise the whole system logic and workflow almost as well as a full system-integration test, but orders of magnitude faster AND without requiring a whole cluster to be devoted to the test... Sure, if sched was taken away, I could just take it back and make it part of the specific system rather than using it from the standard library -- but this argument would apply to a vast majority of library modules, particularly pure-Python ones; I think (I hope) we're just "spring-cleaning the cruft", not drastically rethinking the whole idea of "batteries included", right?-) Besides Google, I know that many other shops are using Python for cluster management and system administration tasks -- for example, RackSpace was a sponsor at Pycon and busy trying to hire Pythonistas, because their cluster management infrastructure software also appears to be all-Python. Large shops like RackSpace or Google are least affected by having to make sched part of their "own" code (though it WOULD needlessly add one more epsilon to the inevitable resistance that the 2.* -> 3.* migration will of course encounter, particularly among conservative, reliability-is-all types such as sysadms), but a lot of sysadm work happens in far smaller environments, often with "part-time" admins for whom finding out that sched once existed and was perfect for replacing cron in so many jobs, and then was removed and can still be downloaded from X.Y.Z, would be a significant chore. For tasks unrelated to system administration, consider for example the very instructive http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/413137 : the original "home cooked" solution (without sched) was really pretty bad and unreliable -- then Raymond Hettinger came in and saved the day by showing how to trivially use sched to do it RIGHT -- solid, lightweight, reliable, no threads and locks and things, what more could you ask for? And then, if needed, we can discuss pure simulation (as opposed to simulation-testing of systems designed to normally use the "real" sched). But already it seems to me there are plenty of use cases to justify retaining sched in the library...! Alex From p.f.moore at gmail.com Tue Apr 29 17:58:24 2008 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 29 Apr 2008 16:58:24 +0100 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: <79990c6b0804290858i18dda57eo19c25e0d2173478a@mail.gmail.com> On 29/04/2008, Alex Martelli wrote: > On Tue, Apr 29, 2008 at 5:10 AM, Nick Coghlan wrote: > ... > > Perhaps sched/mutex could be dumped in the Demo directory? Or perhaps we > > should just get rid of them entirely and see if anyone with a real use case > > complains - it's not like the modules will be particularly hard to dig out > > of SVN if we decide we want to keep them after all. > > I have real use cases and I'm already complaining;-). Although I don't have use cases at the moment, I find the comments made persuasive. I hadn't noticed the sched module, but now that it's been brought to my attention it feels more like "a little gem I'd missed" than "a useless module I've never needed". Just my view, Paul. From rhamph at gmail.com Tue Apr 29 18:30:06 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 29 Apr 2008 10:30:06 -0600 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 9:05 AM, Alex Martelli wrote: [snippage] > And then, if needed, we can discuss pure simulation (as opposed to > simulation-testing of systems designed to normally use the "real" > sched). But already it seems to me there are plenty of use cases to > justify retaining sched in the library...! Google's codesearch shows dozens of unique users, if not more. (And just as many unrelated modules also called "sched"...) Brett, has sched been discussed before, or is it the one exception? ;) -- Adam Olsen, aka Rhamphoryncus From jbaker at zyasoft.com Tue Apr 29 18:49:23 2008 From: jbaker at zyasoft.com (Jim Baker) Date: Tue, 29 Apr 2008 12:49:23 -0400 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <79990c6b0804290858i18dda57eo19c25e0d2173478a@mail.gmail.com> References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> <79990c6b0804290858i18dda57eo19c25e0d2173478a@mail.gmail.com> Message-ID: +1 to keep. Simple, reliable, and used. On Tue, Apr 29, 2008 at 11:58 AM, Paul Moore wrote: > On 29/04/2008, Alex Martelli wrote: > > On Tue, Apr 29, 2008 at 5:10 AM, Nick Coghlan > wrote: > > ... > > > Perhaps sched/mutex could be dumped in the Demo directory? Or perhaps > we > > > should just get rid of them entirely and see if anyone with a real use > case > > > complains - it's not like the modules will be particularly hard to dig > out > > > of SVN if we decide we want to keep them after all. > > > > I have real use cases and I'm already complaining;-). > > Although I don't have use cases at the moment, I find the comments > made persuasive. I hadn't noticed the sched module, but now that it's > been brought to my attention it feels more like "a little gem I'd > missed" than "a useless module I've never needed". > > Just my view, > Paul. > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/jbaker%40zyasoft.com > -- Jim Baker jbaker at zyasoft.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Tue Apr 29 20:01:50 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 30 Apr 2008 03:01:50 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <797440730804290520g6982518fkf25ac81bf7ea9260@mail.gmail.com> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <87wsmxpc0r.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804290520g6982518fkf25ac81bf7ea9260@mail.gmail.com> Message-ID: <87hcdkzgj5.fsf@uwakimon.sk.tsukuba.ac.jp> atsuo ishimoto writes: > 2008/4/17 Stephen J. Turnbull : > > How about choosing a standard Python repertoire (based on the Unicode > > standard, of course) of which characters get a graphic repr and which > > ones get \u-escaped, and have a post-hook for repr which gets passed > > the string repr proposes to print out? > > Will the standard repertoire exclude Cyrillic or full-with ASCII? "Exclude"? Nothing is "excluded". In my proposal, compatibility (full-width) "ASCII" will be \u-escaped by repr, yes. Cyrillic characters that can be confused with ASCII characters will be \u-escaped, yes. > If so, I (Japanese) will disable the hook [[ You have the way my proposal works backwards. The post-hook may be provided by the user to convert the unambiguous standard representation into one the user prefers. Python may or may not provide a library of convenient functions. ]] > because full-with ASCII characters are not ambiguous to me. That depends on the font(s) you use. Many fonts used with word processors make very little distinction and leave it up to the layout manager to create enough space. If you use different fonts for ASCII and JIS as will be the case in many environments (eg, an Emacs shell or python-mode buffer), who knows which will look wider? > I think ambiguity will occur when we meet with unfamiliar > characters. So choosing repertoire everybody can accept will be > difficult. We already know that. The point is that repr is like quoted-printable encoding in MIME. It should be mostly readable for programmers. There will be situations where it's a horrible choice from the point of view of readability, but the considerations of (1) consistency and (2) removal of ambiguity should be given precedence IMO. From jimjjewett at gmail.com Tue Apr 29 21:00:25 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 29 Apr 2008 15:00:25 -0400 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <87hcdkzgj5.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <87wsmxpc0r.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804290520g6982518fkf25ac81bf7ea9260@mail.gmail.com> <87hcdkzgj5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 4/29/08, Stephen J. Turnbull wrote: > atsuo ishimoto writes: > > > 2008/4/17 Stephen J. Turnbull : > > > How about choosing a standard Python repertoire (based on the Unicode > > > standard, of course) of which characters get a graphic repr and which > > > ones get \u-escaped, and have a post-hook for repr which gets passed > > > the string repr proposes to print out? > > Will the standard repertoire exclude Cyrillic or full-with ASCII? > "Exclude"? Nothing is "excluded". In my proposal, compatibility > (full-width) "ASCII" will be \u-escaped by repr, yes. but ... > [[ You have the way my proposal works backwards. The post-hook > may be provided by the user ... I think "standard repertoire based on Unicode" may be confusing the issue. As I understand it, you're saying something like For strings, repr will delegate to display_string. Users can (and should) supply a display_string function appropriate to their own system. The default display_string will display ASCII, and unicode-escape everything else. Except that you're leaving wiggle room for refinements like: "OK, we'll display all of Latin-1 by default, because we did in the past" or "For security reasons, the control character codes will always be escaped, instead of passing them to string_display" -jJ From dickinsm at gmail.com Tue Apr 29 22:17:08 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 29 Apr 2008 16:17:08 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> On Tue, Apr 29, 2008 at 10:30 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > or sizable. I would say "range(n) is a memory efficient substitute > for [0, 1, ... n-1]" is easier to fit into one's brain that the > current hodgepodge of exceptions. > For what it's worth, I'm -1 on any change that makes range(10**10) an error. I'd like to be able to write for i in range(n): ... without having to stop and worry about whether n is always going to be small enough to avoid an exception, and what to do if there's a possibility that n is large. The common case of range should have a small mental footprint. Indexing a range object, or taking its length, are surely much rarer than simply iterating over it; I don't think the problems with indexing and length are a good reason to impose restrictions on the use of range as an iterable. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Apr 29 22:30:33 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 16:30:33 -0400 Subject: [Python-3000] range() issues In-Reply-To: <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 4:17 PM, Mark Dickinson wrote: .. > I'd like to be able to write > > for i in range(n): > ... > > without having to stop and worry about whether n is always going > to be small enough to avoid an exception, and what to do if there's > a possibility that n is large. I would say that if it is possible that n exceeds a few hundred million, it is a good idea to pause and think whether you want to have this loop implemented in Python to begin with. Numpy or a custom C extention (combined with a 64-bit OS) may be a good idea if you routinely iterate over huge datasets. From brett at python.org Tue Apr 29 22:42:14 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 13:42:14 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: On Mon, Apr 28, 2008 at 11:11 PM, Adam Olsen wrote: > On Mon, Apr 28, 2008 at 8:30 PM, Brett Cannon wrote: > > * sched > > > > + Replaced by threading.Timer. > > I don't see sched as obsoleted by threading.Timer. It's much simpler > to use (no need for locking) and more efficient (no legions of > sleeping threads). Instead, maybe it should be removed because it's > trivial to reimplement as well as being overshadowed by all the other > event loops built into bigger systems (tk, qt, gtk, twisted, etc)? Fair enough. I tweaked the reasons. -Brett From guido at python.org Tue Apr 29 22:44:28 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Apr 2008 13:44:28 -0700 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: <1afaf6160804281521t12d07c73hf64be096882f2b96@mail.gmail.com> References: <1afaf6160804281521t12d07c73hf64be096882f2b96@mail.gmail.com> Message-ID: +1 On 4/28/08, Benjamin Peterson wrote: > It seems that os.walk has more options and a cleaner interface to > walking trees than os.path.walk does. Is there support for the removal > this in Py3k? > > -- > Cheers, > Benjamin Peterson > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- Sent from Gmail for mobile | mobile.google.com --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Tue Apr 29 22:44:30 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 13:44:30 -0700 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> References: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> Message-ID: On Tue, Apr 29, 2008 at 2:46 AM, Raymond Hettinger wrote: > > > * UserList/UserString [done: 3.0] > > > > Note that these were updated and moved to the collections module in Py3.0. > Noted. > > > > anydbm dbm.tools [1]_ > > whichdb dbm.tools [1]_ > > > > Were there any better naming suggestions than dbm.tools? The original > names seem much more informative. > But way too much overhead for two modules that only contained one useful function each. As Nick said, if you don't know DB stuff then I don't see any loss of information. If you can come up with a better name I am open to suggestions, but the module merge will happen. -Brett From brett at python.org Tue Apr 29 22:46:45 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 13:46:45 -0700 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <48170F7E.90609@gmail.com> References: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> <48170F7E.90609@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 5:07 AM, Nick Coghlan wrote: > Raymond Hettinger wrote: > > > > > > * UserList/UserString [done: 3.0] > > > > > > > Note that these were updated and moved to the collections module in Py3.0. > > > > > > > anydbm dbm.tools [1]_ > > > whichdb dbm.tools [1]_ > > > > > > > Were there any better naming suggestions than dbm.tools? The original > names seem much more informative. > > > > Maybe they're more informative if you've been using them for a long time. > As a non-DB-API user, anydbm seems just as generic to me as dbm.tools, and > whichdb.whichdb is just redundant. > > dbm.tools.open and dbm.tools.whichdb seem fine as names for the functions. > > > > > > > > For modules that are renamed, stub modules will be created with the > > > original names and be kept in a directory within the stdlib (e.g. like > > > how lib-old was once used). > > > > > > > What is the purpose of the new directory? Are there some use > > cases for intermixing the new and old names? Is there something > > that the 2-to-3 converter won't be able to handle? > > > > The reason is noted in the PEP - it's to keep case insensitive filesystems > (such as NTFS) from spitting the dummy when we try to put both a > ConfigParser.py (old name) and configparser.py (new name) in the Python Lib > directory. > > I'd like to see the PEP address the question of how it is going to deal > with getting duplicate copies of modules in sys.modules when some code in an > application uses the old name and some code uses the new name. > There is not much that can be done without introducing a custom importer to handle the mapping of names in sys.modules to the same module object. Otherwise it's going to be done using ``from _ import *``. -Brett From jimjjewett at gmail.com Tue Apr 29 22:57:45 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 29 Apr 2008 16:57:45 -0400 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> <48170F7E.90609@gmail.com> Message-ID: On 4/29/08, Brett Cannon wrote: > On Tue, Apr 29, 2008 at 5:07 AM, Nick Coghlan wrote: > > Raymond Hettinger wrote: > > > > For modules that are renamed, stub modules will be created with the > > > > original names ... > > > What is the purpose of the new directory? Are there some use > > > cases for intermixing the new and old names? Is there something > > > that the 2-to-3 converter won't be able to handle? People can start using the new "proper" names immediately. People can reduce the number of changes for which they rely on 2-to-3. > > I'd like to see the PEP address the question of how it is going to deal > > with getting duplicate copies of modules in sys.modules when some code in an > > application uses the old name and some code uses the new name. > There is not much that can be done without introducing a custom > importer to handle the mapping of names in sys.modules to the same > module object. Otherwise it's going to be done using ``from _ import > *``. The following worked on python 2.5 zort.py --------- import sys import collections sys.modules[__name__]=collections >>> import zort >>> zort is collections True From brett at python.org Tue Apr 29 23:00:38 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 14:00:38 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <20080429134336.08761B964@mail.terralink.de> References: <20080429134336.08761B964@mail.terralink.de> Message-ID: On Tue, Apr 29, 2008 at 8:43 AM, wrote: > Hi list, > > I'm a long time lurker with only a very few contributions > in Tkinter. Just one remark about the inclusion of Canvas > in the tkinter package: > Canvas is marked as obsolete since 2000. > See this issue (and the comment at the top of Canvas.py): > http://bugs.python.org/issue210677 > Thanks for the references, Matthias! I have moved Canvas to the list of modules to remove. -Brett > Cheers, > Matthias Kievernagel. > (mkiever/at/web/dot/de) > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/brett%40python.org > From g.brandl at gmx.net Tue Apr 29 23:07:52 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 29 Apr 2008 23:07:52 +0200 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: Brett Cannon schrieb: > [bcc to stdlib-sig] > > After two false starts over the YEARS of trying to cleanup and > reorganize the stdlib, creating a SIG to get this going, having Guido > give the PEP the once-over over the past several days, and creating > two new bugs reports (issues 2715 and 2716), PEP 3108 is finally ready > for public vetting! > > While reading this PEP, do remember this is only about either removing > modules, renaming them, or moving them into a package. Additions are > not covered by this PEP! > > Also realize all of the right people have been consulted on this stuff > (e.g., the web SIG about the urllib package). So please do not think > that something that seems drastic (e.g., the removal of all > Mac-specific modules) was taken lightly when in fact the proper people > were asked and they were okay with what is going on. > > Lastly, I do not want this to turn into a drawn-out thread about how > people think some module should stay because they happen to use it or > suggest some other module to remove. Please think before you propose a > change. I have been through this proposal process for this reorg > before and every time it has gotten way out of control. I do not want > it happen this time. Looks very nice! +1 from me. Georg From brett at python.org Tue Apr 29 23:04:57 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 14:04:57 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <8b943f2b0804290739p63153166ib3786e2bdadb9ab4@mail.gmail.com> References: <20080429134336.08761B964@mail.terralink.de> <8b943f2b0804290739p63153166ib3786e2bdadb9ab4@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 7:39 AM, Quentin Gallet-Gilles wrote: > > > > On Tue, Apr 29, 2008 at 5:43 PM, wrote: > > Hi list, > > > > I'm a long time lurker with only a very few contributions > > in Tkinter. Just one remark about the inclusion of Canvas > > in the tkinter package: > > Canvas is marked as obsolete since 2000. > > See this issue (and the comment at the top of Canvas.py): > > http://bugs.python.org/issue210677 > > > > Cheers, > > Matthias Kievernagel. > > (mkiever/at/web/dot/de) > > Indeed. I remember mentioning it back in December 2007, along with a > question about the deprecated status of Tkdnd. With the stdlib-sig list > creation, I guess the discussion was lost and I totally forgot to mention it > again when the discussion about modules deletions occurred. Sorry Brett. > No need to apologize. Sorry for missing your initial email! I have only seriously used Tkinter once, so I am not in a good position to judge any of this. What do people think about the suggestions? > Quentin > > > As a reminder, here's an abstract from the mail back then : > """Apart from the awfully inconsistent naming convention, there are a few > things that are worth considering for the reorg: > > 1. FixTk is only called by Tkinter and has no API to expose since it's only > a win32 specific piece of code to manage the _tkinter import. It should be > renamed _fixtk/_tkfix or merged into Tkinter.py > Already suggested to be hidden. > 2. Tkconstants is used in several places : > - Tkinter.py does a simple "from Tkconstants import *" > - Tix doesn't import it and its documentation shows an example with "import > Tkinter" followed by "from Tkconstants import *". And sure enough, DirList > and DirTree (from Demo/tix/samples/) both do that. > - Finally, CodeContext (Lib/idlelib) imports a few constants manually > > IMO Tkconstants could be renamed _tkconstants and all imports besides the > first one changed to access the constants via Tkinter. > What do people think about this? > 3. Canvas contains a comment saying it's obsolete and that Tkinter.Canvas > should be used instead. I've gone ahead and added it in the "Possible > Deletions" tab. > Dealt with. > 4. All those *Dialog modules seems an obvious candidate for > merging/deleting/rewriting > /whatever but I have no idea which one should go or stay. Doing some quick > greps, it appears the pynche tool uses the tk-prefixed versions, IDLE a > combination of the two (see IOBinding.py that uses tkFileDialog, > tkMessageBox but also SimpleDialog). I haven't seen much love for the > non-prefixed versions, by the way. > Don't know about this. > 5. About Tkdnd, Tkinter documentation says: "This is experimental and should > become deprecated when it is replaced with the Tk DND". What's the status on > this one ? """ Don't know about this one either. Thoughts? -Brett From musiccomposition at gmail.com Tue Apr 29 23:08:26 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 29 Apr 2008 16:08:26 -0500 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: <1afaf6160804291408i6bbbd64en5689b22c3470c6a5@mail.gmail.com> On Mon, Apr 28, 2008 at 9:30 PM, Brett Cannon wrote: > #. Remove the module. > #. Remove the tests. > #. Edit ``Modules/Setup.dist`` and ``setup.py`` if needed. > #. Remove the docs (if applicable). > #. Run the regression test suite (using ``-uall``); watch out for > tests that are skipped because an import failed for the removed > module. Why don't why apply the patch at issue 2409, so catching imports is easier? +1 Overall, I'm very impressed. It's hard to find a PEP (especially this big) that doesn't cause massive bikeshedding. -- Cheers, Benjamin Peterson From brett at python.org Tue Apr 29 23:08:41 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 14:08:41 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 8:05 AM, Alex Martelli wrote: [SNIP - Alex's well-argued reasons to keep sched] > And then, if needed, we can discuss pure simulation (as opposed to > simulation-testing of systems designed to normally use the "real" > sched). But already it seems to me there are plenty of use cases to > justify retaining sched in the library...! OK, sched stays. Do you need mutex to stay as-is, get rolled into sched, or can we still ditch that module (at least publicly)? -Brett From dickinsm at gmail.com Tue Apr 29 23:09:48 2008 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 29 Apr 2008 17:09:48 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> Message-ID: <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> On Tue, Apr 29, 2008 at 4:30 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > I would say that if it is possible that n exceeds a few hundred > million, it is a good idea to pause and think whether you want to have > this loop implemented in Python to begin with. > Maybe. But the answer is often going to be yes, if it's a choice between me spending some number of hours translating everything to C, or just leaving my computer do the work (however inefficiently) while I do something else. These numbers aren't ridiculously large. I just tried for i in range(2**31): pass on my (32-bit) laptop: it took 736.8 seconds, or about 12 and a bit minutes. (An aside: in contrast, for i in range(2**31-1): pass took only 131.1 seconds; looks like there's some potential for optimization here....) Put another way: range(n) currently works, in Py3k, for n > sys.maxsize. What's the rationale for breaking that? extention (combined with a 64-bit OS) may be a good idea if you > routinely iterate over huge datasets. > Well, huge datasets (large time *and* space requirements) don't really come into it for my typical use-cases: just long-running processes. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Apr 29 23:18:47 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 14:18:47 -0700 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> <48170F7E.90609@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 1:57 PM, Jim Jewett wrote: > On 4/29/08, Brett Cannon wrote: [SNIP] > > > > I'd like to see the PEP address the question of how it is going to deal > > > with getting duplicate copies of modules in sys.modules when some code in an > > > application uses the old name and some code uses the new name. > > > There is not much that can be done without introducing a custom > > importer to handle the mapping of names in sys.modules to the same > > module object. Otherwise it's going to be done using ``from _ import > > *``. > > The following worked on python 2.5 > > zort.py > --------- > > import sys > import collections > sys.modules[__name__]=collections > > >>> import zort > >>> zort is collections > True > Huh. That is a much better solution. I just didn't think that was going to work. But both import.c and importlib return the module found in sys.modules in the end, so the trick works. I will update the PEP! -Brett From musiccomposition at gmail.com Tue Apr 29 23:18:57 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 29 Apr 2008 16:18:57 -0500 Subject: [Python-3000] range() issues In-Reply-To: <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> Message-ID: <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> On Tue, Apr 29, 2008 at 4:09 PM, Mark Dickinson wrote: > Put another way: range(n) currently works, in Py3k, for n > sys.maxsize. > What's the rationale for breaking that? So we can support other sequence methods. (I think.) Personally, I think that range should be just an easy to iterate over a sequence (set, series, or whatever the term is) of integers (even if they're huge). I don't think we need to turn it into a sequence. That fits in your brain. If you need to do something more advanced than counting, there are other things out there! Where are the use cases for range slicing, anyway? -- Cheers, Benjamin Peterson From mike.klaas at gmail.com Tue Apr 29 23:24:23 2008 From: mike.klaas at gmail.com (Mike Klaas) Date: Tue, 29 Apr 2008 14:24:23 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: On 29-Apr-08, at 2:08 PM, Brett Cannon wrote: > On Tue, Apr 29, 2008 at 8:05 AM, Alex Martelli > wrote: > [SNIP - Alex's well-argued reasons to keep sched] > >> And then, if needed, we can discuss pure simulation (as opposed to >> simulation-testing of systems designed to normally use the "real" >> sched). But already it seems to me there are plenty of use cases to >> justify retaining sched in the library...! > > OK, sched stays. Do you need mutex to stay as-is, get rolled into > sched, or can we still ditch that module (at least publicly)? Having a module with this (somewhat odd) functionality under the name "mutex" is a confusing wart in the current stdlib. If it is deemed desirable to preserve (publicly), I would advocate renaming it. Nice work on the PEP. -Mike From brett at python.org Tue Apr 29 23:26:18 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 14:26:18 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <1afaf6160804291408i6bbbd64en5689b22c3470c6a5@mail.gmail.com> References: <1afaf6160804291408i6bbbd64en5689b22c3470c6a5@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 2:08 PM, Benjamin Peterson wrote: > On Mon, Apr 28, 2008 at 9:30 PM, Brett Cannon wrote: > > #. Remove the module. > > #. Remove the tests. > > #. Edit ``Modules/Setup.dist`` and ``setup.py`` if needed. > > #. Remove the docs (if applicable). > > #. Run the regression test suite (using ``-uall``); watch out for > > tests that are skipped because an import failed for the removed > > module. > > Why don't why apply the patch at issue 2409, so catching imports is easier? > Well, I have been planning implementing that exact function at some point as part of my "let's get Python's testing tool support up to snuff" project (which, of course, will happen between now and when I fall over dead). I honestly didn't know about the patch plus I don't have the time to go through and change every import for every module such that the module being tested is optional (and thus skipped), but that if the support modules are missing the test fails. > > +1 Overall, I'm very impressed. It's hard to find a PEP (especially > this big) that doesn't cause massive bikeshedding. > Thanks for the compliment (and everyone else who has had nice things to say). I am just glad I didn't drop the ball on this one (although this all still needs to be implemented, so I still have a chance to screw up =). Unfortunately I suspect I am going to pay the price for doing this by having importlib slip until 3.1 (I don't think I have enough time to work out the last three failing tests in time for 3.0b1). But that's actually okay as long as Guido doesn't tell me it is 3.0 or bust. =) -Brett From musiccomposition at gmail.com Tue Apr 29 23:32:19 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 29 Apr 2008 16:32:19 -0500 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <1afaf6160804291408i6bbbd64en5689b22c3470c6a5@mail.gmail.com> Message-ID: <1afaf6160804291432p664fd331g8c13cdc3915770d1@mail.gmail.com> On Tue, Apr 29, 2008 at 4:26 PM, Brett Cannon wrote: > Well, I have been planning implementing that exact function at some > point as part of my "let's get Python's testing tool support up to > snuff" project (which, of course, will happen between now and when I > fall over dead). I honestly didn't know about the patch plus I don't > have the time to go through and change every import for every module > such that the module being tested is optional (and thus skipped), but > that if the support modules are missing the test fails. It looks like the patch already changes tests, which have optional imports. -- Cheers, Benjamin Peterson From tjreedy at udel.edu Tue Apr 29 23:34:07 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 29 Apr 2008 17:34:07 -0400 Subject: [Python-3000] range() issues References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> Message-ID: "Alexander Belopolsky" wrote in message news:d38f5330804290730oe40394dr87f9849d89d4851f at mail.gmail.com... | On Mon, Apr 28, 2008 at 7:18 PM, Guido van Rossum wrote: | .. | > The cost of the extra functionality: writing it, reviewing it, adding | > unittests, documenting it, maintaining it, making sure it works on | > 64-bit machines, having Python book authors discuss it; and in | > addition some extra baggage in the executable that is never needed | > (but I think the other reasons are more compelling). There's a reason | > the xrange() object didn't have all this extra baggage. | > | > Remember, one of the goals of Py3k is to *shrink* the language so that | > it will fit in your brain again. This thread seems to be going in the | > opposite direction. | | I would say making range return a instance of Sequence will make that | feature easier to understand. I agree that 'shrinking' the language means that range should either be a simple iterator or a full sequence. Something in between makes a new concept to learn. tjr From brett at python.org Tue Apr 29 23:42:39 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 14:42:39 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: <1afaf6160804291432p664fd331g8c13cdc3915770d1@mail.gmail.com> References: <1afaf6160804291408i6bbbd64en5689b22c3470c6a5@mail.gmail.com> <1afaf6160804291432p664fd331g8c13cdc3915770d1@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 2:32 PM, Benjamin Peterson wrote: > On Tue, Apr 29, 2008 at 4:26 PM, Brett Cannon wrote: > > Well, I have been planning implementing that exact function at some > > point as part of my "let's get Python's testing tool support up to > > snuff" project (which, of course, will happen between now and when I > > fall over dead). I honestly didn't know about the patch plus I don't > > have the time to go through and change every import for every module > > such that the module being tested is optional (and thus skipped), but > > that if the support modules are missing the test fails. > > It looks like the patch already changes tests, which have optional imports. > Right, but that is not the extend I was thinking for "optional". I personally want to see it so that if the module being tested is not present, the test is skipped, otherwise failing imports means there is a test error. The patch only covers like five modules (most of which are being removed) which have modules that are optional for some tests. -Brett From db3l.net at gmail.com Wed Apr 30 00:03:43 2008 From: db3l.net at gmail.com (David Bolen) Date: Tue, 29 Apr 2008 18:03:43 -0400 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup References: Message-ID: "Brett Cannon" writes: > Also realize all of the right people have been consulted on this stuff > (e.g., the web SIG about the urllib package). So please do not think > that something that seems drastic (e.g., the removal of all > Mac-specific modules) was taken lightly when in fact the proper people > were asked and they were okay with what is going on. Are there any thoughts on providing some other distribution or mechanism to build selected Mac modules post-removal? Is it likely to be possible to grab current 2.x source and build as part of 3.x, providing it's not a 64-bit system or UCS-4 configuration unsupported by bgen? I have an application using the QuickTime portion of the Carbon package very successfully in recent code (with a primarily Tiger-based user base), where the higher level Cocoa/ObjC frameworks didn't provide the necessary functionality under Tiger - it would be nice to have some path to maintaining that across a 2.x/3.x transition, even if I had to build something locally. -- David From musiccomposition at gmail.com Wed Apr 30 00:16:01 2008 From: musiccomposition at gmail.com (Benjamin Peterson) Date: Tue, 29 Apr 2008 17:16:01 -0500 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: <1afaf6160804291516h38c541aaod58538d4065569a4@mail.gmail.com> On Tue, Apr 29, 2008 at 5:03 PM, David Bolen wrote: > > I have an application using the QuickTime portion of the Carbon package > very successfully in recent code (with a primarily Tiger-based user > base), where the higher level Cocoa/ObjC frameworks didn't provide the > necessary functionality under Tiger - it would be nice to have some path > to maintaining that across a 2.x/3.x transition, even if I had to build > something locally. Well, the modules are still currently in the Py3k tree, so I suppose you could grab a snapshot of them before they were removed. -- Cheers, Benjamin Peterson From aleaxit at gmail.com Wed Apr 30 00:55:25 2008 From: aleaxit at gmail.com (Alex Martelli) Date: Tue, 29 Apr 2008 15:55:25 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 2:08 PM, Brett Cannon wrote: > On Tue, Apr 29, 2008 at 8:05 AM, Alex Martelli wrote: > [SNIP - Alex's well-argued reasons to keep sched] > > > > And then, if needed, we can discuss pure simulation (as opposed to > > simulation-testing of systems designed to normally use the "real" > > sched). But already it seems to me there are plenty of use cases to > > justify retaining sched in the library...! > > OK, sched stays. Do you need mutex to stay as-is, get rolled into > sched, or can we still ditch that module (at least publicly)? I have no use case for mutex (nor anything against it either), so personally I'm +0 on removing it (just on the basis that it confuses me - it's documented as needing to be used with sched, but most examples I can find with google code search use it without sched, etc...). But I hope that somebody understanding its use cases better than me speaks!-) Alex From facundobatista at gmail.com Wed Apr 30 01:17:45 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 29 Apr 2008 20:17:45 -0300 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: 2008/4/29, Benjamin Peterson : > On Tue, Apr 29, 2008 at 4:09 PM, Mark Dickinson wrote: > > Put another way: range(n) currently works, in Py3k, for n > sys.maxsize. > > What's the rationale for breaking that? > > So we can support other sequence methods. (I think.) The point is that we're sacrificing a good feature (don't worry about the limit of range(), Python is safe), in favor of rarely used features (index a range()... what's the point?; or knowing its length... hey, you created range, you should know its length). Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From facundobatista at gmail.com Wed Apr 30 01:18:30 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 29 Apr 2008 20:18:30 -0300 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: <1afaf6160804281521t12d07c73hf64be096882f2b96@mail.gmail.com> References: <1afaf6160804281521t12d07c73hf64be096882f2b96@mail.gmail.com> Message-ID: 2008/4/28, Benjamin Peterson : > It seems that os.walk has more options and a cleaner interface to > walking trees than os.path.walk does. Is there support for the removal > this in Py3k? +1 -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From guido at python.org Wed Apr 30 01:48:34 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Apr 2008 16:48:34 -0700 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 2:18 PM, Benjamin Peterson wrote: > On Tue, Apr 29, 2008 at 4:09 PM, Mark Dickinson wrote: > > Put another way: range(n) currently works, in Py3k, for n > sys.maxsize. > > What's the rationale for breaking that? > > So we can support other sequence methods. (I think.) > > Personally, I think that range should be just an easy to iterate over > a sequence (set, series, or whatever the term is) of integers (even if > they're huge). I don't think we need to turn it into a sequence. That > fits in your brain. If you need to do something more advanced than > counting, there are other things out there! Where are the use cases > for range slicing, anyway? +1 Let's just stop the discussion here and kill all proposals to add indexing/slicing etc. Sorry, Alexander, but there just isn't anyone besides you in favor, and nobody has brought up a convincing use case. __len__ will always be problematic when there are more values than can be counted in a signed C long; maybe we should do what the Java collections package does: for once, Java chooses practicality over purity, and simply states that if the length doesn't fit, the largest number that does fit is returned (i.e. for us that would be sys.maxsize in 3.0, sys.maxint in 2.x). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From brett at python.org Wed Apr 30 02:17:20 2008 From: brett at python.org (Brett Cannon) Date: Tue, 29 Apr 2008 17:17:20 -0700 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: On Tue, Apr 29, 2008 at 3:03 PM, David Bolen wrote: > "Brett Cannon" writes: > > > Also realize all of the right people have been consulted on this stuff > > (e.g., the web SIG about the urllib package). So please do not think > > that something that seems drastic (e.g., the removal of all > > Mac-specific modules) was taken lightly when in fact the proper people > > were asked and they were okay with what is going on. > > Are there any thoughts on providing some other distribution or mechanism > to build selected Mac modules post-removal? Is it likely to be possible > to grab current 2.x source and build as part of 3.x, providing it's not > a 64-bit system or UCS-4 configuration unsupported by bgen? > > I have an application using the QuickTime portion of the Carbon package > very successfully in recent code (with a primarily Tiger-based user > base), where the higher level Cocoa/ObjC frameworks didn't provide the > necessary functionality under Tiger - it would be nice to have some path > to maintaining that across a 2.x/3.x transition, even if I had to build > something locally. > Even when the code is removed it will still be in the svn history so you should be able to grab it easily. -Brett From alexander.belopolsky at gmail.com Wed Apr 30 04:16:53 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 22:16:53 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 7:48 PM, Guido van Rossum wrote: .. > Let's just stop the discussion here and kill all proposals to add > indexing/slicing etc. Sorry, Alexander, but there just isn't anyone > besides you in favor, and nobody has brought up a convincing use case. > That's fair, but let me wrap up by rehashing the current state of affairs. 1. Both 2.x xrange and 3.x range support indexing. A comment in py3k branch says "range(...)[x] is necessary for: seq[:] = range(...)," but this is apparently wrong: >>> x = [] >>> x[:] = iter([1,2,3]) >>> x [1, 2, 3] 2. In 3.x, ranges longer that sys.sizemax are allowed, but cannot be indexed even with small indexes, for example, range(2**100)[0] raises an OverflowError. There is little justification for this behavior. A 3-line patch can fix the situation for small indexes and Amaury demonstrated [1] that with some effort arbitrary indexes can be supported. [1] http://bugs.python.org/file10109/anyrange.patch 3. There is an ongoing debate [2] on how comparison and hashing should be implemented for range objects. My point is that current implementation of 3.x is neither here nor there. It is not simple: it does not even do what its documentation says: >>> print(range.__doc__) range([start,] stop[, step]) -> range object Returns an iterator that generates the numbers in the range on demand. >>> range(10).__next__() Traceback (most recent call last): File "", line 1, in AttributeError: 'range' object has no attribute '__next__' It supports some sequence methods (len and subscripting), but not others (__contains__ and slicing). My use case for making range a Sequence is as follows. I frequently deal with data organized in column oriented tables. These tables often need a column that represents the row number. A range object would allow an efficient representation of such column, but having such a virtual column in the table would mean that generic sequence manipulation functions will not work on some columns. This is not a strong itch, though. While virtualizing row number column using range() is an attractive solution, in practice memory savings compared to numpy's arange() (or array('i', range(..))) are not that significant. However, if slicing support is axed based on complexity considerations, I don't see how supporting indexing can be justified. Moreover, since indexing and slicing can reuse the same start + i*step computation, the incremental code complexity of slicing support is small, so for me the two go hand in hand. For these reasons, I believe that either of the following alternatives is better than the status quo: 1. Make range(..) return a Sequence. 2. Make range(..) return an Iterator. (While I prefer #1, there are several advantages of this proposal: in the common list(range(..)) and for i in range(..) cases, creation of an intermediate object will go away; we will stop debating what hash(range(..)) should return [2]; and finally we will not need to change the docstring :-).) [2] http://bugs.python.org/issue2603 > __len__ will always be problematic when there are more values than can > be counted in a signed C long; maybe we should do what the Java > collections package does: for once, Java chooses practicality over > purity, and simply states that if the length doesn't fit, the largest > number that does fit is returned (i.e. for us that would be > sys.maxsize in 3.0, sys.maxint in 2.x). This is another simple way to fix range(2**100)[0] buglett. From guido at python.org Wed Apr 30 04:36:28 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Apr 2008 19:36:28 -0700 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: I propose to remove the support for indexing; it is a carryover from before Python 2.2 when there was no .next() method. There are good reasons for having range() return an Iterable and not an Iterator; e.g. R = range(N) for i in R: for j in R: .... so here I propose to keep the status quo. Let's also fix __len__() so that it returns sys.{maxint,maxsize} when the result doesn't fit in a Py_ssize_t. I am worried that the debates about repr()/hash()/eq() are similarly stuck in vicious circles; I'll have to think about how to untie those knots, but they're unrelated to the sequence/iterator/iterable debate. --Guido On Tue, Apr 29, 2008 at 7:16 PM, Alexander Belopolsky wrote: > On Tue, Apr 29, 2008 at 7:48 PM, Guido van Rossum wrote: > .. > > > Let's just stop the discussion here and kill all proposals to add > > indexing/slicing etc. Sorry, Alexander, but there just isn't anyone > > besides you in favor, and nobody has brought up a convincing use case. > > > > That's fair, but let me wrap up by rehashing the current state of affairs. > > 1. Both 2.x xrange and 3.x range support indexing. A comment in py3k > branch says "range(...)[x] is necessary for: seq[:] = range(...)," > but this is apparently wrong: > > >>> x = [] > >>> x[:] = iter([1,2,3]) > >>> x > [1, 2, 3] > > 2. In 3.x, ranges longer that sys.sizemax are allowed, but cannot be > indexed even with small indexes, for example, range(2**100)[0] raises > an OverflowError. There is little justification for this behavior. A > 3-line patch can fix the situation for small indexes and Amaury > demonstrated [1] that with some effort arbitrary indexes can be > supported. > > [1] http://bugs.python.org/file10109/anyrange.patch > > 3. There is an ongoing debate [2] on how comparison and hashing should > be implemented for range objects. > > My point is that current implementation of 3.x is neither here nor > there. It is not simple: it does not even do what its documentation > says: > > >>> print(range.__doc__) > range([start,] stop[, step]) -> range object > > > Returns an iterator that generates the numbers in the range on demand. > >>> range(10).__next__() > > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'range' object has no attribute '__next__' > > It supports some sequence methods (len and subscripting), but not > others (__contains__ and slicing). > > My use case for making range a Sequence is as follows. I frequently > deal with data organized in column oriented tables. These tables > often need a column that represents the row number. A range object > would allow an efficient representation of such column, but having > such a virtual column in the table would mean that generic sequence > manipulation functions will not work on some columns. > > This is not a strong itch, though. While virtualizing row number > column using range() is an attractive solution, in practice memory > savings compared to numpy's arange() (or array('i', range(..))) are > not that significant. However, if slicing support is axed based on > complexity considerations, I don't see how supporting indexing can be > justified. Moreover, since indexing and slicing can reuse the same > start + i*step computation, the incremental code complexity of slicing > support is small, so for me the two go hand in hand. For these > reasons, I believe that either of the following alternatives is better > than the status quo: > > 1. Make range(..) return a Sequence. > > 2. Make range(..) return an Iterator. (While I prefer #1, there are > several advantages of this proposal: in the common list(range(..)) and > for i in range(..) cases, creation of an intermediate object will go > away; we will stop debating what hash(range(..)) should return [2]; > and finally we will not need to change the docstring :-).) > > [2] http://bugs.python.org/issue2603 > > > > > __len__ will always be problematic when there are more values than can > > be counted in a signed C long; maybe we should do what the Java > > collections package does: for once, Java chooses practicality over > > purity, and simply states that if the length doesn't fit, the largest > > number that does fit is returned (i.e. for us that would be > > sys.maxsize in 3.0, sys.maxint in 2.x). > > This is another simple way to fix range(2**100)[0] buglett. > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexander.belopolsky at gmail.com Wed Apr 30 04:53:09 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 22:53:09 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: .. > There are good reasons for having range() return an Iterable and not > an Iterator; e.g. > > R = range(N) > for i in R: > for j in R: > .... You realize that in the snippet above whatever cycles you save by creating R once, you give away by creating iter(R) twice. So compared to range() returning an iterator and having to write for i in range(N): for j in range(N): ... you have 3 vs. 2 auxiliary objects created. And how often do you see code that will not benefit from being generalized from square to rectangular matrices? Lots of C code will go away if we nix the range object and leave only rangeiterator! From alexander.belopolsky at gmail.com Wed Apr 30 04:58:44 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 22:58:44 -0400 Subject: [Python-3000] range() issues In-Reply-To: <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 5:18 PM, Benjamin Peterson wrote: .. > > Put another way: range(n) currently works, in Py3k, for n > sys.maxsize. > > What's the rationale for breaking that? > > So we can support other sequence methods. (I think.) > This is not true. The missing sequence methods are slicing and __contains__ and neither requires len <= sys.maxsize . The rationale was that a huge range is most likely a programming or input error which is detected early in 2.x while in 3.x may result in strange errors later on. From alexander.belopolsky at gmail.com Wed Apr 30 05:04:57 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 23:04:57 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: Correction: My calculation below was only correct for N = 1 case. In general, the two alternatives will create N+2 vs. N+1 auxiliary objects. On Tue, Apr 29, 2008 at 10:53 PM, Alexander Belopolsky wrote: > On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: > .. > > There are good reasons for having range() return an Iterable and not > > an Iterator; e.g. > > > > R = range(N) > > for i in R: > > for j in R: > > .... > > You realize that in the snippet above whatever cycles you save by > creating R once, you give away by creating iter(R) twice. So compared > to range() returning an iterator and having to write > > for i in range(N): > for j in range(N): > ... > > you have 3 vs. 2 auxiliary objects created. And how often do you see > code that will not benefit from being generalized from square to > rectangular matrices? > > Lots of C code will go away if we nix the range object and leave only > rangeiterator! > From theaney at gmail.com Wed Apr 30 05:10:00 2008 From: theaney at gmail.com (Tim Heaney) Date: Tue, 29 Apr 2008 23:10:00 -0400 Subject: [Python-3000] Removal of os.path.walk Message-ID: Speaking of this, is it too late to lobby for an iterator version of os.listdir? (Perhaps listdir would not be the best name. :) There is one at http://wxidle.sourceforge.net/projects/xlistdir/ but I think it ought to be in the standard library. Moreover, if we had such a thing, shouldn't os.walk use it instead of lists? > It seems that os.walk has more options and a cleaner interface to > walking trees than os.path.walk does. Is there support for the removal > this in Py3k? > > -- > Cheers, > Benjamin Peterson From alexander.belopolsky at gmail.com Wed Apr 30 05:22:27 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 29 Apr 2008 23:22:27 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: .. > There are good reasons for having range() return an Iterable and not > an Iterator; What would you say to an idea of exposing rangeiter in itertools - say itertools.irange(..) function that returns an iterator? From guido at python.org Wed Apr 30 05:42:07 2008 From: guido at python.org (Guido van Rossum) Date: Tue, 29 Apr 2008 20:42:07 -0700 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: References: Message-ID: On Tue, Apr 29, 2008 at 8:10 PM, Tim Heaney wrote: > Speaking of this, is it too late to lobby for an iterator version of > os.listdir? (Perhaps listdir would not be the best name. :) > > There is one at > > http://wxidle.sourceforge.net/projects/xlistdir/ > > but I think it ought to be in the standard library. Moreover, if we > had such a thing, shouldn't os.walk use it instead of lists? I'm not sure I see the advantage of having it as an iterator; I doubt that there is ever not enough memory to hold the contents of a single directory. Do you have a compelling use case? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From stephen at xemacs.org Wed Apr 30 07:39:34 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 30 Apr 2008 14:39:34 +0900 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <87wsmxpc0r.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804290520g6982518fkf25ac81bf7ea9260@mail.gmail.com> <87hcdkzgj5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87d4o72961.fsf@uwakimon.sk.tsukuba.ac.jp> Jim Jewett writes: > I think "standard repertoire based on Unicode" may be confusing the issue. By "standard repertoire" I mean that all Pythons will show the same characters the same way, while "based on Unicode" is intended to mean looking at TR#36 and TR#39 in picking the repertoires. > As I understand it, you're saying something like > > For strings, repr will delegate to display_string. Er, I'm not familiar with such a function.... What I have in mind is that for string display, repr will have a large, standard set of characters that it sends directly to output, and a set that it \u-escapes for the purpose of avoiding ambiguity. These sets are always defined the same way for any Python. For people for whom the standard display would be painful (eg, Cyrillic users and Greek users), there would be an optional post-processor (basically a codec) which would translate some \u-escapes to characters, and should also translate the conflicting characters (ie, ASCII in the case of Cyrillic and Greek) to \u-escapes. > Users can (and should) supply a display_string function > appropriate to their own system. "Can", yes, but only on a "consenting adults" basis. They should not do so in most cases. > The default display_string will display ASCII, and unicode-escape > everything else. Definitely not. The default should try to display anything that can be displayed unambiguously. If we don't do that, *nobody* will use the default except us semi-lingual Americans, and there would be no point in having a standard repertoire. For practical purposes, the only scripts I know of where there will be real problems are Cyrillic and Greek, because they share glyphs with the Latin alphabet, and by default many of their characters would be escaped. I'm sure there are other such scripts, of course, I don't mean to minimize the problem. (Some Japanese will undoubtedly complain about their full-width "ASCII", but I have no sympathy for that particular self-inflicted injury: they are already deprecated in Unicode as compatibility characters.) On the other hand, Unicode was careful to assemble a unified set of Latin characters. Although some like the Angstrom symbol do have compatibility encodings, I don't think that's a major worry. The vast majority of Asian characters (loosely defined, including not only the Han ideographs but the radicals, Korean Hangul, Japanese and Chinese syllabaries, etc) are going to be readable, too (for those with appropriate fonts). From rhamph at gmail.com Wed Apr 30 07:51:34 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 29 Apr 2008 23:51:34 -0600 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: <9e804ac0804290501n30447250p13526e11ae24446f@mail.gmail.com> <48171036.8000701@gmail.com> Message-ID: On Tue, Apr 29, 2008 at 3:08 PM, Brett Cannon wrote: > On Tue, Apr 29, 2008 at 8:05 AM, Alex Martelli wrote: > [SNIP - Alex's well-argued reasons to keep sched] > > > > And then, if needed, we can discuss pure simulation (as opposed to > > simulation-testing of systems designed to normally use the "real" > > sched). But already it seems to me there are plenty of use cases to > > justify retaining sched in the library...! > > OK, sched stays. Do you need mutex to stay as-is, get rolled into > sched, or can we still ditch that module (at least publicly)? codesearch shows a few users of mutex, although not nearly as many as sched itself. A couple of those seemed to think it was for threading, which I think is a good reason to at least rename it. ... Actually, I wouldn't be surprised if half the uses mistakenly believe it's a thread-safe mutex. It's disturbingly common to see them loop until .testandset() returns true (which will always be on the first call, or never.) That method shouldn't exist. It's not worth the effort of redesigning such an obscure module, so I say just rip it out. -- Adam Olsen, aka Rhamphoryncus From rhamph at gmail.com Wed Apr 30 08:15:50 2008 From: rhamph at gmail.com (Adam Olsen) Date: Wed, 30 Apr 2008 00:15:50 -0600 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 8:36 PM, Guido van Rossum wrote: > Let's also fix __len__() so that it returns sys.{maxint,maxsize} when > the result doesn't fit in a Py_ssize_t. Why not leave sq_length as is, but have len() bypass it and call .__len__() directly? C code is likely allocating memory for whatever length it's given, so a sq_length overflow just makes it fail earlier, whereas python code could be more creative (such as printing the length of an on-disk container.) The problem with the indexing API also calling sq_length is moot since you've decided to remove it from range. -- Adam Olsen, aka Rhamphoryncus From jbarham at gmail.com Wed Apr 30 09:42:00 2008 From: jbarham at gmail.com (John Barham) Date: Wed, 30 Apr 2008 00:42:00 -0700 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: References: Message-ID: <4f34febc0804300042v2d6670fbwa68a07f0f0e6565@mail.gmail.com> On Tue, Apr 29, 2008 at 8:42 PM, Guido van Rossum wrote: > On Tue, Apr 29, 2008 at 8:10 PM, Tim Heaney wrote: > > Speaking of this, is it too late to lobby for an iterator version of > > os.listdir? (Perhaps listdir would not be the best name. :) > > > > There is one at > > > > http://wxidle.sourceforge.net/projects/xlistdir/ > > > > but I think it ought to be in the standard library. Moreover, if we > > had such a thing, shouldn't os.walk use it instead of lists? > > I'm not sure I see the advantage of having it as an iterator; I doubt > that there is ever not enough memory to hold the contents of a single > directory. Do you have a compelling use case? I don't know how compelling it is, but the dirread Plan 9 call to get a directory listing (http://plan9.bell-labs.com/magic/man2html/2/dirread) returns only a subset of the entries in the directory so it effectively acts as an iterator. If it's listing a network shared file system an iterator version of listdir could result in less network traffic depending on what entry you were looking for. I don't know if NFS is the same but I think in general it would be a win for network file systems in terms of efficiency. John From martin at v.loewis.de Wed Apr 30 09:58:08 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 30 Apr 2008 09:58:08 +0200 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: <4f34febc0804300042v2d6670fbwa68a07f0f0e6565@mail.gmail.com> References: <4f34febc0804300042v2d6670fbwa68a07f0f0e6565@mail.gmail.com> Message-ID: <48182690.1060508@v.loewis.de> > I don't know how compelling it is, but the dirread Plan 9 call to get > a directory listing > (http://plan9.bell-labs.com/magic/man2html/2/dirread) returns only a > subset of the entries in the directory so it effectively acts as an > iterator. All operating system APIs to read directories work in this way; Plan 9 is not unique (here). > If it's listing a network shared file system an iterator > version of listdir could result in less network traffic depending on > what entry you were looking for. I don't know if NFS is the same but > I think in general it would be a win for network file systems in terms > of efficiency. Still, Guido's question stands: do you have an actual use case where you would want to stop earlier? Even if you glob, you still need to read to the end of the directory. Regards, Martin From ronaldoussoren at mac.com Wed Apr 30 10:49:53 2008 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 30 Apr 2008 10:49:53 +0200 Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: On 30 Apr, 2008, at 2:17, Brett Cannon wrote: > On Tue, Apr 29, 2008 at 3:03 PM, David Bolen > wrote: >> "Brett Cannon" writes: >> >>> Also realize all of the right people have been consulted on this >>> stuff >>> (e.g., the web SIG about the urllib package). So please do not think >>> that something that seems drastic (e.g., the removal of all >>> Mac-specific modules) was taken lightly when in fact the proper >>> people >>> were asked and they were okay with what is going on. >> >> Are there any thoughts on providing some other distribution or >> mechanism >> to build selected Mac modules post-removal? Is it likely to be >> possible >> to grab current 2.x source and build as part of 3.x, providing it's >> not >> a 64-bit system or UCS-4 configuration unsupported by bgen? >> >> I have an application using the QuickTime portion of the Carbon >> package >> very successfully in recent code (with a primarily Tiger-based user >> base), where the higher level Cocoa/ObjC frameworks didn't provide >> the >> necessary functionality under Tiger - it would be nice to have some >> path >> to maintaining that across a 2.x/3.x transition, even if I had to >> build >> something locally. >> > > Even when the code is removed it will still be in the svn history so > you should be able to grab it easily. More importantly: there is no reason why the Carbon stuff couldn't be released as a standalone package. IMHO that's much better in the long run anyway because this allows updates to the Carbon bindings independent of Python releases. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2224 bytes Desc: not available URL: From greg.ewing at canterbury.ac.nz Wed Apr 30 11:24:18 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Apr 2008 21:24:18 +1200 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: <48183AC2.4090900@canterbury.ac.nz> Alexander Belopolsky wrote: > On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: > .. >> R = range(N) >> for i in R: >> for j in R: >> .... > > You realize that in the snippet above whatever cycles you save by > creating R once, you give away by creating iter(R) twice. I'm not so sure about that. Evaluating range(N) involves a global lookup and Python function call, whereas extracting an iterator from a C-implemented iterable doesn't. The difference is likely to be even greater when there are more arguments involved, such as range(m, n, s). Also, the range may be getting computed elsewhere. There are notational conveniences to being able to wrap the description of a range up into a single object that can be passed around and iterated over easily. -- Greg From greg.ewing at canterbury.ac.nz Wed Apr 30 11:27:14 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Apr 2008 21:27:14 +1200 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: References: Message-ID: <48183B72.8020109@canterbury.ac.nz> Tim Heaney wrote: > Speaking of this, is it too late to lobby for an iterator version of > os.listdir? (Perhaps listdir would not be the best name. :) There was discussion about an opendir() function a while back that would return an iterable, but I don't think anything came of it. -- Greg From greg.ewing at canterbury.ac.nz Wed Apr 30 11:42:37 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 30 Apr 2008 21:42:37 +1200 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: <48182690.1060508@v.loewis.de> References: <4f34febc0804300042v2d6670fbwa68a07f0f0e6565@mail.gmail.com> <48182690.1060508@v.loewis.de> Message-ID: <48183F0D.1040206@canterbury.ac.nz> Martin v. L?wis wrote: > Still, Guido's question stands: do you have an actual use case where > you would want to stop earlier? It just seems a bit disappointing to me that the underlying OS has the ability to read directories an item at a time, but this is not made available to the Python programmer. -- Greg From guido at python.org Wed Apr 30 16:02:20 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 07:02:20 -0700 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 7:53 PM, Alexander Belopolsky wrote: > On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: > .. > > There are good reasons for having range() return an Iterable and not > > an Iterator; e.g. > > > > R = range(N) > > for i in R: > > for j in R: > > .... > > You realize that in the snippet above whatever cycles you save by > creating R once, you give away by creating iter(R) twice. So compared > to range() returning an iterator and having to write > > for i in range(N): > for j in range(N): > ... > > you have 3 vs. 2 auxiliary objects created. And how often do you see > code that will not benefit from being generalized from square to > rectangular matrices? > > Lots of C code will go away if we nix the range object and leave only > rangeiterator! That's completely besides the point. The point of the example is that the *Python* code doesn't have to write range(N) twice. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Apr 30 16:05:20 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 07:05:20 -0700 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 8:22 PM, Alexander Belopolsky wrote: > On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: > .. > > There are good reasons for having range() return an Iterable and not > > an Iterator; > > What would you say to an idea of exposing rangeiter in itertools - say > itertools.irange(..) function that returns an iterator? You're kidding right? If you *want* the iterator, what's wrong with iter(range(N))? It's even less characters than itertools.irange(N). :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Wed Apr 30 16:15:13 2008 From: janssen at parc.com (Bill Janssen) Date: Wed, 30 Apr 2008 07:15:13 PDT Subject: [Python-3000] PEP 3108 - stdlib reorg/cleanup In-Reply-To: References: Message-ID: <08Apr30.071523pdt."58696"@synergy1.parc.xerox.com> > I have an application using the QuickTime portion of the Carbon package > very successfully in recent code (with a primarily Tiger-based user > base), where the higher level Cocoa/ObjC frameworks didn't provide the > necessary functionality under Tiger I've got the same issue for the Spotlight API; the Objective-C API is a dumbed-down version of the C API. However, I can get at the C API using ctypes. The only real problem is generating type definitions for ctypes from the Mac header files. If we had a standard way of doing that, I'd say between PyObjC and ctypes, you'd be covered. Bill From guido at python.org Wed Apr 30 16:17:14 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 07:17:14 -0700 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Tue, Apr 29, 2008 at 8:22 PM, Alexander Belopolsky wrote: > On Tue, Apr 29, 2008 at 10:36 PM, Guido van Rossum wrote: > .. > > There are good reasons for having range() return an Iterable and not > > an Iterator; > > What would you say to an idea of exposing rangeiter in itertools - say > itertools.irange(..) function that returns an iterator? You're kidding right? If you *want* the iterator, what's wrong with iter(range(N))? It's even less characters than itertools.irange(N). :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Wed Apr 30 16:48:04 2008 From: aahz at pythoncraft.com (Aahz) Date: Wed, 30 Apr 2008 07:48:04 -0700 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: References: Message-ID: <20080430144804.GA26439@panix.com> On Tue, Apr 29, 2008, Guido van Rossum wrote: > On Tue, Apr 29, 2008 at 8:10 PM, Tim Heaney wrote: >> >> Speaking of this, is it too late to lobby for an iterator version of >> os.listdir? (Perhaps listdir would not be the best name. :) >> >> There is one at >> >> http://wxidle.sourceforge.net/projects/xlistdir/ >> >> but I think it ought to be in the standard library. Moreover, if we >> had such a thing, shouldn't os.walk use it instead of lists? > > I'm not sure I see the advantage of having it as an iterator; I doubt > that there is ever not enough memory to hold the contents of a single > directory. Do you have a compelling use case? There's a big difference between "not enough memory" and "directory consumes lots of memory". My company has some directories with several hundred thousand entries, so using an iterator would be appreciated (although by the time we upgrade to Python 3.x, we probably will have fixed that architecture). But even then, we're talking tens of megabytes at worst, so it's not a killer -- just painful. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Help a hearing-impaired person: http://rule6.info/hearing.html From guido at python.org Wed Apr 30 17:02:28 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 08:02:28 -0700 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: <20080430144804.GA26439@panix.com> References: <20080430144804.GA26439@panix.com> Message-ID: On Wed, Apr 30, 2008 at 7:48 AM, Aahz wrote: > > On Tue, Apr 29, 2008, Guido van Rossum wrote: > > On Tue, Apr 29, 2008 at 8:10 PM, Tim Heaney wrote: > >> > >> Speaking of this, is it too late to lobby for an iterator version of > >> os.listdir? (Perhaps listdir would not be the best name. :) > >> > >> There is one at > >> > >> http://wxidle.sourceforge.net/projects/xlistdir/ > >> > >> but I think it ought to be in the standard library. Moreover, if we > >> had such a thing, shouldn't os.walk use it instead of lists? > > > > I'm not sure I see the advantage of having it as an iterator; I doubt > > that there is ever not enough memory to hold the contents of a single > > directory. Do you have a compelling use case? > > There's a big difference between "not enough memory" and "directory > consumes lots of memory". My company has some directories with several > hundred thousand entries, so using an iterator would be appreciated > (although by the time we upgrade to Python 3.x, we probably will have > fixed that architecture). > > But even then, we're talking tens of megabytes at worst, so it's not a > killer -- just painful. Wow. And the filesystem isn't impossibly slow when accessing the last file in such a directory? Anyway, I'd be fine with a separate os.opendir() call that returns an iterator. The iterator object should also have an optional close() method which explicitly frees the underlying file descriptor (or whatever is used on Windows). But I don't think that changing os.listdir() is worth the pain it's going to cause. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alexander.belopolsky at gmail.com Wed Apr 30 16:20:01 2008 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 30 Apr 2008 10:20:01 -0400 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: On Wed, Apr 30, 2008 at 10:05 AM, Guido van Rossum wrote: .. > > > > What would you say to an idea of exposing rangeiter in itertools - say > > itertools.irange(..) function that returns an iterator? > > You're kidding right? If you *want* the iterator, what's wrong with > iter(range(N))? It's even less characters than itertools.irange(N). > :-) No, I was not kidding (but I may be acting as a performance freak:-). Since you cannot reuse the result of iter(range(N)), using explicit iter call over implicit does not save much. I would be happy with itertool.count(..) getting an optional stop argument instead of adding irange(). From guido at python.org Wed Apr 30 19:34:56 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 10:34:56 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <797440730804181935p1f618e90ob1b8b9efb48932c3@mail.gmail.com> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <797440730804181935p1f618e90ob1b8b9efb48932c3@mail.gmail.com> Message-ID: On Fri, Apr 18, 2008 at 7:35 PM, atsuo ishimoto wrote: > - io.TextIOWrapper doesn't provide interface to change encoding > and error-handler after it was created. This feature is supported > in PEP-3116, but isn't impletented at this time. Will it be > implemented? It should be implemented. It may be a little tricky if there's codec state, but I'm okay with raising an exception in that case or doing something else that's sensible. > It would be nice if we have optional encoding and errors args for print() > and TextIOWrapper.write(), so people can write > print(repr(obj), 'koi8-r', 'backslashescape'). This should be done with a new function, not added to print. Once you specify an encoding, you have to write to sys.stdout.buffer, which is the underlying binary stream; but you'd have to flush the TextIOWrapper and deal with incomplete codec state, and in general I don't think it's a good idea. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Apr 30 19:36:22 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 10:36:22 -0700 Subject: [Python-3000] Displaying strings containing unicode escapes In-Reply-To: <4807C3C1.6010602@v.loewis.de> References: <87ve2jdo17.fsf@uwakimon.sk.tsukuba.ac.jp> <797440730804160249j35a98e5xd9303fbe71430568@mail.gmail.com> <4805ECE1.6040501@gmail.com> <20080416124529.GC8598@phd.pp.ru> <4805FD56.6070902@gmail.com> <20080416133046.GB16087@phd.pp.ru> <480612CE.1010300@gmail.com> <4807C3C1.6010602@v.loewis.de> Message-ID: I still like this proposal. I don't quite understand the competing (?) proposal by Stephen Turnbull; perhaps Stephen can compare and contrast the two proposals? And where does Atsuo fall? On Thu, Apr 17, 2008 at 2:40 PM, "Martin v. L?wis" wrote: > > I do think we should use some kind of Unicode-standard-endorsed > > definition of "printable" (as long as it excludes all ASCII escapes), > > I think > > unicodedata.category(c)[0] != "C" > > is fairly close. That excludes control characters (Cc), format > characters (Cf), surrogates (Cs), private-use (Co) and unassigned > characters (Cn). We should then also escape \, ' and ", following > the traditional algorithm. > > Printable then would be all letters, numbers, punctuation, symbols, > but also marks (e.g. TILDE, COMBINING RIGHT HARPOON ABOVE) and > separators (SPACE, NO-BREAK SPACE, THREE-PER-EM SPACE, LINE SEPARATOR, > PARAGRAPH SEPARATOR). It might be reasonable to also exclude line > separators (Zl) and paragraph separators (Zp), each category having > only one character in them. > > Regards, > Martin > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Wed Apr 30 19:48:13 2008 From: barry at python.org (Barry Warsaw) Date: Wed, 30 Apr 2008 13:48:13 -0400 Subject: [Python-3000] gettext In-Reply-To: References: <1afaf6160804230732t266c8285la97a1fac62f96a8d@mail.gmail.com> Message-ID: <35FDD892-1F6B-42DA-B5DB-FF5DC6992D46@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Apr 24, 2008, at 6:18 PM, Guido van Rossum wrote: > Care to comment? Or know who should comment? > > ---------- Forwarded message ---------- > From: Benjamin Peterson > Date: Wed, Apr 23, 2008 at 7:32 AM > Subject: [Python-3000] gettext > To: Python 3000 > > > [I'm not a gettext expert, so sorry if the following is totally > wrong. :)] > > Are we going to want to keep the "u" variants of the gettext APIs > around in 3.0? Also, the unicode parameters (for .install methods) > don't make much sense in 3.0. > > I don't see how we could remove them in 3.0, but perhaps rename then > to their non-"u" variants and deprecate? I wonder if it makes more sense to keep a unicode version and a bytes version. The simplest solution then would be to change gettext() to return an encoded bytes and leave ugettext() to return the unicode string. I don't have a sense for how useful an encoded translated bytes will be in the real world, and I do think that the unicode translation will be far more likely. That might argue for renaming ugettext() to gettext() and adding something like a egettext() or bgettext() method. OTOH, the current names are inspired from GNU gettext so it seems to me there's not much value in renaming our methods, except to increase confusion and break backward compatibility . - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iQCVAwUBSBiw3XEjvBPtnXfVAQIf8AP8DcnZzB1TeOwnOV5qGsRjBNUTMmE3+IOT /ugEtBhvp12CekNx+/7ibNmz7e8tR7ZcUPaE6vklKoryR5ILoJ8Nonm5VIIr0VaS Hft1xD88ApIcRsESCHlzpErc0X0jsvqjqfH9lhapq0ahZtCUOAALTuKTXu4CkBCM 7PrlDFazPcs= =xDf5 -----END PGP SIGNATURE----- From unknown_kev_cat at hotmail.com Wed Apr 30 08:33:24 2008 From: unknown_kev_cat at hotmail.com (Joe Smith) Date: Wed, 30 Apr 2008 02:33:24 -0400 Subject: [Python-3000] [stdlib-sig] PEP 3108 - stdlib reorg/cleanup References: <001401c8a9dd$e82e63c0$c600a8c0@RaymondLaptop1> Message-ID: "Brett Cannon" wrote in message news:bbaeab100804291344g1ca48af9s3b5bbdacf516b8d7 at mail.gmail.com... > On Tue, Apr 29, 2008 at 2:46 AM, Raymond Hettinger wrote: >> >> > * UserList/UserString [done: 3.0] >> > >> >> Note that these were updated and moved to the collections module in >> Py3.0. >> > > Noted. > >> >> >> > anydbm dbm.tools [1]_ >> > whichdb dbm.tools [1]_ >> > >> >> Were there any better naming suggestions than dbm.tools? The original >> names seem much more informative. >> > > But way too much overhead for two modules that only contained one > useful function each. As Nick said, if you don't know DB stuff then I > don't see any loss of information. > > If you can come up with a better name I am open to suggestions, but > the module merge will happen. Is there a problem having the functions be just dbm.open() and dmb.whichdb()? As a user the latter one seems espeically logical, as it is a tool to help me select which "submodule" I want to use. From bronger at physik.rwth-aachen.de Wed Apr 30 20:41:55 2008 From: bronger at physik.rwth-aachen.de (Torsten Bronger) Date: Wed, 30 Apr 2008 20:41:55 +0200 Subject: [Python-3000] gettext References: <1afaf6160804230732t266c8285la97a1fac62f96a8d@mail.gmail.com> <35FDD892-1F6B-42DA-B5DB-FF5DC6992D46@python.org> Message-ID: <87d4o7chho.fsf@physik.rwth-aachen.de> Hall?chen! Barry Warsaw writes: > On Apr 24, 2008, at 6:18 PM, Guido van Rossum wrote: > >> [...] >> >> ---------- Forwarded message ---------- >> From: Benjamin Peterson >> Date: Wed, Apr 23, 2008 at 7:32 AM >> Subject: [Python-3000] gettext >> To: Python 3000 >> >> [...] >> >> Are we going to want to keep the "u" variants of the gettext APIs >> around in 3.0? Also, the unicode parameters (for .install >> methods) don't make much sense in 3.0. >> >> I don't see how we could remove them in 3.0, but perhaps rename >> then to their non-"u" variants and deprecate? > > I wonder if it makes more sense to keep a unicode version and a > bytes version. The simplest solution then would be to change > gettext() to return an encoded bytes and leave ugettext() to > return the unicode string. I don't have a sense for how useful an > encoded translated bytes will be in the real world, and I do think > that the unicode translation will be far more likely. Indeed. From today's perspective, I see no use case for getting human text snippets in byte strings encoded with the same encoding that just happened to be used in the .mo file, or with the "preferred system encoding". So it is only about the question how much hassle a renaming/deprecation generates for existing code. > That might argue for renaming ugettext() to gettext() and adding > something like a egettext() or bgettext() method. Okay. But I think its not much advantage to have the "encoded" functions under new names, given that instead of renaming, you can also easily use ugettext to mimic their behaviour. > OTOH, the current names are inspired from GNU gettext so it seems > to me there's not much value in renaming our methods, except to > increase confusion and break backward compatibility . Well, this is hard to evaluate. However, I think that if there is no danger of getting silent errors, then the module should switch to unicode, possibly even unicode-only. After all, the results of gettext are likely to be passed to higher-level functions that use (or will switch to) unicode, too. As for "gettext" returning a unicode string: If clearly documented, I see not too much harm in using a different type scheme than C gettext; this should be acceptable in a reimplementation in another language. Just my 2c. Tsch?, Torsten. -- Torsten Bronger, aquisgrana, europa vetus Jabber ID: bronger at jabber.org (See http://ime.webhop.org for further contact info.) From qrczak at knm.org.pl Wed Apr 30 21:14:14 2008 From: qrczak at knm.org.pl (Marcin =?UTF-8?Q?=E2=80=98Qrczak=E2=80=99?= Kowalczyk) Date: Wed, 30 Apr 2008 21:14:14 +0200 Subject: [Python-3000] range() issues In-Reply-To: References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> Message-ID: <1209582854.1924.7.camel@qrnik> Dnia 29-04-2008, wto o godzinie 19:36 -0700, Guido van Rossum pisze: > Let's also fix __len__() so that it returns sys.{maxint,maxsize} when > the result doesn't fit in a Py_ssize_t. Is this official? What should sq_length do when the real size doesn't fit in a Py_ssize_t? It should be documented. Either return maxsize or fail, with OverflowError probably. I admit that the only case I have in mind is some virtual sequence analogous to range (wrapped from my language in a Python object). -- __("< Marcin Kowalczyk \__/ qrczak at knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ From guido at python.org Wed Apr 30 21:18:20 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 30 Apr 2008 12:18:20 -0700 Subject: [Python-3000] range() issues In-Reply-To: <1209582854.1924.7.camel@qrnik> References: <1afaf6160804252025q3f36dba5h4164e50785a40926@mail.gmail.com> <5c6f2a5d0804291317k424a4a89v37f9b1691d9f76ad@mail.gmail.com> <5c6f2a5d0804291409re370ce2ifb913a6e1e4f1987@mail.gmail.com> <1afaf6160804291418s7723dd8cqb37e495a043a4723@mail.gmail.com> <1209582854.1924.7.camel@qrnik> Message-ID: I would like to see the following: - sq_length should return maxsize if the actual value doesn't fit - if __len__ is implemented in Python, it may return a value > maxsize, but calling len() will call sq_length, and the sq_length wrapper that calls __len__ must truncate the value to maxsize - if a user wants to get the untruncated length of something that implements __len__ in Python and could return a value > maxsize, they should call the __len__ method directly (not a very common use case) --Guido On Wed, Apr 30, 2008 at 12:14 PM, Marcin 'Qrczak' Kowalczyk wrote: > Dnia 29-04-2008, wto o godzinie 19:36 -0700, Guido van Rossum pisze: > > > > Let's also fix __len__() so that it returns sys.{maxint,maxsize} when > > the result doesn't fit in a Py_ssize_t. > > Is this official? What should sq_length do when the real size doesn't > fit in a Py_ssize_t? It should be documented. Either return maxsize or > fail, with OverflowError probably. > > I admit that the only case I have in mind is some virtual sequence > analogous to range (wrapped from my language in a Python object). > > -- > __("< Marcin Kowalczyk > \__/ qrczak at knm.org.pl > ^^ http://qrnik.knm.org.pl/~qrczak/ > > > > _______________________________________________ > Python-3000 mailing list > Python-3000 at python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gnewsg at gmail.com Wed Apr 30 14:57:55 2008 From: gnewsg at gmail.com (Giampaolo Rodola') Date: Wed, 30 Apr 2008 05:57:55 -0700 (PDT) Subject: [Python-3000] Removal of os.path.walk In-Reply-To: <48183B72.8020109@canterbury.ac.nz> References: <48183B72.8020109@canterbury.ac.nz> Message-ID: <6212a3c5-5334-4b9b-828c-9cc173b3dcfe@d45g2000hsc.googlegroups.com> On 30 Apr, 11:27, Greg Ewing wrote: > Tim Heaney wrote: > > Speaking of this, is it too late to lobby for an iterator version of > > os.listdir? (Perhaps listdir would not be the best name. :) > > There was discussion about an opendir() function a while > back that would return an iterable, but I don't think > anything came of it. > > -- > Greg Here it is: http://groups.google.com/group/python-ideas/browse_thread/thread/3733d3c3f2c602e5/b1238f081e3e5689?lnk=gst&q=listdir --- Giampaolo http://code.google.com/p/pyftpdlib/ From mwm at mired.org Wed Apr 30 17:10:59 2008 From: mwm at mired.org (Mike Meyer) Date: Wed, 30 Apr 2008 11:10:59 -0400 Subject: [Python-3000] Removal of os.path.walk In-Reply-To: References: <20080430144804.GA26439@panix.com> Message-ID: <20080430111059.5329437c@mbook-fbsd> On Wed, 30 Apr 2008 08:02:28 -0700 "Guido van Rossum" wrote: > On Wed, Apr 30, 2008 at 7:48 AM, Aahz wrote: > > > > On Tue, Apr 29, 2008, Guido van Rossum wrote: > > > On Tue, Apr 29, 2008 at 8:10 PM, Tim Heaney wrote: > > >> > > >> Speaking of this, is it too late to lobby for an iterator version of > > >> os.listdir? (Perhaps listdir would not be the best name. :) > > >> > > >> There is one at > > >> > > >> http://wxidle.sourceforge.net/projects/xlistdir/ > > >> > > >> but I think it ought to be in the standard library. Moreover, if we > > >> had such a thing, shouldn't os.walk use it instead of lists? > > > > > > I'm not sure I see the advantage of having it as an iterator; I doubt > > > that there is ever not enough memory to hold the contents of a single > > > directory. Do you have a compelling use case? > > > > There's a big difference between "not enough memory" and "directory > > consumes lots of memory". My company has some directories with several > > hundred thousand entries, so using an iterator would be appreciated > > (although by the time we upgrade to Python 3.x, we probably will have > > fixed that architecture). > > > > But even then, we're talking tens of megabytes at worst, so it's not a > > killer -- just painful. > > Wow. And the filesystem isn't impossibly slow when accessing the last > file in such a directory? Modern file system hash directory entries, so access time by name is essentially O(1) out to well beyond 50K files in a directory. > Anyway, I'd be fine with a separate os.opendir() call that returns an > iterator. The iterator object should also have an optional close() > method which explicitly frees the underlying file descriptor (or > whatever is used on Windows). I think the real win here will be on file systems that return the files in some well-defined order. If you have to process them all, you can save on memory, but if you can use the order to skip looking at some of them completely, that's save disk I/O. Since this is file-system dependent, it would be nice if os.opendir() was required to preserve the ordering semantics (if any) of the underlying system. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information.