From ziade.tarek at gmail.com Sun Jul 4 12:47:03 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 4 Jul 2010 12:47:03 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org Message-ID: Hello, If you follow python-checkins, you have probably notice and got annoyed this morning my +100 checkin mails in distutils2. I was lagging a bit on getting the GSOC students work pulled in, and, with the DVCS effect, you get 2/3 weeks of work in hg.python.org in a minute. :) Once CPython itself is in mercurial, we will probably have the same problem when people are pulling contributions. If you use a "hg pull" command it will get all commits from the third party, even if some if those commits are unnecessary noise, like "I have removed this file. OOps I am putting the file back in..". And it's not so easy to edit the incoming changelog once they are commited. It's not easy either to use "hg incoming" because most of the time, the third party clone has many unrelated changes. I think we should work with queues and patches everywhere to solve this. The idea is to have contributors handling hg patches in bug.python.org, one patch per feature. They can use mq for that, and the benefit will be to have a very clean history in all repositories. A good thing about hg patches is that unlike simple diffs, the contributor name and comment appears in the final changelog. I would like to propose a policy for hg.python.org, based on mercurial queues + bugs.python.org, and I would like to contribute a small guide about it in python.org/dev. Regards Tarek -- Tarek Ziad? | http://ziade.org From solipsis at pitrou.net Sun Jul 4 12:57:00 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 4 Jul 2010 12:57:00 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org References: Message-ID: <20100704125700.3edc8fa3@pitrou.net> On Sun, 4 Jul 2010 12:47:03 +0200 Tarek Ziad? wrote: > > I would like to propose a policy for hg.python.org, based on mercurial > queues + bugs.python.org, and I would like to contribute a small guide > about it in python.org/dev. Sounds good. We can probably make mq optional, since regular diffs would work as well (except that they wouldn't contain the original committer name, but that isn't different from what we have today). Regards Antoine. From merwok at netwok.org Sun Jul 4 13:14:32 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 04 Jul 2010 13:14:32 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <20100704125700.3edc8fa3@pitrou.net> References: <20100704125700.3edc8fa3@pitrou.net> Message-ID: <4C306D18.3050100@netwok.org> [Antoine Pitrou] > [Tarek Ziad?] >> I would like to propose a policy for hg.python.org, based on mercurial >> queues + bugs.python.org, and I would like to contribute a small guide >> about it in python.org/dev. > > Sounds good. We can probably make mq optional, since regular diffs > would work as well (except that they wouldn't contain the > original committer name, but that isn't different from what we have > today). Agreed. The policy can just require patches and thus let people choose their local workflow (many commits in a named branch/bookmark/pbranch and then diff, MQ for moar power, or just edit things to get a diff without using a fancy command (like now)). The policy should say something about authorship attribution. Mercurial-made diffs contain the user name in a special comment which is used by hg import, plain diffs can be applied with patch and then committed with ?hg commit --user "Bill "?, and if a patch is edited before commit, then use the current style (core dev as committer, original patch author in the commit message). First-class authorship acknowledgment is a nice feature of DVCSes. The policy should also allow pulling from another repo if it contains changesets that aren?t crufty. In that case, a pusher (new name for a svn committer) can just pull from Bill and push to the main repo, adding extra export-to-patch import-from-patch steps is unnecessary. Cheers From dirkjan at ochtman.nl Sun Jul 4 15:46:53 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Sun, 04 Jul 2010 15:46:53 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: Message-ID: <4C3090CD.7020909@ochtman.nl> On 2010-07-04 12:47, Tarek Ziad? wrote: > Once CPython itself is in mercurial, we will probably have the same > problem when people are pulling contributions. If you use a "hg pull" > command it will get all commits from the third party, even if some if > those commits are unnecessary noise, like > "I have removed this file. OOps I am putting the file back in..". > > And it's not so easy to edit the incoming changelog once they are > commited. It's not easy either to use "hg incoming" because most of > the time, the third party clone has many unrelated changes. I think we > should work with queues and patches everywhere to solve this. > > The idea is to have contributors handling hg patches in > bug.python.org, one patch per feature. They can use mq for that, and > the benefit will be to have a very clean history in all repositories. > A good thing about hg patches is that unlike simple diffs, the > contributor name and comment appears in the final changelog. Hmm, I don't think I agree on what you're saying. First, a changeset is a changeset is a changeset. If you exchange them as patches or in some other way (by pulling or pushing or whatever) shouldn't really matter. This is one of the things DVCS is good at, you can move csets around different clones in many ways, and all clones are created equal. Second, a noisy history is never good. So yes, pulling some kind of messy history and pushing it to a central repo as-is is not a good idea. People should polish their changesets so that each changeset can stand on its own. So yes, somewhere between it being a messy history of actual development and it going into a central repo, it should be cleaned up. Ideally, the original author should do that, but if he's not in a position to do so, the committer should do it. Third, if the result of cleaning up is a single cset, it should probably be rebased before getting pushed to a central repo. If it's two or three csets, rebase it. On the other hand, if it's 10 csets, actually doing an explicit merge makes sense. The idea is not to clutter up the history with a merge every other cset, but if the merge is hard/non-trivial it can make sense to leave it explicit. Fourth, one-patch-per-issue is too restrictive. Small commits are useful because they're way easier to review. Concatenate several small commits leading up to a single issue fix into a single patch and it gets much harder to read. Easy reviews are important, because a lot of valuable time is spent reviewing. The simple example is a chain like refactor-refactor-fix (which is IME quite common). Ideally each stage keeps the test suite passing and is internally consistent, but moving towards a common goal (to fix the issue). So, I find your proposed policy somewhat vague and also not that attractive. Cleaning up the history is certainly a good thing, but I don't think we have to mandate a way for things to get into the repo. Mandating the use of issues as a reference for each fix or enhancement could be useful, but seems unnecessary. Cheers, Dirkjan From ziade.tarek at gmail.com Sun Jul 4 15:53:19 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 4 Jul 2010 15:53:19 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C308B38.2050800@ochtman.nl> References: <4C308B38.2050800@ochtman.nl> Message-ID: On Sun, Jul 4, 2010 at 3:23 PM, Dirkjan Ochtman wrote: ... > > Hmm, I don't think I agree on what you're saying. > > First, a changeset is a changeset is a changeset. If you exchange them as > patches or in some other way (by pulling or pushing or whatever) shouldn't > really matter. This is one of the things DVCS is good at, you can move csets > around different clones in many ways, and all clones are created equal. As far as I have experienced, there's a back and forth game between the contributor and the commiter, leading to clone hell, unless the contributor uses mq, then apply it in his clone right before the contributor pulls the changes in. Using patches makes changes separated from the history for the time being, until they are merged. And are easy to read, review and understand. > Second, a noisy history is never good. So yes, pulling some kind of messy > history and pushing it to a central repo as-is is not a good idea. People > should polish their changesets so that each changeset can stand on its own. When you work on a feature, how do you polish a changeset without polluting your history or doing many clones ? (true question, I've been looking for that) > So yes, somewhere between it being a messy history of actual development and > it going into a central repo, it should be cleaned up. Ideally, the original > author should do that, but if he's not in a position to do so, the committer > should do it. Do you have an easy way to perform this cleanup ? Could you propose a process here ? I am bit skeptical that contributors will do this, whereas a patch policy makes sure we don't have to deal with this, and avoid asking people to have a high mercurial-fu. I am also skeptical that contributors are willing to digg into a clone to get what they want and/or check that it's fine to pull. It seems to me that patches are the universal format to propose a change and are easy to produce and consume. Contributors can use any process they want to create it, even without using mercurial. > > Third, if the result of cleaning up is a single cset, it should probably be > rebased before getting pushed to a central repo. If it's two or three csets, > rebase it. On the other hand, if it's 10 csets, actually doing an explicit > merge makes sense. The idea is not to clutter up the history with a merge > every other cset, but if the merge is hard/non-trivial it can make sense to > leave it explicit. > > Fourth, one-patch-per-issue is too restrictive. Small commits are useful > because they're way easier to review. Concatenate several small commits > leading up to a single issue fix into a single patch and it gets much harder > to read. Easy reviews are important, because a lot of valuable time is spent > reviewing. The simple example is a chain like refactor-refactor-fix (which > is IME quite common). Ideally each stage keeps the test suite passing and is > internally consistent, but moving towards a common goal (to fix the issue). > > So, I find your proposed policy somewhat vague and also not that attractive. > Cleaning up the history is certainly a good thing, but I don't think we have > to mandate a way for things to get into the repo. Mandating the use of > issues as a reference for each fix or enhancement could be useful, but seems > unnecessary. Yes, it's vague, I don't have a clear idea yet and I am not that experienced in hg latest features, so I am probably doing some steps wrong or ignore some shortcuts. But I have the strong feeling that without patches, we are heading for extra work for all parties unless we have a strong tutorial on how to contribute with hg.python.org, and that is proven to be very simple. side note: I am replying to the gname emails but I don't know if this works with the mailman thread as well.. Tarek From solipsis at pitrou.net Sun Jul 4 17:26:56 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 4 Jul 2010 17:26:56 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org References: <4C3090CD.7020909@ochtman.nl> Message-ID: <20100704172656.1dae21c1@pitrou.net> On Sun, 04 Jul 2010 15:46:53 +0200 Dirkjan Ochtman wrote: > > Fourth, one-patch-per-issue is too restrictive. Small commits are useful > because they're way easier to review. Concatenate several small commits > leading up to a single issue fix into a single patch and it gets much > harder to read. I don't agree with that. The commits obviously won't be independent because they will be motivated by each other (or even dependent on each other), therefore you have to remember what the other commits do when reviewing one of them. What's more, when reading "hg log" months or years later, it is hard to make sense of a single commit because you don't really know what issue it was meant to contribute to fix. I know that's how Mercurial devs do things, but I don't really like it. Regards Antoine. From thomas at jollans.com Sun Jul 4 18:06:45 2010 From: thomas at jollans.com (Thomas Jollans) Date: Sun, 04 Jul 2010 18:06:45 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C308B38.2050800@ochtman.nl> Message-ID: <4C30B195.1080804@jollans.com> On 07/04/2010 03:53 PM, Tarek Ziad? wrote: > > > On Sun, Jul 4, 2010 at 3:23 PM, Dirkjan Ochtman wrote: > ... >> >> Hmm, I don't think I agree on what you're saying. >> >> First, a changeset is a changeset is a changeset. If you exchange them as >> patches or in some other way (by pulling or pushing or whatever) shouldn't >> really matter. This is one of the things DVCS is good at, you can move csets >> around different clones in many ways, and all clones are created equal. > > As far as I have experienced, there's a back and forth game between > the contributor and the commiter, leading to clone hell, unless the > contributor uses mq, then apply it in his clone > right before the contributor pulls the changes in. > > Using patches makes changes separated from the history for the time > being, until they are merged. And are easy to read, review and > understand. > >> Second, a noisy history is never good. So yes, pulling some kind of messy >> history and pushing it to a central repo as-is is not a good idea. People >> should polish their changesets so that each changeset can stand on its own. > > When you work on a feature, how do you polish a changeset without > polluting your history or doing many clones ? (true question, I've > been looking for that) mq is a good method. If a changeset that only exists locally has to be changed, you can convert it to a patch, make some changes, and re-commit. If the changesets are relatively clean in the first place, you can rebase/transplant/strip your way around too big a mess. > >> So yes, somewhere between it being a messy history of actual development and >> it going into a central repo, it should be cleaned up. Ideally, the original >> author should do that, but if he's not in a position to do so, the committer >> should do it. > > Do you have an easy way to perform this cleanup ? Could you propose a > process here ? > > I am bit skeptical that contributors will do this, whereas a patch > policy makes sure we don't have to deal with this, and avoid asking > people to have a high mercurial-fu. I am also skeptical that > contributors are willing to digg into a clone to get what they want > and/or check that it's fine to pull. > > It seems to me that patches are the universal format to propose a > change and are easy to produce and consume. Contributors can use any > process they want to create it, even without using mercurial. There's no reason to force those of us capable of producing clean hg branches back into the world of patches. I can see why you'd want to be able to say "no, this repo is a mess. Submit something presentable, like a patch." Some "how to contribute" document might recommend using mq, but it shouldn't be a requirement - pulling comes naturally with DVCS. Python should use it. Accept patches - sure - not everyone uses mercurial. Require patches - please don't! > >> >> Third, if the result of cleaning up is a single cset, it should probably be >> rebased before getting pushed to a central repo. If it's two or three csets, >> rebase it. On the other hand, if it's 10 csets, actually doing an explicit >> merge makes sense. The idea is not to clutter up the history with a merge >> every other cset, but if the merge is hard/non-trivial it can make sense to >> leave it explicit. >> >> Fourth, one-patch-per-issue is too restrictive. Small commits are useful >> because they're way easier to review. Concatenate several small commits >> leading up to a single issue fix into a single patch and it gets much harder >> to read. Easy reviews are important, because a lot of valuable time is spent >> reviewing. The simple example is a chain like refactor-refactor-fix (which >> is IME quite common). Ideally each stage keeps the test suite passing and is >> internally consistent, but moving towards a common goal (to fix the issue). >> >> So, I find your proposed policy somewhat vague and also not that attractive. >> Cleaning up the history is certainly a good thing, but I don't think we have >> to mandate a way for things to get into the repo. Mandating the use of >> issues as a reference for each fix or enhancement could be useful, but seems >> unnecessary. > > Yes, it's vague, I don't have a clear idea yet and I am not that > experienced in hg latest features, so I am probably doing some steps > wrong or ignore some shortcuts. > > But I have the strong feeling that without patches, we are heading for > extra work for all parties > unless we have a strong tutorial on how to contribute with > hg.python.org, and that is proven to be very simple. > > side note: I am replying to the gname emails but I don't know if this > works with the mailman thread as well.. > > Tarek > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From g.brandl at gmx.net Sun Jul 4 18:56:11 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 04 Jul 2010 18:56:11 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <20100704172656.1dae21c1@pitrou.net> References: <4C3090CD.7020909@ochtman.nl> <20100704172656.1dae21c1@pitrou.net> Message-ID: Am 04.07.2010 17:26, schrieb Antoine Pitrou: > On Sun, 04 Jul 2010 15:46:53 +0200 > Dirkjan Ochtman wrote: >> >> Fourth, one-patch-per-issue is too restrictive. Small commits are useful >> because they're way easier to review. Concatenate several small commits >> leading up to a single issue fix into a single patch and it gets much >> harder to read. > > I don't agree with that. The commits obviously won't be independent > because they will be motivated by each other (or even dependent on each > other), therefore you have to remember what the other commits do when > reviewing one of them. What's more, when reading "hg log" months or > years later, it is hard to make sense of a single commit because you > don't really know what issue it was meant to contribute to fix. > > I know that's how Mercurial devs do things, but I don't really like > it. I think the best of both worlds is to encourage contributors to send more complicated patches in a series of easy-to-review steps, but when committing to Python, make one changeset out of them. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From ziade.tarek at gmail.com Sun Jul 4 19:09:36 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 4 Jul 2010 19:09:36 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C3090CD.7020909@ochtman.nl> <20100704172656.1dae21c1@pitrou.net> Message-ID: On Sun, Jul 4, 2010 at 6:56 PM, Georg Brandl wrote: > Am 04.07.2010 17:26, schrieb Antoine Pitrou: >> On Sun, 04 Jul 2010 15:46:53 +0200 >> Dirkjan Ochtman wrote: >>> >>> Fourth, one-patch-per-issue is too restrictive. Small commits are useful >>> because they're way easier to review. Concatenate several small commits >>> leading up to a single issue fix into a single patch and it gets much >>> harder to read. >> >> I don't agree with that. The commits obviously won't be independent >> because they will be motivated by each other (or even dependent on each >> other), therefore you have to remember what the other commits do when >> reviewing one of them. What's more, when reading "hg log" months or >> years later, it is hard to make sense of a single commit because you >> don't really know what issue it was meant to contribute to fix. >> >> I know that's how Mercurial devs do things, but I don't really like >> it. > > I think the best of both worlds is to encourage contributors to send > more complicated patches in a series of easy-to-review steps, but when > committing to Python, make one changeset out of them. Exactly, so one bugfix or one feature comes in a single changeset that contains ideally the code change + the doc change + the tests. Like Thomas has suggested, I'll start a "how to contribute" wiki page with the best practices, and give the url here so everyone can contribute/correct it. Tarek -- Tarek Ziad? | http://ziade.org From mal at egenix.com Mon Jul 5 09:30:54 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 05 Jul 2010 09:30:54 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C30B195.1080804@jollans.com> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> Message-ID: <4C318A2E.1070303@egenix.com> Thomas Jollans wrote: > On 07/04/2010 03:53 PM, Tarek Ziad? wrote: >> >> Do you have an easy way to perform this cleanup ? Could you propose a >> process here ? >> >> I am bit skeptical that contributors will do this, whereas a patch >> policy makes sure we don't have to deal with this, and avoid asking >> people to have a high mercurial-fu. I am also skeptical that >> contributors are willing to digg into a clone to get what they want >> and/or check that it's fine to pull. >> >> It seems to me that patches are the universal format to propose a >> change and are easy to produce and consume. Contributors can use any >> process they want to create it, even without using mercurial. > > There's no reason to force those of us capable of producing clean hg > branches back into the world of patches. I can see why you'd want to be > able to say "no, this repo is a mess. Submit something presentable, like > a patch." Some "how to contribute" document might recommend using mq, > but it shouldn't be a requirement - pulling comes naturally with DVCS. > Python should use it. > > Accept patches - sure - not everyone uses mercurial. Require patches - > please don't! I'm with Tarek here: the only way for core developers to be able to review checkins on the checkins list is by looking at the patches that go in. Having to look at 10+ checkin emails for a single "patch" will break this review process - no-one will be able to follow what a particular pulled set of changes will do in the end, compared to what we had in the repo before the pull. As a result, the review process will no longer be possible. As example, see Tarek's pull/push of the distutils2 work. Those checkin email will just rush by and not get a second or third review. OTOH, I don't think that requiring to open a ticket on the tracker for everything is needed either. Aside 1: Isn't it interesting that the more we actually think about moving to Mercurial, the more we find that the existing Subversion model of working is actually a very workable model for a large open source project ?! Aside 2: This thread should really be moved to python-dev where it belongs. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 05 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2010-07-19: EuroPython 2010, Birmingham, UK 13 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From thomas at jollans.com Mon Jul 5 11:42:25 2010 From: thomas at jollans.com (Thomas Jollans) Date: Mon, 05 Jul 2010 11:42:25 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C318A2E.1070303@egenix.com> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> Message-ID: <4C31A901.2000800@jollans.com> On 07/05/2010 09:30 AM, M.-A. Lemburg wrote: > Thomas Jollans wrote: >> On 07/04/2010 03:53 PM, Tarek Ziad? wrote: >>> >>> Do you have an easy way to perform this cleanup ? Could you propose a >>> process here ? >>> >>> I am bit skeptical that contributors will do this, whereas a patch >>> policy makes sure we don't have to deal with this, and avoid asking >>> people to have a high mercurial-fu. I am also skeptical that >>> contributors are willing to digg into a clone to get what they want >>> and/or check that it's fine to pull. >>> >>> It seems to me that patches are the universal format to propose a >>> change and are easy to produce and consume. Contributors can use any >>> process they want to create it, even without using mercurial. >> >> There's no reason to force those of us capable of producing clean hg >> branches back into the world of patches. I can see why you'd want to be >> able to say "no, this repo is a mess. Submit something presentable, like >> a patch." Some "how to contribute" document might recommend using mq, >> but it shouldn't be a requirement - pulling comes naturally with DVCS. >> Python should use it. >> >> Accept patches - sure - not everyone uses mercurial. Require patches - >> please don't! > > I'm with Tarek here: the only way for core developers to be able to > review checkins on the checkins list is by looking at the patches > that go in. > > Having to look at 10+ checkin emails for a single "patch" will > break this review process - no-one will be able to follow what > a particular pulled set of changes will do in the end, compared > to what we had in the repo before the pull. As a result, the > review process will no longer be possible. If the problem is the amount of changesets per "patch", then it has to be the responsibility of the person committing - be that a core dev or an external contributor - to make sure it's only a single changeset. OTOH, I don't think being that strict about it is a good idea - in many cases, having a handful of changesets is, IMHO, better, with Mercurial. Either way, if there is some sort of policy stating how changes should look, I for one would be happy to publish a branch on bitbucket or my own hgweb instance in that format. Permitting text patches is a must, but requiring text patches when we have actual distributed branching is quite the anachronism. > > As example, see Tarek's pull/push of the distutils2 work. Those > checkin email will just rush by and not get a second or third > review. If the problem is the amount of emails per "patch" then, for god's sake, change the script that writes the emails to send a mail per push, instead of a mail per commit ! DVCSs allow one to have small, atomic (but, of course, inter-dependent) commits, and push them later. I myself feel that this property should be valued, not feared. > OTOH, I don't think that requiring to open a ticket on the tracker > for everything is needed either. > > Aside 1: Isn't it interesting that the more we actually think about > moving to Mercurial, the more we find that the existing Subversion > model of working is actually a very workable model for a large > open source project ?! It's all a question of how changes are reviewed and synchronised. Of course, the Python subversion model works, no question. The specific approach "turn every commit into an email for proof reading" appears to work well with it. It may not work as well with hg. It may work better if you s/commit/push/ instead of s/commit/changeset/. Other projects work in a more distributed fashion, with developers' private repositories, changes being reviewed before pulling/merging. Linux is, of course, a prominent example. If this approach is for Python, I don't know. I doubt it, at least for the time being. But a suitable workflow will surely be found. Ah well, we'll see what happens. Thomas From mal at egenix.com Mon Jul 5 12:11:06 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 05 Jul 2010 12:11:06 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C31A901.2000800@jollans.com> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <4C31A901.2000800@jollans.com> Message-ID: <4C31AFBA.9080800@egenix.com> Thomas Jollans wrote: > On 07/05/2010 09:30 AM, M.-A. Lemburg wrote: >> Thomas Jollans wrote: >>> On 07/04/2010 03:53 PM, Tarek Ziad? wrote: >>>> >>>> Do you have an easy way to perform this cleanup ? Could you propose a >>>> process here ? >>>> >>>> I am bit skeptical that contributors will do this, whereas a patch >>>> policy makes sure we don't have to deal with this, and avoid asking >>>> people to have a high mercurial-fu. I am also skeptical that >>>> contributors are willing to digg into a clone to get what they want >>>> and/or check that it's fine to pull. >>>> >>>> It seems to me that patches are the universal format to propose a >>>> change and are easy to produce and consume. Contributors can use any >>>> process they want to create it, even without using mercurial. >>> >>> There's no reason to force those of us capable of producing clean hg >>> branches back into the world of patches. I can see why you'd want to be >>> able to say "no, this repo is a mess. Submit something presentable, like >>> a patch." Some "how to contribute" document might recommend using mq, >>> but it shouldn't be a requirement - pulling comes naturally with DVCS. >>> Python should use it. >>> >>> Accept patches - sure - not everyone uses mercurial. Require patches - >>> please don't! >> >> I'm with Tarek here: the only way for core developers to be able to >> review checkins on the checkins list is by looking at the patches >> that go in. >> >> Having to look at 10+ checkin emails for a single "patch" will >> break this review process - no-one will be able to follow what >> a particular pulled set of changes will do in the end, compared >> to what we had in the repo before the pull. As a result, the >> review process will no longer be possible. > > If the problem is the amount of changesets per "patch", then it has to > be the responsibility of the person committing - be that a core dev or > an external contributor - to make sure it's only a single changeset. > OTOH, I don't think being that strict about it is a good idea - in many > cases, having a handful of changesets is, IMHO, better, with Mercurial. > > Either way, if there is some sort of policy stating how changes should > look, I for one would be happy to publish a branch on bitbucket or my > own hgweb instance in that format. Permitting text patches is a must, > but requiring text patches when we have actual distributed branching is > quite the anachronism. You need those patches anyway, since that's how we review things on the issue tracker. The point I wanted to make was that (at least some of) the core devs do monitor the checkins list for new code and/or changes to existing code going in. This would not longer reasonably work, if you start pushing revisions of patches down the list as well. The history of those patches is not all that interesting to Python developers. It's the final outcome, that makes the difference. >> As example, see Tarek's pull/push of the distutils2 work. Those >> checkin email will just rush by and not get a second or third >> review. > > If the problem is the amount of emails per "patch" then, for god's sake, > change the script that writes the emails to send a mail per push, > instead of a mail per commit ! > > DVCSs allow one to have small, atomic (but, of course, inter-dependent) > commits, and push them later. I myself feel that this property should be > valued, not feared. This is not a matter of receiving the patch in 10+ emails, or lumping everything into one email. I simply don't see any benefit in having to follow the path of development of a patch. Much to the contrary: it only adds noise that distracts from the important bits. The discussion of a patch is recorded on the issue tracker anyway and in a form that is more easily comprehensible than a set of checkin messages. >> OTOH, I don't think that requiring to open a ticket on the tracker >> for everything is needed either. >> >> Aside 1: Isn't it interesting that the more we actually think about >> moving to Mercurial, the more we find that the existing Subversion >> model of working is actually a very workable model for a large >> open source project ?! > > It's all a question of how changes are reviewed and synchronised. Of > course, the Python subversion model works, no question. The specific > approach "turn every commit into an email for proof reading" appears to > work well with it. It may not work as well with hg. It may work better > if you s/commit/push/ instead of s/commit/changeset/. Other projects > work in a more distributed fashion, with developers' private > repositories, changes being reviewed before pulling/merging. Linux is, > of course, a prominent example. If this approach is for Python, I don't > know. I doubt it, at least for the time being. But a suitable workflow > will surely be found. > > > Ah well, we'll see what happens. Certainly. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 05 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2010-07-19: EuroPython 2010, Birmingham, UK 13 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From p.f.moore at gmail.com Mon Jul 5 14:20:43 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 5 Jul 2010 13:20:43 +0100 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C31AFBA.9080800@egenix.com> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <4C31A901.2000800@jollans.com> <4C31AFBA.9080800@egenix.com> Message-ID: On 5 July 2010 11:11, M.-A. Lemburg wrote: > The point I wanted to make was that (at least some of) the core > devs do monitor the checkins list for new code and/or changes > to existing code going in. This would not longer reasonably > work, if you start pushing revisions of patches down the list > as well. I agree entirely that commits should be made up of "completed" patches, not of "work in progress" (patch 2 fixing a badly named variable in patch 1, etc). But there may be merit in breaking a large patch into a series of self-contained, incremental changes - which *can* be reviewed independently, but which make sense as a group. For example, one patch that introduces set literals, a second which updates the standard library code to use them. As a more radical possibility, a patch could be broken up into 3, one with the code changes, one with the tests and one with the documentation. That may be less acceptable, although it does allow for the possibility of someone with little C experience to contribute by reviewing the docs and tests without having to worry about the code. Ultimately, it's for the core devs to decide, though. Paul. From dirkjan at ochtman.nl Mon Jul 5 15:46:39 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 05 Jul 2010 15:46:39 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C31AFBA.9080800@egenix.com> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <4C31A901.2000800@jollans.com> <4C31AFBA.9080800@egenix.com> Message-ID: On 2010-07-05 12:11, M.-A. Lemburg wrote: > The point I wanted to make was that (at least some of) the core > devs do monitor the checkins list for new code and/or changes > to existing code going in. This would not longer reasonably > work, if you start pushing revisions of patches down the list > as well. That was not what I meant at all. You don't send different patch revisions, or incremental improvements to a single change into a single repository. You send in chunks of changes that can stand on their own (for example in the test suite), instead of a single large patch that's much harder to review, which contains everything needed to fix a single issue. Cheers, Dirkjan From tjreedy at udel.edu Mon Jul 5 21:20:41 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 05 Jul 2010 15:20:41 -0400 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <4C31A901.2000800@jollans.com> <4C31AFBA.9080800@egenix.com> Message-ID: On 7/5/2010 8:20 AM, Paul Moore wrote: > On 5 July 2010 11:11, M.-A. Lemburg wrote: >> The point I wanted to make was that (at least some of) the core >> devs do monitor the checkins list for new code and/or changes >> to existing code going in. This would not longer reasonably >> work, if you start pushing revisions of patches down the list >> as well. > > I agree entirely that commits should be made up of "completed" > patches, not of "work in progress" (patch 2 fixing a badly named > variable in patch 1, etc). > > But there may be merit in breaking a large patch into a series of > self-contained, incremental changes - which *can* be reviewed > independently, but which make sense as a group. For example, one patch > that introduces set literals, a second which updates the standard > library code to use them. Devs have occasionally asked a submitter of a large patch to split it into reviewable pieces. But that should be a special-case decision of a commiter reviewer. > As a more radical possibility, a patch could > be broken up into 3, one with the code changes, one with the tests and > one with the documentation. That may be less acceptable, although it > does allow for the possibility of someone with little C experience to > contribute by reviewing the docs and tests without having to worry > about the code. I do not see that as being so useful. Patches have section for each file and I have no trouble not reading a file section. Part of review is checking that doc and code changes match. Also, test and code patch have to be applied together. -- Terry Jan Reedy From brett at python.org Mon Jul 5 22:11:21 2010 From: brett at python.org (Brett Cannon) Date: Mon, 5 Jul 2010 13:11:21 -0700 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <4C318A2E.1070303@egenix.com> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> Message-ID: On Mon, Jul 5, 2010 at 00:30, M.-A. Lemburg wrote: [SNIP] > Aside 1: Isn't it interesting that the more we actually think about > moving to Mercurial, the more we find that the existing Subversion > model of working is actually a very workable model for a large > open source project ?! Not really. The current system works and is understood without retraining. The switch to hg has never been about tweaking the workflow of committers, but that of contributors. From ncoghlan at gmail.com Mon Jul 5 22:41:19 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Jul 2010 06:41:19 +1000 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> Message-ID: On Tue, Jul 6, 2010 at 6:11 AM, Brett Cannon wrote: > On Mon, Jul 5, 2010 at 00:30, M.-A. Lemburg wrote: > [SNIP] >> Aside 1: Isn't it interesting that the more we actually think about >> moving to Mercurial, the more we find that the existing Subversion >> model of working is actually a very workable model for a large >> open source project ?! > > Not really. The current system works and is understood without > retraining. The switch to hg has never been about tweaking the > workflow of committers, but that of contributors. Although, as with the CVS to SVN transmissions, the workflows of committers will likely change over time as we become more adept at exploiting the more powerful tool. I liked Joel Spolsky's observation that in moving from a centralised VCS to a distributed VCS, the key idea to wrap your head around is the shift from managing file (and repository) revisions to coherent changesets. I suspect that's something that can only happen properly by using a DVCS for a while. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Tue Jul 6 02:15:17 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Jul 2010 02:15:17 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> Message-ID: <20100706021517.2a6c7aad@pitrou.net> On Tue, 6 Jul 2010 06:41:19 +1000 Nick Coghlan wrote: > > Although, as with the CVS to SVN transmissions, the workflows of > committers will likely change over time as we become more adept at > exploiting the more powerful tool. > > I liked Joel Spolsky's observation that in moving from a centralised > VCS to a distributed VCS, the key idea to wrap your head around is the > shift from managing file (and repository) revisions to coherent > changesets. I suspect Spolsky has skipped on SVN then, because SVN already allows for coherent changesets (that's how we use it most of the time anyway). Regards Antoine. From stephen at xemacs.org Tue Jul 6 07:07:15 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 06 Jul 2010 14:07:15 +0900 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <20100706021517.2a6c7aad@pitrou.net> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <20100706021517.2a6c7aad@pitrou.net> Message-ID: <87pqz19p18.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > SVN already allows for coherent changesets (that's how we use it > most of the time anyway). Indeed. That's one of the things I really like about this project. From stephen at xemacs.org Tue Jul 6 07:16:33 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 06 Jul 2010 14:16:33 +0900 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C3090CD.7020909@ochtman.nl> <20100704172656.1dae21c1@pitrou.net> Message-ID: <87ocel9olq.fsf@uwakimon.sk.tsukuba.ac.jp> Georg Brandl writes: > Am 04.07.2010 17:26, schrieb Antoine Pitrou: > > On Sun, 04 Jul 2010 15:46:53 +0200 > > Dirkjan Ochtman wrote: > >> > >> Fourth, one-patch-per-issue is too restrictive. Small commits are useful > >> because they're way easier to review. Concatenate several small commits > >> leading up to a single issue fix into a single patch and it gets much > >> harder to read. > > > > I don't agree with that. The commits obviously won't be independent > > because they will be motivated by each other (or even dependent on each > > other), therefore you have to remember what the other commits do when > > reviewing one of them. What's more, when reading "hg log" months or > > years later, it is hard to make sense of a single commit because you > > don't really know what issue it was meant to contribute to fix. > > > > I know that's how Mercurial devs do things, but I don't really like > > it. > > I think the best of both worlds is to encourage contributors to send > more complicated patches in a series of easy-to-review steps, but when > committing to Python, make one changeset out of them. I don't see how this addresses Antoine's problem of connecting commits to issues at all. Some ways to address it are (1) require issue numbers in log messages, if there is an applicable issue (for non-committers, there should be a patch issue on the tracker, right?) and (2) require that the commits addressing a single issue be done on a single separate branch, then merged (which doesn't connect issues to commits, but does connect a series of commits). I don't really see why commits should take place in a lump, either. That makes bisecting less accurate, for one thing. Nor does it help with review; the review is already done by the time the commit takes place, no? OTOH, people who have a specific interest and want to review ex post are often going to want the bite-size patches, just as the original reviewer did, no? From ncoghlan at gmail.com Tue Jul 6 12:56:00 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 6 Jul 2010 20:56:00 +1000 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <20100706021517.2a6c7aad@pitrou.net> References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <20100706021517.2a6c7aad@pitrou.net> Message-ID: On Tue, Jul 6, 2010 at 10:15 AM, Antoine Pitrou wrote: > On Tue, 6 Jul 2010 06:41:19 +1000 > Nick Coghlan wrote: >> >> Although, as with the CVS to SVN transmissions, the workflows of >> committers will likely change over time as we become more adept at >> exploiting the more powerful tool. >> >> I liked Joel Spolsky's observation that in moving from a centralised >> VCS to a distributed VCS, the key idea to wrap your head around is the >> shift from managing file (and repository) revisions to coherent >> changesets. > > I suspect Spolsky has skipped on SVN then, because SVN already allows > for coherent changesets (that's how we use it most of the time anyway). No it doesn't. It has atomic commits (as do many other centralised version control systems), but it still only manages file revisions. The mental conversion Spolsky was talking about was specifically from SVN to Hg, the same one we're looking at. A DVCS isn't written in terms of file revisions the way SVN is, it's written in terms of a directed acyclic graph of changesets. If anyone wants to see what he actually wrote, rather than my hacked up paraphrase of it, it's the last programming article he did for Joel on Software: http://www.joelonsoftware.com/items/2010/03/17.html Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Tue Jul 6 13:49:28 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 06 Jul 2010 20:49:28 +0900 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> <20100706021517.2a6c7aad@pitrou.net> Message-ID: <87d3v0akzb.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > The mental conversion Spolsky was talking about was specifically from > SVN to Hg, the same one we're looking at. A DVCS isn't written in > terms of file revisions the way SVN is, it's written in terms of a > directed acyclic graph of changesets. Sure. But I think Antoine's right. So is the Python workflow. At any given time, you've got dozens of patches in active development in people's workspaces and on the tracker. As they get baked, you pull in a coherent set and commit it. Here's what Joel says: In Subversion, you might think, "bring my version up to date with the main version" or "go back to the previous version." In Mercurial, you think, "get me Jacob's change set" or "let's just forget that change set." While it's certainly true that to work with Python's Subversion repo you need to translate to terms of a fairly linear progression of versions, I don't see people thinking that way about the workflow. I think people do expect commits to the svn repo to be coherent, and by and large they are. I personally expect this migration to make a big difference to the core committers, because it gives them that much more flexibility. Casual committers and pull-only tester types may have some trouble adjusting, but I really don't think it will be that bad. From daniel at stutzbachenterprises.com Tue Jul 6 16:58:08 2010 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Tue, 6 Jul 2010 09:58:08 -0500 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C308B38.2050800@ochtman.nl> <4C30B195.1080804@jollans.com> <4C318A2E.1070303@egenix.com> Message-ID: On Mon, Jul 5, 2010 at 3:11 PM, Brett Cannon wrote: > The switch to hg has never been about tweaking the > workflow of committers, but that of contributors. > I've always thought of it as tweaking the workflow of collaboration. As an individual contributor and non-committer, the server switch isn't going to impact my workflow much. I use a DVCS locally to manage my work and then I submit a patch on the bug tracker. After the server switch, I'll do the same. A DVCS server will help a lot when I'm collaborating on a patch with others. As a concrete example, a few months ago I wrote a patch to speed up math.factorial (issue8692). Alexander Belopolsky and Mark Dickinson found a few corner-case flaws, suggested code-cleanup improvements, and some algorithmic alternatives. We went back and forth with several variations of patches before settling on a final patch. When Python is natively hosted in Mercurial, then the tools can explicitly track the relationship between all of the experimental patches. When just pushing patch files around, it's pretty hard to see that factorial-precompute-partials.patch is based on factorial-no-recursion.patch if you haven't been following the issue closely. It's also hard to examine the incremental changes between the two, which makes it hard to review an updated patch after reviewing the original. All of that would be a lot easier if I had started my patch as a clone of py3k on bitbucket. At the end of the process, the final committer can consolidate the changes into a single patch to keep the core repository clean. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed Jul 7 19:31:54 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 07 Jul 2010 19:31:54 +0200 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: <87ocel9olq.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4C3090CD.7020909@ochtman.nl> <20100704172656.1dae21c1@pitrou.net> <87ocel9olq.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Am 06.07.2010 07:16, schrieb Stephen J. Turnbull: > Georg Brandl writes: > > Am 04.07.2010 17:26, schrieb Antoine Pitrou: > > > On Sun, 04 Jul 2010 15:46:53 +0200 > > > Dirkjan Ochtman wrote: > > >> > > >> Fourth, one-patch-per-issue is too restrictive. Small commits are useful > > >> because they're way easier to review. Concatenate several small commits > > >> leading up to a single issue fix into a single patch and it gets much > > >> harder to read. > > > > > > I don't agree with that. The commits obviously won't be independent > > > because they will be motivated by each other (or even dependent on each > > > other), therefore you have to remember what the other commits do when > > > reviewing one of them. What's more, when reading "hg log" months or > > > years later, it is hard to make sense of a single commit because you > > > don't really know what issue it was meant to contribute to fix. > > > > > > I know that's how Mercurial devs do things, but I don't really like > > > it. > > > > I think the best of both worlds is to encourage contributors to send > > more complicated patches in a series of easy-to-review steps, but when > > committing to Python, make one changeset out of them. > > I don't see how this addresses Antoine's problem of connecting commits > to issues at all. I wasn't addressing Antoine's original problem, rather his reply to Dirkjan. Georg From stephen at xemacs.org Thu Jul 8 01:22:00 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 08 Jul 2010 08:22:00 +0900 Subject: [Python-ideas] Using only patches for pulling changes in hg.python.org In-Reply-To: References: <4C3090CD.7020909@ochtman.nl> <20100704172656.1dae21c1@pitrou.net> <87ocel9olq.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87lj9mkhd3.fsf@uwakimon.sk.tsukuba.ac.jp> Georg Brandl writes: > Am 06.07.2010 07:16, schrieb Stephen J. Turnbull: > > Georg Brandl writes: > > > Am 04.07.2010 17:26, schrieb Antoine Pitrou: > > > > On Sun, 04 Jul 2010 15:46:53 +0200 > > > > Dirkjan Ochtman wrote: > > > >> > > > >> Fourth, one-patch-per-issue is too restrictive. Small > > > >> commits are useful because they're way easier to > > > >> review. Concatenate several small commits leading up to a > > > >> single issue fix into a single patch and it gets much > > > >> harder to read. > > > > > > > > I don't agree with that. The commits obviously won't be > > > > independent because they will be motivated by each other (or > > > > even dependent on each other), therefore you have to > > > > remember what the other commits do when reviewing one of > > > > them. What's more, when reading "hg log" months or years > > > > later, it is hard to make sense of a single commit because > > > > you don't really know what issue it was meant to contribute > > > > to fix. > > > > > > > > I know that's how Mercurial devs do things, but I don't > > > > really like it. > > > > > > I think the best of both worlds is to encourage contributors > > > to send more complicated patches in a series of easy-to-review > > > steps, but when committing to Python, make one changeset out > > > of them. > > > > I don't see how this addresses Antoine's problem of connecting > > commits to issues at all. > > I wasn't addressing Antoine's original problem, rather his reply to > Dirkjan. Huh? Are you referring to something other than the part of his post that you quoted? Antoine writes "you have to remember what the other commits do when reviewing them" and "it is hard to make sense of a single commit [in a series] because you don't know what issue it was meant to fix". I admit I'm not really sure what his issue is. It seems to me that connecting commits is what a feature branch (in conjunction with rebase) is designed to achieve. If you don't like rebase, you can either work fast enough that your whole sequence is done before the mainline moves on significantly, or you can refrain from updating until done (and have a potentially messy merge), or you can use MQ (which is really just a way of rebasing without the shame ;-). I'll have to test it, but AFAIK in all of the above strategies, as long as you don't push to the public repo until done, the logs of the commits on the feature branch should all be adjacent in the natural order of hg log. That seems to me to be the optimal strategy, in combination with reading long parts of history in a graphical DAG browser. Of course, that assumes that random pieces of the fix aren't dispersed among commits. In that case the logs will still be hard to read and understand, as will the diffs. People who like to commit early and often should indeed be encouraged to edit their feature branches to make each individual commit make sense to reviewers. (MQ helps to address this, as does Bazaar's loom feature or StGit.) Feature branches don't automatically organize commits in an intelligible way, that requires an intelligence driving the process. But they do make it possible. Once you have feature branches, then there's a question of the external issue. Here reviewers should pay attention to the log message, and make sure it describes the problem well, and includes cross references to any documentation (tracker issue or ML thread). But that's no different from the current process. I think that in many cases the process of coming up with coherent changesets that are reviewable will indeed result in a single commit to address the whole issue. But there will also be multicommit patterns that make sense, such as "refactor API and update current clients -> use new feature in a few places". The thing to remember is that DVCSes not only record a frozen view of history accurately, but can also be used to flexibly reorganize the presentation of that history "as it should have happened". I think of these workflows as opportunities to *improve* the quality of information presented by the history. But they aren't mandated by adopting hg. Contributors and reviewers who are satisfied with the current process should continue to refine a set of changes to a single commit. hg is certainly flexible enough to allow that, with several different workflows. And Antoine's worries (AIUI) are not unfounded. Eg, we should not allow people to be lazy and submit a feature branch with changes randomly assigned to different commits and log messages like "Lunch time! commit progress to date." But that's a social problem; I think that conventions will quickly evolve *from* the one patch per issue workflow to a *well-organized* feature branch per issue (as appropriate) because python-dev reviewers will demand it. From jacobidiego at gmail.com Sun Jul 11 19:58:43 2010 From: jacobidiego at gmail.com (Diego Jacobi) Date: Sun, 11 Jul 2010 14:58:43 -0300 Subject: [Python-ideas] pop multiple elements of a list at once Message-ID: Hi. As recommended here: http://bugs.python.org/issue9218 I am posting this to this list. I am currently working with buffer in an USB device and pyusb. So when i read from the buffer of endpoint, i get an array.Array() list. I handle this chunk of data with a thread to send a receive the information that i need. In this thread, i load a list with all the information that is read from the USB device, and another layer with pop this information from the threads buffer. The thing i found is that, to pop a variable chunk of data from this buffer without copying it and deleting the elements, i have to pop one element at the time. def get_chunk(self, size): for x in range(size): yield self.recv_buffer.pop() I guess that it would be improved if i can just pop a defined number of elements, like this: pop self.recv_buffer[:-size] or self.recv_buffer.pop(,-size) That would be... "pop from (the last element minus size) to (the last element)" in that way there is only one memory transaction. The new list (or maybe a tuple) points to the old memory address and the recv_buffer is advanced to a one new address. Data is not moved. Note that i like the idea of using "pop" as the "del" operator for lists, but i am concient that this would not be backward compatible. Thanks. Diego From brett at python.org Sun Jul 11 20:47:53 2010 From: brett at python.org (Brett Cannon) Date: Sun, 11 Jul 2010 11:47:53 -0700 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: Message-ID: On Sun, Jul 11, 2010 at 10:58, Diego Jacobi wrote: > Hi. > As recommended here: http://bugs.python.org/issue9218 > I am posting this to this list. > > > > I am currently working with buffer in an USB device and pyusb. > So when i read from the buffer of endpoint, i get an array.Array() list. > I handle this chunk of data with a thread to send a receive the > information that i need. > In this thread, i load a list with all the information that is read > from the USB device, and another layer with pop this information from > the threads buffer. > > The thing i found is that, to pop a variable chunk of data from this > buffer without copying it and deleting the elements, i have to pop one > element at the time. > > ? ?def get_chunk(self, size): > ? ? ? ?for x in range(size): > ? ? ? ? ? ?yield self.recv_buffer.pop() > > I guess that it would be improved if i can just pop a defined number > of elements, like this: > > pop self.recv_buffer[:-size] > or > self.recv_buffer.pop(,-size) > > That would be... "pop from (the last element minus size) to (the last element)" > in that way there is only one memory transaction. > The new list (or maybe a tuple) points to the old memory address and > the recv_buffer is advanced to a one new address. Data is not moved. Why can't you do ``del self.recv_buffer[-size:]``? > > Note that i like the idea of using "pop" as the "del" operator for > lists, but i am concient that this would not be backward compatible. Too specialized, so that will never fly. -Brett From ncoghlan at gmail.com Sun Jul 11 23:18:49 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Jul 2010 07:18:49 +1000 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: Message-ID: I think you overestimate how standardised we could make this across all platforms and data structures. Under the hood, any such expansion to the .pop API would almost certainly be defined as equivalent to: def pop(self, index): result = self[index] del self[index] return result such that slice objects could be passed in as well as integers (or integer equivalents). (Currently pop on builtin objects rejects slice objects, as it only works with integers) In the meantime, if you want to manipulate memory while minimising copying, then the 2.7 memoryview object may be for you (assuming you can switch to the later version). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Mon Jul 12 05:39:42 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sun, 11 Jul 2010 17:39:42 -1000 Subject: [Python-ideas] explicitation lines in python ? In-Reply-To: References: <4C24FEAF.4030304@gmail.com> <4C26F710.4030902@gmail.com> Message-ID: On Sun, Jun 27, 2010 at 8:25 PM, Nick Coghlan wrote: > The availability of "nonlocal" binding semantics also makes the > semantics much easier to define than they were in those previous > discussions (the lack of clear semantics for name binding statements > with an attached local namespace was the major factor blocking > creation of a reference implementation for this proposal back then). > > For example: > > ?c = sqrt(a*a + b*b) where: > ? ?a = retrieve_a() > ? ?b = retrieve_b() > > could translate to something like: > > ?def _anon(): # *(see below) > ? ?nonlocal c > ? ?a = retrieve_a() > ? ?b = retrieve_b() > ? ?c = sqrt(a*a + b*b) > ?_anon() > > *(unlike Python code, the compiler can make truly anonymous functions > by storing them solely on the VM stack. It already does this when > executing class definitions): I like this idea, but I would tweak it slightly. Maybe we should say EXPRESSION where: BLOCK is equivalent to def _(): BLOCK return EXPRESSION _() That way, c = a where: a = 7 would be equivalent to def _(): a = 7 return a c = _() One advantage of this equivalence is it would make it easier to work around a longstanding scoping gotcha. A na?ve coder might expect this code to print out numbers 0 to 4: >>> fs = [] >>> for n in range(5): ... def f(): ... print(item) ... fs.append(f) ... >>> [f() for f in fs] 4 4 4 4 4 [None, None, None, None, None] I think we all have enough experience to know this isn?t a totally unrealistic scenario. I personally stumbled into when I was trying to create a class by looping through a set of method names. To get around it, one could use a where clause like so: fs = [] for n in range(5): fs.append(f) where: shadow = n def f(): print(shadow) This would print out 0 to 4 as expected and be equivalent to >>> fs = [] >>> for n in range(5): ... def _(): ... shadow = n ... def f(): ... print(shadow) ... fs.append(f) ... _() ... >>> [f() for f in fs] 0 1 2 3 4 [None, None, None, None, None] I think a where-clause with def-like namespace semantics would be a positive addition to Python, once the moratorium is up. -- Carl Johnson From tjreedy at udel.edu Mon Jul 12 07:29:01 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 12 Jul 2010 01:29:01 -0400 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: Message-ID: On 7/11/2010 1:58 PM, Diego Jacobi wrote: > The thing i found is that, to pop a variable chunk of data from this > buffer without copying it and deleting the elements, i have to pop one > element at the time. In CPython, popping copies a reference and them deletes it from the list. The item popped is not copied. It is a convenience, which I proposed, but not a necessity. You can easily write a function that returns a slice after deleting it. def pop_slice(lis, n): tem = lis[:-n] del lis[:-n] return tem I expect this to run faster than popping more than a few items one at a time. -- Terry Jan Reedy From guido at python.org Mon Jul 12 08:35:13 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Jul 2010 08:35:13 +0200 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: Message-ID: On Sun, Jul 11, 2010 at 7:58 PM, Diego Jacobi wrote: > I guess that it would be improved if i can just pop a defined number > of elements, like this: > > pop self.recv_buffer[:-size] > or > self.recv_buffer.pop(,-size) > > That would be... "pop from (the last element minus size) to (the last element)" > in that way there is only one memory transaction. > The new list (or maybe a tuple) points to the old memory address and > the recv_buffer is advanced to a one new address. Data is not moved. I think yo misunderstand the implementation of lists (and the underlying malloc()). You can't break the memory used for the list elements into two pieces and give new ownership to a (leading) section of it. However you also seem to be worrying about "copying" too much -- the only things that get copied are the *pointers* to the objects popped off the stack, which is very cheap compared to the rest of the operation. It is true that to pop off a whole slice there is a more efficient way than calling pop() repeatedly -- but there's no need for a new primitive operation, as it can already be done by copying and then deleting the slice (again, the copying only copies the pointers). Try reading up on Python's memory model for objects, it will be quite enlightening. -- --Guido van Rossum (python.org/~guido) From cmjohnson.mailinglist at gmail.com Mon Jul 12 08:50:28 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sun, 11 Jul 2010 20:50:28 -1000 Subject: [Python-ideas] explicitation lines in python ? In-Reply-To: References: <4C24FEAF.4030304@gmail.com> <4C26F710.4030902@gmail.com> Message-ID: One more quick thought about the advantage of a where-clause. Often times, there is thought of creating the equivalent of Haskell-style pattern matching using decorators. For example PJE has worked on creating "generic functions." One problem with using a decorator for this is that for use cases more complicated than just matching on type, the matcher itself needs to be a function that looks at the arguments then returns true or false based on whether they match a pattern. So, a proper decorator would need to take *two* functions, one to do the matching and one to actually be the body of the function. You can do this to some extent with lambdas or decorating and redecorating, but it quickly becomes a little tedious. With a where-clause one might instead write: fib = base(cond, action) where: def cond(n): return n in (0, 1) def action(n): return 1 fib.add_match(cond, action) where: def cond(n): return isinstance(n, int) and n > 1 def action(n): return n + fib(n - 1) And also for the property decorator. GvR made up a nice way of redecorating with properties, so this is a moot point now, but if we had had a where-clause before that, we could have instead written: myprop = property(getter, setter, deleter) where: def getter(self): etc. etc. OK, that?s as much advocacy as I feel like doing. See you all again in 6 months, when something like this is proposed again. ;-) -- Carl From ncoghlan at gmail.com Mon Jul 12 14:45:10 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Jul 2010 22:45:10 +1000 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: Message-ID: On Mon, Jul 12, 2010 at 4:35 PM, Guido van Rossum wrote: > I think yo misunderstand the implementation of lists (and the > underlying malloc()). You can't break the memory used for the list > elements into two pieces and give new ownership to a (leading) section > of it. However you also seem to be worrying about "copying" too much > -- the only things that get copied are the *pointers* to the objects > popped off the stack, which is very cheap compared to the rest of the > operation. It is true that to pop off a whole slice there is a more > efficient way than calling pop() repeatedly -- but there's no need for > a new primitive operation, as it can already be done by copying and > then deleting the slice (again, the copying only copies the pointers). Note that the original poster was apparently talking about array.array() rather than an actual list (at least, that's the way I interpreted the phrase "array,Array() list"). In that context, the desire to avoid copying when invoking pop() makes a lot more sense than it does when using a builtin list. I agree that the suggestion of reassigning ownership of a chunk of an array is based on a misunderstanding of the way memory allocation works at the pymalloc and OS levels though. For the record, neither pymalloc nor the OS support breaking a chunk of already allocated memory in two that way - you need some master object to maintain control of it, and then use other pointers to look at subsections. Since memoryview objects in 3.x and 2.7 are designed specifically to provide a window onto a chunk of memory owned by another object (such as the storage underlying an array object) without copying, it seems like that is the kind of thing the original poster is actually looking for. (That said, supporting slice objects in pop() still doesn't strike me as an insane idea, although I'd probably want to see some use cases before we went to the hassle of adding it). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Jul 12 14:59:56 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 12 Jul 2010 14:59:56 +0200 Subject: [Python-ideas] pop multiple elements of a list at once References: Message-ID: <20100712145956.28695535@pitrou.net> On Mon, 12 Jul 2010 22:45:10 +1000 Nick Coghlan wrote: > > For the record, neither pymalloc nor the OS support breaking a chunk > of already allocated memory in two that way - you need some master > object to maintain control of it, and then use other pointers to look > at subsections. Since memoryview objects in 3.x and 2.7 are designed > specifically to provide a window onto a chunk of memory owned by > another object (such as the storage underlying an array object) > without copying, it seems like that is the kind of thing the original > poster is actually looking for. memoryviews don't provide a high-level view over their chunk of memory, though, only bytes-level. (they were specified to provide such a view, but it was never implemented) From ncoghlan at gmail.com Mon Jul 12 15:06:27 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Jul 2010 23:06:27 +1000 Subject: [Python-ideas] explicitation lines in python ? In-Reply-To: References: <4C24FEAF.4030304@gmail.com> <4C26F710.4030902@gmail.com> Message-ID: On Mon, Jul 12, 2010 at 1:39 PM, Carl M. Johnson wrote: > I like this idea, but I would tweak it slightly. Maybe we should say > > EXPRESSION where: > ? ?BLOCK > > is equivalent to > > def _(): > ? ?BLOCK > ? ?return EXPRESSION > _() Implement it that way (or find someone who can), then get back to me* :) That said, my suggested semantics still have the desired effect in your use case, since your expression does not contain a name binding operation, so it makes no difference whether name binding would have been handled via a return value (your suggestion, which I tried and failed to implement last time) or via nonlocal name bindings (my suggestion this time around). Cheers, Nick. *P.S. There's a reason I stopped pushing this idea back then: the absolute nightmare that was trying to implement it without ready access to nonlocal variable definitions (trying to figure out what the return value should be and how it should be unpacked in the surrounding scope was seriously ugly). Using nonlocal semantics instead should make it relatively straightforward (fairly similar to a class definition in fact, although the compilation options for the nested code object will be different and there'll be a bit of additional dancing during the symbol pass to figure out any implicit nonlocal declarations for the inner scope). -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Jul 12 15:32:51 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Jul 2010 23:32:51 +1000 Subject: [Python-ideas] explicitation lines in python ? In-Reply-To: References: <4C24FEAF.4030304@gmail.com> <4C26F710.4030902@gmail.com> Message-ID: > That said, my suggested semantics still have the desired effect in > your use case, since your expression does not contain a name binding > operation, so it makes no difference whether name binding would have > been handled via a return value (your suggestion, which I tried and > failed to implement last time) or via nonlocal name bindings (my > suggestion this time around). Bleh, I just remembered why nonlocal semantics won't work for this use case: nonlocal only looks at function namespaces, so class and module namespaces don't count. That behaviour would be unacceptable for a where clause implementation. So this suggestion going anywhere post-moratorium is firstly dependent on someone figuring out how to properly split name binding operations across the two namespaces (such that the values are generated in the inner scope, but assigned in the outer scope). As an example of the kind of thing that actually makes this a nightmare: x = b[index] = value where: index = calc_target_index() value = calc_value() It turns out that name binding is only part of the problem though. Variable *lookup* actually shares one of the problems of nonlocal name binding: it skips over class scopes, so the inner scope can't see class level names. Generator expressions and most comprehensions (all bar 2.x list comprehensions) already have this problem - at class scope, only the outermost iterator can see names defined in the class body, since everything else is in a nested scope where name lookup skips over the class due to the name lookup semantics that were originally designed for method implementations (i.e. before we had things like generator expressions that implicitly created new scopes). It took a while for all these evil variable referencing semantic problems to come back to me, but they're the kind of thing that needs to be addressed before a where clause proposal can be taken seriously. As I noted in my last message, I *did* try to implement this years ago and I now remember that the only way I can see it working is to define a completely new means of compiling a code object such that variable lookup and nonlocal namebinding can "see" an immediately surrounding class scope (as well as outer function scopes) and still fall back to global semantics if the name is not found explicitly in the symbol table. I believe such an addition would actually be beneficial in other ways, as I personally consider the current name lookup quirks in generator expressions to be something of a wart and these new semantics for implicit scopes could potentially be used to fix that (although perhaps not until Py4k). However, adding such lookup semantics is a seriously non-trivial task (I've been working with the current compiler since I helped get it ready for inclusion in 2.5 back when it was still on the ast-compiler branch and I'm not sure where I would even start on a project like that. It wouldn't just be the compiler either, the VM itself would almost certainly need some changes). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Jul 12 15:35:57 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Jul 2010 23:35:57 +1000 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: <20100712145956.28695535@pitrou.net> References: <20100712145956.28695535@pitrou.net> Message-ID: On Mon, Jul 12, 2010 at 10:59 PM, Antoine Pitrou wrote: > memoryviews don't provide a high-level view over their chunk of memory, > though, only bytes-level. > (they were specified to provide such a view, but it was never > implemented) True, but the use case the original poster mentioned was for a bytes level view. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jacobidiego at gmail.com Mon Jul 12 17:13:34 2010 From: jacobidiego at gmail.com (Diego Jacobi) Date: Mon, 12 Jul 2010 12:13:34 -0300 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: <20100712145956.28695535@pitrou.net> Message-ID: Hi. I apologize if i am having difficulties to explain myself in English. Also i just realise that i wasnt answering to the list, but directly to Brett Cannon. Sorry for that. Also i am taking into account that you are right and my concept of how memory is handled here for lists is way too different as how i am more used to work in firmwares. Anyway, the concept of my idea comes of my understanding of an low-leveled array. I do understand that poping returns only a pointer to that element. I didnt understand that every element inside a list is also a pointer to a python type, so the data is not copied, but the pointer is. The thing is that the elements on my list are just bytes. (unsigned char), and i think that an array of this type of data is called in python immutable, which means that i may not be using the right datatype. Anyway. slice support on pop(), and/or the ability to skip elements in a for loop without restarting the iteration will clear up a lot of code. If my scenario is yet no clear i give below some more details: When i think on how this will work on memory: def pop_slice(lis, n): tem = lis[:-n] del lis[:-n] return tem I think in "copying the elements of an array": BYTE* pop_slice(BYTE* list, unsigned int n){ BYTE* temp = malloc(n*sizeof(BYTE)); int i; for(i=0 ; i < n ; i++) { temp[i] = list[i]; } free(list, n); return temp; } Most python books and tutorials clearly says that this operation L[start:end] copies the elements requested. And being copy i understand the above behavior, which is less efficient than advancing the pointer. But i wanted to do (with "pop multiple bytes at once") is: typedef unsigned char BYTE; BYTE array[BIG_SIZE]; BYTE* incomming_buffer_pointer = &array[0]; BYTE* incomming_packet_pointer = &array[0]; BYTE* pop_slice(BYTE* list, unsigned int n){ BYTE* temp; temp = list; list = &list[n]; return temp; } .. incomming_packet_pointer = pop_slice( incomming_buffer_pointer , PACKET_SIZE) if (parse_packet_is_not_corrupt( incomming_packet_pointer ) ) parse_new_packet( incomming_packet_pointer ); else .... .. Thanks for analizing the idea. Jacobi Diego From ncoghlan at gmail.com Mon Jul 12 17:41:48 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Jul 2010 01:41:48 +1000 Subject: [Python-ideas] pop multiple elements of a list at once In-Reply-To: References: <20100712145956.28695535@pitrou.net> Message-ID: On Tue, Jul 13, 2010 at 1:13 AM, Diego Jacobi wrote: > Hi. > I apologize if i am having difficulties to explain myself in English. > Also i just realise that i wasnt answering to the list, but directly > to Brett Cannon. Sorry for that. > > Also i am taking into account that you are right and my concept of how > memory is handled here for lists is way too different as how i am more > used to work in firmwares. > > Anyway, the concept of my idea comes of my understanding of an > low-leveled array. For builtin lists and most other Python container types, think more along the lines of a list of pointers rather than a list of numbers. However, for the behaviour of array.array (rather than the builtin list), your understanding of the cost of slicing was actually pretty close to correct (since storing actual values rather than pointers is the way array.array gains its comparative memory efficiency). Python doesn't actually excel at memory efficiency when manipulating large data sets using conventional syntax (e.g. slicing). The old buffer objects were an initial attempt at providing memory efficient access to segments of data buffers, while the recently added memoryview objects are a more sophisticated (and safer) approach to the same idea. The NumPy extension, on the other hand, is able to very efficiently provide multiple views of the same region of memory without requiring copying (NumPy itself isn't particularly small though). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From shane at hathawaymix.org Sat Jul 17 10:36:28 2010 From: shane at hathawaymix.org (Shane Hathaway) Date: Sat, 17 Jul 2010 02:36:28 -0600 Subject: [Python-ideas] Looking for a "batch" function Message-ID: <4C416B8C.1070701@hathawaymix.org> Hi all, An operation I often want in my Python code is some kind of simple batch function. The batch function would take an iterator and return same-size batches from it, except the last batch, which could be smaller. Two parameters would be required: the iterator and the size of each batch. Here are some examples of what I would expect this batch function to do. Get batches from a list: >>> list(batch([1,2,3,4,5], 2)) [[1, 2], [3, 4], [5]] Get batches from a string: >>> list(batch('one two six ten', 4)) ['one ', 'two ', 'six ', 'ten'] Organize a stream of objects into a table: >>> list(batch(['Somewhere', 'CA', 90210, 'New York', 'NY', 10001], 3)) [['Somewhere', 'CA', 90210], ['New York', 'NY', 10001]] My intuition tells me that such a function should exist in Python, but I have not found it in the builtin functions, slice operators, or itertools. Did I miss it? Here is an implementation that satisfies all of the above examples, but requires a sliceable sequence as input, not just an iterator: def batch(input, batch_size): while input: yield input[:batch_size] input = input[batch_size:] Obviously, I can just include that function in my projects, but I wonder if there is some built-in version of it. If there isn't, maybe there should be. Shane From pyideas at rebertia.com Sat Jul 17 10:52:59 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Sat, 17 Jul 2010 01:52:59 -0700 Subject: [Python-ideas] Looking for a "batch" function In-Reply-To: <4C416B8C.1070701@hathawaymix.org> References: <4C416B8C.1070701@hathawaymix.org> Message-ID: On Sat, Jul 17, 2010 at 1:36 AM, Shane Hathaway wrote: > Hi all, > > An operation I often want in my Python code is some kind of simple batch > function. ?The batch function would take an iterator and return same-size > batches from it, except the last batch, which could be smaller. ?Two > parameters would be required: the iterator and the size of each batch. ?Here > are some examples of what I would expect this batch function to do. > > Get batches from a list: > >>>> list(batch([1,2,3,4,5], 2)) > [[1, 2], [3, 4], [5]] > > Get batches from a string: > >>>> list(batch('one two six ten', 4)) > ['one ', 'two ', 'six ', 'ten'] > > Organize a stream of objects into a table: > >>>> list(batch(['Somewhere', 'CA', 90210, 'New York', 'NY', 10001], 3)) > [['Somewhere', 'CA', 90210], ['New York', 'NY', 10001]] > > My intuition tells me that such a function should exist in Python, but I > have not found it in the builtin functions, slice operators, or itertools. > ?Did I miss it? IMO, yes. > Obviously, I can just include that function in my projects, but I wonder if > there is some built-in version of it. ?If there isn't, maybe there should > be. See the "grouper" recipe in itertools: http://docs.python.org/library/itertools.html#recipes It does almost exactly what you want: grouper(3, 'ABCDEFG', 'x') --> ['A','B','C'], ['D','E','F'], ['G','x','x'] Cheers, Chris -- http://blog.rebertia.com From shane at hathawaymix.org Sat Jul 17 20:50:54 2010 From: shane at hathawaymix.org (Shane Hathaway) Date: Sat, 17 Jul 2010 12:50:54 -0600 Subject: [Python-ideas] Looking for a "batch" function In-Reply-To: References: <4C416B8C.1070701@hathawaymix.org> Message-ID: <4C41FB8E.5040907@hathawaymix.org> On 07/17/2010 02:52 AM, Chris Rebert wrote: > See the "grouper" recipe in itertools: > http://docs.python.org/library/itertools.html#recipes > It does almost exactly what you want: > grouper(3, 'ABCDEFG', 'x') --> ['A','B','C'], ['D','E','F'], ['G','x','x'] Interesting, but I have a few concerns with that answer: - It ignores the type of the container. If I provide a string as input, I expect an iterable of strings as output. - If I give a batch size of 1000000, grouper() is going to be rather inefficient. Even worse would be to allow users to specify the batch size. - Since grouper() is not actually in the standard library and it doesn't do quite what I need, it's rather unlikely that I'll use it. Another possible name for this functionality I am describing is packetize(). Computers always packetize data for transmission, storage, and display to users. Packetizing seems like such a common need that I think it should be built in to Python. Shane From taleinat at gmail.com Sat Jul 17 22:30:30 2010 From: taleinat at gmail.com (Tal Einat) Date: Sat, 17 Jul 2010 23:30:30 +0300 Subject: [Python-ideas] Looking for a "batch" function In-Reply-To: <4C41FB8E.5040907@hathawaymix.org> References: <4C416B8C.1070701@hathawaymix.org> <4C41FB8E.5040907@hathawaymix.org> Message-ID: On Sat, Jul 17, 2010 at 9:50 PM, Shane Hathaway wrote: > On 07/17/2010 02:52 AM, Chris Rebert wrote: >> >> See the "grouper" recipe in itertools: >> http://docs.python.org/library/itertools.html#recipes >> It does almost exactly what you want: >> grouper(3, 'ABCDEFG', 'x') --> ?['A','B','C'], ['D','E','F'], >> ['G','x','x'] > > Interesting, but I have a few concerns with that answer: > > - It ignores the type of the container. ?If I provide a string as input, I > expect an iterable of strings as output. > > - If I give a batch size of 1000000, grouper() is going to be rather > inefficient. ?Even worse would be to allow users to specify the batch size. > > - Since grouper() is not actually in the standard library and it doesn't do > quite what I need, it's rather unlikely that I'll use it. > > Another possible name for this functionality I am describing is packetize(). > ?Computers always packetize data for transmission, storage, and display to > users. ?Packetizing seems like such a common need that I think it should be > built in to Python. This reminds me of discussions about a "flatten" function. This kind of operation often has slightly different requirements in different scenarios. It is very simple to implement a version of this to meet your exact needs. Sometimes in these kinds of situations it is better not to have a built-in generic function, to force programmers to decide explicitly how they want it to work. You mentioned efficiency; to do this kind of operation efficiently ones really needs to know what kind of sequence/iterator is being "packetized", and implement accordingly. - Tal Einat From ncoghlan at gmail.com Sun Jul 18 03:02:15 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 18 Jul 2010 11:02:15 +1000 Subject: [Python-ideas] Looking for a "batch" function In-Reply-To: References: <4C416B8C.1070701@hathawaymix.org> <4C41FB8E.5040907@hathawaymix.org> Message-ID: On Sun, Jul 18, 2010 at 6:30 AM, Tal Einat wrote: > This kind of operation often has slightly different requirements in > different scenarios. It is very simple to implement a version of this > to meet your exact needs. Sometimes in these kinds of situations it is > better not to have a built-in generic function, to force programmers > to decide explicitly how they want it to work. > > You mentioned efficiency; to do this kind of operation efficiently > ones really needs to know what kind of sequence/iterator is being > "packetized", and implement accordingly. Indeed. There's actually a reasonably decent general windowing recipe on ASPN (http://code.activestate.com/recipes/577196-windowing-an-iterable-with-itertools/), but even that isn't appropriate for every use case. The OP, for example, has rather different requirements to what is implemented there: - non-overlapping windows, so tee() isn't needed - return type should match original container type A custom generator for that task is actually pretty trivial (note: untested, so may contain typos): def windowed(seq, window_len): for slice_start in range(0, len(seq), window_len): # use xrange() in 2.x slice_end = slice_start + window_len yield seq[slice_start:slice_end] Even adding support for overlapped windows is fairly easy: def windowed(seq, window_len, overlap=0): slice_step = window_len - overlap for slice_start in range(0, len(seq), slice_step): # use xrange() in 2.x slice_end = slice_start + window_len yield seq[slice_start:slice_end] However, those approaches don't support arbitrary iterators (i.e. those without __len__), they only support sequences. To support arbitrary iterators, you'd need to do something fancier with either collections.deque (either directly or via itertools.tee), but again, the most appropriate approach is going to be application specific (for byte data, you're probably going to want to use buffer or memoryview rather than the original container type). It isn't that this is an uncommon problem - it's that any appropriately general solution is going to be suboptimal in many specific applications, while an optimal solution for specific applications is going to be insufficiently general to be appropriate for the standard library. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Sun Jul 18 03:31:53 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 18 Jul 2010 13:31:53 +1200 Subject: [Python-ideas] Looking for a "batch" function In-Reply-To: <4C41FB8E.5040907@hathawaymix.org> References: <4C416B8C.1070701@hathawaymix.org> <4C41FB8E.5040907@hathawaymix.org> Message-ID: <4C425989.7010107@canterbury.ac.nz> Shane Hathaway wrote: > - It ignores the type of the container. If I provide a string as input, > I expect an iterable of strings as output. Fine, but... > - If I give a batch size of 1000000, grouper() is going to be rather > inefficient. I guess you would prefer each batch to be a lazy iterator over part of the original sequence -- but that would conflict with the previous requirement. -- Greg From shane at hathawaymix.org Sun Jul 18 03:54:43 2010 From: shane at hathawaymix.org (Shane Hathaway) Date: Sat, 17 Jul 2010 19:54:43 -0600 Subject: [Python-ideas] Looking for a "batch" function In-Reply-To: References: <4C416B8C.1070701@hathawaymix.org> <4C41FB8E.5040907@hathawaymix.org> Message-ID: <4C425EE3.80905@hathawaymix.org> On 07/17/2010 07:02 PM, Nick Coghlan wrote: > It isn't that this is an uncommon problem - it's that any > appropriately general solution is going to be suboptimal in many > specific applications, while an optimal solution for specific > applications is going to be insufficiently general to be appropriate > for the standard library. Well, I feel like there is in fact a general solution that would be near optimal for many applications, but I would rather spend time refining the idea in real projects rather than get much into theory at the moment. Thanks for the feedback. Shane From sergdavis at gmail.com Tue Jul 20 03:45:27 2010 From: sergdavis at gmail.com (Sergio Davis) Date: Mon, 19 Jul 2010 21:45:27 -0400 Subject: [Python-ideas] 'where' statement in Python? Message-ID: Dear members of the python-ideas mailing list, I'm not quite sure if this is the right place to ask for feedback about the idea, I apologize if this is not the case. I'm considering the following extension to Python's grammar: adding the 'where' keyword, which would work as follows: where_expr : expr 'where' NAME '=' expr The idea is to be able to write something like: a = (z**2+5) where z=2 being equivalent to (current Python syntax): a = (lambda z: z**2+5)(z=2) I thinkg this would be especially powerful in cases where the variable to be substituted ('z' in the example) comes in turn from a complicated expression, which makes it confusing to "embed" it in the main expression (the body of the 'lambda'), or in cases where the substitution must be performed more than once, and it may be more efficient to evaluate 'z' once. A more complicated example: vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() where par_pos=decl.find('(') equivalent to (current Python syntax): vtype = (lambda par_pos: decl[par_pos+1:FindMatching(par_pos,decl)].strip())(par_pos=decl.find('(')) Extending this syntax to several assignments after the 'where' keyword could be implemented as: where_expr: expr 'where' NAME '=' expr (',' NAME '=' expr )* or (which I think may be more "pythonic"): where_expr: expr 'where' NAME (',' NAME)* '=' expr (',' expr)* as it mimics the same syntax for unpacking tuples. I would appreciate any feedback on the idea, especially if it has some obvious flaw or if it's redundant (meaning there is a clearer way of doing this 'trick' which I don't know about). best regards, Sergio Davis -- Sergio Davis Irarr?zabal Grupo de NanoMateriales, Universidad de Chile http://www.gnm.cl/~sdavis -------------- next part -------------- An HTML attachment was scrubbed... URL: From jackdied at gmail.com Tue Jul 20 03:52:47 2010 From: jackdied at gmail.com (Jack Diederich) Date: Mon, 19 Jul 2010 21:52:47 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: Message-ID: On Mon, Jul 19, 2010 at 9:45 PM, Sergio Davis wrote: > Dear members of the python-ideas mailing list, > > I'm not quite sure if this is the right place to ask for feedback about the > idea, I apologize if this is not the case. > > I'm considering the following extension to Python's grammar: adding the > 'where' keyword, which would work as follows: > > where_expr : expr 'where' NAME '=' expr > > The idea is to be able to write something like: > > a = (z**2+5) where z=2 > > being equivalent to (current Python syntax): > > a = (lambda z: z**2+5)(z=2) > > I thinkg this would be especially powerful in cases where the variable to be > substituted ('z' in the example) comes in turn from a complicated > expression, which makes it confusing to "embed" it in the main expression > (the body of the 'lambda'), or in cases where the substitution must be > performed more than once, and it may be more efficient to evaluate 'z' once. > A more complicated example: > > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() where > par_pos=decl.find('(') > > equivalent to (current Python syntax): > > vtype = (lambda par_pos: > decl[par_pos+1:FindMatching(par_pos,decl)].strip())(par_pos=decl.find('(')) > > Extending this syntax to several assignments after the 'where' keyword could > be implemented as: > > where_expr: expr 'where' NAME '=' expr (',' NAME '=' expr )* > > or (which I think may be more "pythonic"): > > where_expr: expr 'where' NAME (',' NAME)* '=' expr (',' expr)* > > as it mimics the same syntax for unpacking tuples. > > I would appreciate any feedback on the idea, especially if it has some > obvious flaw or if it's redundant (meaning there is a clearer way of doing > this 'trick' which I don't know about). > I think the "trick" to making it readable is putting the assignment first. par_pos = decl.find('(') vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() versus: vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() where par_pos=decl.find('(') -Jack From stephen at xemacs.org Tue Jul 20 07:29:01 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 20 Jul 2010 14:29:01 +0900 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: Message-ID: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Sergio Davis writes: > I'm considering the following extension to Python's grammar: adding the > 'where' keyword, which would work as follows: > > where_expr : expr 'where' NAME '=' expr We just had a long thread about this. http://mail.python.org/pipermail/python-ideas/2010-June/007476.html The sentiment was about -0 to -0.5 on the idea in general, although a couple of people with experience in implementing Python syntax expressed sympathy for it. There have also been threads on earlier variations of the idea, referenced in the thread above. HTH From ncoghlan at gmail.com Tue Jul 20 15:27:49 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 20 Jul 2010 23:27:49 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Tue, Jul 20, 2010 at 3:29 PM, Stephen J. Turnbull wrote: > We just had a long thread about this. > > http://mail.python.org/pipermail/python-ideas/2010-June/007476.html > > The sentiment was about -0 to -0.5 on the idea in general, although a > couple of people with experience in implementing Python syntax > expressed sympathy for it. For the record, I am personally +1 on the idea (otherwise I wouldn't have put so much thought into it over the years). It's just a *lot* harder to define complete and consistent semantics for the concept than people often realise. However, having the question come up twice within the last month finally inspired me to write the current status of the topic down in a deferred PEP: http://www.python.org/dev/peps/pep-3150/ Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From 8mayday at gmail.com Tue Jul 20 17:01:39 2010 From: 8mayday at gmail.com (Andrey Popp) Date: Tue, 20 Jul 2010 19:01:39 +0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Also, what about some alternative for workaround the following: > Out of Order Execution: the where clause makes execution jump around a little strangely, as > the body of the where clause is executed before the simple statement in the clause header. The > closest any other part of Python comes to this before is the out of order evaluation in > conditional expressions. result = let: a = retrieve_a() b = retreive_b() in: a*a + b*b On Tue, Jul 20, 2010 at 6:57 PM, Andrey Popp <8mayday at gmail.com> wrote: > Hello, > > PEP-3150 is awesome, just a small addition ? why not to allow > one-liners `where`s: > > ? ?a = (b, b) where b = 43 > > And that also make sense for generator/list/set/dict comprehensions: > > ? ?mylist = [y for y in another_list if y < 5 where y = f(x)] > > On Tue, Jul 20, 2010 at 5:27 PM, Nick Coghlan wrote: >> On Tue, Jul 20, 2010 at 3:29 PM, Stephen J. Turnbull wrote: >>> We just had a long thread about this. >>> >>> http://mail.python.org/pipermail/python-ideas/2010-June/007476.html >>> >>> The sentiment was about -0 to -0.5 on the idea in general, although a >>> couple of people with experience in implementing Python syntax >>> expressed sympathy for it. >> >> For the record, I am personally +1 on the idea (otherwise I wouldn't >> have put so much thought into it over the years). It's just a *lot* >> harder to define complete and consistent semantics for the concept >> than people often realise. >> >> However, having the question come up twice within the last month >> finally inspired me to write the current status of the topic down in a >> deferred PEP: http://www.python.org/dev/peps/pep-3150/ >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > Andrey Popp > > phone: +7 911 740 24 91 > e-mail: 8mayday at gmail.com > -- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday at gmail.com From 8mayday at gmail.com Tue Jul 20 16:57:15 2010 From: 8mayday at gmail.com (Andrey Popp) Date: Tue, 20 Jul 2010 18:57:15 +0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Hello, PEP-3150 is awesome, just a small addition ? why not to allow one-liners `where`s: a = (b, b) where b = 43 And that also make sense for generator/list/set/dict comprehensions: mylist = [y for y in another_list if y < 5 where y = f(x)] On Tue, Jul 20, 2010 at 5:27 PM, Nick Coghlan wrote: > On Tue, Jul 20, 2010 at 3:29 PM, Stephen J. Turnbull wrote: >> We just had a long thread about this. >> >> http://mail.python.org/pipermail/python-ideas/2010-June/007476.html >> >> The sentiment was about -0 to -0.5 on the idea in general, although a >> couple of people with experience in implementing Python syntax >> expressed sympathy for it. > > For the record, I am personally +1 on the idea (otherwise I wouldn't > have put so much thought into it over the years). It's just a *lot* > harder to define complete and consistent semantics for the concept > than people often realise. > > However, having the question come up twice within the last month > finally inspired me to write the current status of the topic down in a > deferred PEP: http://www.python.org/dev/peps/pep-3150/ > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday at gmail.com From stephen at xemacs.org Tue Jul 20 17:22:27 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 21 Jul 2010 00:22:27 +0900 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87zkxmxjnw.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > On Tue, Jul 20, 2010 at 3:29 PM, Stephen J. Turnbull wrote: > For the record, I am personally +1 on the idea Gee, and here I had you down as a +0.95. Can you forgive me? :-) > However, having the question come up twice within the last month > finally inspired me to write the current status of the topic down in a > deferred PEP: http://www.python.org/dev/peps/pep-3150/ Way cool! Thanks! From daniel at stutzbachenterprises.com Tue Jul 20 17:24:56 2010 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Tue, 20 Jul 2010 10:24:56 -0500 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: n Tue, Jul 20, 2010 at 8:27 AM, Nick Coghlan wrote: > For the record, I am personally +1 on the idea (otherwise I wouldn't > have put so much thought into it over the years). It's just a *lot* > harder to define complete and consistent semantics for the concept > than people often realise. > > However, having the question come up twice within the last month > finally inspired me to write the current status of the topic down in a > deferred PEP: http://www.python.org/dev/peps/pep-3150/ There was a related discussion on python-ideas in July 2009, spanning two threads, that you may want to additionally reference. A lot of corner cases were also brought up in that thread. Here are the starts of the two threads: http://mail.python.org/pipermail/python-ideas/2009-July/005082.html http://mail.python.org/pipermail/python-ideas/2009-July/005132.html -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.arbash.meinel at gmail.com Tue Jul 20 17:28:40 2010 From: john.arbash.meinel at gmail.com (John Arbash Meinel) Date: Tue, 20 Jul 2010 10:28:40 -0500 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4C45C0A8.8060407@gmail.com> Andrey Popp wrote: > Hello, > > PEP-3150 is awesome, just a small addition ? why not to allow > one-liners `where`s: > > a = (b, b) where b = 43 > > And that also make sense for generator/list/set/dict comprehensions: > > mylist = [y for y in another_list if y < 5 where y = f(x)] Do you mean: mylist = [y for x in another_list if y < 5 where y = f(x)] John =:-> From eric.twilegar at gmail.com Tue Jul 20 19:03:07 2010 From: eric.twilegar at gmail.com (et gmail) Date: Tue, 20 Jul 2010 12:03:07 -0500 Subject: [Python-ideas] first() and last() tests in for x in y loops In-Reply-To: <4c4365f1.5429e70a.4ba3.ffff8601@mx.google.com> References: <4c4365f1.5429e70a.4ba3.ffff8601@mx.google.com> Message-ID: <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> While doing "for x in y" loops I often need to know if I'm working on the first item or the last item in the list. For example imagine you are building a list of values separated by ","s. The last iteration you need to suppress the ",". One work around is to just take the last character off during at the end, but you get the idea. I could see the code looking something like this for item in List: if __first__: print 'we are in the first loop' doSomething() if __last__ is False: print ',' Sorry if the formatting is a little off. Does something like this already exist and I'm just being a newb. I am fairly new to the language. Also it would be nice if there was an auto counter like For item in List with count x . where x would then be an auto counter that incremented every iteration of the loop. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Jul 20 19:13:13 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 20 Jul 2010 13:13:13 -0400 Subject: [Python-ideas] first() and last() tests in for x in y loops In-Reply-To: <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> References: <4c4365f1.5429e70a.4ba3.ffff8601@mx.google.com> <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> Message-ID: On Tue, Jul 20, 2010 at 1:03 PM, et gmail wrote: .. > Also it would be nice if there was an auto counter like > > For item in List with count x ? where x would then be an auto counter that > incremented every iteration of the loop. This feature is already there: for num,item in enumerate(List): ... I would like to suggest that when you don't know how to achieve certain result in python, you first ask on python-list or #python IRC channel. It is best to propose new features after users agree that the feature is not present. From phd at phd.pp.ru Tue Jul 20 19:19:01 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Tue, 20 Jul 2010 21:19:01 +0400 Subject: [Python-ideas] first() and last() tests in for x in y loops In-Reply-To: <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> References: <4c4365f1.5429e70a.4ba3.ffff8601@mx.google.com> <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> Message-ID: <20100720171901.GB32158@phd.pp.ru> On Tue, Jul 20, 2010 at 12:03:07PM -0500, et gmail wrote: > While doing "for x in y" loops I often need to know if I'm working on the > first item or the last item in the list. See http://ppa.cvs.sourceforge.net/ppa/misc/Repeat.py?rev=HEAD&content-type=text/vnd.viewcvs-markup (License: Python) > Also it would be nice if there was an auto counter like Find built-in function enumerate() in the docs. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From sturla at molden.no Tue Jul 20 19:18:57 2010 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jul 2010 19:18:57 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4C45DA81.5020009@molden.no> Den 20.07.2010 16:57, skrev Andrey Popp: > a = (b, b) where b = 43 > I am +1 for a where module and -1 for a where keyword, and here is the reason: In MATLAB, we have the "find" function that serves the role of where. In NumPy, we have a function numpy.where and also masked arrays. The above statement with NumPy ndarrays would be: idx, = np.where(b == 47) a = (b[idx], b[idx]) or we could simply do this: a = (b[b==47], b[b==47]) And if we look at this proposed expression, mylist = [y for y in another_list if y < 5 where y == f(x)] Using NumPy, we can proceed like this: idx, = np.where(another_array == f(x)) mylist = [y for y in another_array[idx] if y < 5] The intension is just as clear, and it avoids a new "where" keyword. It is also similar to NumPy and Matlab. Not to mention that the "where keyword" in the above expression could be replaced with an "and", so it serve no real purpose: mylist = [y for y in another_list if (y < 5 and y == f(x))] why have a where keyword here? It is just redundant. So I'd rather speak of something useful instead: NumPy's "Fancy indexing". "Fancy indexing" (NumPy jargon) will in this context mean that we allow indexes to be an iterable, not just integers: mylist[(1,2,3)] == mylist[1,2,3] mylist[iterable] == [a(i) for i in iterable] That is what NumPy and Matlab do, as well as Fortran 90 (and certain C++ libraries such as Blitz++). It has all the power of the "where keyword", while being more flexible to use, and intention is more explicit. It is also well tested syntax. Thus with "fancy indexing": alist[iterable] == [alist[i] for i in iterable] That is what we really need! Note that this is not a language syntax change, it is just a change of how __setitem__ and __getitem__ works for certain container types. NumPy already does this, so the syntax itself is completely valid Python. And as for "where", it is just a function. Andrey's proposed where keyword is a crippled tool in comparison. That is, the real power of a list of indexers is that it can be obtained and manipulated with any conceivable method, e.g. slicing. It also allows numpy to have an "argsort" function, since an index list can be reused on multiple arrays: idx = np.argsort(array_a) sorteda = array_a[idx] sortedb = array_b[idx] is the same as tmp = sorted([a,i for i,a in enumerate(lista)]) sorteda = [a for a,i in tmp] sortedb = [listb[i] for a,i in tmp] Which is the more readable? Implementing a generic "where function" can be achieved with a lambda: idx = where(lambda x: x== 47, alist) or a list comprehension (this would be very similar to NumPy): idx = where([x==47 for x in alist]) But to begin with, I think we should get NumPy style "fancy indexing" to standard container types like list, tuple, string, bytes, bytearray, array and deque. That would just be a handful of subclasses, and I think they should (initially) be put in a standard library module, and possibly replace the current cointainers in Python 4000. But as for a where keyword: My opinion is a big -1, if I have the right to vote. We should rather implement a where function and overload the mentioned container types. The where function should go in the same module. So all in all, I am +1 for a "where module" and -1 for a "where keyword". P.S. I'll admit that dict and set might add to some confusion, since "fancy indexing" would be ambigous for them. Regards, Sturla From guido at python.org Tue Jul 20 20:16:04 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Jul 2010 19:16:04 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <4C45DA81.5020009@molden.no> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: On Tue, Jul 20, 2010 at 6:18 PM, Sturla Molden wrote: > Den 20.07.2010 16:57, skrev Andrey Popp: >> >> ? ? a = (b, b) where b = 43 >> > > I am +1 for a where module and -1 for a where keyword, and here is the > reason: > > In MATLAB, we have the "find" function that serves the role of where. In > NumPy, we have a function numpy.where and also masked arrays. > > The above statement with NumPy ndarrays would be: > > ? idx, = np.where(b == 47) > ? a = (b[idx], b[idx]) > > or we could simply do this: > > ? a = (b[b==47], b[b==47]) It looks like NumPy's "where" is more like SQL's, while Nick's is more like Haskell's. These are totally different: in SQL it's a dynamic query (and its argument is a condition), whereas in Haskell it's purely a syntactic construct for defining some variables to be used as shorthands in an expression. Given the large number of Python users familiar with SQL compared to those familiar with Haskell, I think we'd do wise to pick a different keyword instead of 'where'. I can't think of one right now though. Your proposal is completely orthogonal to Nick's; the best thing to do is probably to start a different thread for yours. Note that Microsoft's LINQ is also similar to your suggestion. -- --Guido van Rossum (python.org/~guido) From daniel at stutzbachenterprises.com Tue Jul 20 20:29:04 2010 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Tue, 20 Jul 2010 13:29:04 -0500 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: On Tue, Jul 20, 2010 at 1:16 PM, Guido van Rossum wrote: > Given the large number of Python users familiar with SQL compared to > those familiar with Haskell, I think we'd do wise to pick a different > keyword instead of 'where'. I can't think of one right now though. > > Taking a cue from mathematics, how about "given"? c = sqrt(a*a + b*b) given: a = retrieve_a() b = retrieve_b() -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.sakkis at gmail.com Tue Jul 20 20:35:20 2010 From: george.sakkis at gmail.com (George Sakkis) Date: Tue, 20 Jul 2010 20:35:20 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: On Tue, Jul 20, 2010 at 8:29 PM, Daniel Stutzbach wrote: > On Tue, Jul 20, 2010 at 1:16 PM, Guido van Rossum wrote: >> >> Given the large number of Python users familiar with SQL compared to >> those familiar with Haskell, I think we'd do wise to pick a different >> keyword instead of 'where'. I can't think of one right now though. > > Taking a cue from mathematics, how about "given"? > > c = sqrt(a*a + b*b) given: > ?? ?a = retrieve_a() > ?? ?b = retrieve_b() Or if we'd rather overload an existing keyword than add a new one, "with" reads well too. George From scialexlight at gmail.com Tue Jul 20 20:56:38 2010 From: scialexlight at gmail.com (Alex Light) Date: Tue, 20 Jul 2010 14:56:38 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: i would suggest overloading the 'with', 'as' combo c = sqrt(a*a + b*b) with: get_a(), get_b() as a, b or c = sqrt(a*a + b*b) with: get_a() as a get_b() as b it reads well and plus it follows since this statement acts similarly to a regular with and as statement with the behavior of the context manager being (in psudocode): set up: store original value of variable if any and set variable to new value. tear down: set value back to original or delete from local namespace if it never had one additionally we do not need to introduce any new keywords any way that this is implemented though it would be quite useful On Tue, Jul 20, 2010 at 2:35 PM, George Sakkis wrote: > On Tue, Jul 20, 2010 at 8:29 PM, Daniel Stutzbach > wrote: > > > On Tue, Jul 20, 2010 at 1:16 PM, Guido van Rossum > wrote: > >> > >> Given the large number of Python users familiar with SQL compared to > >> those familiar with Haskell, I think we'd do wise to pick a different > >> keyword instead of 'where'. I can't think of one right now though. > > > > Taking a cue from mathematics, how about "given"? > > > > c = sqrt(a*a + b*b) given: > > a = retrieve_a() > > b = retrieve_b() > > Or if we'd rather overload an existing keyword than add a new one, > "with" reads well too. > > George > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Tue Jul 20 21:15:04 2010 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 20 Jul 2010 20:15:04 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: <4C45F5B8.6080606@mrabarnett.plus.com> Alex Light wrote: > i would suggest overloading the 'with', 'as' combo > > c = sqrt(a*a + b*b) with: > get_a(), get_b() as a, b > > or > > > c = sqrt(a*a + b*b) with: > get_a() as a > get_b() as b > Why use 'as'? Why not: c = sqrt(a*a + b*b) with: a = get_a() b = get_b() which is like: def _(): a = get_a() b = get_b() return sqrt(a*a + b*b) c = _() del _ > > it reads well and plus it follows since this statement acts similarly to > a regular with and as statement with the behavior of the context manager > being (in psudocode): > set up: store original value of variable if any and set variable to new > value. > tear down: set value back to original or delete from local namespace if > it never had one > > additionally we do not need to introduce any new keywords > > any way that this is implemented though it would be quite useful > > On Tue, Jul 20, 2010 at 2:35 PM, George Sakkis > wrote: > > On Tue, Jul 20, 2010 at 8:29 PM, Daniel Stutzbach > > wrote: > > > On Tue, Jul 20, 2010 at 1:16 PM, Guido van Rossum > > wrote: > >> > >> Given the large number of Python users familiar with SQL compared to > >> those familiar with Haskell, I think we'd do wise to pick a > different > >> keyword instead of 'where'. I can't think of one right now though. > > > > Taking a cue from mathematics, how about "given"? > > > > c = sqrt(a*a + b*b) given: > > a = retrieve_a() > > b = retrieve_b() > > Or if we'd rather overload an existing keyword than add a new one, > "with" reads well too. > From bruce at leapyear.org Tue Jul 20 21:17:18 2010 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 20 Jul 2010 12:17:18 -0700 Subject: [Python-ideas] fancy indexing Message-ID: [changing the subject; was: 'where' statement in Python?] I think this is an interesting idea (whether worth adding is a different question). I think it would be confusing that a[x] = (y,z) does something entirely different when x is 1 or (1,2). If python *were* to add something like this, I think perhaps a different syntax should be considered: a[[x]] = y y = a[[x]] which call __setitems__ and __getitems__ respectively. This makes it clear that something different is going on and eliminates the ambiguity for dicts. --- Bruce http://www.vroospeak.com http://google-gruyere.appspot.com On Tue, Jul 20, 2010 at 10:18 AM, Sturla Molden wrote: > So I'd rather speak of something useful instead: NumPy's "Fancy indexing". > > "Fancy indexing" (NumPy jargon) will in this context mean that we allow > indexes to be an iterable, not just integers: > > mylist[(1,2,3)] == mylist[1,2,3] > mylist[iterable] == [a(i) for i in iterable] > > That is what NumPy and Matlab do, as well as Fortran 90 (and certain C++ > libraries such as Blitz++). It has all the power of the "where keyword", > while being more flexible to use, and intention is more explicit. It is also > well tested syntax. > > Thus with "fancy indexing": > > alist[iterable] == [alist[i] for i in iterable] > > That is what we really need! > > Note that this is not a language syntax change, it is just a change of how > __setitem__ and __getitem__ works for certain container types. NumPy already > does this, so the syntax itself is completely valid Python. And as for > "where", it is just a function. > > Andrey's proposed where keyword is a crippled tool in comparison. That is, > the real power of a list of indexers is that it can be obtained and > manipulated with any conceivable method, e.g. slicing. It also allows numpy > to have an "argsort" function, since an index list can be reused on multiple > arrays: > > idx = np.argsort(array_a) > sorteda = array_a[idx] > sortedb = array_b[idx] > > is the same as > > tmp = sorted([a,i for i,a in enumerate(lista)]) > sorteda = [a for a,i in tmp] > sortedb = [listb[i] for a,i in tmp] > > Which is the more readable? > > Implementing a generic "where function" can be achieved with a lambda: > > idx = where(lambda x: x== 47, alist) > > or a list comprehension (this would be very similar to NumPy): > > idx = where([x==47 for x in alist]) > > But to begin with, I think we should get NumPy style "fancy indexing" to > standard container types like list, tuple, string, bytes, bytearray, array > and deque. That would just be a handful of subclasses, and I think they > should (initially) be put in a standard library module, and possibly replace > the current cointainers in Python 4000. > > But as for a where keyword: My opinion is a big -1, if I have the right to > vote. We should rather implement a where function and overload the mentioned > container types. The where function should go in the same module. > > So all in all, I am +1 for a "where module" and -1 for a "where keyword". > > P.S. I'll admit that dict and set might add to some confusion, since "fancy > indexing" would be ambigous for them. > > Regards, > Sturla > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Tue Jul 20 21:20:32 2010 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jul 2010 21:20:32 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: <4C45F700.9040902@molden.no> Den 20.07.2010 20:16, skrev Guido van Rossum: > It looks like NumPy's "where" is more like SQL's, Yes, it is roughtly like a WHERE statement in SQL or Fortran 90, or Python's built-in "filter" function (albeit more flexible). I am not sure I miss "fancy indexing" for Python lists because it is useful, or because Fortran have crippled my mind. This is of course easy to achieve with two utility functions, also demonstrating what NumPy's fancy indexing does. This is with a lambda: def where(cond, alist): return [i for i,a in enumerate(alist) if cond(a)] def fancyindex(alist, index): return [alist[i] for i in index] > Microsoft's LINQ is also similar to your suggestion. > LINQ is just to compensate for lack of duck-typing in C# ;-) Also when used to Pythons list comprehensions, LINQ syntax feels like thinking backwards (which can be quite annoying) :-( Sturla From sturla at molden.no Tue Jul 20 21:32:25 2010 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jul 2010 21:32:25 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> Message-ID: <4C45F9C9.3040808@molden.no> Den 20.07.2010 20:56, skrev Alex Light: > i would suggest overloading the 'with', 'as' combo > > c = sqrt(a*a + b*b) with: > get_a(), get_b() as a, b > That will not work, the parser would think like this: c = sqrt(a*a + b*b) with: get_a(), (get_b() as a), b > > c = sqrt(a*a + b*b) with: > get_a() as a > get_b() as b > I think it tastes too much like functional programming. Specifically: It forces you to think backwards: it does not evaluate the lines in the order they read. The block below is evaluated before the expression above. It really means: a = get_a() b = get_b() c = sqrt(a*a + b*b) with: del a,b That I think is very annoying. Why not just use context managers instead? with get_a() as a, get_b() as b: c = sqrt(a*a + b*b) Now it reads the right way: top down, not bottom up. Regards, Sturla From bruce at leapyear.org Tue Jul 20 21:42:15 2010 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 20 Jul 2010 12:42:15 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: List comprehensions are the one place where I might find this useful. I can do this with map though: [y for x in another_list if y < 5 where y = f(x)] [y for x in map(f, another_list) if x < 5] [x for x in another_list if y < 5 where y = f(x)] [x for (x,y) in map(lambda x: (x,f(x)), another_list) if y < 5] I think the variant with 'where' or 'with' would be a bit more readable but is it valuable enough? --- Bruce http://www.vroospeak.com http://google-gruyere.appspot.com On Tue, Jul 20, 2010 at 7:57 AM, Andrey Popp <8mayday at gmail.com> wrote: > > > mylist = [y for y in another_list if y < 5 where y = f(x)] > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jul 20 21:44:18 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Jul 2010 20:44:18 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <4C45F9C9.3040808@molden.no> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F9C9.3040808@molden.no> Message-ID: On Tue, Jul 20, 2010 at 8:32 PM, Sturla Molden wrote: > Den 20.07.2010 20:56, skrev Alex Light: >> >> i would suggest overloading the 'with', 'as' combo >> >> c = sqrt(a*a + b*b) with: >> ? ?get_a(), get_b() as a, b >> > > That will not work, the parser would think like this: > > c = sqrt(a*a + b*b) with: > ? ?get_a(), (get_b() as a), b Not true, we can define the grouping as we like. However if we do something like this I'd rather use 'var = expr' instead of 'expr as var'. >> c = sqrt(a*a + b*b) with: >> ? ?get_a() as a >> ? ?get_b() as b >> > > I think it tastes too much like functional programming. Specifically: > > It forces you to think backwards: it does not evaluate the lines in the > order they read. The block below is evaluated before the expression above. > It really means: > > a = get_a() > b = get_b() > c = sqrt(a*a + b*b) with: > del a,b > > That I think is very annoying. Like it or not, except for the keyword to use and the 'as' issue, that's exactly the proposal (please read the PEP: http://www.python.org/dev/peps/pep-3150/ ). I personally like it; plus the "think backwards" idea is already used in other parts of Python' syntax, e.g. list comprehensions. And of course forward references work for method calls too. > Why not just use context managers instead? > > with get_a() as a, get_b() as b: > ? ? c = sqrt(a*a + b*b) > > Now it reads the right way: top down, not bottom up. Because this means something different and requires that get_a() return something that obeys the context manager protocol. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Tue Jul 20 21:49:49 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 20 Jul 2010 15:49:49 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 7/20/2010 1:29 AM, Stephen J. Turnbull wrote: > Sergio Davis writes: > > > I'm considering the following extension to Python's grammar: adding the > > 'where' keyword, which would work as follows: > > > > where_expr : expr 'where' NAME '=' expr > > We just had a long thread about this. > > http://mail.python.org/pipermail/python-ideas/2010-June/007476.html > > The sentiment was about -0 to -0.5 on the idea in general, I did not comment then because I thought the idea of cluttering python with augmented local namespace blocks, with no functional gain, was rejected and dead, and hence unnecessary of comment. -10 For me, the idea would come close to destroying (what remains of) the simplicity that makes Python relatively easy to learn. It seems to be associated with the (to me, cracked) idea that names are pollution. I agree with Jack Diederich: >I think the "trick" to making it readable > is putting the assignment first. > par_pos = decl.find('(') > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() > versus: > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() where > par_pos=decl.find('(') The real horror would come with multiple assignments with multiple and nested where or whatever clauses. -- Terry Jan Reedy From sturla at molden.no Tue Jul 20 21:57:02 2010 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jul 2010 21:57:02 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: > [x for x in another_list if y < 5 where y = f(x)] > [x for (x,y) in map(lambda x: (x,f(x)), another_list) if y < 5] Or allow nested list comprehensions? [x for (x,y) in [x,f(x) for x in another_list] if y < 5] this is getting closer to LINQ... We could use line-breaks for readability: [x for (x,y) in [x,f(x) for x in another_list] if y < 5] S. From 8mayday at gmail.com Tue Jul 20 22:02:07 2010 From: 8mayday at gmail.com (Andrey Popp) Date: Wed, 21 Jul 2010 00:02:07 +0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <4C45C0A8.8060407@gmail.com> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45C0A8.8060407@gmail.com> Message-ID: Yes, exactly this, it was a typo. On Tue, Jul 20, 2010 at 7:28 PM, John Arbash Meinel wrote: > Andrey Popp wrote: >> Hello, >> >> PEP-3150 is awesome, just a small addition ? why not to allow >> one-liners `where`s: >> >> ? ? a = (b, b) where b = 43 >> >> And that also make sense for generator/list/set/dict comprehensions: >> >> ? ? mylist = [y for y in another_list if y < 5 where y = f(x)] > > Do you mean: > > ?mylist = [y for x in another_list if y < 5 where y = f(x)] > > John > =:-> > -- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday at gmail.com From sturla at molden.no Tue Jul 20 22:07:21 2010 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jul 2010 22:07:21 +0200 Subject: [Python-ideas] fancy indexing In-Reply-To: <4C45F700.9040902@molden.no> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F700.9040902@molden.no> Message-ID: <925d09c162cc4a9eca363b01403edda9.squirrel@webmail.uio.no> > Den 20.07.2010 20:16, skrev Guido van Rossum: > Yes, it is roughtly like a WHERE statement in SQL or Fortran 90, or > Python's built-in "filter" function (albeit more flexible). Fancy indexing is actually more like a join. Since the result from a numpy.where one array can be used to filter another array, it fancy indexing would be like a join between tables in a relational database. def join(blist, index): return [blist[i] for i in index] The big difference between fancy indexing and a join method is of course that indexing can appear on the left side of an expression. Sturla From ianb at colorstudy.com Tue Jul 20 22:09:02 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 20 Jul 2010 15:09:02 -0500 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <4C45F700.9040902@molden.no> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F700.9040902@molden.no> Message-ID: On Tue, Jul 20, 2010 at 2:20 PM, Sturla Molden wrote: > Microsoft's LINQ is also similar to your suggestion. >> >> > > LINQ is just to compensate for lack of duck-typing in C# ;-) Also when used > to Pythons list comprehensions, LINQ syntax feels like thinking backwards > (which can be quite annoying) :-( > Well, LINQ has a bit more going on from what I can tell -- you can actually get at the expressions and work with them. This is something NumPy and database abstraction layers both need, and they both currently use method override hacks that have certain limitations (e.g., you capture ==, but you can't capture "and"). If you work really hard you can decompile the bytecodes (DejaVu did this for lambdas, but not generator expressions). I don't think I've even seen a language proposal that actually tries to tackle this though. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From scialexlight at gmail.com Tue Jul 20 22:13:07 2010 From: scialexlight at gmail.com (Alex Light) Date: Tue, 20 Jul 2010 16:13:07 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F5B8.6080606@mrabarnett.plus.com> Message-ID: MRAB wrote: > > Why use 'as'? Why not: > i would use as because this whole where clause acts very similarly to a context manager in that it sets a variable to a value for a small block > c = a + b with: > get_a(), get_b() as a, b > is equivilent to > with get_a(), get_b() as a, b: > c = a + b > assuming get_a() and get_b() are context manager that return the values of a and b and then delete them at the end > the thing with this though it that the get_a() and get_b() do not need to be context managers the interpreter will create one so this will be valid: > c = a**2 with: > 4 as a > i.e. it creates a context manager whose __enter__ method returns 4 and whose exit method deletes a from the current namespace > strula moldan wrote: >That will not work, the parser would think like this: > >c = sqrt(a*a + b*b) with: > get_a(), (get_b() as a), b sorry haven't used 'as' in a while and i forgot that you could not use it like you can everything else in compound assignments (BTW why isn't a, b as c, d allowed it makes sense especially since a, b = c, d is allowed it should be changed so it does) >>c = sqrt(a*a + b*b) with: >> get_a() as a >> get_b() as b > > >I think it tastes too much like functional programming. Specifically: > >It forces you to think backwards: it does not evaluate the lines in the order they read. i think you miss the point which is understandable considering such a short example but consider if there are many many steps used to arrive at the variable. you can give it a simple, descriptive name and nobody needs to see how you got it unless they want to. or a situation where you have many many variables in the with statement and it is possible to understand from their names what they represent. i.e. sea = water() with (get_temperature(sea) as temp, get_depth(sea) as depth, get_purity(sea) as purity, get_salinity(sea) as saltiness, get_size(sea) as size #etc for a few more lines get_density(sea) as density): sea_num = temp**depth + purity - salinity * size - density # one line only is much harder to understand than simply sea_num = temp**depth + purity - salinity * size - density with: # one line only get_temperature(sea) as temp, get_depth(sea) as depth, get_purity(sea) as purity, get_salinity(sea) as saltiness, get_size(sea) as size #etc for a few more lines get_density(sea) as density the meaning of all the words is obvious (assuming you know that sea_num has to do with the sea and water) and since it is you do not really need to have all that clutter up above and can just put it below > On Tue, Jul 20, 2010 at 3:15 PM, MRAB wrote: > Alex Light wrote: > >> i would suggest overloading the 'with', 'as' combo >> >>> >> c = sqrt(a*a + b*b) with: >> >>> get_a(), get_b() as a, b >> >>> >> or >> >>> >> >> c = sqrt(a*a + b*b) with: >> >>> get_a() as a >> >>> get_b() as b >> >> Why use 'as'? Why not: > >> > > c = sqrt(a*a + b*b) with: > >> a = get_a() > >> b = get_b() > >> > which is like: > >> > def _(): > >> a = get_a() > >> b = get_b() > >> > return sqrt(a*a + b*b) > >> > c = _() > >> del _ > >> > >> it reads well and plus it follows since this statement acts similarly to a >> regular with and as statement with the behavior of the context manager being >> (in psudocode): set up: store original value of variable if any and set >> variable to new value. >> >>> tear down: set value back to original or delete from local namespace if >> it never had one >> >>> >> additionally we do not need to introduce any new keywords >> >>> >> any way that this is implemented though it would be quite useful >> >>> >> On Tue, Jul 20, 2010 at 2:35 PM, George Sakkis > george.sakkis at gmail.com>> wrote: >> >>> >> On Tue, Jul 20, 2010 at 8:29 PM, Daniel Stutzbach >> >>> > >>> > wrote: >> >>> >> > On Tue, Jul 20, 2010 at 1:16 PM, Guido van Rossum >> >>> > wrote: >> >>> >> >> >>> >> Given the large number of Python users familiar with SQL compared >> to >> >>> >> those familiar with Haskell, I think we'd do wise to pick a >> >>> different >> >>> >> keyword instead of 'where'. I can't think of one right now though. >> >>> > >> >>> > Taking a cue from mathematics, how about "given"? >> >>> > >> >>> > c = sqrt(a*a + b*b) given: >> >>> > a = retrieve_a() >> >>> > b = retrieve_b() >> >>> >> Or if we'd rather overload an existing keyword than add a new one, >> >>> "with" reads well too. >> >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Tue Jul 20 22:15:29 2010 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jul 2010 22:15:29 +0200 Subject: [Python-ideas] fancy indexing In-Reply-To: References: Message-ID: <19309b1fb49ecc37281b810bbf7af682.squirrel@webmail.uio.no> > [changing the subject; was: 'where' statement in Python?] > a[[x]] = y > y = a[[x]] > > which call __setitems__ and __getitems__ respectively. This makes it clear > that something different is going on and eliminates the ambiguity for > dicts. Or use the * operator used to expand tuples for fucntion calls: a[*x] = y y = a[*x] analogous to foobar(*x). The intent would be the same. S. From merwok at netwok.org Tue Jul 20 23:07:06 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Tue, 20 Jul 2010 23:07:06 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4C460FFA.7000303@netwok.org> > Or allow nested list comprehensions? > > [x for (x,y) in [x,f(x) for x in another_list] if y < 5] This already works, with a syntax correction: [x for (x, y) in [(x, f(x)) for x in another_list] if y < 5] Regards From ncoghlan at gmail.com Tue Jul 20 23:07:57 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 07:07:57 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Jul 21, 2010 at 12:57 AM, Andrey Popp <8mayday at gmail.com> wrote: > Hello, > > PEP-3150 is awesome, just a small addition ? why not to allow > one-liners `where`s: > > ? ?a = (b, b) where b = 43 > > And that also make sense for generator/list/set/dict comprehensions: > > ? ?mylist = [y for y in another_list if y < 5 where y = f(x)] As with any other suite, a one-liner would be allowed on the same line as the colon: a = (b, b) where: b = call_once_only() mylist = [y for y in another_list if y < x] where: x = call_once_only() Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Tue Jul 20 23:59:07 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 20 Jul 2010 17:59:07 -0400 Subject: [Python-ideas] first() and last() tests in for x in y loops In-Reply-To: <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> References: <4c4365f1.5429e70a.4ba3.ffff8601@mx.google.com> <4c45d6e1.5429e70a.5067.ffffd4b5@mx.google.com> Message-ID: On 7/20/2010 1:03 PM, et gmail wrote: > I could see the code looking something like this > for item in List: > if __first__: > print ?we are in the first loop? > doSomething() > if __last__ is False: > print ?,? > > Sorry if the formatting is a little off. Do not use tabs when posting code. > Does something like this already exist I am posting an extended answer to "How to treat the first or last item differently" on python-list (gmane.comp.python.general) so that others can see and find it. -- Terry Jan Reedy From ncoghlan at gmail.com Wed Jul 21 00:13:29 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 08:13:29 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F5B8.6080606@mrabarnett.plus.com> Message-ID: On Wed, Jul 21, 2010 at 6:13 AM, Alex Light wrote: > > > > MRAB wrote: > >>?Why use 'as'? Why not: > > i would use as because this whole where clause acts very similarly to a > context manager in that it sets a variable to a value for a small block No, the idea is for the indented suite to be a perfectly normal suite of Python code. We want to be able to define functions, classes, etc in there. Inventing a new mini-language specifically for these clauses would be a bad idea (and make them unnecessarily hard to understand) For the record, I've updated the PEP* based on the discussion in this thread (I switched to "given" as the draft keyword due to the Haskell/SQL semantic confusion for the "where" keyword - we've had that discussion before, I just forgot about it last night when putting the PEP together) *Diff here: http://svn.python.org/view/peps/trunk/pep-3150.txt?r1=82992&r2=83002 Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Wed Jul 21 00:28:03 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 Jul 2010 00:28:03 +0200 Subject: [Python-ideas] 'where' statement in Python? References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20100721002803.0d10def2@pitrou.net> On Tue, 20 Jul 2010 23:27:49 +1000 Nick Coghlan wrote: > > For the record, I am personally +1 on the idea (otherwise I wouldn't > have put so much thought into it over the years). It's just a *lot* > harder to define complete and consistent semantics for the concept > than people often realise. > > However, having the question come up twice within the last month > finally inspired me to write the current status of the topic down in a > deferred PEP: http://www.python.org/dev/peps/pep-3150/ I am worried that this complexifies Python syntax without any obvious benefit in terms of expressive power, new abstractions, or concision. There is a benefit (learning curve, readibility of foreign code) to a simple syntax. I am somewhere between -0.5 and -1. Regards Antoine. From pyideas at rebertia.com Wed Jul 21 00:30:43 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 20 Jul 2010 15:30:43 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F5B8.6080606@mrabarnett.plus.com> Message-ID: On Tue, Jul 20, 2010 at 3:13 PM, Nick Coghlan wrote: > For the record, I've updated the PEP* based on the discussion in this > thread (I switched to "given" as the draft keyword due to the > Haskell/SQL semantic confusion for the "where" keyword - we've had > that discussion before, I just forgot about it last night when putting > the PEP together) > > *Diff here: http://svn.python.org/view/peps/trunk/pep-3150.txt?r1=82992&r2=83002 Nitpicking: Could the example code snippets please be made PEP 8 compliant (in particular, use 4-space indents)? They currently wobble between 2, 3, and 4-space indents, and the readability of Alex Light's example in particular is diminished by the use of only 2-space indents. Cheers, Chris From ncoghlan at gmail.com Wed Jul 21 00:38:07 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 08:38:07 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C45DA81.5020009@molden.no> <4C45F5B8.6080606@mrabarnett.plus.com> Message-ID: On Wed, Jul 21, 2010 at 8:30 AM, Chris Rebert wrote: > On Tue, Jul 20, 2010 at 3:13 PM, Nick Coghlan wrote: > >> For the record, I've updated the PEP* based on the discussion in this >> thread (I switched to "given" as the draft keyword due to the >> Haskell/SQL semantic confusion for the "where" keyword - we've had >> that discussion before, I just forgot about it last night when putting >> the PEP together) >> >> *Diff here: http://svn.python.org/view/peps/trunk/pep-3150.txt?r1=82992&r2=83002 > > Nitpicking: > Could the example code snippets please be made PEP 8 compliant (in > particular, use 4-space indents)? Done. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Wed Jul 21 00:48:06 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 Jul 2010 00:48:06 +0200 Subject: [Python-ideas] 'where' statement in Python? References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> Message-ID: <20100721004806.48858225@pitrou.net> On Wed, 21 Jul 2010 00:28:03 +0200 Antoine Pitrou wrote: > > I am worried that this complexifies Python syntax without any obvious > benefit in terms of expressive power, new abstractions, or concision. > There is a benefit (learning curve, readibility of foreign code) to a > simple syntax. I'll add another issue: - currently, lexical blocks (indentation following a colon) are used for control flow statements; this proposal blurs the line and makes visual inspection less reliable I also disagree with the rationale which states that the motivation is similar to that for decorators or list comprehensions. Decorators and list comprehensions add value by making certain constructs more concise and more readable (by allowing to express the construct at a higher level through the use of detail-hiding syntax); as for decorators, they also eliminate the need for repeating oneself. Both have the double benefit of allowing shorter and higher-level code. The "given" syntax (I don't know how to call it: statement? postfix? appendage?), however, brings none of these benefits: it is almost pure syntactic sugar, and one which doesn't bring any lexical compression since it actually increases code size, rather than decrease it. Regards Antoine. From grosser.meister.morti at gmx.net Wed Jul 21 00:51:23 2010 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Wed, 21 Jul 2010 00:51:23 +0200 Subject: [Python-ideas] fancy indexing In-Reply-To: References: Message-ID: <4C46286B.3030005@gmx.net> I'm not sure what this is about but do you mean something like this? >>> l=[1,2,3,4] >>> l[1:2] = ['a','b'] >>> l [1, 'a', 'b', 3, 4] On 07/20/2010 09:17 PM, Bruce Leban wrote: > [changing the subject; was: 'where' statement in Python?] > > I think this is an interesting idea (whether worth adding is a different question). I think it would > be confusing that > a[x] = (y,z) > does something entirely different when x is 1 or (1,2). If python *were* to add something like this, > I think perhaps a different syntax should be considered: > > a[[x]] = y > y = a[[x]] > > which call __setitems__ and __getitems__ respectively. This makes it clear that something different > is going on and eliminates the ambiguity for dicts. From guido at python.org Wed Jul 21 01:08:18 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Jul 2010 00:08:18 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <20100721004806.48858225@pitrou.net> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Tue, Jul 20, 2010 at 11:48 PM, Antoine Pitrou wrote: > On Wed, 21 Jul 2010 00:28:03 +0200 > Antoine Pitrou wrote: >> >> I am worried that this complexifies Python syntax without any obvious >> benefit in terms of expressive power, new abstractions, or concision. >> There is a benefit (learning curve, readibility of foreign code) to a >> simple syntax. > > I'll add another issue: > > - currently, lexical blocks (indentation following a colon) are used > ?for control flow statements; this proposal blurs the line and makes > ?visual inspection less reliable This is indeed a bit of a downside; if you see blah blah blah: x = blah y = blah you will have to look more carefully at the end of the first blah blah blah line to know whether the indented block is executed first or last. For all other intended blocks, the *beginning* of the indented block is your clue (class, def, if, try, etc.). > I also disagree with the rationale which states that the motivation > is similar to that for decorators or list comprehensions. Decorators > and list comprehensions add value by making certain constructs more > concise and more readable (by allowing to express the construct at a > higher level through the use of detail-hiding syntax); as for > decorators, they also eliminate the need for repeating oneself. Both > have the double benefit of allowing shorter and higher-level code. I see a similar possibility as for decorators, actually. A decorator is very simple syntactic sugar too, but it allows one to emphasize the decoration by putting it up front rather than hiding it after the (possibly very long) function. The 'given' block has a similar effect of changing the order in which the (human) reader encounters things: it lets you see the important part first, e.g. c = sqrt(a*a + b*b) and put the definitions for a and b off till later. This can both help the author "stay in the flow" and emphasize the most important parts for the reader, similar to top-down programming using forward references between functions or methods. I personally use top-down just about evenly with bottom-up (though in different circumstances), and I think it would be useful to have more support for top-down coding at the statement level. That's why ABC had refinements, too. I dropped them from Python because I had to cut down the design as much as possible to make it possible to implement in a reasonable time as a skunkworks project. But I always did like them in ABC. Whether this is enough to compensate for the larger grammar is an open question. > The "given" syntax (I don't know how to call it: statement? postfix? > appendage?), however, brings none of these benefits: it is almost pure > syntactic sugar, and one which doesn't bring any lexical compression > since it actually increases code size, rather than decrease it. But it decreases exposure by limiting the scope of the variables defined in the 'given' block, just like generator expressions and (in Python 3) list comprehensions do. In larger functions it is easy to accidentally reuse variables and this occasionally introduces bugs (the most common case probably being the loop control variable for a small (often inner) loop overwriting a variable that is set far above the loop in the same scope but is used far below it). -- --Guido van Rossum (python.org/~guido) From cmjohnson.mailinglist at gmail.com Wed Jul 21 01:38:41 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Tue, 20 Jul 2010 13:38:41 -1000 Subject: [Python-ideas] fancy indexing In-Reply-To: <4C46286B.3030005@gmx.net> References: <4C46286B.3030005@gmx.net> Message-ID: Does this need new syntax? Couldn?t it just be a method? Perhaps .where()? ;-) From bruce at leapyear.org Wed Jul 21 01:43:13 2010 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 20 Jul 2010 16:43:13 -0700 Subject: [Python-ideas] fancy indexing In-Reply-To: <4C46286B.3030005@gmx.net> References: <4C46286B.3030005@gmx.net> Message-ID: x = a[[y]] would be approximately equivalent to x = [a[i] for i in y] and a[[x]] = y would be approximately equivalent to for (i,j) in zip(x,y): a[i] = j except that zip throws away excess values in the longer sequence and I think [[..]] would throw an exception. --- Bruce http://www.vroospeak.com http://google-gruyere.appspot.com On Tue, Jul 20, 2010 at 3:51 PM, Mathias Panzenb?ck < grosser.meister.morti at gmx.net> wrote: > I'm not sure what this is about but do you mean something like this? > >>> l=[1,2,3,4] > >>> l[1:2] = ['a','b'] > >>> l > [1, 'a', 'b', 3, 4] > > > On 07/20/2010 09:17 PM, Bruce Leban wrote: > >> [changing the subject; was: 'where' statement in Python?] >> >> I think this is an interesting idea (whether worth adding is a different >> question). I think it would >> be confusing that >> a[x] = (y,z) >> does something entirely different when x is 1 or (1,2). If python *were* >> to add something like this, >> I think perhaps a different syntax should be considered: >> >> a[[x]] = y >> y = a[[x]] >> >> which call __setitems__ and __getitems__ respectively. This makes it clear >> that something different >> is going on and eliminates the ambiguity for dicts. >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Wed Jul 21 02:03:01 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 02:03:01 +0200 Subject: [Python-ideas] fancy indexing In-Reply-To: <4C46286B.3030005@gmx.net> References: <4C46286B.3030005@gmx.net> Message-ID: <4C463935.80709@molden.no> Den 21.07.2010 00:51, skrev Mathias Panzenb?ck: > I'm not sure what this is about but do you mean something like this? > >>> l=[1,2,3,4] > >>> l[1:2] = ['a','b'] > >>> l > [1, 'a', 'b', 3, 4] No, that is slicing. A fancy index is a more flexible slice, as it has no regular structure. It's just a list, tuple or array of indexes, in arbitrary order, possibly repeated. It would e.g. work like this: >>> alist = [1,2,3,4] >>> alist[(1,2,1,1,3)] [2, 3, 2, 2, 4] If know SQL, it means that you can do with indexing what SQL can do with WHERE and JOIN. You can e.g. search a list in O(N) for indexes where a certain condition evaluates to True (cf. SQL WHERE), and then apply these indexes to any list (cf. SQL JOIN). It is not just for queries, but also for things like sorting. It is what lets NumPy have an "argsort" function. It does not return a sorted array, but an array of indices, which when applied to the array, will return a sorted instance. These indices can in turn be applied to other arrays as well. Think about what happens when you sort each row in an Excel spreadsheet by the values in a certain column. One column is sorted, the other columns are reordered synchroneously. That is the kind of thing that fancy indexing allows us to do rather easily. Yes there are other ways of doing this in Python now, but not as elegent I think. And it is not a syntax change to Python (NumPy can do it), it is just a library issue. This is at least present in NumPy, MATLAB, C# and LINQ, SQL, Fortran 95 (in two ways), Scilab, Octave, and C++ (e.g. Blitz++). The word "fancy indexing" is the name used for it in NumPy. Sturla From cmjohnson.mailinglist at gmail.com Wed Jul 21 02:09:16 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Tue, 20 Jul 2010 14:09:16 -1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: Questions: 1.) It looks like a lot of the complexity of PEP 3150 is based on wanting thing like this to work: x[index] = 42 given: index = complicated_formula To make that work, you need to figure out if index is a nonlocal or a global or what in order to emit the right bytecode. What happens if we just give up that use case and say that anything on the assignment side of the initial = gets looked up in the original namespace? In other words, make 3150 more similar to the sugar: def _(): index = complicated_formula x[index] = _() #Probably a NameError Would the complexity of PEP 3150 be significantly lessened by that? Or are there other major sources of complexity in the local/nonlocal/global issue? 2.) What happens in this case: x = y given: return "???" Do we just disallow return inside a given? If so, how would the parser know to allow you to do a def inside a given? -- Carl Johnson From sturla at molden.no Wed Jul 21 02:15:15 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 02:15:15 +0200 Subject: [Python-ideas] fancy indexing In-Reply-To: References: <4C46286B.3030005@gmx.net> Message-ID: <4C463C13.2000107@molden.no> Den 21.07.2010 01:38, skrev Carl M. Johnson: > Does this need new syntax? Couldn?t it just be a method? Perhaps .where()? ;-) > It is just a library issue. And adding it would not break anything, because lists and tuples don't accept iterables as indexers now. The problem is the dict and the set, which can take tuples as index. A .where() method would work, if it e.g. took a predicate as argument. But we would still need no pass the return value (e.g. a tuple) to the [] operator. That is all legal syntax today (which is why NumPy can do this), but lists are implemented to only accept integers to __setitem__ and __getitem__. Sturla From pyideas at rebertia.com Wed Jul 21 02:24:41 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 20 Jul 2010 17:24:41 -0700 Subject: [Python-ideas] fancy indexing In-Reply-To: References: <4C46286B.3030005@gmx.net> Message-ID: On Tue, Jul 20, 2010 at 4:43 PM, Bruce Leban wrote: > ??? x = a[[y]] > would be approximately equivalent to > ?? ?x = [a[i] for i in y] You realize that syntax /already/ has a valid meaning in Python, right? Namely, using a single-element list as a subscript: >>> class Foo(object): ... def __getitem__(self, index): ... print "Subscript:", index ... >>> a = Foo() >>> y = 42 >>> x = a[[y]] Subscript: [42] >>> # hey, whaddya know! Making this syntax do something else would lead to some surprising inconsistencies to say the least; albeit I don't know how common it is to use lists as subscripts. Cheers, Chris -- http://blog.rebertia.com From sturla at molden.no Wed Jul 21 02:51:53 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 02:51:53 +0200 Subject: [Python-ideas] Add faster locks to the threading module? Message-ID: <4C4644A9.3040502@molden.no> Thread synchronization with threading.Lock can be expensive. But consider this: Why should the active thread need to aquire a mutex, when it already holds one? That would be the GIL. Instead of acqiring a lock (and possibly inducing thread swiching etc.), it could just deny the other threads access to the GIL for a while. The cost of that synchronization method would be completely amortized, as check intervals happen anyway. Here is what a very na?ve implementation would look like in ctypes (real code could use C instead, or perhaps should not attempt this at all...): from contextlib import contextmanager import ctypes _Py_Ticker = ctypes.c_int.in_dll(ctypes.pythonapi,"_Py_Ticker") @contextmanager def threadsafe(): tmp = _Py_Ticker.value _Py_Ticker.value = 0x7fffffff yield _Py_Ticker.value = tmp Now we can do this: with threadsafe(): # the GIL is mine, # for as long as I want pass The usecase for this "gillock" is about the same as for a spinlock C. We want synchronization for a breif period of time, but don't want the overhead of aquiring a mutex. In Python this gillock has one big advantage over a spinlock: We don't have to wait, so we don't risk a tread switch on __enter__/aquire. But there can be only one instance of this lock, as there is only one GIL. That is the drawback compared to a spinlock. Therefore I think both a spinlock and a gillock should be added to the threading module. These are synchronization methods that should be available. P.S. A gillock working like the ctypes code above is of course very dangerous. If a C extension releases the GIL while _Py_Ticker is astronomic, we have a very bad situation... But real code could try to safeguard against this. E.g. Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS could be defined to do nothing if _Py_Ticker is above some ridiculous threshold. P.P.S. Yes I know about the newgil. But I have not thought about how to achieve similar effect with that. Sturla -------------- next part -------------- An HTML attachment was scrubbed... URL: From jnoller at gmail.com Wed Jul 21 02:57:54 2010 From: jnoller at gmail.com (Jesse Noller) Date: Tue, 20 Jul 2010 20:57:54 -0400 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: <4C4644A9.3040502@molden.no> References: <4C4644A9.3040502@molden.no> Message-ID: On Tue, Jul 20, 2010 at 8:51 PM, Sturla Molden wrote: > [...snip...] > P.P.S. Yes I know about the newgil. But I have not thought about how to > achieve similar effect with that. > > > Sturla Regardless of the rest of the proposal (which I'm not keen on; I don't want to rely on the GIL "being there") you need to factor the newgil code into this equation as a change like this would only land in the 3.x branch. With 2.7 out the door - 3.x is, for all effective purposes, trunk. jesse From sturla at molden.no Wed Jul 21 02:59:58 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 02:59:58 +0200 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: <4C4644A9.3040502@molden.no> References: <4C4644A9.3040502@molden.no> Message-ID: <4C46468E.603@molden.no> Den 21.07.2010 02:51, skrev Sturla Molden: > > Therefore I think both a spinlock and a gillock should be added to the > threading module. These are synchronization methods that should be > available. > Avtually, a spinlock would probably not even be feasible in Python: The GIL is a mutex, and we would have to give it up before we could spin on the lock, and reacquire afterwards. So the cost of a kernel mutex object is still there. The gillock is possibly the only way of getting a fast lock object in Python. Sturla -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Wed Jul 21 03:08:26 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 03:08:26 +0200 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: References: <4C4644A9.3040502@molden.no> Message-ID: <4C46488A.7010304@molden.no> Den 21.07.2010 02:57, skrev Jesse Noller: > > Regardless of the rest of the proposal (which I'm not keen on; I don't > want to rely on the GIL "being there") you need to factor the newgil > code into this equation as a change like this would only land in the > 3.x branch. With 2.7 out the door - 3.x is, for all effective > purposes, trunk. > > Yes, but the principle of denying other threads access to the GIL for fast synchronization still applies to 3.x. Java has "synchronized" blocks too. This is not very different. But monopolizing the GIL is much faster than a lock if the purpose is just to guard a tiny piece of code. Sturla From sturla at molden.no Wed Jul 21 03:40:28 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 03:40:28 +0200 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: References: <4C4644A9.3040502@molden.no> Message-ID: <4C46500C.5010101@molden.no> Den 21.07.2010 02:57, skrev Jesse Noller: > Regardless of the rest of the proposal (which I'm not keen on; I don't > want to rely on the GIL "being there") you need to factor the newgil > code into this equation as a change like this would only land in the > 3.x branch. With 2.7 out the door - 3.x is, for all effective > purposes, trunk. > > I have looked briefly at it now. It seems to be just as easy with newgil, and possibly much safer. The functions drop_gil and take_gil in ceval_gil.h could e.g. be modified to just return and do nothing if a global deny flag is set. But I have to look more carefully at this later on. Sturla From jnoller at gmail.com Wed Jul 21 03:58:42 2010 From: jnoller at gmail.com (Jesse Noller) Date: Tue, 20 Jul 2010 21:58:42 -0400 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: <4C46500C.5010101@molden.no> References: <4C4644A9.3040502@molden.no> <4C46500C.5010101@molden.no> Message-ID: On Tue, Jul 20, 2010 at 9:40 PM, Sturla Molden wrote: > Den 21.07.2010 02:57, skrev Jesse Noller: >> >> Regardless of the rest of the proposal (which I'm not keen on; I don't >> want to rely on the GIL "being there") you need to factor the newgil >> code into this equation as a change like this would only land in the >> 3.x branch. With 2.7 out the door - 3.x is, for all effective >> purposes, trunk. >> >> > > I have looked briefly at it now. It seems to be just as easy with newgil, > and possibly much safer. The functions drop_gil and take_gil ?in ceval_gil.h > could e.g. be modified to just return and do nothing if a global deny flag > is set. But I have to look more carefully at this later on. > > Sturla > That's all I was trying to say :) Your original email said "Yes I know about the newgil. But I have not thought about how to achieve similar effect with that." I see what you're trying to do (impersonating the synchronized keyword java has) but I'm a little skeeved out by adding anything like this which is directly reliant on the GIL's existence. From scialexlight at gmail.com Wed Jul 21 04:50:29 2010 From: scialexlight at gmail.com (Alex Light) Date: Tue, 20 Jul 2010 22:50:29 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: Carl M. johnson wrote: >2.) What happens in this case: > >x = y given: > return "???" > >Do we just disallow return inside a given? If so, how would the parser >know to allow you to do a def inside a given? i think so because unless i am misunderstanding something the only allowed expressions in a 'given' block would be of the type: a_variable = an_expression however one way to think of it is that ans = do_something(a, b, c) given: a = get_a() b = get_b() c = get_c() is the same as saying a = get_a() b = get_b() c = get_c() ans = do_something(a, b, c) del a, b, c which would seem to indicate that a = do_something() given: return "???" goes to: return "???" a = do_something() would be valid (if extremely bad style) in the end i think its use in that way would be like the use of a GOTO statement in many languages, technically there is no reason it should not be allowed but still prohibited for stylistic reasons. anyway why would you ever want to do this? it would be like writing def f(): return return it is non-nonsensical and obviously an error On Tue, Jul 20, 2010 at 8:09 PM, Carl M. Johnson < cmjohnson.mailinglist at gmail.com> wrote: > Questions: > > 1.) It looks like a lot of the complexity of PEP 3150 is based on > wanting thing like this to work: > > x[index] = 42 given: > index = complicated_formula > > To make that work, you need to figure out if index is a nonlocal or a > global or what in order to emit the right bytecode. What happens if we > just give up that use case and say that anything on the assignment > side of the initial = gets looked up in the original namespace? In > other words, make 3150 more similar to the sugar: > > def _(): > index = complicated_formula > > x[index] = _() #Probably a NameError > > Would the complexity of PEP 3150 be significantly lessened by that? Or > are there other major sources of complexity in the > local/nonlocal/global issue? > > 2.) What happens in this case: > > x = y given: > return "???" > > Do we just disallow return inside a given? If so, how would the parser > know to allow you to do a def inside a given? > > -- Carl Johnson > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Wed Jul 21 04:56:57 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 20 Jul 2010 19:56:57 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Tue, Jul 20, 2010 at 7:50 PM, Alex Light wrote: > Carl M. johnson wrote: >>2.) What happens in this case: >> >>x = y given: >> ? ?return "???" >> >>Do we just disallow return inside a given? If so, how would the parser >>know to allow you to do a def inside a given? > i think so because?unless i am misunderstanding something the only allowed > expressions > in a 'given' block would be of the type: > a_variable = an_expression Incorrect. Yes, you are misunderstanding: On Tue, Jul 20, 2010 at 3:13 PM, Nick Coghlan wrote: > On Wed, Jul 21, 2010 at 6:13 AM, Alex Light wrote: >> i would use as because this whole where clause acts very similarly to a >> context manager in that it sets a variable to a value for a small block > > No, the idea is for the indented suite to be a perfectly normal suite > of Python code. We want to be able to define functions, classes, etc > in there. Inventing a new mini-language specifically for these clauses > would be a bad idea (and make them unnecessarily hard to understand) > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia Did you not read Nick's reply yet when you wrote this, or...? Cheers, Chris -- http://blog.rebertia.com From sturla at molden.no Wed Jul 21 05:59:11 2010 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jul 2010 05:59:11 +0200 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: References: <4C4644A9.3040502@molden.no> <4C46500C.5010101@molden.no> Message-ID: <4C46708F.30809@molden.no> Den 21.07.2010 03:58, skrev Jesse Noller: > I see what you're trying to do (impersonating the synchronized keyword > java has) but I'm a little skeeved out by adding anything like this > which is directly reliant on the GIL's existence. > > It is not reliant on the GIL. Sorry if you got that impression. In a GIL free world, a global spinlock would serve the same purpose (cf. Java). But as there is a GIL in Python, we cannot use a spinlock to avoid the overhead of a mutex. But we can temporarily hold on to the GIL instead, and achieve the same effect. This is very close to Java's synchronized keyword, yes. The main reason is that most synchronizations in multi-threaded apps are used to protect very small pieces of code. A mutex is overkill for that. Instead of: 1. release gil. 2. acquire lock. 3. re-acquire gil. 4. release lock. we could just: 1. keep the gil for a litte while. Also note that in the single-threaded case, the overhead from this "synchronized" block would be close to zero. It would do nothing, except write to the address of a volatile int. Sturla From cs at zip.com.au Wed Jul 21 08:22:27 2010 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 21 Jul 2010 16:22:27 +1000 Subject: [Python-ideas] Add faster locks to the threading module? In-Reply-To: <4C4644A9.3040502@molden.no> References: <4C4644A9.3040502@molden.no> Message-ID: <20100721062226.GA12223@cskk.homeip.net> On 21Jul2010 02:51, Sturla Molden wrote: | Thread synchronization with threading.Lock can be expensive. But | consider this: Why should the active thread need to aquire a mutex, | when it already holds one? That would be the GIL. | | Instead of acqiring a lock (and possibly inducing thread swiching | etc.), it could just deny the other threads access to the GIL for a | while. The cost of that synchronization method would be completely | amortized, as check intervals happen anyway. [...] | The usecase for this "gillock" is about the same as for a spinlock | C. We want synchronization for a breif period of time, but don't | want the overhead of aquiring a mutex. But a spinlock _does_ acquire a mutex, unless I misremember. It is just that instead of using a more expensive mutex mechanism that deschedules the caller until available it sits there banging its head against a very lightweight (not descheduling) mutex until it gets it; the efficiency is entirely reliant on _all_ users of the spinlock doing very little inside the located period. And therein lies one of my two misgivings about this idea: users of this must be very careful at all times to do Very Little inside the lock. Just like a spinlock. This seems very easy to misuse. The other misgiving is one you've already mentioned in passing: P.S. A gillock working like the ctypes code above is of course very dangerous. If a C extension releases the GIL while _Py_Ticker is astronomic, we have a very bad situation... But real code could try to safeguard against this. E.g. Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS could be defined to do nothing if _Py_Ticker is above some ridiculous threshold. Suppose someone goes: with threadsafe(): x = fp.read(1) and the read blocks? It looks like trivial code but I think the I/O stuff will release the GIL over a read. Now you're not safe any more. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ The ZZR-1100 is not the bike for me, but the day they invent "nerf" roads and ban radars I'll be the first in line......AMCN From 8mayday at gmail.com Wed Jul 21 08:23:40 2010 From: 8mayday at gmail.com (Andrey Popp) Date: Wed, 21 Jul 2010 10:23:40 +0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <20100721004806.48858225@pitrou.net> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: Hello, On Wed, Jul 21, 2010 at 2:48 AM, Antoine Pitrou wrote: > I'll add another issue: > > - currently, lexical blocks (indentation following a colon) are used > ?for control flow statements; this proposal blurs the line and makes > ?visual inspection less reliable Do class definitions or with-statements represent control flow structures? I think, no (with-statement maybe). > I also disagree with the rationale which states that the motivation > is similar to that for decorators or list comprehensions. Decorators > and list comprehensions add value by making certain constructs more > concise and more readable (by allowing to express the construct at a > higher level through the use of detail-hiding syntax); as for > decorators, they also eliminate the need for repeating oneself. Both > have the double benefit of allowing shorter and higher-level code. Consider the following: ... value = a*x*x + b*x + c given: a = compute_a() b = compute_b() c = compute_c() ... which is roughly equivalent to ... a = compute_a() b = compute_b() c = compute_c() value = a*x*x + b*x + c ... with two differences: - It emphasizes that `value` is a target of this computation and `a`, `b` and `c` are just auxiliary. - It states that `a`, `b` and `c` are only used in statement, before the `given` keyword, that would help future refactorings. Due to the second point, it can't be considered as syntactic sugar. Is is more readable? I think yes. -- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday at gmail.com From stephen at xemacs.org Wed Jul 21 11:31:13 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 21 Jul 2010 18:31:13 +0900 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: <87sk3dxjtq.fsf@uwakimon.sk.tsukuba.ac.jp> Andrey Popp writes: > Do class definitions or with-statements represent control flow > structures? I think, no (with-statement maybe). In both cases, yes. From the point of view of the programmer writing the controlled suite, they're very simple control structures (serial execution). True, the important thing that a class definition does is set up a special configuration of namespaces in which the suite is executed (eg, instead of def registering a function in the global namespace, it registers it in the class). Nevertheless, it does determine control flow, and the statements in a class "declaration" are executed at runtime, not at compile time. The with statement is a nontrivial control flow structure: it ensures that certain code is executed at certain times, although the code that it provides guarantees for is not in the explicit suite it controls. Antoine's point here is not that "given" isn't a control flow *structure*. It is that it is not a *statement*, but rather a fragment that can be added to a wide variety of statements. That is quite a major departure for Python, though I think it a natural one in this context. Antoine might find execute: value = a*x*x + b*x + c given: a = compute_a() b = compute_b() c = compute_c() less objectionable on those grounds. Ie, it is now a proper control statement. I hasten to add that I think this syntax is quite horrible for *other* reasons, not least needing to find two keywords. Worst, "given" advocates want the computation of a, b, and c subordinated lexically to computation of value, but here they are on the same level. > Consider the following: > > ... > value = a*x*x + b*x + c given: > a = compute_a() > b = compute_b() > c = compute_c() > ... > > which is roughly equivalent to > > ... > a = compute_a() > b = compute_b() > c = compute_c() > value = a*x*x + b*x + c > ... > > with two differences: > > - It emphasizes that `value` is a target of this computation and `a`, > `b` and `c` are just auxiliary. > - It states that `a`, `b` and `c` are only used in statement, before > the `given` keyword, that would help future refactorings. > > Due to the second point, it can't be considered as syntactic sugar. But the second point doesn't prove you can't get the same semantics, only that a naive implementation fails. If you want to ensure that `a`, `b` and `c` are only used in a limited scope, there's always def compute_quadratic(z): a = compute_a() b = compute_b() c = compute_c() return a*x*x + b*x + c value = compute_quadratic(x) del compute_quadratic Now *that* is quite ugly (and arguably unreadable). But it shows that there are other ways of obtaining the same semantics in Python, and thus "given" is syntactic sugar (at least for this use case). From stefan_ml at behnel.de Wed Jul 21 11:48:33 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Jul 2010 11:48:33 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Nick Coghlan, 20.07.2010 15:27: > having the question come up twice within the last month > finally inspired me to write the current status of the topic down in a > deferred PEP: http://www.python.org/dev/peps/pep-3150/ Thanks for writing that up. I like the idea in general. As for input from the "major Python implementations", we currently do similar things internally in Cython for optimisation purposes, so a syntactic 'where' clause with expression-local scope would be trivial to implement on our side as most of the infrastructure is there anyway. Stefan From solipsis at pitrou.net Wed Jul 21 12:04:46 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 Jul 2010 12:04:46 +0200 Subject: [Python-ideas] Add faster locks to the threading module? References: <4C4644A9.3040502@molden.no> Message-ID: <20100721120446.045eaae2@pitrou.net> On Wed, 21 Jul 2010 02:51:53 +0200 Sturla Molden wrote: > > Thread synchronization with threading.Lock can be expensive. But > consider this: Why should the active thread need to aquire a mutex, when > it already holds one? That would be the GIL. Do you have any data about supposed cost of threading.Lock (or RLock)? There is no point in trying to optimize a perceived bottleneck if the bottleneck doesn't exist. From solipsis at pitrou.net Wed Jul 21 12:18:55 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 Jul 2010 12:18:55 +0200 Subject: [Python-ideas] lock performance References: <4C4644A9.3040502@molden.no> Message-ID: <20100721121855.58da8bda@pitrou.net> On Wed, 21 Jul 2010 02:51:53 +0200 Sturla Molden wrote: > > Thread synchronization with threading.Lock can be expensive. But > consider this: Why should the active thread need to aquire a mutex, when > it already holds one? That would be the GIL. Ok, here is the cost of acquiring and releasing an uncontended lock under Linux, with Python 3.2: $ ./python -m timeit \ -s "from threading import Lock; l=Lock(); a=l.acquire; r=l.release" \ "a(); r()" 10000000 loops, best of 3: 0.127 usec per loop And here is the cost of calling a dummy Python function: $ ./python -m timeit -s "def a(): pass" "a(); a()" 1000000 loops, best of 3: 0.221 usec per loop And here is the cost of calling a trivial C function (which returns the False singleton): $ ./python -m timeit -s "a=bool" "a(); a()" 10000000 loops, best of 3: 0.164 usec per loop Also, note that using the lock as a context manager is actually slower, not faster as you might imagine: $ ./python -m timeit -s "from threading import Lock; l=Lock()" \ "with l: pass" 1000000 loops, best of 3: 0.242 usec per loop At least under Linux, there doesn't seem to be a lot of room for improvement in lock performance, to say the least. PS: RLock is now as fast as Lock: $ ./python -m timeit \ -s "from threading import RLock; l=RLock(); a=l.acquire; r=l.release" \ "a(); r()" 10000000 loops, best of 3: 0.114 usec per loop From solipsis at pitrou.net Wed Jul 21 12:26:41 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 Jul 2010 12:26:41 +0200 Subject: [Python-ideas] spinlocks "vs" mutexes References: <4C4644A9.3040502@molden.no> <20100721062226.GA12223@cskk.homeip.net> Message-ID: <20100721122641.202885d8@pitrou.net> On Wed, 21 Jul 2010 16:22:27 +1000 Cameron Simpson wrote: > On 21Jul2010 02:51, Sturla Molden wrote: > | Thread synchronization with threading.Lock can be expensive. But > | consider this: Why should the active thread need to aquire a mutex, > | when it already holds one? That would be the GIL. > | > | Instead of acqiring a lock (and possibly inducing thread swiching > | etc.), it could just deny the other threads access to the GIL for a > | while. The cost of that synchronization method would be completely > | amortized, as check intervals happen anyway. > [...] > | The usecase for this "gillock" is about the same as for a spinlock > | C. We want synchronization for a breif period of time, but don't > | want the overhead of aquiring a mutex. > > But a spinlock _does_ acquire a mutex, unless I misremember. > > It is just that instead of using a more expensive mutex mechanism that > deschedules the caller until available it sits there banging its head > against a very lightweight (not descheduling) mutex until it gets it; > the efficiency is entirely reliant on _all_ users of the spinlock doing > very little inside the located period. Not only that, but optimized OSes and libc's might do the same as an optimization for regular mutexes. For example, here's what the pthreads(7) man page says here: ?thread synchronization primitives (mutexes, thread joining, etc.) are implemented using the Linux futex(2) system call.? I don't know exactly how futex(2) works, but it looks like a kind of low-level API allowing to build spinlocks and other fast userspace primitives. Regards Antoine. From scialexlight at gmail.com Wed Jul 21 12:35:42 2010 From: scialexlight at gmail.com (Alex Light) Date: Wed, 21 Jul 2010 06:35:42 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: nick coughlan wrote: > No, the idea is for the indented suite to be a perfectly normal suite > of Python code. We want to be able to define functions, classes, etc > in there. @Chris Robert sorry what i meant in saying that " a_variable = an_expression" is that, it seems to me, at least, the only allowed statements are ones where a variable is set to a value, which includes "class" and "def" (and some control flow, if, else etc.) also in the first post: Sergio Davis wrote: >I'm considering the following extension to Python's grammar: adding the 'where' keyword, which would work as follows: > >where_expr : expr 'where' NAME '=' expr On Tue, Jul 20, 2010 at 10:56 PM, Chris Rebert wrote: > On Tue, Jul 20, 2010 at 7:50 PM, Alex Light > wrote: > > Carl M. johnson wrote: > >>2.) What happens in this case: > >> > >>x = y given: > >> return "???" > >> > >>Do we just disallow return inside a given? If so, how would the parser > >>know to allow you to do a def inside a given? > > i think so because unless i am misunderstanding something the only > allowed > > expressions > > in a 'given' block would be of the type: > > a_variable = an_expression > > Incorrect. Yes, you are misunderstanding: > > On Tue, Jul 20, 2010 at 3:13 PM, Nick Coghlan wrote: > > On Wed, Jul 21, 2010 at 6:13 AM, Alex Light > wrote: > >> i would use as because this whole where clause acts very similarly to a > >> context manager in that it sets a variable to a value for a small block > > > > No, the idea is for the indented suite to be a perfectly normal suite > > of Python code. We want to be able to define functions, classes, etc > > in there. Inventing a new mini-language specifically for these clauses > > would be a bad idea (and make them unnecessarily hard to understand) > > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > Did you not read Nick's reply yet when you wrote this, or...? > > Cheers, > Chris > -- > http://blog.rebertia.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Wed Jul 21 12:39:22 2010 From: masklinn at masklinn.net (Masklinn) Date: Wed, 21 Jul 2010 12:39:22 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On 2010-07-21, at 02:09 , Carl M. Johnson wrote: > Questions: > > 1.) It looks like a lot of the complexity of PEP 3150 is based on > wanting thing like this to work: > > x[index] = 42 given: > index = complicated_formula > > To make that work, you need to figure out if index is a nonlocal or a > global or what in order to emit the right bytecode. What happens if we > just give up that use case and say that anything on the assignment > side of the initial = gets looked up in the original namespace? In > other words, make 3150 more similar to the sugar: I quite agree with that, the where/given block/scope should only apply to the expression directly to the left of it. So only the RHS should be concerned, and LHS is out of that scope. And that expression would be written as: operator.setitem(x, index, 42) given: index = complicated_formula I think the first Torture Test block is misguided, and I'd be ?0.5 on such a complex feature. From guido at python.org Wed Jul 21 12:53:19 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Jul 2010 11:53:19 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 11:39 AM, Masklinn wrote: > On 2010-07-21, at 02:09 , Carl M. Johnson wrote: >> Questions: >> >> 1.) It looks like a lot of the complexity of PEP 3150 is based on >> wanting thing like this to work: >> >> x[index] = 42 given: >> index = complicated_formula >> >> To make that work, you need to figure out if index is a nonlocal or a >> global or what in order to emit the right bytecode. What happens if we >> just give up that use case and say that anything on the assignment >> side of the initial = gets looked up in the original namespace? In >> other words, make 3150 more similar to the sugar: Why do you think it's more complicated to do it for the LHS than for the RHS? > I quite agree with that, the where/given block/scope should only apply to the expression directly to the left of it. So only the RHS should be concerned, and LHS is out of that scope. > > And that expression would be written as: > > operator.setitem(x, index, 42) given: > index = complicated_formula Bah. > I think the first Torture Test block is misguided, and I'd be -0.5 on such a complex feature. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed Jul 21 13:07:39 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 21:07:39 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 8:53 PM, Guido van Rossum wrote: > On Wed, Jul 21, 2010 at 11:39 AM, Masklinn wrote: >> On 2010-07-21, at 02:09 , Carl M. Johnson wrote: >>> Questions: >>> >>> 1.) It looks like a lot of the complexity of PEP 3150 is based on >>> wanting thing like this to work: >>> >>> x[index] = 42 given: >>> ? ?index = complicated_formula >>> >>> To make that work, you need to figure out if index is a nonlocal or a >>> global or what in order to emit the right bytecode. What happens if we >>> just give up that use case and say that anything on the assignment >>> side of the initial = gets looked up in the original namespace? In >>> other words, make 3150 more similar to the sugar: > > Why do you think it's more complicated to do it for the LHS than for the RHS? It's allowing more than simple names on the LHS that makes a naive return-based approach to name binding not work. However, it's class scopes that really make this complicated, and making it "just work" is probably easier than trying to explain which kinds of assignment will and won't work. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stefan_ml at behnel.de Wed Jul 21 13:16:37 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 21 Jul 2010 13:16:37 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Terry Reedy, 20.07.2010 21:49: > I did not comment then because I thought the idea of cluttering python > with augmented local namespace blocks, with no functional gain, was > rejected and dead, and hence unnecessary of comment. > -10 > For me, the idea would come close to destroying (what remains of) the > simplicity that makes Python relatively easy to learn. It seems to be > associated with the (to me, cracked) idea that names are pollution. Actually, it's about *giving* names to subexpressions, that's quite the opposite. > I agree with Jack Diederich: > >I think the "trick" to making it readable > > is putting the assignment first. > > > par_pos = decl.find('(') > > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() > > > versus: > > > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() where > > par_pos=decl.find('(') > > The real horror would come with multiple assignments with multiple and > nested where or whatever clauses. I agree that the placement *behind* the expression itself *can* be suboptimal, but then, we also have conditional expressions, where it's good to know when to use them and when they get too long to be readable. The same applies here. However, I take your point that this is nothing that really makes anything simpler or that potentially opens new use cases (like the 'with' statement did, for example). It's a plain convenience syntax and as such not really worth defending over potential draw-backs. Stefan From ncoghlan at gmail.com Wed Jul 21 13:20:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 21:20:22 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 12:56 PM, Chris Rebert wrote: > On Tue, Jul 20, 2010 at 3:13 PM, Nick Coghlan wrote: >> On Wed, Jul 21, 2010 at 6:13 AM, Alex Light wrote: >>> i would use as because this whole where clause acts very similarly to a >>> context manager in that it sets a variable to a value for a small block >> >> No, the idea is for the indented suite to be a perfectly normal suite >> of Python code. We want to be able to define functions, classes, etc >> in there. Inventing a new mini-language specifically for these clauses >> would be a bad idea (and make them unnecessarily hard to understand) > > Did you not read Nick's reply yet when you wrote this, or...? Alex actually has a reasonable point here: break, continue, yield and return actually don't make sense in the top-level of the given clause (since it is conceptually all one statement). For break and continue, they will naturally give a SyntaxError with the proposed implementation (for "'break' outside loop" or "'continue' not properly in loop", just to be randomly inconsistent). yield and return (at the level of the given clause itself) will need to be disallowed explicitly by the compiler (similar to the "'return' outside function" and "'yield' outside function" errors you get if you attempt to use these keywords in a class or module scope). There are also some subtleties as to whether the given clause is compiled as a closure or not (my current thoughts are that it should be compiled as a closure when defined in a function scope, but like a class scope when defined in a class or module scope). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Jul 21 13:24:55 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 21:24:55 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Jul 21, 2010 at 9:16 PM, Stefan Behnel wrote: > Terry Reedy, 20.07.2010 21:49: >> >> I did not comment then because I thought the idea of cluttering python >> with augmented local namespace blocks, with no functional gain, was >> rejected and dead, and hence unnecessary of comment. >> -10 >> For me, the idea would come close to destroying (what remains of) the >> simplicity that makes Python relatively easy to learn. It seems to be >> associated with the (to me, cracked) idea that names are pollution. > > Actually, it's about *giving* names to subexpressions, that's quite the > opposite. I think Terry's point was that you can already give names to subexpressions by assigning them to variables in the current scope, but some people object to that approach due to "namespace pollution". I agree with him that avoiding namespace pollution isn't a particular strong argument though (unless you have really long scripts and functions), which is why I've tried to emphasise the intended readability benefits. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Jul 21 13:37:57 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Jul 2010 21:37:57 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: Message-ID: On Tue, Jul 20, 2010 at 11:52 AM, Jack Diederich wrote: > I think the "trick" to making it readable is putting the assignment first. > > par_pos = decl.find('(') > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() > > versus: > > vtype = decl[par_pos+1:FindMatching(par_pos, decl)].strip() where > par_pos=decl.find('(') Note that with a "given" clause, I would recommend writing something along these lines: vtype = decl[open_paren_pos+1:close_paren_pos] given: open_paren_pos = decl.find('(') close_paren_pos = FindMatching(open_paren_pos, decl) The positions of the open and closing parentheses are only relevant in the assignment statement and you can understand what the code does based just on the names of the subexpressions without necessarily worrying about how they are determined. The question here is whether this offers *enough* benefit over just writing open_paren_pos = decl.find('(') close_paren_pos = FindMatching(open_paren_pos, decl) vtype = decl[open_paren_pos+1:close_paren_pos] to be worth the significant additional complexity it introduces. Currently I'd say the scales are leaning heavily towards "not worth the hassle", but I'd be interesting to see what people can make of the PEP 359 use cases and judicious use of the locals() function in the context of PEP 3150 (assuming the given clause semantics are exactly as described by the implementation sketch in the PEP) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From george.sakkis at gmail.com Wed Jul 21 17:06:53 2010 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 21 Jul 2010 17:06:53 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 1:07 PM, Nick Coghlan wrote: > > However, it's class scopes that really make this complicated, and > making it "just work" is probably easier than trying to explain which > kinds of assignment will and won't work. Is support for class scopes an absolute must-have ? While it might be nice to have for consistency's sake, practically speaking module and function scopes should cover all but the most obscure and perverse use cases. I'm on -0.1 at the moment because of the "Two Ways To Do It" objection but I don't think the complication of making it work for class scopes should be grounds for rejection. George From bruce at leapyear.org Wed Jul 21 17:51:26 2010 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 21 Jul 2010 08:51:26 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: I'm unconvinced of the value at this point but notwithstanding that let me toss in an alternative syntax: given: suite do: suite This executes the two suites in order with any variable bindings created by the first suite being local to the scope of the two suites. I think this is more readable than the trailing clause and is more flexible (you can put multiple statements in the second suite) and avoids the issue with anyone wanting the where clause added to arbitrary expressions. FWIW, in math it's more common to list givens at the top. --- Bruce (via android) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jackdied at gmail.com Wed Jul 21 18:02:06 2010 From: jackdied at gmail.com (Jack Diederich) Date: Wed, 21 Jul 2010 12:02:06 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 7:20 AM, Nick Coghlan wrote: > On Wed, Jul 21, 2010 at 12:56 PM, Chris Rebert wrote: >> On Tue, Jul 20, 2010 at 3:13 PM, Nick Coghlan wrote: >>> On Wed, Jul 21, 2010 at 6:13 AM, Alex Light wrote: >>>> i would use as because this whole where clause acts very similarly to a >>>> context manager in that it sets a variable to a value for a small block >>> >>> No, the idea is for the indented suite to be a perfectly normal suite >>> of Python code. We want to be able to define functions, classes, etc >>> in there. Inventing a new mini-language specifically for these clauses >>> would be a bad idea (and make them unnecessarily hard to understand) >> >> Did you not read Nick's reply yet when you wrote this, or...? > > Alex actually has a reasonable point here: break, continue, yield and > return actually don't make sense in the top-level of the given clause > (since it is conceptually all one statement). > > For break and continue, they will naturally give a SyntaxError with > the proposed implementation (for "'break' outside loop" or "'continue' > not properly in loop", just to be randomly inconsistent). > > yield and return (at the level of the given clause itself) will need > to be disallowed explicitly by the compiler (similar to the "'return' > outside function" and "'yield' outside function" errors you get if you > attempt to use these keywords in a class or module scope). I'm -sys.maxint on the PEP for a many reasons. 1) I don't want to have to explain this to people. "It's just like regular python but you can't read it top-to-bottom and you can't include control flow statements." 2a) No control flow statements in the block means if you need to augment the code to do a return/break/continue/yield you then have to refactor so everything in the "given:" block gets moved to the top and a 1-line change becomes a 10 line diff. 2b) Allowing control flow statements in the block would be even more confusing. 2c) Is this legal? x = b given: b = 0 for item in range(100): b += item if b > 10: break 3) I really don't want to have to explain to people why that is, or isn't valid. 4) decorators and "with" blocks read top-to-bottom even if they change the emphasis. This doesn't. 5) There are no compelling use cases. The two examples in the PEP are toys. -Jack From scialexlight at gmail.com Wed Jul 21 19:19:20 2010 From: scialexlight at gmail.com (Alex Light) Date: Wed, 21 Jul 2010 13:19:20 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 12:02 PM, Jack Diederich wrote: > 2a) No control flow statements in the block means if you need to > augment the code to do a return/break/continue/yield you then have to > refactor so everything in the "given:" block gets moved to the top and > a 1-line change becomes a 10 line diff. > 2b) Allowing control flow statements in the block would be even more > confusing. > 2c) Is this legal? > x = b given: > b = 0 > for item in range(100): > b += item > if b > 10: > break 2a) you are missing the point of the given clause. you use it to assign values to variables if, and only if, the only possible results of the computation are 1) an exception is raised or 2) a value is returned which is set to the variable and used in the expression no matter its value. if there is the slightest chance that what you describe might be necessary you would not put it in a "given" but do something like this: (assumes that a given is applied to the previous statement in its current block) if some_bool_func(a): ans = some_func1(a, b, c) else: ans = some_func2(a, b, c) given: # note: given applied to the block starting with the "if" statement a = get_a() b = get_b() c = get_c() 2b) agreed but there is no reason for them to be disallowed except readibility but they should be discouraged 2c) see it might be okay, depending on what people think of your second question. in my opinion it should be illegal stylistically and be required to be changed to def summation(start, end) i = start while i < end: start += i i += 1 return start def a_func(): # do stuff x = b given: b = summation(0, 100) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Jul 21 21:34:12 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 21 Jul 2010 21:34:12 +0200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy Message-ID: <1279740852.3222.38.camel@localhost.localdomain> Hello, I would like to propose the following PEP for feedback and review. Permanent link to up-to-date version with proper HTML formatting: http://www.python.org/dev/peps/pep-3151/ Thank you, Antoine. PEP: 3151 Title: Reworking the OS and IO exception hierarchy Version: $Revision: 83042 $ Last-Modified: $Date: 2010-07-21 21:16:49 +0200 (mer. 21 juil. 2010) $ Author: Antoine Pitrou Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2010-07-21 Python-Version: 3.2 or 3.3 Post-History: Resolution: TBD Abstract ======== The standard exception hierarchy is an important part of the Python language. It has two defining qualities: it is both generic and selective. Generic in that the same exception type can be raised - and handled - regardless of the context (for example, whether you are trying to add something to an integer, to call a string method, or to write an object on a socket, a TypeError will be raised for bad argument types). Selective in that it allows the user to easily handle (silence, examine, process, store or encapsulate...) specific kinds of error conditions while letting other errors bubble up to higher calling contexts. For example, you can choose to catch ZeroDivisionErrors without affecting the default handling of other ArithmeticErrors (such as OverflowErrors). This PEP proposes changes to a part of the exception hierarchy in order to better embody the qualities mentioned above: the errors related to operating system calls (OSError, IOError, select.error, and all their subclasses). Rationale ========= Confusing set of OS-related exceptions -------------------------------------- OS-related (or system call-related) exceptions are currently a diversity of classes, arranged in the following subhierarchies:: +-- EnvironmentError +-- IOError +-- io.BlockingIOError +-- io.UnsupportedOperation (also inherits from ValueError) +-- socket.error +-- OSError +-- WindowsError +-- mmap.error +-- select.error While some of these distinctions can be explained by implementation considerations, they are often not very logical at a higher level. The line separating OSError and IOError, for example, is often blurry. Consider the following:: >>> os.remove("fff") Traceback (most recent call last): File "", line 1, in OSError: [Errno 2] No such file or directory: 'fff' >>> open("fff") Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory: 'fff' The same error condition (a non-existing file) gets cast as two different exceptions depending on which library function was called. The reason for this is that the ``os`` module exclusively raises OSError (or its subclass WindowsError) while the ``io`` module mostly raises IOError. However, the user is interested in the nature of the error, not in which part of the interpreter it comes from (since the latter is obvious from reading the traceback message or application source code). In fact, it is hard to think of any situation where OSError should be caught but not IOError, or the reverse. A further proof of the ambiguity of this segmentation is that the standard library itself sometimes has problems deciding. For example, in the ``select`` module, similar failures will raise either ``select.error``, ``OSError`` or ``IOError`` depending on whether you are using select(), a poll object, a kqueue object, or an epoll object. This makes user code uselessly complicated since it has to be prepared to catch various exception types, depending on which exact implementation of a single primitive it chooses to use at runtime. As for WindowsError, it seems to be a pointless distinction. First, it only exists on Windows systems, which requires tedious compatibility code in cross-platform applications (such code can be found in ``Lib/shutil.py``). Second, it inherits from OSError and is raised for similar errors as OSError is raised for on other systems. Third, the user wanting access to low-level exception specifics has to examine the ``errno`` or ``winerror`` attribute anyway. Lack of fine-grained exceptions ------------------------------- The current variety of OS-related exceptions doesn't allow the user to filter easily for the desired kinds of failures. As an example, consider the task of deleting a file if it exists. The Look Before You Leap (LBYL) idiom suffers from an obvious race condition:: if os.path.exists(filename): os.remove(filename) If a file named as ``filename`` is created by another thread or process between the calls to ``os.path.exists`` and ``os.remove``, it won't be deleted. This can produce bugs in the application, or even security issues. Therefore, the solution is to try to remove the file, and ignore the error if the file doesn't exist (an idiom known as Easier to Ask Forgiveness than to get Permission, or EAFP). Careful code will read like the following (which works under both POSIX and Windows systems):: try: os.remove(filename) except OSError as e: if e.errno != errno.ENOENT: raise or even:: try: os.remove(filename) except EnvironmentError as e: if e.errno != errno.ENOENT: raise This is a lot more to type, and also forces the user to remember the various cryptic mnemonics from the ``errno`` module. It imposes an additional cognitive burden and gets tiresome rather quickly. Consequently, many programmers will instead write the following code, which silences exceptions too broadly:: try: os.remove(filename) except OSError: pass ``os.remove`` can raise an OSError not only when the file doesn't exist, but in other possible situations (for example, the filename points to a directory, or the current process doesn't have permission to remove the file), which all indicate bugs in the application logic and therefore shouldn't be silenced. What the programmer would like to write instead is something such as:: try: os.remove(filename) except FileNotFound: pass Compatibility strategy ====================== Reworking the exception hierarchy will obviously change the exact semantics of at least some existing code. While it is not possible to improve on the current situation without changing exact semantics, it is possible to define a narrower type of compatibility, which we will call **useful compatibility**, and define as follows: * *useful compatibility* doesn't make exception catching any narrower, but it can be broader for *na?ve* exception-catching code. Given the following kind of snippet, all exceptions caught before this PEP will also be caught after this PEP, but the reverse may be false:: try: os.remove(filename) except OSError: pass * *useful compatibility* doesn't alter the behaviour of *careful* exception-catching code. Given the following kind of snippet, the same errors should be silenced or reraised, regardless of whether this PEP has been implemented or not:: try: os.remove(filename) except OSError as e: if e.errno != errno.ENOENT: raise The rationale for this compromise is that careless (or "na?ve") code can't really be helped, but at least code which "works" won't suddenly raise errors and crash. This is important since such code is likely to be present in scripts used as cron tasks or automated system administration programs. Careful code should not be penalized. Step 1: coalesce exception types ================================ The first step of the resolution is to coalesce existing exception types. The extent of this step is not yet fully determined. A number of possible changes are listed hereafter: * alias both socket.error and select.error to IOError * alias mmap.error to OSError * alias IOError to OSError * alias WindowsError to OSError Each of these changes doesn't preserve exact compatibility, but it does preserve *useful compatibility* (see "compatibility" section above). Not only does this first step present the user a simpler landscape, but it also allows for a better and more complete resolution of step 2 (see "Prerequisite" below). Deprecation of names -------------------- It is not yet decided whether the old names will be deprecated (then removed) or all alternative names will continue living in the root namespace. Deprecation of names from the root namespace presents some implementation challenges, especially where performance is important. Step 2: define additional subclasses ==================================== The second step of the resolution is to extend the hierarchy by defining subclasses which will be raised, rather than their parent, for specific errno values. Which errno values is subject to discussion, but a survey of existing exception matching practices (see Appendix A) helps us propose a reasonable subset of all values. Trying to map all errno mnemonics, indeed, seems foolish, pointless, and would pollute the root namespace. Furthermore, in a couple of cases, different errno values could raise the same exception subclass. For example, EAGAIN, EALREADY, EWOULDBLOCK and EINPROGRESS are all used to signal that an operation on a non-blocking socket would block (and therefore needs trying again later). They could therefore all raise an identical subclass and let the user examine the ``errno`` attribute if (s)he so desires (see below "exception attributes"). Prerequisite ------------ Step 1 is a loose prerequisite for this. Prerequisite, because some errnos can currently be attached to different exception classes: for example, EBADF can be attached to both OSError and IOError, depending on the context. If we don't want to break *useful compatibility*, we can't make an ``except OSError`` (or IOError) fail to match an exception where it would succeed today. Loose, because we could decide for a partial resolution of step 2 if existing exception classes are not coalesced: for example, EBADF could raise a hypothetical BadFileDescriptor where an IOError was previously raised, but continue to raise OSError otherwise. The dependency on step 1 could be totally removed if the new subclasses used multiple inheritance to match with all of the existing superclasses (or, at least, OSError and IOError, which are arguable the most prevalent ones). It would, however, make the hierarchy more complicated and therefore harder to grasp for the user. New exception classes --------------------- The following tentative list of subclasses, along with a description and the list of errnos mapped to them, is submitted to discussion: * ``FileAlreadyExists``: trying to create a file or directory which already exists (EEXIST) * ``FileNotFound``: for all circumstances where a file and directory is requested but doesn't exist (ENOENT) * ``IsADirectory``: file-level operation (open(), os.remove()...) requested on a directory (EISDIR) * ``NotADirectory``: directory-level operation requested on something else (ENOTDIR) * ``PermissionDenied``: trying to run an operation without the adequate access rights - for example filesystem permissions (EACCESS, optionally EPERM) * ``BlockingIOError``: an operation would block on an object (e.g. socket) set for non-blocking operation (EAGAIN, EALREADY, EWOULDBLOCK, EINPROGRESS); this is the existing ``io.BlockingIOError`` with an extended role * ``BadFileDescriptor``: operation on an invalid file descriptor (EBADF); the default error message could point out that most causes are that an existing file descriptor has been closed * ``ConnectionAborted``: connection attempt aborted by peer (ECONNABORTED) * ``ConnectionRefused``: connection reset by peer (ECONNREFUSED) * ``ConnectionReset``: connection reset by peer (ECONNRESET) * ``TimeoutError``: connection timed out (ECONNTIMEOUT); this could be re-cast as a generic timeout exception, useful for other types of timeout (for example in Lock.acquire()) This list assumes step 1 is accepted in full; the exception classes described above would all derive from the now unified exception type OSError. It will need reworking if a partial version of step 1 is accepted instead (again, see appendix A for the current distribution of errnos and exception types). Exception attributes -------------------- In order to preserve *useful compatibility*, these subclasses should still set adequate values for the various exception attributes defined on the superclass (for example ``errno``, ``filename``, and optionally ``winerror``). Implementation -------------- Since it is proposed that the subclasses are raised based purely on the value of ``errno``, little or no changes should be required in extension modules (either standard or third-party). As long as they use the ``PyErr_SetFromErrno()`` family of functions (or the ``PyErr_SetFromWindowsErr()`` family of functions under Windows), they should automatically benefit from the new, finer-grained exception classes. Library modules written in Python, though, will have to be adapted where they currently use the following idiom (seen in ``Lib/tempfile.py``):: raise IOError(_errno.EEXIST, "No usable temporary file name found") Fortunately, such Python code is quite rare since raising OSError or IOError with an errno value normally happens when interfacing with system calls, which is usually done in C extensions. If there is popular demand, the subroutine choosing an exception type based on the errno value could be exposed for use in pure Python. Possible objections =================== Namespace pollution ------------------- Making the exception hierarchy finer-grained makes the root (or builtins) namespace larger. This is to be moderated, however, as: * only a handful of additional classes are proposed; * while standard exception types live in the root namespace, they are visually distinguished by the fact that they use the CamelCase convention, while almost all other builtins use lowercase naming (except True, False, None, Ellipsis and NotImplemented) An alternative would be to provide a separate module containing the finer-grained exceptions, but that would defeat the purpose of encouraging careful code over careless code, since the user would first have to import the new module instead of using names already accessible. Earlier discussion ================== While this is the first time such as formal proposal is made, the idea has received informal support in the past [1]_; both the introduction of finer-grained exception classes and the coalescing of OSError and IOError. The removal of WindowsError alone has been discussed and rejected as part of another PEP [2]_, but there seemed to be a consensus that the distinction with OSError wasn't meaningful. This supports at least its aliasing with OSError. Moratorium ========== The moratorium in effect on language builtins means this PEP has little chance to be accepted for Python 3.2. Possible alternative ==================== Pattern matching ---------------- Another possibility would be to introduce an advanced pattern matching syntax when catching exceptions. For example:: try: os.remove(filename) except OSError as e if e.errno == errno.ENOENT: pass Several problems with this proposal: * it introduces new syntax, which is perceived by the author to be a heavier change compared to reworking the exception hierarchy * it doesn't decrease typing effort significantly * it doesn't relieve the programmer from the burden of having to remember errno mnemonics Exceptions ignored by this PEP ============================== This PEP ignores ``EOFError``, which signals a truncated input stream in various protocol and file format implementations (for example ``GzipFile``). ``EOFError`` is not OS- or IO-related, it is a logical error raised at a higher level. This PEP also ignores ``SSLError``, which is raised by the ``ssl`` module in order to propagate errors signalled by the ``OpenSSL`` library. Ideally, ``SSLError`` would benefit from a similar but separate treatment since it defines its own constants for error types (``ssl.SSL_ERROR_WANT_READ``, etc.). Appendix A: Survey of common errnos =================================== This is a quick recension of the various errno mnemonics checked for in the standard library and its tests, as part of ``except`` clauses. Common errnos with OSError -------------------------- * ``EBADF``: bad file descriptor (usually means the file descriptor was closed) * ``EEXIST``: file or directory exists * ``EINTR``: interrupted function call * ``EISDIR``: is a directory * ``ENOTDIR``: not a directory * ``ENOENT``: no such file or directory * ``EOPNOTSUPP``: operation not supported on socket (possible confusion with the existing io.UnsupportedOperation) * ``EPERM``: operation not permitted (when using e.g. os.setuid()) Common errnos with IOError -------------------------- * ``EACCES``: permission denied (for filesystem operations) * ``EBADF``: bad file descriptor (with select.epoll); read operation on a write-only GzipFile, or vice-versa * ``EBUSY``: device or resource busy * ``EISDIR``: is a directory (when trying to open()) * ``ENODEV``: no such device * ``ENOENT``: no such file or directory (when trying to open()) * ``ETIMEDOUT``: connection timed out Common errnos with socket.error ------------------------------- All these errors may also be associated with a plain IOError, for example when calling read() on a socket's file descriptor. * ``EAGAIN``: resource temporarily unavailable (during a non-blocking socket call except connect()) * ``EALREADY``: connection already in progress (during a non-blocking connect()) * ``EINPROGRESS``: operation in progress (during a non-blocking connect()) * ``EINTR``: interrupted function call * ``EISCONN``: the socket is connected * ``ECONNABORTED``: connection aborted by peer (during an accept() call) * ``ECONNREFUSED``: connection refused by peer * ``ECONNRESET``: connection reset by peer * ``ENOTCONN``: socket not connected * ``ESHUTDOWN``: cannot send after transport endpoint shutdown * ``EWOULDBLOCK``: same reasons as ``EAGAIN`` Common errnos with select.error ------------------------------- * ``EINTR``: interrupted function call Appendix B: Survey of raised OS and IO errors ============================================= Interpreter core ---------------- Handling of PYTHONSTARTUP raises IOError (but the error gets discarded):: $ PYTHONSTARTUP=foox ./python Python 3.2a0 (py3k:82920M, Jul 16 2010, 22:53:23) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. Could not open PYTHONSTARTUP IOError: [Errno 2] No such file or directory: 'foox' ``PyObject_Print()`` raises IOError when ferror() signals an error on the `FILE *` parameter (which, in the source tree, is always either stdout or stderr). Unicode encoding and decoding using the ``mbcs`` encoding can raise WindowsError for some error conditions. Standard library ---------------- bz2 ''' Raises IOError throughout (OSError is unused):: >>> bz2.BZ2File("foox", "rb") Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory >>> bz2.BZ2File("LICENSE", "rb").read() Traceback (most recent call last): File "", line 1, in IOError: invalid data stream >>> bz2.BZ2File("/tmp/zzz.bz2", "wb").read() Traceback (most recent call last): File "", line 1, in IOError: file is not ready for reading curses '''''' Not examined. dbm.gnu, dbm.ndbm ''''''''''''''''' _dbm.error and _gdbm.error inherit from IOError:: >>> dbm.gnu.open("foox") Traceback (most recent call last): File "", line 1, in _gdbm.error: [Errno 2] No such file or directory fcntl ''''' Raises IOError throughout (OSError is unused). imp module '''''''''' Raises IOError for bad file descriptors:: >>> imp.load_source("foo", "foo", 123) Traceback (most recent call last): File "", line 1, in IOError: [Errno 9] Bad file descriptor io module ''''''''' Raises IOError when trying to open a directory under Unix:: >>> open("Python/", "r") Traceback (most recent call last): File "", line 1, in IOError: [Errno 21] Is a directory: 'Python/' Raises IOError or io.UnsupportedOperation (which inherits from the former) for unsupported operations:: >>> open("LICENSE").write("bar") Traceback (most recent call last): File "", line 1, in IOError: not writable >>> io.StringIO().fileno() Traceback (most recent call last): File "", line 1, in io.UnsupportedOperation: fileno >>> open("LICENSE").seek(1, 1) Traceback (most recent call last): File "", line 1, in IOError: can't do nonzero cur-relative seeks Raises either IOError or TypeError when the inferior I/O layer misbehaves (i.e. violates the API it is expected to implement). Raises IOError when the underlying OS resource becomes invalid:: >>> f = open("LICENSE") >>> os.close(f.fileno()) >>> f.read() Traceback (most recent call last): File "", line 1, in IOError: [Errno 9] Bad file descriptor ...or for implementation-specific optimizations:: >>> f = open("LICENSE") >>> next(f) 'A. HISTORY OF THE SOFTWARE\n' >>> f.tell() Traceback (most recent call last): File "", line 1, in IOError: telling position disabled by next() call Raises BlockingIOError (inheriting from IOError) when a call on a non-blocking object would block. mmap '''' Undex Unix, raises its own ``mmap.error`` (inheriting from EnvironmentError) throughout:: >>> mmap.mmap(123, 10) Traceback (most recent call last): File "", line 1, in mmap.error: [Errno 9] Bad file descriptor >>> mmap.mmap(os.open("/tmp", os.O_RDONLY), 10) Traceback (most recent call last): File "", line 1, in mmap.error: [Errno 13] Permission denied Under Windows, however, it mostly raises WindowsError (the source code also shows a few occurrences of ``mmap.error``):: >>> fd = os.open("LICENSE", os.O_RDONLY) >>> m = mmap.mmap(fd, 16384) Traceback (most recent call last): File "", line 1, in WindowsError: [Error 5] Acc?s refus? >>> sys.last_value.errno 13 >>> errno.errorcode[13] 'EACCES' >>> m = mmap.mmap(-1, 4096) >>> m.resize(16384) Traceback (most recent call last): File "", line 1, in WindowsError: [Error 87] Param?tre incorrect >>> sys.last_value.errno 22 >>> errno.errorcode[22] 'EINVAL' multiprocessing ''''''''''''''' Not examined. os / posix '''''''''' The ``os`` (or ``posix``) module raises OSError throughout, except under Windows where WindosError can be raised instead. ossaudiodev ''''''''''' Raises IOError throughout (OSError is unused):: >>> ossaudiodev.open("foo", "r") Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory: 'foo' readline '''''''' Raises IOError in various file-handling functions:: >>> readline.read_history_file("foo") Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory >>> readline.read_init_file("foo") Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory >>> readline.write_history_file("/dev/nonexistent") Traceback (most recent call last): File "", line 1, in IOError: [Errno 13] Permission denied select '''''' * select() and poll objects raise ``select.error``, which doesn't inherit from anything (but poll.modify() raises IOError); * epoll objects raise IOError; * kqueue objects raise both OSError and IOError. signal '''''' ``signal.ItimerError`` inherits from IOError. socket '''''' ``socket.error`` inherits from IOError. sys ''' ``sys.getwindowsversion()`` raises WindowsError with a bogus error number if the ``GetVersionEx()`` call fails. time '''' Raises IOError for internal errors in time.time() and time.sleep(). zipimport ''''''''' zipimporter.get_data() can raise IOError. References ========== .. [1] "IO module precisions and exception hierarchy" http://mail.python.org/pipermail/python-dev/2009-September/092130.html .. [2] Discussion of "Removing WindowsError" in PEP 348 http://www.python.org/dev/peps/pep-0348/#removing-windowserror Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From nathan at cmu.edu Wed Jul 21 21:55:45 2010 From: nathan at cmu.edu (Nathan Schneider) Date: Wed, 21 Jul 2010 15:55:45 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: > if some_bool_func(a): > ans = some_func1(a, b, c) > else: > ans = some_func2(a, b, c) > given: # note: given applied to the block starting with the "if" statement > a = get_a() > b = get_b() > c = get_c() It seems to me that the postfix construct would confuse the reader in a scenario such as the above, rather than reducing complexity. (Worse, suppose 'a', 'b', and 'c' had been assigned prior to the 'if' statement!) Moreover, our intuitions seem to be fuzzy when it comes to the desirability/behavior of control flow statements in the 'given' block. So I'm -1 on this: I think there is a marked increase in the opportunity for confusion among authors and readers, without a clear set of common patterns that would be improved (even if readability is slightly better on occasion). I think a better alternative to allow for safe localization of variables to a block would be to adapt the 'with' statement to behave as a 'let' (similar to suggestions earlier in the thread). For instance: with fname = sys.argv[1], open(fname) as f, contents = f.read(): do_stuff1(fname, contents) do_stuff2(contents) do_stuff3(fname) # error: out of scope This makes it clear to the reader that the assignments to 'fname' and 'contents', like 'f', only pertain to the contents of the 'with' block. It allows the reader to focus their eye on the 'important' part?the part inside the block?even though it doesn't come first. It helps avoid bugs that might arise if 'fname' were used later on. And it leaves no question as to where control flow statements are permitted/desirable. I'm +0.5 on this alternative: my hesitation is because we'd need to explain to newcomers why 'f = open(fname)' would be legal but bad, owing to the subtleties of context managers. (I worry Alex's proposal for the interpreter to create context managers on the fly would only add confusion.) Nathan On Wed, Jul 21, 2010 at 1:19 PM, Alex Light wrote: > > > On Wed, Jul 21, 2010 at 12:02 PM, Jack Diederich wrote: >> >> >> >> 2a) No control flow statements in the block means if you need to >> augment the code to do a return/break/continue/yield you then have to >> refactor so everything in the "given:" block gets moved to the top and >> a 1-line change becomes a 10 line diff. >> 2b) Allowing control flow statements in the block would be even more >> confusing. >> 2c) Is this legal? >> ? ? x = b given: >> ? ? ? b = 0 >> ? ? ? for item in range(100): >> ? ? ? ? b += item >> ? ? ? ? if b > 10: >> ? ? ? ? ? break > > 2a) you are missing the point of the given clause. you use it to assign > values to variables if, and only if, the only possible results of the > computation are > 1) an exception is raised or > 2) a value is returned which is set to the variable and used in the > expression no matter its value. > if there is the slightest chance that what you describe might > be?necessary?you would not put it in a "given" but do something like this: > (assumes that a given is applied to the previous statement in its current > block) > if some_bool_func(a): > ?? ?ans = some_func1(a, b, c) > else: > ?? ?ans = some_func2(a, b, c) > given: # note: given applied to the block starting with the "if" statement > ?? ?a = get_a() > ?? ?b = get_b() > ?? ?c = get_c() > 2b)?agreed?but there is no reason for them to be disallowed except > readibility but they should be discouraged > 2c) see it might be okay, depending on what people think of your second > question. in my?opinion?it should be illegal stylistically and > be?required?to be changed to > def summation(start, end) > ?? ?i = start > ?? ?while i < end: > ?? ? ? ?start += i > ?? ? ? ?i ?+= 1 > ?? ?return start > def a_func(): > ?? ?# do stuff > ?? ?x = b given: > ?? ? ? ?b = summation(0, 100) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From ncoghlan at gmail.com Wed Jul 21 23:20:36 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Jul 2010 07:20:36 +1000 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <1279740852.3222.38.camel@localhost.localdomain> References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: I read this while you were working on it in the sandbox - +1 in principle, but the devil is obviously going to be in the details. On Thu, Jul 22, 2010 at 5:34 AM, Antoine Pitrou wrote: > Step 1: coalesce exception types > ================================ > > The first step of the resolution is to coalesce existing exception types. > The extent of this step is not yet fully determined. ?A number of possible > changes are listed hereafter: > > * alias both socket.error and select.error to IOError > * alias mmap.error to OSError > * alias IOError to OSError > * alias WindowsError to OSError > > Each of these changes doesn't preserve exact compatibility, but it does > preserve *useful compatibility* (see "compatibility" section above). > > Not only does this first step present the user a simpler landscape, but > it also allows for a better and more complete resolution of step 2 > (see "Prerequisite" below). Another idea along these lines would be to coalesce the builtin exceptions at the EnvironmentError level. That is, the top of the revised hierarchy would look like: +-- IOError +-- io.BlockingIOError +-- io.UnsupportedOperation (also inherits from ValueError) IOError would be aliased as EnvironmentError, OSError, WindowsError, socket.error, mmap.error and select.error Coalescing WindowsError like that would mean the "winerr" attribute would be present on all platforms, just set to "None" if the platform isn't Windows. (errno, filename and strerror can all already be None, as will often be the case when IOError is raised directly by Python code). select.error (now just an alias for IOError) would also grow the common IOError attributes. I'm suggesting IOError as the name based on your survey of what standard libraries currently raise (i.e. the vast majority of them use IOError rather than one of the other names). EnvironmentError would probably be more accurate, but IOError is more common and easier to type (and from the interpreter's point of view, any manipulation of the underlying OS can be viewed as a form of I/O, even if it involves accessing the process table or the environment variables or the registry rather than the filesystem or network). Also, there should be a helper function (probably in the os module) that given an errno value will create the appropriate IOError subclass. Regards, Nick. P.S. I want to let the idea kick around in my brain for a while before offering suggestions for possible useful IOError subclasses. Note that we don't need to create subclasses for *everything* - errors without a specific subclass can fall back to the basic IOError. Still, I expect many bikesheds will be painted a wide variety of colours before this discussion is done ;) -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Jul 21 23:58:15 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Jul 2010 07:58:15 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Thu, Jul 22, 2010 at 1:51 AM, Bruce Leban wrote: > I'm unconvinced of the value at this point but notwithstanding that let me > toss in an alternative syntax: > > ??? given: > ??????? suite > ??? do: > ??????? suite > > This executes the two suites in order with any variable bindings created by > the first suite being local to the scope of the two suites. I think this is > more readable than the trailing clause and is more flexible (you can put > multiple statements in the second suite) and avoids the issue with anyone > wanting the where clause added to arbitrary expressions. > > FWIW, in math it's more common to list givens at the top. However, writing it that way has even less to offer over ordinary local variables than the postfix given clause. I updated the draft PEP again, pointing out that if a decision had to be made today, the PEP would almost certainly be rejected due to a lack of compelling use cases. The bar for adding a new syntactic construct is pretty high and PEP 3150 currently isn't even close to reaching it (see PEP 343 for the kind of use cases that got the with statement over that bar). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Jul 22 00:05:53 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Jul 2010 08:05:53 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Thu, Jul 22, 2010 at 5:55 AM, Nathan Schneider wrote: > I think a better alternative to allow for safe localization of > variables to a block would be to adapt the 'with' statement to behave > as a 'let' (similar to suggestions earlier in the thread). For > instance: > > with fname = sys.argv[1], open(fname) as f, contents = f.read(): > ? ?do_stuff1(fname, contents) > ? ?do_stuff2(contents) > do_stuff3(fname) ?# error: out of scope > > This makes it clear to the reader that the assignments to 'fname' and > 'contents', like 'f', only pertain to the contents of the 'with' > block. It allows the reader to focus their eye on the 'important' > part?the part inside the block?even though it doesn't come first. It > helps avoid bugs that might arise if 'fname' were used later on. And > it leaves no question as to where control flow statements are > permitted/desirable. > > I'm +0.5 on this alternative: my hesitation is because we'd need to > explain to newcomers why 'f = open(fname)' would be legal but bad, > owing to the subtleties of context managers. Hmm, an intriguing idea. I agree that the subtleties of "=" vs "as" could lead to problems though. There's also the fact that existing semantics mean that 'f' has to remain bound after the block, so it would be surprising if 'fname' and 'contents' were unavailable. It also suffers the same issue as any of the other in-order proposals: without out-of-order execution, the only gain is a reduction in the chance for namespace collisions, and that's usually only a problem for accidental collisions with loop variable names in long functions or scripts. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bruce at leapyear.org Thu Jul 22 00:36:14 2010 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 21 Jul 2010 15:36:14 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 2:58 PM, Nick Coghlan wrote: > On Thu, Jul 22, 2010 at 1:51 AM, Bruce Leban wrote: > > let me toss in an alternative syntax: > > given: > > suite > > do: > > suite > > However, writing it that way has even less to offer over ordinary > local variables than the postfix given clause. > Perhaps. I do think it looks more like Python. The PEP says it's "some form of statement local namespace". If that's the advantage, this alternative offers it. If the advantage is that it introduces a different execution order, then the PEP should make the case for that. --- Bruce http://www.vroospeak.com http://google-gruyere.appspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dag.odenhall at gmail.com Thu Jul 22 00:37:05 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Thu, 22 Jul 2010 00:37:05 +0200 Subject: [Python-ideas] Infix application of binary functions Message-ID: <1279751825.4507.16.camel@gumri> It could help readability if binary (arity of 2) functions could be applied infix with some syntax. For example, borrowing from Haskell, the backtick could be reintroduced for this purpose. Good examples for this are isinstance and hasattr: if some_object `isinstance` Iterable: ... elif some_object `hasattr` '__iter__': ... It is already possible[1] to make infix functions, but the solution is a hack and requires functions to be marked as infix. (The use of backticks is just an example borrowing from Haskell and might not be optimal, although a benefit is that it isn't very noisy.) [1] http://code.activestate.com/recipes/384122-infix-operators/ From pyideas at rebertia.com Thu Jul 22 01:17:06 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 21 Jul 2010 16:17:06 -0700 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: <1279751825.4507.16.camel@gumri> References: <1279751825.4507.16.camel@gumri> Message-ID: On Wed, Jul 21, 2010 at 3:37 PM, Dag Odenhall wrote: > It could help readability if binary (arity of 2) functions could be > applied infix with some syntax. For example, borrowing from Haskell, the > backtick could be reintroduced for this purpose. > > Good examples for this are isinstance and hasattr: > > ? ?if some_object `isinstance` Iterable: > ? ? ? ?... > ? ?elif some_object `hasattr` '__iter__': Already proposed (by me) and rejected by the BDFL: http://mail.python.org/pipermail/python-ideas/2007-January/000054.html Cheers, Chris -- http://blog.rebertia.com From grosser.meister.morti at gmx.net Thu Jul 22 01:33:38 2010 From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=) Date: Thu, 22 Jul 2010 01:33:38 +0200 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: References: <1279751825.4507.16.camel@gumri> Message-ID: <4C4783D2.5060304@gmx.net> Then what about: obj $isinstance Iterable or obj $isinstance$ Iterable or obj *isinstance Iterable or obj isinstance? Iterable These don't use the backtick charackter (wich on some setups even is a unicode char not from 7bit ascii). -panzi On 07/22/2010 01:17 AM, Chris Rebert wrote: > On Wed, Jul 21, 2010 at 3:37 PM, Dag Odenhall wrote: >> It could help readability if binary (arity of 2) functions could be >> applied infix with some syntax. For example, borrowing from Haskell, the >> backtick could be reintroduced for this purpose. >> >> Good examples for this are isinstance and hasattr: >> >> if some_object `isinstance` Iterable: >> ... >> elif some_object `hasattr` '__iter__': > > Already proposed (by me) and rejected by the BDFL: > > http://mail.python.org/pipermail/python-ideas/2007-January/000054.html > > Cheers, > Chris From dag.odenhall at gmail.com Thu Jul 22 01:34:51 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Thu, 22 Jul 2010 01:34:51 +0200 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: References: <1279751825.4507.16.camel@gumri> Message-ID: <1279755291.4507.17.camel@gumri> > Already proposed (by me) and rejected by the BDFL: > > http://mail.python.org/pipermail/python-ideas/2007-January/000054.html Ah, thanks. He only rejects the backtick however, not infix application itself. From pyideas at rebertia.com Thu Jul 22 01:41:11 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 21 Jul 2010 16:41:11 -0700 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: <4C4783D2.5060304@gmx.net> References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> Message-ID: > On 07/22/2010 01:17 AM, Chris Rebert wrote: >> On Wed, Jul 21, 2010 at 3:37 PM, Dag Odenhall >> ?wrote: >>> It could help readability if binary (arity of 2) functions could be >>> applied infix with some syntax. For example, borrowing from Haskell, the >>> backtick could be reintroduced for this purpose. >>> >>> Good examples for this are isinstance and hasattr: >>> >>> ? ?if some_object `isinstance` Iterable: >>> ? ? ? ?... >>> ? ?elif some_object `hasattr` '__iter__': >> >> Already proposed (by me) and rejected by the BDFL: >> >> http://mail.python.org/pipermail/python-ideas/2007-January/000054.html >> On Wed, Jul 21, 2010 at 4:33 PM, Mathias Panzenb?ck wrote: > Then what about: > ? ? ? ?obj *isinstance Iterable How would the parser distinguish that from multiplication? > or > ? ? ? ?obj isinstance? Iterable That would look odd for non-interoggative binary functions: z = x cartesianProduct? y > These don't use the backtick charackter (wich on some setups even is a > unicode char not from 7bit ascii). Not using backtick definitely makes the proposal more viable. (Personally I <3 backtick though.) Cheers, Chris From dag.odenhall at gmail.com Thu Jul 22 01:43:08 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Thu, 22 Jul 2010 01:43:08 +0200 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: <4C4783D2.5060304@gmx.net> References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> Message-ID: <1279755788.4507.25.camel@gumri> tor 2010-07-22 klockan 01:33 +0200 skrev Mathias Panzenb?ck: > Then what about: > obj $isinstance Iterable > or > obj $isinstance$ Iterable > or > obj *isinstance Iterable > or > obj isinstance? Iterable > > These don't use the backtick charackter (wich on some setups even is a unicode char not from 7bit > ascii). I like the question mark, although it is only useful for predicates. I haven't considered if infix is useful for anything other than predicates, though. Another possibility is a keyword, maybe "of": obj isinstance of Iterable or obj hasattr of '__iter__' But better then would be a keyword that already exists and makes sense for this use. A character such as the question mark is probably best, just noting the possibility of a keyword for completeness sake. An example of a non-predicate infix might be str.format: 'Hello {}' str.format? 'World' Here, the question mark makes less sense. From solipsis at pitrou.net Thu Jul 22 01:52:15 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 22 Jul 2010 01:52:15 +0200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: <20100722015215.75af6544@pitrou.net> On Thu, 22 Jul 2010 07:20:36 +1000 Nick Coghlan wrote: > > Another idea along these lines would be to coalesce the builtin > exceptions at the EnvironmentError level. Agreed, I will add it to the PEP and process the rest of your input. > Still, I expect > many bikesheds will be painted a wide variety of colours before this > discussion is done ;) Yes, this PEP offers a lot of opportunities for discussion :) Thanks, Antoine. From tjreedy at udel.edu Thu Jul 22 03:21:33 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 21 Jul 2010 21:21:33 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 7/21/2010 7:24 AM, Nick Coghlan wrote: > On Wed, Jul 21, 2010 at 9:16 PM, Stefan Behnel wrote: >> Terry Reedy, 20.07.2010 21:49: >>> >>> I did not comment then because I thought the idea of cluttering python >>> with augmented local namespace blocks, with no functional gain, was >>> rejected and dead, and hence unnecessary of comment. >>> -10 >>> For me, the idea would come close to destroying (what remains of) the >>> simplicity that makes Python relatively easy to learn. It seems to be >>> associated with the (to me, cracked) idea that names are pollution. >> >> Actually, it's about *giving* names to subexpressions, that's quite the >> opposite. > > I think Terry's point was that you can already give names to > subexpressions by assigning them to variables in the current scope, > but some people object to that approach due to "namespace pollution". Right. > I agree with him that avoiding namespace pollution isn't a particular > strong argument though (unless you have really long scripts and Okay, we can leave that issue aside. > functions), which is why I've tried to emphasize the intended > readability benefits. whereas I am trying to emphasize the reading horror for people whose brains are wired differently from yours. The backwards conditional expressions are nearly impossible for me to read, which is to say, painful. To some, something like e = fe(a,b,c, p1) where: c = fc(a, d, p2 where: d = fd(a, p1) where: a = fa(p1, p2) b = fb(a,p2) where p1,p2 are input parameters; looks about as bad (and it was a real effort to write). I would rather something like that were in a branch dialect, Ypthon with its own extension (.yp). Algorithm book authors usually want their books read by lots of people. When they invent a pseudocode language, they usually invent something lots of people can read. (Knuth's MIX was something of an exception.) It is often so close to (a subset of) Python that it is ridiculous that they do not just use (a subset) Python so it is not 'pseudo'. I cannot remember seeing anything like the above. I believe the reason is because it would be, on average, less readable and harder to understand. -- Terry Jan Reedy From tjreedy at udel.edu Thu Jul 22 04:04:02 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 21 Jul 2010 22:04:02 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On 7/20/2010 7:08 PM, Guido van Rossum wrote: > I see a similar possibility as for decorators, actually. A decorator > is very simple syntactic sugar too, but it allows one to emphasize the > decoration by putting it up front rather than hiding it after the > (possibly very long) function. Because you chose to put multiple decorators in 'nested' order, I can consistently read decorated definitions as multi-statement'expressions' like so: @f1(arg) #( @f2 # ( def f(): pass # )) with the function name propagated out. Reading expression inside out is nothing new. (The opposite decorator order would have made this 'trick' impossible and decorators harder for me to read.) I do not particularly see this syntax as 'emphasizing' the decoration any more than f(g(2*x+y**z) 'emphasizes' f or g over the operator expression. [changing the order] > This is indeed a bit of a downside; if you see > > blah blah blah: > x = blah > y = blah > > you will have to look more carefully at the end of the first blah blah > blah line to know whether the indented block is executed first or > last. For all other intended blocks, the *beginning* of the indented > block is your clue (class, def, if, try, etc.). Yes! This is a great feature of Python. For decorators, the *initial* '@' is the clue to shift reading mode. If givens were accepted, I would strongly prefer there to be a similar initial clue. On function definition order: I generally prefer to write and read definitions before they are used. (Mathematicians usually present and prove lemmas before a main theorem.) The major exception is the __init__ method of a class, which really is special as it defines instance attributes. I can imagine a module with one main and several helper functions starting with the main function, but there should first be a doc string listing everything with at least a short phrase of explanation. -- Terry Jan Reedy From tjreedy at udel.edu Thu Jul 22 04:12:49 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 21 Jul 2010 22:12:49 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On 7/21/2010 7:20 AM, Nick Coghlan wrote: > yield and return (at the level of the given clause itself) will need > to be disallowed explicitly by the compiler Why introduce an inconsistency? If a = e1 b = f(a) can be flipped to b = f(a) given: a = e1 I would expect a = e1 return f(a) to be flippable to return f(a) given a = e1 -- Terry Jan Reedy From dag.odenhall at gmail.com Thu Jul 22 04:26:48 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Thu, 22 Jul 2010 04:26:48 +0200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <1279740852.3222.38.camel@localhost.localdomain> References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: <1279765609.4507.51.camel@gumri> +1 on the general idea, always seemed awkward to me that these operations all raise the same exception. I didn't even know about the errno comparison method, though I've never looked for it. Point is that it is cryptic and as such not very pythonic. From pyideas at rebertia.com Thu Jul 22 05:30:17 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 21 Jul 2010 20:30:17 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 7:12 PM, Terry Reedy wrote: > On 7/21/2010 7:20 AM, Nick Coghlan wrote: > >> yield and return (at the level of the given clause itself) will need >> to be disallowed explicitly by the compiler > > Why introduce an inconsistency? > > If > > a = e1 > b = f(a) > > can be flipped to > > b = f(a) given: > ? ?a = e1 > > I would expect > > a = e1 > return f(a) > > to be flippable to > > return f(a) given > ? ?a = e1 I believe Nick meant returns/yields *within* the `given` suite (he just phrased it awkwardly), e.g. a = b given: b = 42 return c # WTF The PEP's Syntax Change section explicitly changes the grammar to allow the sort of `given`s you're talking about. Cheers, Chris From cmjohnson.mailinglist at gmail.com Thu Jul 22 05:34:12 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 21 Jul 2010 17:34:12 -1000 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: <1279755788.4507.25.camel@gumri> References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> <1279755788.4507.25.camel@gumri> Message-ID: This can be done in Python today: >>> class Infix(object): ... def __init__(self, func): ... self.func = func ... self.arg1 = self.arg2 = self.not_set = object() ... ... def __radd__(self, arg1): ... self.arg1 = arg1 ... if self.arg2 is self.not_set: ... return self ... else: ... return self.func(self.arg1, self.arg2) ... ... def __add__(self, arg2): ... self.arg2 = arg2 ... if self.arg1 is self.not_set: ... return self ... else: ... return self.func(self.arg1, self.arg2) ... >>> @Infix ... def add(one, two): ... return one + two ... >>> @Infix ... def mul(one, two): ... return one * two ... >>> @Infix ... def power(one, two): ... return one ** two ... >>> 1 + add + 1 2 >>> 2 + mul + 2 4 >>> 3 + power + 3 27 Enjoy. -- Carl Johnson From cmjohnson.mailinglist at gmail.com Thu Jul 22 05:49:16 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 21 Jul 2010 17:49:16 -1000 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> <1279755788.4507.25.camel@gumri> Message-ID: Thought about it some more. Here?s a more general formula: class InfixArity(object): def __init__(self, arity): self.arity = arity self.args = [] def __call__(self, func): self.func = func return self def __add__(self, arg): self.args.append(arg) if len(self.args) < self.arity: return self else: return self.func(*self.args) __radd__ = __add__ Infix = lambda func: InfixArity(2)(func) And of course, one can use __mul__ or __div__ or whatever to taste. "1 // add // 2? doesn?t make me instantly vomit in my mouth. ;-) -- Carl Johnson From cmjohnson.mailinglist at gmail.com Thu Jul 22 06:07:19 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 21 Jul 2010 18:07:19 -1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: There have been questions about whether there are any cases of the given/where/let/whatever solving problems that would otherwise be cumbersome to solve. I think it could help get around certain for-loop gotchas: >>> funcs = [] >>> for i in range(5): ... def f(): ... print("#", i) ... funcs.append(f) ... >>> [func() for func in funcs] # 4 # 4 # 4 # 4 # 4 [None, None, None, None, None] D?oh! (This can be a real world problem if you have a list of methods you want to decorate inside a class.) One current workaround: >>> funcs = [] >>> for i in range(5): ... def _(): ... n = i ... def f(): ... print("#", n) ... funcs.append(f) ... _() ... >>> [func() for func in funcs] # 0 # 1 # 2 # 3 # 4 [None, None, None, None, None] Not pretty, but it works. In let format (I?m leaning toward the format ?let [VAR = | return | yield] EXPRESSION where: BLOCK?): funcs = [] for i in range(5): let funcs.append(f) where: n = i def f(): print("#", n) [func() for func in funcs] Still a little awkward, but not as bad, IMHO. -- Carl Johnson From cmjohnson.mailinglist at gmail.com Thu Jul 22 06:22:58 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 21 Jul 2010 18:22:58 -1000 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> <1279755788.4507.25.camel@gumri> Message-ID: Last time, I swear! I caught a bug in the last version. Since I mutated my instances (not very Haskell-like!!), you couldn?t use the same function more than once. Here?s a new version that lets you use the same function again: class InfixArity(object): def __init__(self, arity): self.arity = arity def __call__(self, func): self.func = func return self def __add__(self, arg): return InfixHelper(self.func, self.arity, arg) __radd__ = __add__ __floordiv__ = __rfloordiv__ = __add__ __truediv__ = __rtruediv__ = __add__ class InfixHelper(object): def __init__(self, func, arity, firstarg): self.func = func self.arity = arity self.args = [firstarg] def __add__(self, arg): self.args.append(arg) if len(self.args) < self.arity: return self else: return self.func(*self.args) __radd__ = __add__ __floordiv__ = __rfloordiv__ = __add__ __truediv__ = __rtruediv__ = __add__ Infix = lambda func: InfixArity(2)(func) I imagine it would be possible to make an n-arity class that could work like ?average // 1 // 2 // 3 // 4 // done? or maybe one that has you use a different operator for the last argument, but I leave that as an exercise for the reader. -- Carl Johnson On Wed, Jul 21, 2010 at 5:49 PM, Carl M. Johnson wrote: > Thought about it some more. Here?s a more general formula: > > class InfixArity(object): > ? ?def __init__(self, arity): > ? ? ? ?self.arity = arity > ? ? ? ?self.args = [] > > > ? ?def __call__(self, func): > ? ? ? ?self.func = func > ? ? ? ?return self > > ? ?def __add__(self, arg): > ? ? ? ?self.args.append(arg) > ? ? ? ?if len(self.args) < self.arity: > ? ? ? ? ? ?return self > ? ? ? ?else: > ? ? ? ? ? ?return self.func(*self.args) > > ? ?__radd__ = __add__ > > Infix = lambda func: InfixArity(2)(func) > > And of course, one can use __mul__ or __div__ or whatever to taste. "1 > // add // 2? doesn?t make me instantly vomit in my mouth. ;-) > > -- Carl Johnson > From dag.odenhall at gmail.com Thu Jul 22 06:33:01 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Thu, 22 Jul 2010 06:33:01 +0200 Subject: [Python-ideas] Infix application of binary functions In-Reply-To: References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> <1279755788.4507.25.camel@gumri> Message-ID: <1279773181.4507.53.camel@gumri> > This can be done in Python today: Quoting myself from the original post: It is already possible[1] to make infix functions, but the solution is a hack and requires functions to be marked as infix. [1] http://code.activestate.com/recipes/384122-infix-operators/ From pyideas at rebertia.com Thu Jul 22 06:34:02 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 21 Jul 2010 21:34:02 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 9:07 PM, Carl M. Johnson wrote: > There have been questions about whether there are any cases of the > given/where/let/whatever solving problems that would otherwise be > cumbersome to solve. I think it could help get around certain for-loop > gotchas: > >>>> funcs = [] >>>> for i in range(5): > ... ? ? def f(): > ... ? ? ? ? print("#", i) > ... ? ? funcs.append(f) > ... >>>> [func() for func in funcs] > # 4 > # 4 > # 4 > # 4 > # 4 > [None, None, None, None, None] > > D?oh! (This can be a real world problem if you have a list of methods > you want to decorate inside a class.) > > One current workaround: > >>>> funcs = [] >>>> for i in range(5): > ... ? ? def _(): > ... ? ? ? ? n = i > ... ? ? ? ? def f(): > ... ? ? ? ? ? ? print("#", n) > ... ? ? ? ? funcs.append(f) > ... ? ? _() > ... >>>> [func() for func in funcs] > # 0 > # 1 > # 2 > # 3 > # 4 > [None, None, None, None, None] > > Not pretty, but it works. > > In let format (I?m leaning toward the format ?let [VAR = | return | > yield] EXPRESSION where: BLOCK?): > > funcs = [] > for i in range(5): > ? ?let funcs.append(f) where: > ? ? ? ?n = i > ? ? ? ?def f(): > ? ? ? ? ? ?print("#", n) > > [func() for func in funcs] > > Still a little awkward, but not as bad, IMHO. Neither of those seem (imo) better than the other current workaround: funcs = [] for i in range(5): ? ?def f(i=i): print("#", i) funcs.append(f) They're all non-obvious idioms, but at least this one is short and easily recognized. Cheers, Chris From ben+python at benfinney.id.au Thu Jul 22 07:59:53 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 22 Jul 2010 15:59:53 +1000 Subject: [Python-ideas] Infix application of binary functions References: <1279751825.4507.16.camel@gumri> <4C4783D2.5060304@gmx.net> <1279755788.4507.25.camel@gumri> <1279773181.4507.53.camel@gumri> Message-ID: <878w546opy.fsf@benfinney.id.au> Dag Odenhall writes: > > This can be done in Python today: > > Quoting myself from the original post: > > It is already possible[1] to make infix functions, but the solution is > a hack and requires functions to be marked as infix. I don't see how ?add a new punctuation character to the syntax in every place where this is to be used? is less of a hack. For reference you might want to read over the debates that preceded the introduction of ?@? to the language. There is *very strong* resistance to adding syntax that uses arbitrary punctuation characters. IMO that resistance is for good reason: punctuation beyond what Python already supports today rarely improves readability, and usually worsens it. -- \ ?All television is educational television. The question is: | `\ what is it teaching?? ?Nicholas Johnson | _o__) | Ben Finney From cmjohnson.mailinglist at gmail.com Thu Jul 22 09:48:14 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 21 Jul 2010 21:48:14 -1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Wed, Jul 21, 2010 at 6:34 PM, Chris Rebert wrote: > funcs = [] > for i in range(5): > ?? ?def f(i=i): > ? ? ? ?print("#", i) > ? ?funcs.append(f) > > They're all non-obvious idioms, but at least this one is short and > easily recognized. I actually had to read that twice before I recognized what you had done, and I knew to look for something out of the ordinary. That said, it *is* the solution GvR recommended as the best solution the last time this came up. I just never understood why. To me, if you set something as a default value for an argument, it should be because it?s a default value, ie. a value that is usually one thing but can also be set to something else at the caller?s discretion. I?m just not comfortable with using the default value to mean ?here?s a value you should never change? or ?pretty please, don?t pass in an argument, because that will screw everything up? or even ?I guess you could pass in an argument if you wanted to, but that?s not a case I?m really very busy thinking about". :-/ But maybe I?m in the minority on this one. -- CJ From debatem1 at gmail.com Thu Jul 22 10:38:57 2010 From: debatem1 at gmail.com (geremy condra) Date: Thu, 22 Jul 2010 04:38:57 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Jul 21, 2010 at 9:21 PM, Terry Reedy wrote: > On 7/21/2010 7:24 AM, Nick Coghlan wrote: >> >> On Wed, Jul 21, 2010 at 9:16 PM, Stefan Behnel >> ?wrote: >>> >>> Terry Reedy, 20.07.2010 21:49: >>>> >>>> I did not comment then because I thought the idea of cluttering python >>>> with augmented local namespace blocks, with no functional gain, was >>>> rejected and dead, and hence unnecessary of comment. >>>> -10 >>>> For me, the idea would come close to destroying (what remains of) the >>>> simplicity that makes Python relatively easy to learn. It seems to be >>>> associated with the (to me, cracked) idea that names are pollution. >>> >>> Actually, it's about *giving* names to subexpressions, that's quite the >>> opposite. >> >> I think Terry's point was that you can already give names to >> subexpressions by assigning them to variables in the current scope, >> but some people object to that approach due to "namespace pollution". > > Right. > >> I agree with him that avoiding namespace pollution isn't a particular >> strong argument though (unless you have really long scripts and > > Okay, we can leave that issue aside. > >> functions), which is why I've tried to emphasize the intended >> readability benefits. > > whereas I am trying to emphasize the reading horror for people whose brains > are wired differently from yours. The backwards conditional expressions are > nearly impossible for me to read, which is to say, painful. To some, > something like Ok, so, your brain is wired differently. That's fine- but mine says that there are cases where the up-front syntax is more readable. I particularly like that it clearly marks what variables will not be used elsewhere without forcing me to jump around in the code to find out how they're being computed. > e = fe(a,b,c, p1) where: > ?c = fc(a, d, p2 where: > ? ?d = fd(a, p1) where: > ? ? ?a = fa(p1, p2) > ?b = fb(a,p2) > > where p1,p2 are input parameters; > > looks about as bad (and it was a real effort to write). I would rather > something like that were in a branch dialect, Ypthon with its own extension > (.yp). Of course it looks bad. You can make anything look bad. 37 levels of nested 'for' statements look bad. That isn't an argument against this, it's an argument against bad code period. > Algorithm book authors usually want their books read by lots of people. When > they invent a pseudocode language, they usually invent something lots of > people can read. (Knuth's MIX was something of an exception.) It is often so > close to (a subset of) Python that it is ridiculous that they do not just > use (a subset) Python so it is not 'pseudo'. I cannot remember seeing > anything like the above. I believe the reason is because it would be, on > average, less readable and harder to understand. I would use this, and would welcome it in a textbook. Telling people what you're doing before you do it motivates the material and helps people to learn and understand with minimal effort IMO/E. Geremy Condra From ncoghlan at gmail.com Thu Jul 22 14:49:44 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Jul 2010 22:49:44 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Thu, Jul 22, 2010 at 5:48 PM, Carl M. Johnson wrote: > On Wed, Jul 21, 2010 at 6:34 PM, Chris Rebert wrote: > > >> funcs = [] >> for i in range(5): >> ?? ?def f(i=i): >> ? ? ? ?print("#", i) >> ? ?funcs.append(f) >> >> They're all non-obvious idioms, but at least this one is short and >> easily recognized. > > I actually had to read that twice before I recognized what you had > done, and I knew to look for something out of the ordinary. That said, > it *is* the solution GvR recommended as the best solution the last > time this came up. I just never understood why. To me, if you set > something as a default value for an argument, it should be because > it?s a default value, ie. a value that is usually one thing but can > also be set to something else at the caller?s discretion. I?m just not > comfortable with using the default value to mean ?here?s a value you > should never change? or ?pretty please, don?t pass in an argument, > because that will screw everything up? or even ?I guess you could pass > in an argument if you wanted to, but that?s not a case I?m really very > busy thinking about". :-/ But maybe I?m in the minority on this one. There's a reason that particular trick is called the "default argument hack" :) The trick is much easier to follow when you *don't* reuse the variable name in the inner function: funcs = [] for i in range(5): def f(early_bound_i=i): print("#", early_bound_i) funcs.append(f) Python doesn't really have a good way to request early binding semantics at this time - the default argument hack is about it. The given clause (as currently specified in the PEP) forces early binding semantics on any functions it contains, so it allows you to write a function definition loop that "does the right thing" with the following fairly straightforward code: funcs = [] for i in range(5): funcs.append(f) given: def f(): print("#", i) That's: a) kinda cool b) veering dangerously close to DWIM* territory (which is not necessarily a good thing) Still, the early binding semantics angle is one I had thought about before - that *is* a genuinely new feature of this proposal. Perhaps not a hugely compelling one though - I think I've only needed to use the default argument hack once in the whole time I've been programming Python. Cheers, Nick. *Don't Worry It's Magic -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Jul 22 14:54:04 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Jul 2010 22:54:04 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Thu, Jul 22, 2010 at 1:30 PM, Chris Rebert wrote: > I believe Nick meant returns/yields *within* the `given` suite (he > just phrased it awkwardly), e.g. > > a = b given: > ? ?b = 42 > ? ?return c ?# WTF Yep, that's what I meant by the top level code in the given clause (i.e. in the same sense as I would refer to the top level code in a function body). It didn't occur to me that phrasing could be legitimately confused with the header line of the clause until I read Terry's message. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Jul 22 14:56:01 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Jul 2010 22:56:01 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Thu, Jul 22, 2010 at 10:49 PM, Nick Coghlan wrote: > Still, the early binding semantics angle is one I had thought about > before - that *is* a genuinely new feature of this proposal. Perhaps > not a hugely compelling one though - I think I've only needed to use > the default argument hack once in the whole time I've been programming > Python. Uhh, "... had*n't* thought about ...". 3 little missing characters that completely reverse the intended meaning of a sentence :P Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Thu Jul 22 21:25:23 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 22 Jul 2010 15:25:23 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Thu, Jul 22, 2010 at 8:49 AM, Nick Coghlan wrote: > I think I've only needed to use > the default argument hack once in the whole time I've been programming > Python. Here is your second: http://bugs.python.org/issue7989#msg109662 or scroll to the end of http://mail.python.org/pipermail/python-dev/2010-July/101900.html From debatem1 at gmail.com Thu Jul 22 22:21:54 2010 From: debatem1 at gmail.com (geremy condra) Date: Thu, 22 Jul 2010 16:21:54 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: <4C48A232.8080701@udel.edu> References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C48A232.8080701@udel.edu> Message-ID: On Thu, Jul 22, 2010 at 3:55 PM, Terry Reedy wrote: > > > On 7/22/2010 4:38 AM, geremy condra wrote: >> >> On Wed, Jul 21, 2010 at 9:21 PM, Terry Reedy ?wrote: > >>> e = fe(a,b,c, p1) where: >>> ?c = fc(a, d, p2 where: >>> ? ?d = fd(a, p1) where: >>> ? ? ?a = fa(p1, p2) >>> ?b = fb(a,p2) >>> >>> where p1,p2 are input parameters; >>> >>> looks about as bad (and it was a real effort to write). I would rather >>> something like that were in a branch dialect, Ypthon with its own >>> extension >>> (.yp). >> >> Of course it looks bad. You can make anything look bad. 37 levels of >> nested >> 'for' statements look bad. That isn't an argument against this, it's an >> argument >> against bad code period. > > I think you make an unfair comparison: 3 nested where statememts, which you > agree look bad, against 37 nested for statements, which I would agree would > be bad, as in unreadable. (I believe auto code generators have done such > things.) The fair comparison would be 3 nested where statement, which you > agree is bad, against 3 nested for statements, which is routine and not bad. > So I think you made my point ;-). Hmm. I'm pretty sure you know that I don't think 37 levels of for statements is the minimum required number for ugliness, which makes me wonder why you chose to structure an argument around that idea unless you just wanted to score a few easy points on an otherwise valid claim. I would also argue that the more valid comparison would be nested functions or classes- both perfectly pretty constructs on their own- which would cause me to gnaw on otherwise unoffending office furniture if I encountered them nested 3 deep. > My differently wired brain would tolerate the new construct much better, and > might even use it, if nested where/given constructs were not allowed. Seems easier to work with to me, but I don't see the point in increasing the implementation difficulty just to stop bad programmers from writing bad code. Geremy Condra From george.sakkis at gmail.com Thu Jul 22 22:49:29 2010 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 22 Jul 2010 22:49:29 +0200 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C48A232.8080701@udel.edu> Message-ID: On Thu, Jul 22, 2010 at 10:21 PM, geremy condra wrote: > I would also argue that the more valid comparison would be nested functions > or classes- both perfectly pretty constructs on their own- which would cause > me to gnaw on otherwise unoffending office furniture if I encountered them > nested 3 deep. I guess you're not much fond of decorators then; it's common to define them exactly as 3-level deep nested functions: def decofactory(deco_arg): def decorator(func): def wrapper(*args, **kwargs): if deco_arg: return func(*args, **kwargs) return wrapper return decorator @decofactory(n) def func(x, y): ... George From ncoghlan at gmail.com Fri Jul 23 00:09:33 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Jul 2010 08:09:33 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C48A232.8080701@udel.edu> Message-ID: On Fri, Jul 23, 2010 at 6:49 AM, George Sakkis wrote: > On Thu, Jul 22, 2010 at 10:21 PM, geremy condra wrote: > >> I would also argue that the more valid comparison would be nested functions >> or classes- both perfectly pretty constructs on their own- which would cause >> me to gnaw on otherwise unoffending office furniture if I encountered them >> nested 3 deep. > > I guess you're not much fond of decorators then; it's common to define > them exactly as 3-level deep nested functions: > > def decofactory(deco_arg): > ? ? def decorator(func): > ? ? ? ?def wrapper(*args, **kwargs): > ? ? ? ? ? ?if deco_arg: > ? ? ? ? ? ? ? return func(*args, **kwargs) > ? ? ? ?return wrapper > ? ?return decorator > > @decofactory(n) > def func(x, y): > ? ?... Actually, I think that's the main reason why parameterised decorators can be such a pain to understand - keeping the 3 scopes straight in your head is genuinely difficult. There's a reason the recently added contextlib.ContextDecorator is implemented as a class with a __call__ method rather than using nested functions. A given clause would let you reorder this code, and I think doing so is genuinely clearer: def decofactory(deco_arg): return decorator given: def decorator(func): if not deco_arg: return func return wrapper given: @wraps(func) def wrapper(*args, **kwargs): return func(*args, **kwargs) Reversing the order of some aspects of the execution allows the text flow to match the way you would describe the operation: we have a decorator factory that returns a decorator that returns the function unmodified if deco_arg evaluates to False and returns a wrapper around the decorated function otherwise. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Fri Jul 23 00:35:24 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 22 Jul 2010 18:35:24 -0400 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <4C48A232.8080701@udel.edu> Message-ID: On 7/22/2010 4:21 PM, geremy condra wrote: > Hmm. I'm pretty sure you know that I don't think 37 levels of for statements is > the minimum required number for ugliness, Of course not, but that is the only number you gave ;-) I would agree to something smaller than 10, but larger than 4. > I would also argue that the more valid comparison would be nested functions > or classes- both perfectly pretty constructs on their own- which would cause > me to gnaw on otherwise unoffending office furniture if I encountered them > nested 3 deep. For classes I agree. For functions I would allow 3. But in all casses, the number that is ok is usefully larger than 1. -- Terry Jan Reedy From ncoghlan at gmail.com Fri Jul 23 12:39:27 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Jul 2010 20:39:27 +1000 Subject: [Python-ideas] [Python-Ideas] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: (moving to python-ideas, where this discussion belongs) On Fri, Jul 23, 2010 at 7:01 AM, wrote: > > >> Date: Thu, 22 Jul 2010 14:49:17 -0400 >> Subject: Re: [Python-Dev] Set the namespace free! >> From: alexander.belopolsky at gmail.com >> To: gregory.smith3 at sympatico.ca >> CC: python-dev at python.org >> >> On Thu, Jul 22, 2010 at 12:53 PM, wrote: >> .. >> > So, ::name or &name or |name or whatever. >> > >> > I'm very amused by all the jokes about turning python into perl, but >> > there's >> > a good idea here that doesn't actually require that... >> >> No, there isn't. And both '&' and '|' are valid python operators that >> cannot be used this way. >> > > Um, of course. Serious brain freeze today, using too many languages at once. > Yeah, that's it. > > Despite my knuckleheadedness, I say there's still a hole here that can be > easily plugged. it's clumsy that you can't call, e.g. > > GenerateURL( reqtype='basic', class='local') > > other than by > > GenerateURL(? **{'reqtype': 'basic', 'class': 'local'}) > > ... just because 'class' is a keyword.? That's letting a parser issue > degrade the value of a really good feature. Likewise for attributes; python > allows you to have > named parameters or attributes called 'class' and 'import' if you like; it > just doesn't let you write them directly; this restriction doesn't seem to > be necessary except for the parse issue, which is fixable. I.e. nothing > would break by allowing GenerateURL(::class = 'local') or > Request.::class. Or, rather than making a major syntactic change that affects not just all Python implementations, but also every syntax highlighter that understands the set of reserved words, we instead encourage external interface developers to implement three simple rules: 1. If a name coming from an external resource clashes with a Python keyword, append a single underscore 2. If a name coming from an external resource ends with an underscore, append an additional underscore 3. If a name being written to or otherwise used to access an external resource ends with an underscore, remove it The above example would then be written as: GenerateURL( reqtype='basic', class_='local') Why should the entire language toolset be burdened with additional syntax in order to deal with an issue that could be handled perfectly well by the adoption of some simple API conventions? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From songofacandy at gmail.com Fri Jul 23 14:10:20 2010 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 23 Jul 2010 21:10:20 +0900 Subject: [Python-ideas] String interpolation again. Message-ID: Hi, all. Below code is syntax error now: foo = 123 x = 'foo' foo 'bar' I wonder if that code is interpreted as: x = 'foo' + str(foo) + 'bar' or x = 'foo%sbar' % (foo,) I think this syntax sugar doesn't import any compatibility problem and simple enough for Python. What do you think about it? -- INADA Naoki? From 8mayday at gmail.com Fri Jul 23 14:26:09 2010 From: 8mayday at gmail.com (Andrey Popp) Date: Fri, 23 Jul 2010 16:26:09 +0400 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: I think it's a form of weak typing from languages like PHP, that allows programmer to make huge amounts of mistakes, that are sometimes difficult to spot. On Fri, Jul 23, 2010 at 4:10 PM, INADA Naoki wrote: > Hi, all. > > Below code is syntax error now: > > foo = 123 > x = 'foo' foo 'bar' > > I wonder if that code is interpreted as: > > x = 'foo' + str(foo) + 'bar' > ?or > x = 'foo%sbar' % (foo,) > > I think this syntax sugar doesn't import any compatibility problem > and simple enough for Python. > > What do you think about it? > > -- > INADA Naoki? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday at gmail.com From masklinn at masklinn.net Fri Jul 23 14:33:03 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 23 Jul 2010 14:33:03 +0200 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: <121A5374-B9EB-461E-ABCE-4A9485928F1D@masklinn.net> On 2010-07-23, at 14:26 , Andrey Popp wrote: > I think it's a form of weak typing from languages like PHP, that > allows programmer to make huge amounts of mistakes, that are sometimes > difficult to spot. For what it's worth, even PHP doesn't allow this. One either has to concatenate "foo" . $foo . "bar" or interpolate "foo${foo}bar" The only language I am aware of which *might* let users do something along those lines (please note that I'm not even sure it's possible) would be Tcl, in which pretty much everything seems to be a string. From scialexlight at gmail.com Fri Jul 23 14:35:40 2010 From: scialexlight at gmail.com (Alex Light) Date: Fri, 23 Jul 2010 08:35:40 -0400 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: On Fri, Jul 23, 2010 at 8:26 AM, Andrey Popp <8mayday at gmail.com> wrote: > I think it's a form of weak typing from languages like PHP, that > allows programmer to make huge amounts of mistakes, that are sometimes > difficult to spot. I agree. but i wouldn't mind if python would start automatically calling str on objects in string sequences if they are not strings. i.e. turn this not_a_string = 123 a_string = 'this is a string' + not_a_string #currently throws syntax error to this: a_string = 'this is a string ' + str(not_a_string) they do this in java and it works quite well Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Fri Jul 23 14:41:21 2010 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 23 Jul 2010 21:41:21 +0900 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: > I agree. but i wouldn't mind if python would start automatically calling str > on objects in string sequences if they are not strings. > i.e. turn this > not_a_string = 123 > a_string = 'this is a string' + not_a_string > #currently throws syntax error Currently, this raises TypeError. I think allowing implicit conversion break backward compatibility and allowing new syntax that is now syntax error doesn't cause a compatibility problem. -- INADA Naoki? From mwm-keyword-python.b4bdba at mired.org Fri Jul 23 15:05:04 2010 From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer) Date: Fri, 23 Jul 2010 09:05:04 -0400 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: <20100723090504.33c58120@bhuda.mired.org> On Fri, 23 Jul 2010 08:35:40 -0400 Alex Light wrote: > On Fri, Jul 23, 2010 at 8:26 AM, Andrey Popp <8mayday at gmail.com> wrote: > > > I think it's a form of weak typing from languages like PHP, that > > allows programmer to make huge amounts of mistakes, that are sometimes > > difficult to spot. It also violates TOOOWTDI. > I agree. but i wouldn't mind if python would start automatically calling str > on objects in string sequences if they are not strings. > i.e. turn this > > not_a_string = 123 > > a_string = 'this is a string' + not_a_string > #currently throws syntax error > > to this: > a_string = 'this is a string ' + str(not_a_string) > > they do this in java and it works quite well I believe the initial quote applies to this form as well. It might work well in Java, but Java isn't Python; there are lots of other differences that make this a lot more tolerable in Java. The first problem with this kind of thing is that there's no obvious reason why 12 + '34' should be '1234' instead of 46. Java variables have declared types. This means the above situation can be detected at compile time, and the implicit conversion added then. In Python, you have to do the tests at run time, which will slow everything down. Further, Java's typed variables means that if you've made a mistake in the type of one of the values, the assignment will be flagged as an error at compile time. Python won't do that, so implicitly fixing the mistake here means you get even further from the error before something happens that reveals it. Finally, the % operator does that implicit conversion for you if you use %s: a_string = 'this is a string %s' % not_a_string Works just fine as things are now (though the syntax needs tweaking if not_a_string is a tuple). I think brings it to -4. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From masklinn at masklinn.net Fri Jul 23 15:25:42 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 23 Jul 2010 15:25:42 +0200 Subject: [Python-ideas] String interpolation again. In-Reply-To: <20100723090504.33c58120@bhuda.mired.org> References: <20100723090504.33c58120@bhuda.mired.org> Message-ID: <2033D5EF-B71B-4927-8A3B-19E45C49197F@masklinn.net> On 2010-07-23, at 15:05 , Mike Meyer wrote: > The first problem with this kind of thing is that there's no obvious > reason why 12 + '34' should be '1234' instead of 46. > > Java variables have declared types. This means the above situation can > be detected at compile time, and the implicit conversion added > then. In Python, you have to do the tests at run time, which will slow > everything down. Actually, it's much simpler than that for Java: the `+` operator is specially overloaded any time a string is involved to become a "convert and concatenate" operator similar to PHP's "." rather than the usual "add" operator. In Python, the equivalent behavior would be to special case the addition operator so that it checks if either operand is a string and if it is convert and concatenate the other one, otherwise apply normal resolution. From mwm-keyword-python.b4bdba at mired.org Fri Jul 23 15:37:05 2010 From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer) Date: Fri, 23 Jul 2010 09:37:05 -0400 Subject: [Python-ideas] String interpolation again. In-Reply-To: <2033D5EF-B71B-4927-8A3B-19E45C49197F@masklinn.net> References: <20100723090504.33c58120@bhuda.mired.org> <2033D5EF-B71B-4927-8A3B-19E45C49197F@masklinn.net> Message-ID: <20100723093705.34581afe@bhuda.mired.org> On Fri, 23 Jul 2010 15:25:42 +0200 Masklinn wrote: > On 2010-07-23, at 15:05 , Mike Meyer wrote: > > The first problem with this kind of thing is that there's no obvious > > reason why 12 + '34' should be '1234' instead of 46. > > > > Java variables have declared types. This means the above situation can > > be detected at compile time, and the implicit conversion added > > then. In Python, you have to do the tests at run time, which will slow > > everything down. > Actually, it's much simpler than that for Java: the `+` operator is specially overloaded any time a string is involved to become a "convert and concatenate" operator similar to PHP's "." rather than the usual "add" operator. > > In Python, the equivalent behavior would be to special case the addition operator so that it checks if either operand is a string and if it is convert and concatenate the other one, otherwise apply normal resolution. I would hope the Java version isn't as convoluted as you say (but given Java, it may be): all this really requires is that the string version of + include the conversion. In python, that would be making str.__add__ (and friends) do the conversion. Given that in Python, str(a_string) is a_string, and doing the type check on a string (via either type or isinstance) is about the same speed as calling str on it (just under .2 usecs/loop), you might as well just always do the conversion. Or maybe this breaks unicode... Assuming, of course, you actually think this is a good idea. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From masklinn at masklinn.net Fri Jul 23 15:50:26 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 23 Jul 2010 15:50:26 +0200 Subject: [Python-ideas] String interpolation again. In-Reply-To: <20100723093705.34581afe@bhuda.mired.org> References: <20100723090504.33c58120@bhuda.mired.org> <2033D5EF-B71B-4927-8A3B-19E45C49197F@masklinn.net> <20100723093705.34581afe@bhuda.mired.org> Message-ID: <3ED23A89-8A6E-41A6-915C-E43A50FE54F1@masklinn.net> On 2010-07-23, at 15:37 , Mike Meyer wrote: > On Fri, 23 Jul 2010 15:25:42 +0200 > Masklinn wrote: > >> On 2010-07-23, at 15:05 , Mike Meyer wrote: >>> The first problem with this kind of thing is that there's no obvious >>> reason why 12 + '34' should be '1234' instead of 46. >>> >>> Java variables have declared types. This means the above situation can >>> be detected at compile time, and the implicit conversion added >>> then. In Python, you have to do the tests at run time, which will slow >>> everything down. >> Actually, it's much simpler than that for Java: the `+` operator is specially overloaded any time a string is involved to become a "convert and concatenate" operator similar to PHP's "." rather than the usual "add" operator. >> >> In Python, the equivalent behavior would be to special case the addition operator so that it checks if either operand is a string and if it is convert and concatenate the other one, otherwise apply normal resolution. > > I would hope the Java version isn't as convoluted as you say (but > given Java, it may be): all this really requires is that the string > version of + include the conversion. There isn't really such a thing as "the string version of +" as Java forbids userland operator overloading. String's + is a special case in the language as it is, indeed, one of the very few (if not the only) overloaded operator. To understand how far this goes, Java's BigInteger (their equivalent to Python's long) doesn't have *any* operator overloading. If a is a BigInteger and b is a BigInteger, you add them by writing `a.add(b)`. Likewise for substraction, multiplication, division, negation or bit-twiddling[0] And if either is *not* a BigInteger, then you convert it to a BigInteger first. If it's a primitive integer type, you cannot even use a constructor, you have to use `BigInteger.valueOf(long)` > In python, that would be making > str.__add__ (and friends) do the conversion. You'd run into the issues of writing `a + "foo"` with `a` defining a custom `__add__`, which would not perform string concatenation, as per Python's operator resolution order. [0] http://download.oracle.com/docs/cd/E17409_01/javase/6/docs/api/java/math/BigInteger.html From mwm-keyword-python.b4bdba at mired.org Fri Jul 23 16:09:21 2010 From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer) Date: Fri, 23 Jul 2010 10:09:21 -0400 Subject: [Python-ideas] String interpolation again. In-Reply-To: <3ED23A89-8A6E-41A6-915C-E43A50FE54F1@masklinn.net> References: <20100723090504.33c58120@bhuda.mired.org> <2033D5EF-B71B-4927-8A3B-19E45C49197F@masklinn.net> <20100723093705.34581afe@bhuda.mired.org> <3ED23A89-8A6E-41A6-915C-E43A50FE54F1@masklinn.net> Message-ID: <20100723100921.613b75f0@bhuda.mired.org> On Fri, 23 Jul 2010 15:50:26 +0200 Masklinn wrote: > On 2010-07-23, at 15:37 , Mike Meyer wrote: > > On Fri, 23 Jul 2010 15:25:42 +0200 > > Masklinn wrote: > > > >> On 2010-07-23, at 15:05 , Mike Meyer wrote: > >>> The first problem with this kind of thing is that there's no obvious > >>> reason why 12 + '34' should be '1234' instead of 46. > >>> > >>> Java variables have declared types. This means the above situation can > >>> be detected at compile time, and the implicit conversion added > >>> then. In Python, you have to do the tests at run time, which will slow > >>> everything down. > >> Actually, it's much simpler than that for Java: the `+` operator is specially overloaded any time a string is involved to become a "convert and concatenate" operator similar to PHP's "." rather than the usual "add" operator. > >> > >> In Python, the equivalent behavior would be to special case the addition operator so that it checks if either operand is a string and if it is convert and concatenate the other one, otherwise apply normal resolution. > > > > I would hope the Java version isn't as convoluted as you say (but > > given Java, it may be): all this really requires is that the string > > version of + include the conversion. > There isn't really such a thing as "the string version of +" as Java forbids userland operator overloading. String's + is a special case in the language as it is, indeed, one of the very few (if not the only) overloaded operator. > > To understand how far this goes, Java's BigInteger (their equivalent to Python's long) doesn't have *any* operator overloading. If a is a BigInteger and b is a BigInteger, you add them by writing `a.add(b)`. Likewise for substraction, multiplication, division, negation or bit-twiddling[0] > > And if either is *not* a BigInteger, then you convert it to a BigInteger first. If it's a primitive integer type, you cannot even use a constructor, you have to use `BigInteger.valueOf(long)` > > > In python, that would be making > > str.__add__ (and friends) do the conversion. > You'd run into the issues of writing `a + "foo"` with `a` defining a custom `__add__`, which would not perform string concatenation, as per Python's operator resolution order. That's what the "and friends" is for. str.__radd__ is one of the friends. If the type of a refused to do the add, then str.__radd__ would get it, and could do the conversion and concatenation. Of course, if the type of a did the add in some way *other* than via the conversion and concatenation, then that's what would happen. Which is one of the reasons this type of implicit conversion isn't right for Python. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From dickinsm at gmail.com Fri Jul 23 16:11:37 2010 From: dickinsm at gmail.com (Mark Dickinson) Date: Fri, 23 Jul 2010 15:11:37 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Tue, Jul 20, 2010 at 2:27 PM, Nick Coghlan wrote: > For the record, I am personally +1 on the idea (otherwise I wouldn't > have put so much thought into it over the years). It's just a *lot* > harder to define complete and consistent semantics for the concept > than people often realise. > > However, having the question come up twice within the last month > finally inspired me to write the current status of the topic down in a > deferred PEP: http://www.python.org/dev/peps/pep-3150/ Is Python's grammar still LL(1) under this proposal? Mark From masklinn at masklinn.net Fri Jul 23 16:14:59 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 23 Jul 2010 16:14:59 +0200 Subject: [Python-ideas] String interpolation again. In-Reply-To: <20100723100921.613b75f0@bhuda.mired.org> References: <20100723090504.33c58120@bhuda.mired.org> <2033D5EF-B71B-4927-8A3B-19E45C49197F@masklinn.net> <20100723093705.34581afe@bhuda.mired.org> <3ED23A89-8A6E-41A6-915C-E43A50FE54F1@masklinn.net> <20100723100921.613b75f0@bhuda.mired.org> Message-ID: <47B1F237-EFBE-443E-A4E1-60D1AAD0AEFE@masklinn.net> On 2010-07-23, at 16:09 , Mike Meyer wrote: > >>> In python, that would be making >>> str.__add__ (and friends) do the conversion. >> You'd run into the issues of writing `a + "foo"` with `a` defining a custom `__add__`, which would not perform string concatenation, as per Python's operator resolution order. > > That's what the "and friends" is for. str.__radd__ is one of the > friends. If the type of a refused to do the add, then str.__radd__ > would get it, and could do the conversion and concatenation. > > Of course, if the type of a did the add in some way *other* than via > the conversion and concatenation, then that's what would happen. Which > is one of the reasons this type of implicit conversion isn't right for > Python. Right. Because otherwise it should not be very hard (which doesn't mean it would be smart) to remove the type error from str.__add__ and str.__radd__ and just str() the argument if it isn't already a string. From songofacandy at gmail.com Fri Jul 23 16:16:00 2010 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 23 Jul 2010 23:16:00 +0900 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: Basic problem is Python doesn't provide a way to print values of expression into str like print prints to file. 'foo{bar}baz'.format(bar=bar) is a bit bessy. ?foo{bar}baz'.format(**vars()) or other technique is a bit trickey and messy. If s = 'foo' bar 'baz' is too dirty, another Pythonic way I think of are: * s = str('foo', bar, 'baz') This is not compatible current Python because str()'s second argument is encoding and third argument is error handler. I think Python4 should not accept str(a_bytes, 'utf-8') and use a_bytes.decode('utf-8') * s = print('foo', bar, 'baz', sep='', file=None) * s = print('foo', bar, 'baz', sep='', file=str) Extend print() function to return str instead of print to file. * s = str.print('foo', bar, 'baz', sep='') Add staticmethod to str. -- INADA Naoki? From andre.roberge at gmail.com Fri Jul 23 18:03:05 2010 From: andre.roberge at gmail.com (Andre Roberge) Date: Fri, 23 Jul 2010 13:03:05 -0300 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: On Fri, Jul 23, 2010 at 11:16 AM, INADA Naoki wrote: > Basic problem is Python doesn't provide a way to print values of expression > into str like print prints to file. > 'foo{bar}baz'.format(bar=bar) is a bit bessy. > ?foo{bar}baz'.format(**vars()) or other technique is a bit trickey and > messy. > > If > s = 'foo' bar 'baz' > is too dirty, another Pythonic way I think of are: > > [snip] What's wrong with s = 'foo' + str(bar) + 'baz' If you want something Pythonic: import this ... Explicit is better than implicit --- Andr? -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Fri Jul 23 19:06:52 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 23 Jul 2010 19:06:52 +0200 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: <92DC6EE0-84E9-4286-A4FD-E41798EC7483@masklinn.net> On 2010-07-23, at 16:16 , INADA Naoki wrote: > Basic problem is Python doesn't provide a way to print values of expression > into str like print prints to file. > 'foo{bar}baz'.format(bar=bar) is a bit bessy. I don't understand what you find messy about either this or `'foo{}baz'.format(bar)`. > * s = print('foo', bar, 'baz', sep='', file=None) > * s = print('foo', bar, 'baz', sep='', file=str) > Extend print() function to return str instead of print to file. > > * s = str.print('foo', bar, 'baz', sep='') > Add staticmethod to str. These forms can trivially be implemented as utility functions. And I don't see them as improvements enough to propose them for inclusion in the stdlib, but YMMV. From songofacandy at gmail.com Fri Jul 23 19:35:29 2010 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 24 Jul 2010 02:35:29 +0900 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: Message-ID: > What's wrong with > s = 'foo' + str(bar) + 'baz' OK, I agree that your code is very pythonic. I've seen and wrote some PHP and Ruby code today, so I've forgotten what's pythonic. Many people from PHP/Ruby/Perl look at string interpolation and find .format(**vars()) trick and feel it's messy. But if there is a clean and pythonic way, adding new syntax is not pythonic. > > If you want something Pythonic: > > import this > ... > Explicit is better than implicit > > --- > Andr? > > > Thank you. -- INADA Naoki? From songofacandy at gmail.com Fri Jul 23 19:52:18 2010 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 24 Jul 2010 02:52:18 +0900 Subject: [Python-ideas] String interpolation again. In-Reply-To: <92DC6EE0-84E9-4286-A4FD-E41798EC7483@masklinn.net> References: <92DC6EE0-84E9-4286-A4FD-E41798EC7483@masklinn.net> Message-ID: On Sat, Jul 24, 2010 at 2:06 AM, Masklinn wrote: > On 2010-07-23, at 16:16 , INADA Naoki wrote: >> Basic problem is Python doesn't provide a way to print values of expression >> into str like print prints to file. >> 'foo{bar}baz'.format(bar=bar) is a bit bessy. > > I don't understand what you find messy about either this or `'foo{}baz'.format(bar)`. In my code, '{bar}' and 'bar=bar' seems too verbose. Your code is simple when the literal is short, but it is difficult to check which variable is inserted where when string is long. > >> * s = print('foo', bar, 'baz', sep='', file=None) >> * s = print('foo', bar, 'baz', sep='', file=str) >> Extend print() function to return str instead of print to file. >> >> * s = str.print('foo', bar, 'baz', sep='') >> Add staticmethod to str. > > These forms can trivially be implemented as utility functions. And I don't see them as improvements enough to propose them for inclusion in the stdlib, but YMMV. I agree with you. -- INADA Naoki? From dag.odenhall at gmail.com Fri Jul 23 21:26:45 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Fri, 23 Jul 2010 21:26:45 +0200 Subject: [Python-ideas] String interpolation again. In-Reply-To: References: <92DC6EE0-84E9-4286-A4FD-E41798EC7483@masklinn.net> Message-ID: <1279913205.3423.5.camel@gumri> > In my code, '{bar}' and 'bar=bar' seems too verbose. > Your code is simple when the literal is short, but it is difficult to > check which variable > is inserted where when string is long. format = '...{short_descriptive_name}...' text = format.format( short_descriptive_name=long_cumbersome_variable_name, other_name=other_name, ...) Don't be afraid of more lines of code if it helps readability. You could also build a context dict: context = {} context['name'] = complicated_code_to_get_name() ... text = '...{name}...'.format(**context) Of course personally I want a where/given clause, but that's a different thread. ;) From eric at trueblade.com Fri Jul 23 21:38:56 2010 From: eric at trueblade.com (Eric Smith) Date: Fri, 23 Jul 2010 15:38:56 -0400 Subject: [Python-ideas] String interpolation again. In-Reply-To: <1279913205.3423.5.camel@gumri> References: <92DC6EE0-84E9-4286-A4FD-E41798EC7483@masklinn.net> <1279913205.3423.5.camel@gumri> Message-ID: <4C49EFD0.5030300@trueblade.com> On 7/23/10 3:26 PM, Dag Odenhall wrote: >> In my code, '{bar}' and 'bar=bar' seems too verbose. >> Your code is simple when the literal is short, but it is difficult to >> check which variable >> is inserted where when string is long. > > format = '...{short_descriptive_name}...' > text = format.format( > short_descriptive_name=long_cumbersome_variable_name, > other_name=other_name, > ...) > > Don't be afraid of more lines of code if it helps readability. > > You could also build a context dict: > > context = {} > context['name'] = complicated_code_to_get_name() > ... > text = '...{name}...'.format(**context) Or use: text = '...{context.name}...'.format(context=context) This is a great trick when "context" is "self". -- Eric. From raymond.hettinger at gmail.com Sat Jul 24 01:38:33 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 23 Jul 2010 16:38:33 -0700 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> <20100721002803.0d10def2@pitrou.net> <20100721004806.48858225@pitrou.net> Message-ID: On Jul 21, 2010, at 2:58 PM, Nick Coghlan wrote: > On Thu, Jul 22, 2010 at 1:51 AM, Bruce Leban wrote: >> I'm unconvinced of the value at this point but notwithstanding that let me >> toss in an alternative syntax: >> >> given: >> suite >> do: >> suite >> >> This executes the two suites in order with any variable bindings created by >> the first suite being local to the scope of the two suites. I think this is >> more readable than the trailing clause and is more flexible (you can put >> multiple statements in the second suite) and avoids the issue with anyone >> wanting the where clause added to arbitrary expressions. >> >> FWIW, in math it's more common to list givens at the top. Sounds like a LISP letrec. > > However, writing it that way has even less to offer over ordinary > local variables than the postfix given clause. > > I updated the draft PEP again, pointing out that if a decision had to > be made today, the PEP would almost certainly be rejected due to a > lack of compelling use cases. The bar for adding a new syntactic > construct is pretty high and PEP 3150 currently isn't even close to > reaching it (see PEP 343 for the kind of use cases that got the with > statement over that bar). > I concur. Raymond From greg at krypto.org Sat Jul 24 03:31:55 2010 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 23 Jul 2010 18:31:55 -0700 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: On Thu, Jul 22, 2010 at 9:24 AM, wrote: > I agree with the idea, but a far less radical change is needed to get the > desired result. > The basic idea is this: it should be possible to use any name as an > identifier in the syntax, including names > like 'while' and 'import'. But there is no need to mess up the entire > language to allow this > (either by quoting all the identifiers, perl-style, or by marking the > keywords). > Yuck. Anyone who feels they need a variable named the same a reserved word simply feels wrong and needs reeducation. New words are introduced very rarely and we do care about the ramifications when we do it. What next? An optional way to support case insensitive names using a unicode character prefix? -gps > > All that is needed is something like this: > > foo = 7 > :foo = 7 # exactly like foo=7 > :while= 3 # assigns 3 to variable 'while' > globals()['while']=3 # current way to do this > > print element.:for # from example below > # > # keyword parameters to a function call: > # > BuildUrlQuery( lang='en', item='monsoon', :class='normal') # -> > "?lang=en&query=monsoon&class=normal" > > > The generic keyword function call is a really nice language feature, but > it's rather impaired by the need to avoid > those names which happen to be keywords. > > The lack of this is most painful when you are auto-generating python code > which forms a bridge to another language with its own > namespace (as in XML example). It's a pain when some of the names you might > generate could conflict with python keywords. > So, you end up using dicts and getattrs for everything and the code gets > much less readable. With a simple escape like :name available, > it's worthwhile to do everything with identifiers and generate the escape > only as needed for these. > > > One of the great strengths of python is the ability to form cleans and > comprehensive bridges to other languages and environments (thus, > in many cases, avoiding the need to write programs in those other > environments :-) ). This feature would fill a gap there. > > The python tcl/tk interface is a bit messed up since tcl/tk uses some names > for options which conflict with python keywords, > and so you need to add '_' to those names. > > There is a feature like this in VHDL: \name\ and \while\ are identifiers, > the backslashes are not part of the name, but just > quote it. In VHDL you can write identifiers like \22\, and > \!This?is=Strange\ as well; since VHDL creates modules that > have named ports, and those modules can interface to things generated by > other environments, they needed a > way to assign any name to a port. > > For python, I'm not sure it makes sense to allow identifiers that doesn't > follow the basic rule "[A-Za-z_][A-Za-z_0-9]*" -- that could > break some debugging tools which expect variable names to be well-formed -- > but it would be useful > to be able to use any well-formed name as an identifier, including those > which happen to be keywords. > > I've suggested :name, which doesn't break old code, and doesn't require > using any new punctuation. Syntax would not change, > just the lexical definition of 'identifier'. If the intent is to allow > arbitrary names (not just well-formed ones), then n'name' would > work better (and is consistent with existing stuff). > > > > > Date: Thu, 22 Jul 2010 10:41:39 -0400 > > From: jnoller at gmail.com > > To: bartosz-tarnowski at zlotniki.pl > > CC: python-dev at python.org > > Subject: Re: [Python-Dev] Set the namespace free! > > > > > On Thu, Jul 22, 2010 at 10:04 AM, Bartosz Tarnowski > > wrote: > > > > > > Hello, guys. > > > > > > Python has more and more reserved words over time. It becomes quite > > > annoying, since you can not use variables and attributes of such names. > > > Suppose I want to make an XML parser that reads a document and returns > an > > > object with attributes corresponding to XML element attributes: > > > > > >> elem = parse_xml("") > > >> print elem.param > > > boo > > > > > > What should I do then, when the attribute is a reserver word? I could > use > > > trailing underscore, but this is quite ugly and introduces ambiguity. > > > > > >> elem = parse_xml("") > > >> print elem.for_ #????? > > >> elem = parse_xml("") > > >> print elem.for__ #????? > > > > > > My proposal: let's make a syntax change. > > > > > > Let all reserved words be preceded with some symbol, i.e. "!" > (exclamation > > > mark). This goes also for standard library global identifiers. > > > > > > !for boo in foo: > > > !if boo is !None: > > > !print(hoo) > > > !else: > > > !return !sorted(woo) > > > > > > > > > This would allow the user to declare any identifier with any name: > > > > > > for = with(return) + try > > > > > > What do you think of it? It is a major change, but I think Python needs > it. > > > > > > -- > > > haael > > > > > > > I'm not a fan of this - I'd much prefer[1] that we use the exclamation > > point to determine scope: > > > > foobar - local > > !foobar - one up > > !!foobar - higher than the last one > > !!!foobar - even higher in scope > > > > We could do the inverse as well; if you append ! you can push variable > > down in scope. > > > > Jesse > > > > > > [1] I am not serious. > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/gsmith%40alumni.uwaterloo.ca > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmjohnson.mailinglist at gmail.com Sat Jul 24 05:22:54 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Fri, 23 Jul 2010 17:22:54 -1000 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: On Fri, Jul 23, 2010 at 3:31 PM, Gregory P. Smith wrote: > What next? ?An optional way to support case insensitive names using a > unicode character prefix? This suggests an innovative solution. Using the advanced phishing techniques of today?s bleeding edge Mafioso hackers, we can use unicode lookalikes to stand in for keywords. So, for example, the motivating use case is the desire to write elem.for. Well, all we need to do is use the Greek omicron and write elem.f?r. See the difference? No? Exactly! This is a good first step as a workaround in older versions of Python, but I think we can do better than this in future version. Since it is so clearly useful and convenient with a wide variety of use cases to be able to use current keywords as variable names, I propose that we phase out the current set of keywords and replace them with vowel-shifted lookalikes: Cyrillic ? for a, Cyrillic ? for y, omicron for o, upside-down exclamation point for i, and the EU?s estimated sign ? for e. So, the current keywords: and elif import return as else in try assert except is while break finally lambda with class for not yield continue from or def global pass del if raise would be replaced with these new keywords: ?nd ?l?f ?mp?rt r?turn ?s ?ls? ?n tr? ?ss?rt ?xc?pt ?s wh?l? br??k f?n?ll? l?mbd? w?th cl?ss f?r n?t ???ld c?nt?nu? fr?m ?r d?f gl?b?l p?ss d?l ?f r??s? Since this change is visually undetectable, I don?t see why we can?t just make it mandatory in 3.2 instead of going through the usual multi-release deprecation cycle. (I will admit that the transition might be quicker if someone could modify 2to3 and make 3_1to3_2 to help people convert their legacy scripts.) Can we get fast-track BDFL approval on this one? Seems like a slam dunk to me. Internationally yrs, -- ?J From bruce at leapyear.org Sat Jul 24 05:32:39 2010 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 23 Jul 2010 20:32:39 -0700 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: > Yuck. Anyone who feels they need a variable named the same a reserved word simply feels wrong and needs reeducation. I hope you meant to apply that yuck to :for or !try as variable names and if so I agree. On the other hand a convention that class_="x" puts "&class=x" in the URL seems quite reasonable. --- Bruce (via android) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vano at mail.mipt.ru Sat Jul 24 09:16:45 2010 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Sat, 24 Jul 2010 11:16:45 +0400 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <1279740852.3222.38.camel@localhost.localdomain> References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: <636668799.20100724111645@mail.mipt.ru> As a user you are talking about in the PEP, i'm quite satisfied with the current implementation. The only change i see worth considering is concatenating IOError and OSError, possibly with EnvironmentError. It's worth giving EnvironmentError a better name too. I strongly oppose introducing 'fine-grained' exceptions. As an exception user, i care about: 1) brief, easy to remember exception hierarchy with clearly distinguished domains 2) intuitive, natural-looking handling 3) simple way to know which exceptions are thrown by a code IMO, the current implementation covers 2) excellently, 1) and 3) only have IOError/OSError ambiguities. 1)) OSError and IOError do intersect in the OS's PoV. The only reason why they're separate is because they 'feel' different. Historically, I/O and OSes come different ways: OSes are almost purely software and I/O technologies are primarily hardware-based. So, 'OSError' looks like something that is meaningful only to software logic and 'IOError' - like something that has a physical incarnation. Both OSError can be thought as part of IOError and vice versa so neither is likely to meet consensus to be made a subclass of the other. So we either 1) declare the above 'feelings' retrograde and fuse the types. In this case, EnvironmentError will become redundant so we'll have to fuse it in too; 2) just use EnvironmentError for all ambiguous cases and give it a better name. The current one is just wa-a-a-y t-o-o-o lo-o-o-ong for ubiquitous usage. 2)) The 'neat' handling except OSError,e: if e.errno==EEXIST: act() else: raise looks the most natural solution to me as all OSErrors are perceived as errors of the same type (errors caused by external factors and received as error codes from the OS standard API). Adding errno-specific subexceptions 1) makes some errnos privileged at the expense of others 2) introduces a list that isn't bound to any objective characteristic and is just an arbitrary "favorites" list 3) adds types that characterize single errors rather than error classes which is an unnecessary level of complexity. 3)) Builtin I/O operations throw IOError, OS function wrappers throw OSError, package functions throw package-specific ones - quite obvious where to expect which. There's EnvironmentError for any ambiguities, the only reason against it is the long and unusual name. From dickinsm at gmail.com Sat Jul 24 10:33:36 2010 From: dickinsm at gmail.com (Mark Dickinson) Date: Sat, 24 Jul 2010 09:33:36 +0100 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: Message-ID: Richard Jones (who isn't subscribed to this list, else he'd be sending this message himself) would like it pointed out that the "withhacks" module[1] does some things very similar to this (by using bytecode hackery, IIUC). See especially the xkwargs and xargs functions. >>> from withhacks import xkwargs >>> print(xkwargs.__doc__) WithHack calling a function with extra keyword arguments. This WithHack captures any local variables created during execution of the block, then calls the given function using them as extra keyword arguments. >>> def calculate(a,b): ... return a * b ... >>> with xkwargs(calculate,b=2) as result: ... a = 5 ... >>> print result 10 -- Mark From ncoghlan at gmail.com Sat Jul 24 11:58:23 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Jul 2010 19:58:23 +1000 Subject: [Python-ideas] 'where' statement in Python? In-Reply-To: References: <878w56zppe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sat, Jul 24, 2010 at 12:11 AM, Mark Dickinson wrote: > On Tue, Jul 20, 2010 at 2:27 PM, Nick Coghlan wrote: >> For the record, I am personally +1 on the idea (otherwise I wouldn't >> have put so much thought into it over the years). It's just a *lot* >> harder to define complete and consistent semantics for the concept >> than people often realise. >> >> However, having the question come up twice within the last month >> finally inspired me to write the current status of the topic down in a >> deferred PEP: http://www.python.org/dev/peps/pep-3150/ > > Is Python's grammar still LL(1) under this proposal? It would need to stay that way, as the LL(1) restriction is a rule Guido has stated he is very reluctant to break. I *think* my second grammar change suggestion conforms with that requirement but I haven't actually tried it out. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Jul 24 12:12:31 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Jul 2010 20:12:31 +1000 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <636668799.20100724111645@mail.mipt.ru> References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> Message-ID: On Sat, Jul 24, 2010 at 5:16 PM, Ivan Pozdeev wrote: > Adding errno-specific subexceptions > 1) makes some errnos privileged at the expense of others Why is that a problem? Some errnos *are* more important than others - they're the ones the regularly appear on the right hand side of "errno == " checks. > 2) introduces a list that isn't bound to any objective characteristic and is just an > arbitrary "favorites" list Why would you consider new classes that would be based on a survey of the errnos that developers actually check for in published code to be "arbitrary"? > 3) adds types that characterize single errors rather than error > classes which is an unnecessary level of complexity. Any new IOError subclasses would likely still characterise classes of errors rather than single errno values. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From python at mrabarnett.plus.com Sat Jul 24 17:12:39 2010 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 24 Jul 2010 16:12:39 +0100 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: <4C4B02E7.2050806@mrabarnett.plus.com> Carl M. Johnson wrote: > On Fri, Jul 23, 2010 at 3:31 PM, Gregory P. Smith wrote: > >> What next? An optional way to support case insensitive names using a >> unicode character prefix? > > This suggests an innovative solution. Using the advanced phishing > techniques of today?s bleeding edge Mafioso hackers, we can use > unicode lookalikes to stand in for keywords. So, for example, the > motivating use case is the desire to write elem.for. Well, all we need > to do is use the Greek omicron and write elem.f?r. See the difference? > No? Exactly! > > This is a good first step as a workaround in older versions of Python, > but I think we can do better than this in future version. Since it is > so clearly useful and convenient with a wide variety of use cases to > be able to use current keywords as variable names, I propose that we > phase out the current set of keywords and replace them with > vowel-shifted lookalikes: Cyrillic ? for a, Cyrillic ? for y, omicron > for o, upside-down exclamation point for i, and the EU?s estimated > sign ? for e. So, the current keywords: > > and elif import return > as else in try > assert except is while > break finally lambda with > class for not yield > continue from or > def global pass > del if raise > > would be replaced with these new keywords: > > ?nd ?l?f ?mp?rt r?turn > ?s ?ls? ?n tr? > ?ss?rt ?xc?pt ?s wh?l? > br??k f?n?ll? l?mbd? w?th > cl?ss f?r n?t ???ld > c?nt?nu? fr?m ?r > d?f gl?b?l p?ss > d?l ?f r??s? > Instead of "?" use "?" ("\u0456") and instead of "?" use "?" ("\u0435"). > Since this change is visually undetectable, I don?t see why we can?t > just make it mandatory in 3.2 instead of going through the usual > multi-release deprecation cycle. (I will admit that the transition > might be quicker if someone could modify 2to3 and make 3_1to3_2 to > help people convert their legacy scripts.) > > Can we get fast-track BDFL approval on this one? Seems like a slam dunk to me. > > Internationally yrs, > From mwm-keyword-python.b4bdba at mired.org Sat Jul 24 17:22:52 2010 From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer) Date: Sat, 24 Jul 2010 11:22:52 -0400 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <636668799.20100724111645@mail.mipt.ru> References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> Message-ID: <20100724112252.6d15e3f4@bhuda.mired.org> On Sat, 24 Jul 2010 11:16:45 +0400 Ivan Pozdeev wrote: > As a user you are talking about in the PEP, i'm quite satisfied with the current implementation. As a user, I'm also quite satisfied with the current implementation, but think the proposal would be a major improvement in all areas. > I strongly oppose introducing 'fine-grained' exceptions. I think we see this in two different ways. The current version of the PEP seems to be somewhere between the two views. But in particular: > Adding errno-specific subexceptions I didn't see the PEP as calling for that. I saw it as dividing up the new combined IOError/OSError/EnvironmentError into finer-grained groups that make logical sense together. Yes, that division would depend on errno, and some of the groups on some platforms may well have only one errno - indeed, the first pass had a lot of those - but that's an implementation detail. > 1) makes some errnos privileged at the expense of others Well, some are privileged, in that they now belong to a finer grouping. I don't see how it's at the "expense" of the others: you can still catch the upper-level one and sort on errno, just like you do now. > 2) introduces a list that isn't bound to any objective characteristic and is just an > arbitrary "favorites" list That I would object to. I expect the PEP to include objective rules for each subgroup that can be used to determine if an errno belongs in that subgroup. > 3) adds types that characterize single errors rather than error > classes which is an unnecessary level of complexity. That actually sounds like the way most packages do things. > 3)) > Builtin I/O operations throw IOError, OS function wrappers throw > OSError, package functions throw package-specific ones - > quite obvious where to expect which. If only practice matched theory. Package functions throw package-specific exceptions, but they also call functions that can throw OS or IO errors. So in practice, I find I often have to deal with those as well as the package-specific ones. I can't even say the package authors are wrong not to catch and map those errors into package-specific errors. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From greg at krypto.org Sat Jul 24 18:00:57 2010 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 24 Jul 2010 09:00:57 -0700 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: On Fri, Jul 23, 2010 at 8:22 PM, Carl M. Johnson < cmjohnson.mailinglist at gmail.com> wrote: > On Fri, Jul 23, 2010 at 3:31 PM, Gregory P. Smith wrote: > > > What next? An optional way to support case insensitive names using a > > unicode character prefix? > > This suggests an innovative solution. Using the advanced phishing > techniques of today?s bleeding edge Mafioso hackers, we can use > unicode lookalikes to stand in for keywords. So, for example, the > motivating use case is the desire to write elem.for. Well, all we need > to do is use the Greek omicron and write elem.f?r. See the difference? > No? Exactly! > > This is a good first step as a workaround in older versions of Python, > but I think we can do better than this in future version. Since it is > so clearly useful and convenient with a wide variety of use cases to > be able to use current keywords as variable names, I propose that we > phase out the current set of keywords and replace them with > vowel-shifted lookalikes: Cyrillic ? for a, Cyrillic ? for y, omicron > for o, upside-down exclamation point for i, and the EU?s estimated > sign ? for e. So, the current keywords: > > and elif import return > as else in try > assert except is while > break finally lambda with > class for not yield > continue from or > def global pass > del if raise > > would be replaced with these new keywords: > > ?nd ?l?f ?mp?rt r?turn > ?s ?ls? ?n tr? > ?ss?rt ?xc?pt ?s wh?l? > br??k f?n?ll? l?mbd? w?th > cl?ss f?r n?t ???ld > c?nt?nu? fr?m ?r > d?f gl?b?l p?ss > d?l ?f r??s? > > Since this change is visually undetectable, I don?t see why we can?t > just make it mandatory in 3.2 instead of going through the usual > multi-release deprecation cycle. (I will admit that the transition > might be quicker if someone could modify 2to3 and make 3_1to3_2 to > help people convert their legacy scripts.) > > Can we get fast-track BDFL approval on this one? Seems like a slam dunk to > me. > > Internationally yrs, > > -- ?J > > Where's the craigslist "best of" button for python-ideas posts when we need it? ;) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From lvh at laurensvh.be Sat Jul 24 18:35:25 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Sat, 24 Jul 2010 18:35:25 +0200 Subject: [Python-ideas] Mont-E technical details, PEP 380, Monocle Message-ID: Hi! The people who attended EuroPython might remember my lightning talk about Mont-E. I wanted to go into more technical detail, but as I said in the talk it was very early days, and I was unable to get a reliable enough internet connection near the talk to really know what was done and what wasn't in detail (and I didn't want to lie about features that did and did not work). For those who haven't: it's a set of patches on top of Py3k that tries to introduce some of the ideas of E (a programming language) into Python. I just read Guido's python-dev post and I was listening in on the discussion after Raymond's talk, and I agree with pretty much everything what was said in both cases. The most important part is PEP 380 expressing a great idea; which is pretty much done and ready for real use, I think the best proof of that is that people have already tried to solve it and the solutions people come up with are quite similar (except that one is specifically tuned towards subgenerators and the other isn't): Twisted's returnValue and Mont-E's return-from-a-generator. The latter looks like this: def port_scan(hostname): ip = defer gethostbyname(hostname) ports = set() for port in range(65536): if (defer is_port_connectable(ip, port)): ports.add(port) return ports (Except that uses E-style promises + resolvers, instead of Twisted and Monocle style deferreds.) I'm not suggesting Mont-E's syntax makes it in favor of anything else, like I mentioned in the talk we really just wanted an excuse to mess with the grammar (the official excuse is "we wanted to see what we could do given the power to do anything including mangling half the stdlib and grammar" ;-)). port_scan(spam) would return a promise, not a generator. Like @inlineCallbacks, the goal of yield/defer is to basically say "oh, okay, you can do other stuff now, just call me back as soon as this thing is done". The value of the promise is the returned value. I think in its core this is analogous to PEP380, except that this takes it a step further and applies it to callbacks/deferreds (well, callbacks/promises+resolvers, technically, but I say deferreds because it's close enough and people are familiar with them). I think stuff like Monocle is a great idea because it introduces portability for async code. I like Twisted and I will continue writing Twisted stuff regardless, but I would much rather write asyncOAuth which everyone can use than txOAuth which only Twisted users could use. I think Monocle shows how these two things are related. How do people feel about a portable-async-code story for the stdlib? Mont-E tries to do this in ways that will be... ahum... let's say "hard" to get into the stdlib: by introducing promises, resolvers, event loops, interfaces... I also agree that CSP is a good model; I prefer actor for most of the stuff I end up writing; but there are definitely cases (for my stuff this tends to be about numerical computation) where CSP rocks. But the stdlib already has multiprocessing, maybe it's time we started looking at other stuff :-) thanks in advance for your input Laurens From vano at mail.mipt.ru Sat Jul 24 18:37:42 2010 From: vano at mail.mipt.ru (Ivan Pozdeev) Date: Sat, 24 Jul 2010 20:37:42 +0400 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> Message-ID: <91977653.20100724203742@mail.mipt.ru> ????????????, Nick. ?? ?????? 24 ???? 2010 ?., 14:12:31: > Why is that a problem? Some errnos *are* more important than others - > they're the ones the regularly appear on the right hand side of "errno > == " checks. > Why would you consider new classes that would be based on a survey of > the errnos that developers actually check for in published code to be > "arbitrary"? Since the list would be a sole opinion of some people who take part in the survey, you'll be constantly faced with demands of other people who want to have "shortcuts" for something else too. And you won't be able to explain why your choice is more preferable than theirs. > Any new IOError subclasses would likely still characterise classes of > errors rather than single errno values. The ones i see in the PEP correspond to either one or a few errnos. If the problem is you don't like the 'cryptic' errno mnemonics, it's a reason to change them instead. Current ones are just the standard POSIX names the errors are long and widely known under. -- Regards, Ivan mailto:vano at mail.mipt.ru From eric at trueblade.com Sat Jul 24 18:49:26 2010 From: eric at trueblade.com (Eric Smith) Date: Sat, 24 Jul 2010 12:49:26 -0400 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: <4C4B1996.5010306@trueblade.com> On 7/23/10 11:22 PM, Carl M. Johnson wrote: > On Fri, Jul 23, 2010 at 3:31 PM, Gregory P. Smith wrote: > >> What next? An optional way to support case insensitive names using a >> unicode character prefix? > > This suggests an innovative solution. Using the advanced phishing > techniques of today?s bleeding edge Mafioso hackers, we can use > unicode lookalikes to stand in for keywords. So, for example, the > motivating use case is the desire to write elem.for. Well, all we need > to do is use the Greek omicron and write elem.f?r. See the difference? > No? Exactly! > > This is a good first step as a workaround in older versions of Python, > but I think we can do better than this in future version. Since it is > so clearly useful and convenient with a wide variety of use cases to > be able to use current keywords as variable names, I propose that we > phase out the current set of keywords and replace them with > vowel-shifted lookalikes: Cyrillic ? for a, Cyrillic ? for y, omicron > for o, upside-down exclamation point for i, and the EU?s estimated > sign ? for e. So, the current keywords: > > and elif import return > as else in try > assert except is while > break finally lambda with > class for not yield > continue from or > def global pass > del if raise > > would be replaced with these new keywords: > > ?nd ?l?f ?mp?rt r?turn > ?s ?ls? ?n tr? > ?ss?rt ?xc?pt ?s wh?l? > br??k f?n?ll? l?mbd? w?th > cl?ss f?r n?t ???ld > c?nt?nu? fr?m ?r > d?f gl?b?l p?ss > d?l ?f r??s? > > Since this change is visually undetectable, I don?t see why we can?t > just make it mandatory in 3.2 instead of going through the usual > multi-release deprecation cycle. (I will admit that the transition > might be quicker if someone could modify 2to3 and make 3_1to3_2 to > help people convert their legacy scripts.) > > Can we get fast-track BDFL approval on this one? Seems like a slam dunk to me. > > Internationally yrs, > > -- ?J CJ for the win! Bravo. -- Eric. From greg at krypto.org Sat Jul 24 23:31:42 2010 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 24 Jul 2010 14:31:42 -0700 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <1279740852.3222.38.camel@localhost.localdomain> References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: On Wed, Jul 21, 2010 at 12:34 PM, Antoine Pitrou wrote: > > Hello, > > I would like to propose the following PEP for feedback and review. > Permanent link to up-to-date version with proper HTML formatting: > http://www.python.org/dev/peps/pep-3151/ > > Thank you, > > Antoine. > > ... ... > > New exception classes > --------------------- > > The following tentative list of subclasses, along with a description and > the list of errnos mapped to them, is submitted to discussion: > > * ``FileAlreadyExists``: trying to create a file or directory which already > exists (EEXIST) > > * ``FileNotFound``: for all circumstances where a file and directory is > requested but doesn't exist (ENOENT) > > * ``IsADirectory``: file-level operation (open(), os.remove()...) requested > on a directory (EISDIR) > > * ``NotADirectory``: directory-level operation requested on something else > (ENOTDIR) > > * ``PermissionDenied``: trying to run an operation without the adequate > access > rights - for example filesystem permissions (EACCESS, optionally EPERM) > > * ``BlockingIOError``: an operation would block on an object (e.g. socket) > set > for non-blocking operation (EAGAIN, EALREADY, EWOULDBLOCK, EINPROGRESS); > this is the existing ``io.BlockingIOError`` with an extended role > > * ``BadFileDescriptor``: operation on an invalid file descriptor (EBADF); > the default error message could point out that most causes are that > an existing file descriptor has been closed > > * ``ConnectionAborted``: connection attempt aborted by peer (ECONNABORTED) > > * ``ConnectionRefused``: connection reset by peer (ECONNREFUSED) > > * ``ConnectionReset``: connection reset by peer (ECONNRESET) > > * ``TimeoutError``: connection timed out (ECONNTIMEOUT); this could be > re-cast > as a generic timeout exception, useful for other types of timeout (for > example in Lock.acquire()) > > This list assumes step 1 is accepted in full; the exception classes > described above would all derive from the now unified exception type > OSError. It will need reworking if a partial version of step 1 is accepted > instead (again, see appendix A for the current distribution of errnos > and exception types). > > > Exception attributes > -------------------- > > In order to preserve *useful compatibility*, these subclasses should still > set adequate values for the various exception attributes defined on the > superclass (for example ``errno``, ``filename``, and optionally > ``winerror``). > > Implementation > -------------- > > Since it is proposed that the subclasses are raised based purely on the > value of ``errno``, little or no changes should be required in extension > modules (either standard or third-party). As long as they use the > ``PyErr_SetFromErrno()`` family of functions (or the > ``PyErr_SetFromWindowsErr()`` family of functions under Windows), they > should automatically benefit from the new, finer-grained exception classes. > > Library modules written in Python, though, will have to be adapted where > they currently use the following idiom (seen in ``Lib/tempfile.py``):: > > raise IOError(_errno.EEXIST, "No usable temporary file name found") > > Fortunately, such Python code is quite rare since raising OSError or > IOError > with an errno value normally happens when interfacing with system calls, > which is usually done in C extensions. > > If there is popular demand, the subroutine choosing an exception type based > on the errno value could be exposed for use in pure Python. > > > Possible objections > =================== > > Namespace pollution > ------------------- > > Making the exception hierarchy finer-grained makes the root (or builtins) > namespace larger. This is to be moderated, however, as: > > * only a handful of additional classes are proposed; > > * while standard exception types live in the root namespace, they are > visually distinguished by the fact that they use the CamelCase convention, > while almost all other builtins use lowercase naming (except True, False, > None, Ellipsis and NotImplemented) > > An alternative would be to provide a separate module containing the > finer-grained exceptions, but that would defeat the purpose of > encouraging careful code over careless code, since the user would first > have to import the new module instead of using names already accessible. > +1 in on this whole PEP! The EnvrionmentError hierarchy and common errno test code has bothered me for a while. While I think the namespace pollution concern is valid I would suggest adding "Error" to the end of all of the names (your initial proposal only says "Error" on the end of one of them) as that is consistent with the bulk of the existing standard exceptions and warnings. They are unlikely to conflict with anything other than exceptions people have already defined themselves in any existing code (which could likely be refactored out after we officially define these). > > Earlier discussion > ================== > > While this is the first time such as formal proposal is made, the idea > has received informal support in the past [1]_; both the introduction > of finer-grained exception classes and the coalescing of OSError and > IOError. > > The removal of WindowsError alone has been discussed and rejected > as part of another PEP [2]_, but there seemed to be a consensus that the > distinction with OSError wasn't meaningful. This supports at least its > aliasing with OSError. > > > Moratorium > ========== > > The moratorium in effect on language builtins means this PEP has little > chance to be accepted for Python 3.2. > > > Possible alternative > ==================== > > Pattern matching > ---------------- > > Another possibility would be to introduce an advanced pattern matching > syntax when catching exceptions. For example:: > > try: > os.remove(filename) > except OSError as e if e.errno == errno.ENOENT: > pass > > Several problems with this proposal: > > * it introduces new syntax, which is perceived by the author to be a > heavier > change compared to reworking the exception hierarchy > * it doesn't decrease typing effort significantly > * it doesn't relieve the programmer from the burden of having to remember > errno mnemonics > ugh. no. :) That only works well for single exceptions and encourages less explicit exception types. Exceptions are a class hierarchy, we should encourage its use rather than encouraging magic type specific attributes with conditionals. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Jul 24 23:47:21 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 24 Jul 2010 17:47:21 -0400 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <91977653.20100724203742@mail.mipt.ru> References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> <91977653.20100724203742@mail.mipt.ru> Message-ID: 2010/7/24 Ivan Pozdeev : .. >> Why would you consider new classes that would be based on a survey of >> the errnos that developers actually check for in published code to be >> "arbitrary"? > > Since the list would be a sole opinion of some people who take part in > the survey, you'll be constantly faced with demands of other people > who want to have "shortcuts" for something else too. I think you misunderstood the survey methodology. It was not a survey of developers, instead large bodies of code were examined. There is nothing arbitrary or subjective in this approach. FWIW, am +1 on the PEP. From greg.ewing at canterbury.ac.nz Sun Jul 25 03:08:50 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 Jul 2010 13:08:50 +1200 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: <4C4B02E7.2050806@mrabarnett.plus.com> References: <4C484FD0.2080803@zlotniki.pl> <4C4B02E7.2050806@mrabarnett.plus.com> Message-ID: <4C4B8EA2.1090407@canterbury.ac.nz> > Carl M. Johnson wrote: >> ?nd ?l?f ?mp?rt r?turn >> ?s ?ls? ?n tr? >> ?ss?rt ?xc?pt ?s wh?l? >> br??k f?n?ll? l?mbd? w?th >> cl?ss f?r n?t ???ld >> c?nt?nu? fr?m ?r >> d?f gl?b?l p?ss >> d?l ?f r??s? >> >> Since this change is visually undetectable, If you saw the funky way those all appear to me in Thunderbird, you wouldn't have written that sentence. :-) -- Greg From greg.ewing at canterbury.ac.nz Sun Jul 25 03:20:34 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 25 Jul 2010 13:20:34 +1200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <20100724112252.6d15e3f4@bhuda.mired.org> References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> <20100724112252.6d15e3f4@bhuda.mired.org> Message-ID: <4C4B9162.4030305@canterbury.ac.nz> Mike Meyer wrote: > I can't even say the > package authors are wrong not to catch and map those errors into > package-specific errors. I'd say they're not wrong at all. The exception hierarchy should be based on the semantics of the exceptions, not which package they happen to originate from or pass through. -- Greg From ncoghlan at gmail.com Sun Jul 25 04:58:45 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Jul 2010 12:58:45 +1000 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: <4C4B8EA2.1090407@canterbury.ac.nz> References: <4C484FD0.2080803@zlotniki.pl> <4C4B02E7.2050806@mrabarnett.plus.com> <4C4B8EA2.1090407@canterbury.ac.nz> Message-ID: On Sun, Jul 25, 2010 at 11:08 AM, Greg Ewing wrote: >> Carl M. Johnson wrote: > >>> ?nd ? ? ? ? ? ? ? ? ?l?f ? ? ? ? ? ? ? ??mp?rt ? ? ? ? ? ? ?r?turn >>> ?s ? ? ? ? ? ? ? ? ??ls? ? ? ? ? ? ? ? ??n ? ? ? ? ? ? ? ? ?tr? >>> ?ss?rt ? ? ? ? ? ? ??xc?pt ? ? ? ? ? ? ??s ? ? ? ? ? ? ? ? ?wh?l? >>> br??k ? ? ? ? ? ? ? f?n?ll? ? ? ? ? ? ? l?mbd? ? ? ? ? ? ? ?w?th >>> cl?ss ? ? ? ? ? ? ? f?r ? ? ? ? ? ? ? ? n?t ? ? ? ? ? ? ? ? ???ld >>> c?nt?nu? ? ? ? ? ? ?fr?m ? ? ? ? ? ? ? ??r >>> d?f ? ? ? ? ? ? ? ? gl?b?l ? ? ? ? ? ? ?p?ss >>> d?l ? ? ? ? ? ? ? ? ?f ? ? ? ? ? ? ? ? ?r??s? >>> >>> Since this change is visually undetectable, > > If you saw the funky way those all appear to me in Thunderbird, > you wouldn't have written that sentence. :-) That's the case for me with Gmail-in-Firefox, but with MRAB's substitutions it becomes genuinely indistinguishable: ?nd ?l?f ?mp?rt r?turn ?s ?ls? ?n tr? ?ss?rt ?xc?pt ?s wh?l? br??k f?n?ll? l?mbd? w?th cl?ss f?r n?t ???ld c?nt?nu? fr?m ?r d?f gl?b?l p?ss d?l ?f r??s? Fonts with incomplete glyph sets will still look wrong, of course. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Jul 25 05:12:24 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 25 Jul 2010 13:12:24 +1000 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <91977653.20100724203742@mail.mipt.ru> References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> <91977653.20100724203742@mail.mipt.ru> Message-ID: 2010/7/25 Ivan Pozdeev : > ????????????, Nick. > > ?? ?????? 24 ???? 2010 ?., 14:12:31: > >> Why is that a problem? Some errnos *are* more important than others - >> they're the ones the regularly appear on the right hand side of "errno >> == " checks. > >> Why would you consider new classes that would be based on a survey of >> the errnos that developers actually check for in published code to be >> "arbitrary"? > > Since the list would be a sole opinion of some people who take part in > the survey, you'll be constantly faced with demands of other people > who want to have "shortcuts" for something else too. > And you won't be able to explain why your choice is more preferable than theirs. As Alexander pointed out, the word survey has multiple meanings. One of those is the subjective approach you're objecting to (ask a bunch of people what they think), another is the more objective approach actually documented in the PEP (go and look at what is out there, as in the sense of "land survey"). Think "code survey" rather than "developer survey". (A scripted tool to gather statistics on exception handling in this space from Google code search results and direct scans of local Python code bases would actually be helpful, even if it wasn't 100% accurate) There is still a subjective step in whittling the code survey results down into a revised class heirarchy, but that's: - why it's a separate step in the PEP, independent of the consolidation step - why the PEP doesn't include a concrete proposal as yet - one of the main goals of discussion of the PEP here and across the wider Python community Language design is inherently a matter of judgment. Based on the way it has played out in practice (frequently requiring explicit errno checks and catching of multiple exception types in order to write correct code), we now think the previous judgment in relation to the EnvironmentError exception hierarchy is demonstrably flawed. That doesn't mean we throw our hands up in the air and give up - it means we knuckle down and try to come up with something better, based on what we can learn from what has gone before. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alex.gaynor at gmail.com Sun Jul 25 20:15:21 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sun, 25 Jul 2010 13:15:21 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ Message-ID: Recently I've been wondering why __contains__ casts all of it's returns to be boolean values. Specifically I'd like to propose that __contains__'s return values be passed directly back as the result of the `in` operation. As a result I'd further propose the introduction of __not_contains__, which is the `not in` operator. The primary usecase for this is something like an expression recorder. For example in SQLAlchemy one can do: User.id == 3, but not User.id in SQLList([1, 2, 3]), because it returns a bool always. __not_contains__ is needed to be the analog of this, as it cannot be merely be a negation of __contains__ when it's returning a non-bool result. There should be no backwards compatibility issues to making __contains__ return non-bools, unless there is code like: x = y in foo assert type(x) is bool However, for __not_contains__ I'd propose the default implementation be: def __not_contains__(self, val): x = val in self if type(x) is not bool: raise TypeError("%s returned a non-boolean value from __contains__ and didn't provide an implementation of __not_contains__") return not x This is not perfect (and it's at odds with the fact that __ne__ doesn't return not self == other), but it seems to allow both the desired flexibility and backwards compatibility. I'm not sure if this is something that'd be covered by the language moratorium, but if not I can try putting together a patch for this. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From raymond.hettinger at gmail.com Sun Jul 25 20:48:23 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 25 Jul 2010 11:48:23 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> On Jul 25, 2010, at 11:15 AM, Alex Gaynor wrote: > Recently I've been wondering why __contains__ casts all of it's > returns to be boolean values. Specifically I'd like to propose that > __contains__'s return values be passed directly back as the result of > the `in` operation. x = y in z # where x is a non boolean. Yuck. One of the beautiful aspects of __contains__ is that its simply signature allows it to be used polymorphically throughout the whole language. It would be ashamed to throw-away this virtue so that you can have a operator version of something that should really be a method (like find() for example). -1 on the proposal because it makes the language harder to grok while conferring only a dubious benefit (replacing well named methods with a non-descriptive use of an operator). There is no "natural" interpretation of an in-operator returning a non-boolean. If the above snippet assigns "foo" to x, what does that mean? If it assigns -10, what does that mean? Language design is about associating meanings (semantics) with syntax. ISTM, this would be poor design. Raymond From bruce at leapyear.org Sun Jul 25 23:20:19 2010 From: bruce at leapyear.org (Bruce Leban) Date: Sun, 25 Jul 2010 14:20:19 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: Let me see if I understand this: False in [False] Returns True now but would return False with your change. --- Bruce (via android) On Jul 25, 2010 11:15 AM, "Alex Gaynor" wrote: Recently I've been wondering why __contains__ casts all of it's returns to be boolean values. Specifically I'd like to propose that __contains__'s return values be passed directly back as the result of the `in` operation. As a result I'd further propose the introduction of __not_contains__, which is the `not in` operator. The primary usecase for this is something like an expression recorder. For example in SQLAlchemy one can do: User.id == 3, but not User.id in SQLList([1, 2, 3]), because it returns a bool always. __not_contains__ is needed to be the analog of this, as it cannot be merely be a negation of __contains__ when it's returning a non-bool result. There should be no backwards compatibility issues to making __contains__ return non-bools, unless there is code like: x = y in foo assert type(x) is bool However, for __not_contains__ I'd propose the default implementation be: def __not_contains__(self, val): x = val in self if type(x) is not bool: raise TypeError("%s returned a non-boolean value from __contains__ and didn't provide an implementation of __not_contains__") return not x This is not perfect (and it's at odds with the fact that __ne__ doesn't return not self == other), but it seems to allow both the desired flexibility and backwards compatibility. I'm not sure if this is something that'd be covered by the language moratorium, but if not I can try putting together a patch for this. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me _______________________________________________ Python-ideas mailing list Python-ideas at python.org http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jul 25 23:28:19 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 26 Jul 2010 07:28:19 +1000 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: On Mon, Jul 26, 2010 at 7:20 AM, Bruce Leban wrote: > Let me see if I understand this: > > False in [False] > > Returns True now but would return False with your change. No, that would be unaffected, since builtin containers would retain their current semantics. Raymond's objections apply though - using syntax rather than a method makes the language noticeably more complicated for minimal gain. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jackdied at gmail.com Sun Jul 25 23:32:21 2010 From: jackdied at gmail.com (Jack Diederich) Date: Sun, 25 Jul 2010 17:32:21 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: On Sun, Jul 25, 2010 at 5:20 PM, Bruce Leban wrote: > Let me see if I understand this: > > False in [False] > > Returns True now but would return False with your change. Bigtime. Official side-effects are neat for hacks but bad for maintainable code. You don't know pain until another developer complains that you refactored user.is_admin() to no longer return the user's object (for the record that happened in perl, but it could in python too). Boolean test operations should return bools for the same reason that in-place operations should return None. -Jack From fuzzyman at voidspace.org.uk Sun Jul 25 23:32:59 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 25 Jul 2010 22:32:59 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> Message-ID: On 25 July 2010 19:48, Raymond Hettinger wrote: > > On Jul 25, 2010, at 11:15 AM, Alex Gaynor wrote: > > > Recently I've been wondering why __contains__ casts all of it's > > returns to be boolean values. Specifically I'd like to propose that > > __contains__'s return values be passed directly back as the result of > > the `in` operation. > > x = y in z # where x is a non boolean. > > Yuck. > > How is it any worse than: x = y > z # where x is a non boolean And all the other operators that already do this? Michael > One of the beautiful aspects of __contains__ is that its simply signature > allows it to be used polymorphically throughout the whole language. > It would be ashamed to throw-away this virtue so that you can > have a operator version of something that should really be a method > (like find() for example). > > -1 on the proposal because it makes the language harder to grok > while conferring only a dubious benefit (replacing well named > methods with a non-descriptive use of an operator). > > There is no "natural" interpretation of an in-operator returning > a non-boolean. If the above snippet assigns "foo" to x, what > does that mean? If it assigns -10, what does that mean? > Language design is about associating meanings (semantics) > with syntax. ISTM, this would be poor design. > > > Raymond > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Sun Jul 25 23:36:34 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 25 Jul 2010 22:36:34 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: On 25 July 2010 22:32, Jack Diederich wrote: > On Sun, Jul 25, 2010 at 5:20 PM, Bruce Leban wrote: > > Let me see if I understand this: > > > > False in [False] > > > > Returns True now but would return False with your change. > > Bigtime. Official side-effects are neat for hacks but bad for > maintainable code. You don't know pain until another developer > complains that you refactored user.is_admin() to no longer return the > user's object (for the record that happened in perl, but it could in > python too). Boolean test operations should return bools Most of them don't and this can be useful. Why should __contains__ enforce it when other boolean operations don't? Inconsistency is also bad. > for the same > reason that in-place operations should return None. > What do you mean by "in-place operations should return None"? For mutable objects __iadd__ and friends should return self. For immutable ones they return the new value. Probably I misunderstand what you mean. Michael > > -Jack > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phd.pp.ru Sun Jul 25 23:59:12 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Mon, 26 Jul 2010 01:59:12 +0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: <20100725215912.GA23112@phd.pp.ru> On Sun, Jul 25, 2010 at 10:36:34PM +0100, Michael Foord wrote: > > in-place operations should return None. > > What do you mean by "in-place operations should return None"? list.sort() sorts in place and returns None. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From fuzzyman at voidspace.org.uk Mon Jul 26 00:34:23 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 25 Jul 2010 23:34:23 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <20100725215912.GA23112@phd.pp.ru> References: <20100725215912.GA23112@phd.pp.ru> Message-ID: On 25 July 2010 22:59, Oleg Broytman wrote: > On Sun, Jul 25, 2010 at 10:36:34PM +0100, Michael Foord wrote: > > > in-place operations should return None. > > > > What do you mean by "in-place operations should return None"? > > list.sort() sorts in place and returns None. > > Ah thanks, although I don't think this is analogous at all. Michael > Oleg. > -- > Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Jul 26 02:01:11 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 25 Jul 2010 17:01:11 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> Message-ID: <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> On Jul 25, 2010, at 2:32 PM, Michael Foord wrote: > > > x = y in z # where x is a non boolean. > > Yuck. > > > > How is it any worse than: > > > x = y > z # where x is a non boolean > > And all the other operators that already do this? Terrible sales technique: "how is this any worse than ..." ;-) Other operations such as rich comparisons have complicated our lives but had sufficient offsetting benefits than made them more bearable. Rich comparisons cause no end of trouble but at least they allow the numeric folks to implement some well studied behaviors than have proven benefits in their domain. In contrast, this proposal offers almost zero benefit to offset the pain it will cause. The OP didn't even offer a compelling use case or a single piece of code that wouldn't be better written with a normal method. No existing code expects "in" to return a non-boolean. A lot of code for containers or that uses containers implicitly expects simple invariants to hold: for x in container: assert x in container Raymond P.S. With rich comparisons, we've lost basics assumptions like equality operations being reflexsive, symmetric, and transitive. We should be cautioned by that experience and only go down that path again if there is a darned good reason. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gaynor at gmail.com Mon Jul 26 04:20:59 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sun, 25 Jul 2010 21:20:59 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> Message-ID: On Sun, Jul 25, 2010 at 7:01 PM, Raymond Hettinger wrote: > > On Jul 25, 2010, at 2:32 PM, Michael Foord wrote: >> >> >> x = y in z ? ? ? ? ?# where x is a non boolean. >> >> Yuck. >> > > > How is it any worse than: > > > ?x = y > z # where x is a non boolean > > And all the other operators that already do this? > > > Terrible sales technique: ?"how is this any worse than ..." ;-) > Other operations such as rich comparisons have > complicated our lives but had sufficient offsetting benefits > than made them more bearable. ?Rich comparisons cause > no end of trouble but at least they allow the numeric folks > to implement some well studied behaviors than have proven > benefits in their domain. > In contrast, this proposal offers almost zero benefit to offset > the pain it will cause. ?The OP didn't even offer a compelling > use case or a single piece of code that wouldn't be better > written with a normal method. > No existing code expects "in" to return a non-boolean. > A lot of code for containers or that uses containers implicitly > expects simple invariants to hold: > ?? for x in container: > ?? ? ? assert x in container > > > Raymond > > P.S. ?With rich comparisons, we've lost basics assumptions > like equality operations being reflexsive, symmetric, and > transitive. ? We should be cautioned by that experience > and only go down that path again if there is a darned good reason. Fundamentally the argument in favor of it is the same as for the other comparison operators: you want to do symbolic manipulation using the "normal" syntax, as a DSL. My example is that of a SQL expression builder: SQLAlchemy uses User.id == 3 to create a clause where the ID is 3, but for "id in [1, 2, 3]" it has: User.id.in_([1, 2, 3]), which is rather unseamly IMO (at least as much as having User.id.eq(3) would be). Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From ianb at colorstudy.com Mon Jul 26 05:01:32 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 25 Jul 2010 22:01:32 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> Message-ID: On Sun, Jul 25, 2010 at 7:01 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > P.S. With rich comparisons, we've lost basics assumptions > like equality operations being reflexsive, symmetric, and > transitive. We should be cautioned by that experience > and only go down that path again if there is a darned good reason. > Well, it's done, and this would simply complete that. I don't see how holding the line at __contains__ help anything; we've gone most of the way down the path, and this just gets us a little further along (and, or, and not still being an issue). Also I don't think this affects reflexive/symmetric/transitive aspects of equality, since any overloading allows that to happen, right? Rich comparisons affect other things. I assume this would fall under the moratorium though? -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Jul 26 05:07:22 2010 From: guido at python.org (Guido van Rossum) Date: Sun, 25 Jul 2010 20:07:22 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> Message-ID: On Sun, Jul 25, 2010 at 5:01 PM, Raymond Hettinger wrote: > > On Jul 25, 2010, at 2:32 PM, Michael Foord wrote: >> >> >> x = y in z ? ? ? ? ?# where x is a non boolean. >> >> Yuck. >> > > > How is it any worse than: > > > ?x = y > z # where x is a non boolean > > And all the other operators that already do this? > > > Terrible sales technique: ?"how is this any worse than ..." ;-) Whoa. Reformulate as a consistency argument and I totally buy it. > Other operations such as rich comparisons have > complicated our lives but had sufficient offsetting benefits > than made them more bearable. ?Rich comparisons cause > no end of trouble but at least they allow the numeric folks > to implement some well studied behaviors than have proven > benefits in their domain. True. The argument for rich comparisons is that "A = B <= C" where B and C are matrices of the same shape could return a matrix of bools of the same shape, like a generalization for "A = [b <= c for b, c in zip(B, C)]". > In contrast, this proposal offers almost zero benefit to offset > the pain it will cause. ?The OP didn't even offer a compelling > use case or a single piece of code that wouldn't be better > written with a normal method. OTOH, there is a similar use case: "A = B in C" could be defined by the author of a matrix type as (the similar generalization of) "A = [b in c for b, c in zip(B, C)]". This is still somewhat less compelling than for rich comparisons because the elements of C corresponding to those in would have to be sequences. But it is not totally uncompelling. BTW Alex *did* mention a use case: expression recoding like SQLAlchemy. > No existing code expects "in" to return a non-boolean. Most existing code also doesn't care, and all predefined implementations of __contains__ will still return bools. It's only folks like NumPy who would care. We should ask them though -- they had a chance to ask for this when rich comparisons were introduced, but apparently didn't. > A lot of code for containers or that uses containers implicitly > expects simple invariants to hold: > ?? for x in container: > ?? ? ? assert x in container Yeah, a lot of code using comparisons also breaks when comparisons don't return bools. It's a specialized use, but I don't see it as anathema. OTOH the real solution would be something like LINQ in C# (http://msdn.microsoft.com/en-us/netframework/aa904594.aspx, http://en.wikipedia.org/wiki/Language_Integrated_Query). > Raymond > > P.S. ?With rich comparisons, we've lost basics assumptions > like equality operations being reflexive, symmetric, and > transitive. ? We should be cautioned by that experience > and only go down that path again if there is a darned good reason. So where's the pain? I don't recall ever seeing a report from someone who was bitten by this. -- --Guido van Rossum (python.org/~guido) From guido at python.org Mon Jul 26 05:08:20 2010 From: guido at python.org (Guido van Rossum) Date: Sun, 25 Jul 2010 20:08:20 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> Message-ID: On Sun, Jul 25, 2010 at 8:01 PM, Ian Bicking wrote: > I assume this would fall under the moratorium though? Since it's not adding new syntax, that's debatable. -- --Guido van Rossum (python.org/~guido) From masklinn at masklinn.net Mon Jul 26 07:50:07 2010 From: masklinn at masklinn.net (Masklinn) Date: Mon, 26 Jul 2010 07:50:07 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> Message-ID: <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> On 2010-07-26, at 05:07 , Guido van Rossum wrote: > >> A lot of code for containers or that uses containers implicitly >> expects simple invariants to hold: >> for x in container: >> assert x in container > > Yeah, a lot of code using comparisons also breaks when comparisons > don't return bools. It's a specialized use, but I don't see it as > anathema. > > OTOH the real solution would be something like LINQ in C# > (http://msdn.microsoft.com/en-us/netframework/aa904594.aspx, > http://en.wikipedia.org/wiki/Language_Integrated_Query). Most of LINQ itself (the LINQ library, as opposed to the query syntaxes which are solely syntactic sugar and statically compiled into LINQ method calls) can already be implemented in Python. The things that might be missing are *some* LINQ-supporting features. Likely expression trees[0], maybe (but probably not) less limited and terser lambdas. [0] http://msdn.microsoft.com/en-us/library/bb397951.aspx From guido at python.org Mon Jul 26 16:28:30 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Jul 2010 07:28:30 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> Message-ID: On Sun, Jul 25, 2010 at 10:50 PM, Masklinn wrote: > On 2010-07-26, at 05:07 , Guido van Rossum wrote: >> >>> A lot of code for containers or that uses containers implicitly >>> expects simple invariants to hold: >>> ? ?for x in container: >>> ? ? ? ?assert x in container >> >> Yeah, a lot of code using comparisons also breaks when comparisons >> don't return bools. It's a specialized use, but I don't see it as >> anathema. >> >> OTOH the real solution would be something like LINQ in C# >> (http://msdn.microsoft.com/en-us/netframework/aa904594.aspx, >> http://en.wikipedia.org/wiki/Language_Integrated_Query). > > Most of LINQ itself (the LINQ library, as opposed to the query syntaxes which are solely syntactic sugar and statically compiled into LINQ method calls) can already be implemented in Python. Well, the point of allowing more general __contains__ overloading is exactly to improve upon the query syntax -- you may call it syntactic sugar (often a derogatory term), but you currently cannot translate an 'in' operator into a parse tree like you can for '<' or '+'. (The other odd ducks are 'and' and 'or', though in a pinch one can use '&' and '|' for those. I forget in which camp 'not' falls. > The things that might be missing are *some* LINQ-supporting features. Likely expression trees[0], maybe (but probably not) less limited and terser lambdas. > > [0] http://msdn.microsoft.com/en-us/library/bb397951.aspx That's exactly the point I am driving at here. :-) -- --Guido van Rossum (python.org/~guido) From masklinn at masklinn.net Mon Jul 26 16:48:45 2010 From: masklinn at masklinn.net (Masklinn) Date: Mon, 26 Jul 2010 16:48:45 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> Message-ID: <85C1FAEB-99C2-4991-9CE7-06B034492CB2@masklinn.net> On 2010-07-26, at 16:28 , Guido van Rossum wrote: > On Sun, Jul 25, 2010 at 10:50 PM, Masklinn wrote: >> On 2010-07-26, at 05:07 , Guido van Rossum wrote: >>> >>>> A lot of code for containers or that uses containers implicitly >>>> expects simple invariants to hold: >>>> for x in container: >>>> assert x in container >>> >>> Yeah, a lot of code using comparisons also breaks when comparisons >>> don't return bools. It's a specialized use, but I don't see it as >>> anathema. >>> >>> OTOH the real solution would be something like LINQ in C# >>> (http://msdn.microsoft.com/en-us/netframework/aa904594.aspx, >>> http://en.wikipedia.org/wiki/Language_Integrated_Query). >> >> Most of LINQ itself (the LINQ library, as opposed to the query syntaxes which are solely syntactic sugar and statically compiled into LINQ method calls) can already be implemented in Python. > > Well, the point of allowing more general __contains__ overloading is > exactly to improve upon the query syntax -- you may call it syntactic > sugar (often a derogatory term) I didn't intend it as such, I just meant that there is nothing the LINQ query syntax allows which isn't available (usually more clearly as far as I'm concerned) via the library part of the same. > but you currently cannot translate an 'in' operator into a parse tree like you can for '<' or '+'. Why not? How would it be different from + or bool no? >> The things that might be missing are *some* LINQ-supporting features. Likely expression trees[0], maybe (but probably not) less limited and terser lambdas. >> >> [0] http://msdn.microsoft.com/en-us/library/bb397951.aspx > > That's exactly the point I am driving at here. :-) Oh well, I probably missed the hints then :( From phd at phd.pp.ru Mon Jul 26 16:53:31 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Mon, 26 Jul 2010 18:53:31 +0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> Message-ID: <20100726145331.GA20020@phd.pp.ru> On Mon, Jul 26, 2010 at 07:28:30AM -0700, Guido van Rossum wrote: > other odd ducks are 'and' and 'or', though in a pinch one can use '&' > and '|' for those. I forget in which camp 'not' falls. 'not' is like 'and' and 'or'. '~' is like '&' and '|'; its magic method is __invert__. I maintain a similar all-magic-methods-overridden class, just for a different ORM (SQLObject), so I am greatly interested in the discussion. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From alexander.belopolsky at gmail.com Mon Jul 26 16:54:24 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 26 Jul 2010 10:54:24 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> Message-ID: On Mon, Jul 26, 2010 at 10:28 AM, Guido van Rossum wrote: .. > Well, the point of allowing more general __contains__ overloading is > exactly to improve upon the query syntax -- you may call it syntactic > sugar (often a derogatory term), but you currently cannot translate an > 'in' operator into a parse tree like you can for '<' or '+'. FWIW, I am +0 on a more general __contains__ and query or expression building are the areas where I could use it. Since we are at it, can we have a decision on whether mp_length and sq_length will change their signatures to return PyObject* rather than Py_ssize_t. In other words, are we going to allow virtual sequences like py3k range to have length greater than sys.maxsize? There is an old open issue that turns on this decision: http://bugs.python.org/issue2690 Here I am -1 on making the change but +1 on the range patch in the issue. From ianb at colorstudy.com Mon Jul 26 18:09:18 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 26 Jul 2010 11:09:18 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <20100726145331.GA20020@phd.pp.ru> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <20100726145331.GA20020@phd.pp.ru> Message-ID: On Mon, Jul 26, 2010 at 9:53 AM, Oleg Broytman wrote: > On Mon, Jul 26, 2010 at 07:28:30AM -0700, Guido van Rossum wrote: > > other odd ducks are 'and' and 'or', though in a pinch one can use '&' > > and '|' for those. I forget in which camp 'not' falls. > > 'not' is like 'and' and 'or'. '~' is like '&' and '|'; its magic method > is __invert__. > ~ actually works okay (except for it reading poorly -- it's a rather obscure operator), but & and | have precedence that makes them very error-prone in my experience, e.g., (query.column == 'x' | query.column == 'y') becomes the SQL "(column == ('x' OR column)) == 'y'". I know overriding "and" and "or" has been discussed some time ago, though I'm not sure what the exact reason is that it never went anywhere. I remember it as one of those epic-and-boring comp.lang.python threads ;) One obvious complication is that they are short-circuit operations. That is, "a and b" must evaluate a, figure out if it is false, and if so then it returns a. But "a or b" must evaluate a, figure out if it is TRUE, and if so then returns a. So if there was anything like __and__ and __or__ it would have to simply disable any short-circuiting. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Jul 26 18:23:21 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 26 Jul 2010 12:23:21 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <20100726145331.GA20020@phd.pp.ru> Message-ID: On Mon, Jul 26, 2010 at 12:09 PM, Ian Bicking wrote: .. > I know overriding "and" and "or" has been discussed some time ago, though > I'm not sure what the exact reason is that it never went anywhere. Indeed. See PEP 335. http://www.python.org/dev/peps/pep-0335/. > I remember it as one of those epic-and-boring comp.lang.python threads ;)? One > obvious complication is that they are short-circuit operations.? That is, "a > and b" must evaluate a, figure out if it is false, and if so then it returns > a.? But "a or b" must evaluate a, figure out if it is TRUE, and if so then > returns a.? So if there was anything like __and__ and __or__ it would have > to simply disable any short-circuiting. The PEP deals with short-circuiting AFAIK. From ronaldoussoren at mac.com Mon Jul 26 17:34:12 2010 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 26 Jul 2010 17:34:12 +0200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: References: <1279740852.3222.38.camel@localhost.localdomain> <636668799.20100724111645@mail.mipt.ru> <91977653.20100724203742@mail.mipt.ru> Message-ID: <43479E63-E3C5-4EA2-AE91-EBC95E28ED4A@mac.com> On 24 Jul, 2010, at 23:47, Alexander Belopolsky wrote: > 2010/7/24 Ivan Pozdeev : > .. >>> Why would you consider new classes that would be based on a survey of >>> the errnos that developers actually check for in published code to be >>> "arbitrary"? >> >> Since the list would be a sole opinion of some people who take part in >> the survey, you'll be constantly faced with demands of other people >> who want to have "shortcuts" for something else too. > > I think you misunderstood the survey methodology. It was not a survey > of developers, instead large bodies of code were examined. There is > nothing arbitrary or subjective in this approach. > > FWIW, am +1 on the PEP. Same here, I'm +1 as well. The PEP is clear and solves a definite problem with a well though-out methodology. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3567 bytes Desc: not available URL: From alexandre at peadrop.com Mon Jul 26 20:42:07 2010 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Mon, 26 Jul 2010 11:42:07 -0700 Subject: [Python-ideas] [Python-Dev] Readability of hex strings (Was: Use of coding cookie in 3.x stdlib) In-Reply-To: References: Message-ID: [+Python-ideas -Python-Dev] import binascii def h(s): return binascii.unhexlify("".join(s.split())) h("DE AD BE EF CA FE BA BE") -- Alexandre On Mon, Jul 26, 2010 at 11:29 AM, anatoly techtonik wrote: > I find "\xXX\xXX\xXX\xXX..." notation for binary data totally > unreadable. Everybody who uses and analyses binary data is more > familiar with plain hex dumps in the form of "XX XX XX XX...". > > I wonder if it is possible to introduce an effective binary string > type that will be represented as h"XX XX XX" in language syntax? It > will be much easier to analyze printed binary data and copy/paste such > data as-is from hex editors/views. > > On Mon, Jul 19, 2010 at 9:45 AM, Guido van Rossum wrote: >> Sounds like a good idea to try to remove redundant cookies *and* to >> remove most occasional use of non-ASCII characters outside comments >> (except for unittests specifically trying to test Unicode features). >> Personally I would use \xXX escapes instead of spelling out the >> characters in shlex.py, for example. >> >> Both with or without the coding cookies, many ways of displaying text >> files garble characters outside the ASCII range, so it's better to >> stick to ASCII as much as possible. >> >> --Guido >> >> On Mon, Jul 19, 2010 at 1:21 AM, Alexander Belopolsky >> wrote: >>> I was looking at the inspect module and noticed that it's source >>> starts with "# -*- coding: iso-8859-1 -*-". ? I have checked and there >>> are no non-ascii characters in the file. ? There are several other >>> modules that still use the cookie: >>> >>> Lib/ast.py:# -*- coding: utf-8 -*- >>> Lib/getopt.py:# -*- coding: utf-8 -*- >>> Lib/inspect.py:# -*- coding: iso-8859-1 -*- >>> Lib/pydoc.py:# -*- coding: latin-1 -*- >>> Lib/shlex.py:# -*- coding: iso-8859-1 -*- >>> Lib/encodings/punycode.py:# -*- coding: utf-8 -*- >>> Lib/msilib/__init__.py:# -*- coding: utf-8 -*- >>> Lib/sqlite3/__init__.py:#-*- coding: ISO-8859-1 -*- >>> Lib/sqlite3/dbapi2.py:#-*- coding: ISO-8859-1 -*- >>> Lib/test/bad_coding.py:# -*- coding: uft-8 -*- >>> Lib/test/badsyntax_3131.py:# -*- coding: utf-8 -*- >>> >>> I understand that coding: utf-8 is strictly redundant in 3.x. ?There >>> are cases such as Lib/shlex.py where using encoding other than utf-8 >>> is justified. ?(See >>> http://svn.python.org/view?view=rev&revision=82560). ?What are the >>> guidelines for other cases? ?Should redundant cookies be removed? >>> Since not all editors respect the ?-*- cookie, I think the answer >>> should be "yes" particularly when the cookie is setting encoding other >>> than utf-8. >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org >>> >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com > From stephen at xemacs.org Tue Jul 27 03:43:22 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 27 Jul 2010 10:43:22 +0900 Subject: [Python-ideas] [Python-Dev] Set the namespace free! In-Reply-To: References: <4C484FD0.2080803@zlotniki.pl> Message-ID: <87zkxdvgw5.fsf@uwakimon.sk.tsukuba.ac.jp> Georg Brandl writes: > Am 26.07.2010 10:59, schrieb Anders Sandvig: > > On Sat, Jul 24, 2010 at 3:31 AM, Gregory P. Smith wrote: > >> Yuck. Anyone who feels they need a variable named the same a reserved word > >> simply feels wrong and needs reeducation. [...] > > > > While I agree with you in principle, I have been finding it > > frustrating trying to calculate yield in my financial applications > > lately... ;) > > In the spirit of optimistic programming, why not assume a large one > and call it Yield? ;) That's certainly more workable than the obvious near-synonym, "return". From greg.ewing at canterbury.ac.nz Tue Jul 27 08:33:28 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Jul 2010 18:33:28 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> Message-ID: <4C4E7DB8.2040407@canterbury.ac.nz> Guido van Rossum wrote: > The other odd ducks are 'and' and 'or', Well, I tried to do something about that with the Overloaded Boolean Operators proposal, but it seems to have met with a not-very-enthusiastic response. Have you had any more thoughts about it? Do you think it's a problem worth solving, or are '&' and '|' good enough? Or would you rather see some completely different mechanism introduced for getting parse trees from expressions, a la LINQ? > I forget in which camp 'not' falls. If I remember correctly, it's not currently overridable independently from __bool__, but there would be no difficulty in making it so, because there is no control flow involved. -- Greg From greg.ewing at canterbury.ac.nz Tue Jul 27 09:28:35 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 27 Jul 2010 19:28:35 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <20100726145331.GA20020@phd.pp.ru> Message-ID: <4C4E8AA3.8090201@canterbury.ac.nz> Ian Bicking wrote: > So if > there was anything like __and__ and __or__ it would have to simply > disable any short-circuiting. Not true -- in PEP 335 I showed how it can be done without giving up any short-circuiting possibilities. -- Greg From guido at python.org Tue Jul 27 16:02:27 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Jul 2010 07:02:27 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <4C4E7DB8.2040407@canterbury.ac.nz> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On Mon, Jul 26, 2010 at 11:33 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> The other odd ducks are 'and' and 'or', > > Well, I tried to do something about that with the Overloaded > Boolean Operators proposal, but it seems to have met with > a not-very-enthusiastic response. > > Have you had any more thoughts about it? Do you think it's > a problem worth solving, or are '&' and '|' good enough? > Or would you rather see some completely different mechanism > introduced for getting parse trees from expressions, a la > LINQ? I think that the approach of overloading tons of operators will always cause a somewhat cramped style, even if we fix some individual operators. There just are too many things that can go wrong, and they will be hard to debug for the *users* of the libraries that provide this stuff. (One reason is that the easy generalization from 1-2 simple examples often fails for non-obvious reasons.) Therefore I think the LINQ approach, which (IIUC) converts an expression into a parse tree when certain syntax is encountered, and calls a built-in method with that parse tree, would be a fresh breath of air. No need deriding it just because Microsoft came up with it first. -- --Guido van Rossum (python.org/~guido) From grosser.meister.morti at gmx.net Tue Jul 27 16:29:56 2010 From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=) Date: Tue, 27 Jul 2010 16:29:56 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> Message-ID: <4C4EED64.90806@gmx.net> On 07/26/2010 04:20 AM, Alex Gaynor wrote: > > Fundamentally the argument in favor of it is the same as for the other > comparison operators: you want to do symbolic manipulation using the > "normal" syntax, as a DSL. My example is that of a SQL expression > builder: SQLAlchemy uses User.id == 3 to create a clause where the ID > is 3, but for "id in [1, 2, 3]" it has: User.id.in_([1, 2, 3]), which > is rather unseamly IMO (at least as much as having User.id.eq(3) would > be). > This is a bad example for your wish because this code: >>> id in [1, 2, 3] translates into: >>> [1, 2, 3].__contains__(id) So it doesn't help that 'in' may return something else than a bool because the method is called on the wrong object for your purposes. -panzi From guido at python.org Tue Jul 27 17:59:33 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Jul 2010 16:59:33 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <4C4EED64.90806@gmx.net> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: On Tue, Jul 27, 2010 at 3:29 PM, Mathias Panzenb?ck wrote: > On 07/26/2010 04:20 AM, Alex Gaynor wrote: >> >> Fundamentally the argument in favor of it is the same as for the other >> comparison operators: you want to do symbolic manipulation using the >> "normal" syntax, as a DSL. ?My example is that of a SQL expression >> builder: SQLAlchemy uses User.id == 3 to create a clause where the ID >> is 3, but for "id in [1, 2, 3]" it has: User.id.in_([1, 2, 3]), which >> is rather unseamly IMO (at least as much as having User.id.eq(3) would >> be). >> > > This is a bad example for your wish because this code: >>>> id in [1, 2, 3] > > translates into: >>>> [1, 2, 3].__contains__(id) > > So it doesn't help that 'in' may return something else than a bool > because the method is called on the wrong object for your purposes. Well that pretty much kills the proposal. I can't believe nobody (including myself) figured this out earlier in the thread. :-( -- --Guido van Rossum (python.org/~guido) From alex.gaynor at gmail.com Tue Jul 27 18:02:20 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Tue, 27 Jul 2010 11:02:20 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: On Tue, Jul 27, 2010 at 10:59 AM, Guido van Rossum wrote: > On Tue, Jul 27, 2010 at 3:29 PM, Mathias Panzenb?ck > wrote: >> On 07/26/2010 04:20 AM, Alex Gaynor wrote: >>> >>> Fundamentally the argument in favor of it is the same as for the other >>> comparison operators: you want to do symbolic manipulation using the >>> "normal" syntax, as a DSL. ?My example is that of a SQL expression >>> builder: SQLAlchemy uses User.id == 3 to create a clause where the ID >>> is 3, but for "id in [1, 2, 3]" it has: User.id.in_([1, 2, 3]), which >>> is rather unseamly IMO (at least as much as having User.id.eq(3) would >>> be). >>> >> >> This is a bad example for your wish because this code: >>>>> id in [1, 2, 3] >> >> translates into: >>>>> [1, 2, 3].__contains__(id) >> >> So it doesn't help that 'in' may return something else than a bool >> because the method is called on the wrong object for your purposes. > > Well that pretty much kills the proposal. I can't believe nobody > (including myself) figured this out earlier in the thread. :-( > > -- > --Guido van Rossum (python.org/~guido) > Well, in my original example I wrapped the list with a SQLList() container class. I thought of the issue before, but it hardly seems like a blocker, the numpy stuff is unaffected for example: they're not using a builtin container, and for myself I'm willing to wrap my lists to get the pretty syntax. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From guido at python.org Tue Jul 27 18:07:20 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Jul 2010 09:07:20 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: On Tue, Jul 27, 2010 at 9:02 AM, Alex Gaynor wrote: > On Tue, Jul 27, 2010 at 10:59 AM, Guido van Rossum wrote: >> On Tue, Jul 27, 2010 at 3:29 PM, Mathias Panzenb?ck >> wrote: >>> On 07/26/2010 04:20 AM, Alex Gaynor wrote: >>>> >>>> Fundamentally the argument in favor of it is the same as for the other >>>> comparison operators: you want to do symbolic manipulation using the >>>> "normal" syntax, as a DSL. ?My example is that of a SQL expression >>>> builder: SQLAlchemy uses User.id == 3 to create a clause where the ID >>>> is 3, but for "id in [1, 2, 3]" it has: User.id.in_([1, 2, 3]), which >>>> is rather unseamly IMO (at least as much as having User.id.eq(3) would >>>> be). >>>> >>> >>> This is a bad example for your wish because this code: >>>>>> id in [1, 2, 3] >>> >>> translates into: >>>>>> [1, 2, 3].__contains__(id) >>> >>> So it doesn't help that 'in' may return something else than a bool >>> because the method is called on the wrong object for your purposes. >> >> Well that pretty much kills the proposal. I can't believe nobody >> (including myself) figured this out earlier in the thread. :-( >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > Well, in my original example I wrapped the list with a SQLList() > container class. ?I thought of the issue before, but it hardly seems > like a blocker, the numpy stuff is unaffected for example: they're not > using a builtin container, and for myself I'm willing to wrap my lists > to get the pretty syntax. Well, writing "x in wrapper(y)" is hardly prettier than "contains(y, x)", if you compare it to "x in y". And it is certainly another thing that can go wrong in a non-obvious way. -- --Guido van Rossum (python.org/~guido) From robert.kern at gmail.com Tue Jul 27 18:25:11 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 Jul 2010 11:25:11 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On 7/27/10 9:02 AM, Guido van Rossum wrote: > Therefore I think the LINQ approach, which (IIUC) converts an > expression into a parse tree when certain syntax is encountered, and > calls a built-in method with that parse tree, would be a fresh breath > of air. No need deriding it just because Microsoft came up with it > first. I've occasionally wished that we could repurpose backticks for expression literals: expr = `x + y*z` assert isinstance(expr, ast.Expression) And triple backticks for blocks of statements: block = ``` try: frobnicate() except FrobError: print("Not on my watch!") ``` assert isinstance(block, ast.Module) Too bad backticks look like grit on Tim's monitor! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alexander.belopolsky at gmail.com Tue Jul 27 18:42:02 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 27 Jul 2010 12:42:02 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: On Tue, Jul 27, 2010 at 11:59 AM, Guido van Rossum wrote: .. >> So it doesn't help that 'in' may return something else than a bool >> because the method is called on the wrong object for your purposes. > > Well that pretty much kills the proposal. I can't believe nobody > (including myself) figured this out earlier in the thread. :-( It may kill a use case or two, but not the proposal. In the libraries like numpy where all python containers get replaced, this is not an issue. Also this problem invites __rcontains__ solution, but the proposal is not very attractive to begin with. IMO, operators that are not symbols such as +, - or &, but words such as 'in', 'not' or 'and' don't offer much advantage over function calls. From masklinn at masklinn.net Tue Jul 27 18:42:32 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 27 Jul 2010 18:42:32 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On 2010-07-27, at 18:25 , Robert Kern wrote: > On 7/27/10 9:02 AM, Guido van Rossum wrote: > >> Therefore I think the LINQ approach, which (IIUC) converts an >> expression into a parse tree when certain syntax is encountered, and >> calls a built-in method with that parse tree, would be a fresh breath >> of air. No need deriding it just because Microsoft came up with it >> first. > > I've occasionally wished that we could repurpose backticks for expression literals: > > expr = `x + y*z` > assert isinstance(expr, ast.Expression) > > And triple backticks for blocks of statements: > > block = ``` > try: > frobnicate() > except FrobError: > print("Not on my watch!") > ``` > assert isinstance(block, ast.Module) > > Too bad backticks look like grit on Tim's monitor! What about french quotes expr = ?x + y * z? block = ??? try: frobnicate() except FrobError: print("Oh no you di'n't") ??? ? Or maybe some question marks? expr = ?x + y * z? From fuzzyman at gmail.com Tue Jul 27 18:49:12 2010 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 27 Jul 2010 17:49:12 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: On 27 July 2010 17:42, Alexander Belopolsky wrote: > On Tue, Jul 27, 2010 at 11:59 AM, Guido van Rossum > wrote: > .. > >> So it doesn't help that 'in' may return something else than a bool > >> because the method is called on the wrong object for your purposes. > > > > Well that pretty much kills the proposal. I can't believe nobody > > (including myself) figured this out earlier in the thread. :-( > > It may kill a use case or two, but not the proposal. In the > libraries like numpy where all python containers get replaced, this is > not an issue. Also this problem invites __rcontains__ solution, Wasn't the lack of an __rcontains__ a problem for the web-sig guys trying to work out the bytes / strings issue? For what it's worth I think that guido is correct that a better solution for the expression -> query problem is to introduce an expression tree, as is done for LINQ (which has been enormously popular amongst .NET developers). All the best, Michael Foord > but > the proposal is not very attractive to begin with. IMO, operators > that are not symbols such as +, - or &, but words such as 'in', 'not' > or 'and' don't offer much advantage over function calls. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Jul 27 19:05:54 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Jul 2010 19:05:54 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: <20100727190554.55129671@pitrou.net> On Tue, 27 Jul 2010 18:42:32 +0200 Masklinn wrote: > > > > Too bad backticks look like grit on Tim's monitor! > > What about french quotes > > expr = ?x + y * z? You should require non-breaking spaces (U+00A0 or U+202F) between them and the enclosed expression: expr = ??x + y * z?? (with U+00A0) expr = ??x + y * z?? (with U+202F) (I hope my editor doesn't fool me here) > block = ??? > try: > frobnicate() > except FrobError: > print("Oh no you di'n't") > ??? This would be quite a derogatory use of French quotes. From 2010 at jmunch.dk Tue Jul 27 19:18:30 2010 From: 2010 at jmunch.dk (Anders J. Munch) Date: Tue, 27 Jul 2010 19:18:30 +0200 Subject: [Python-ideas] PEP 380 alternative: A yielding function Message-ID: <4C4F14E6.1060102@jmunch.dk> Looking at PEP 380 (http://www.python.org/dev/peps/pep-0380/), the need for yield forwarding syntax comes from the inability to delegate yielding functionality in the usual way. For example, if you have a recurring pattern including yields, like this (this is a toy example, please don't take it for more than that): if a: yield a if b: yield b you cannot do the extract function refactoring in the usual way: def yield_if_true(x): if x: yield x yield_if_true(a) yield_if_true(b) because yield_if_true would become a generator. PEP 380 addresses this by making the workaround - "for x in yield_if_true(a): yield x" - easier to write. But suppose you could address the source instead? Suppose you could write yield_if_true in such a way that it did not become a generator despite yielding? Syntactically this could be done with a yielding *function* in addition to the yield statement/expression, either as a builtin or a function in the sys module. Let's call it 'yield_' , for lack of a better name. The function would yield the nearest generator on the call stack. Now the example would work with a slight modifiction: def yield_if_true(x): if x: yield_(x) yield_if_true(a) yield_if_true(b) The real benefits from a yield_ function come with recursive functions. A recursive tree traversal that yields from the leaf nodes currently suffers from a performance penalty: Every yield is repeated as many times as the depth of the tree, turning a O(n) traversal algorithm into an O(n lg(n)) algorithm. PEP 380 does not change that. But a yield_ function could be O(1), regardless of the forwarding depth. To be fair, a clever implementation might be able to short-circuit a 'yield from' chain and achieve the same thing. Two main drawbacks of yield_: - Difficulty of implementation. Generators would need to keep an entire stack branch alive, instead of merely a single frame, and if that somehow affects the efficiency of simple generators, that would be bad. - 'the nearest generator on the call stack' is sort of a dynamic scoping thing, which has its problems. For example, if you forget to make the relevant function a generator (the "if 0: yield None" trick might have been needed but was forgotten), then the yield would trickle up to some random generator higher up, with confusing results. And if you use yield_ in a callback, well, let's just say that an interesting case too. All the same, if a yield_ function is practical, I think it is a better way to address the problems that motivate PEP 380. I'm guessing you could implement 'yield from' as a pure-Python function using yield_, making yield_ strictly more powerful, although I couldn't say for sure as I haven't studied the enhanced generator protocol. regards, Anders From bruce at leapyear.org Tue Jul 27 19:25:05 2010 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 27 Jul 2010 10:25:05 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: The idea of LINQ is that you write the expression directly in the language and it translates into a query expression. It's going to be operating on an expression parse tree, right? Rather than trying to change the allowable expressions maybe the question is to figure out how to translate what we have and find what we can't express with what we have (and that's an orthogonal question and has nothing to do with __xxx__ functions). On Tue, Jul 27, 2010 at 9:42 AM, Masklinn wrote: > > What about french quotes > > expr = ?x + y * z? > > Isn't there are already a syntax for this? expr = lambda: x + y * z Maybe you want some conversion of that lambda into a different form: expr = @ast lambda: x + y + z --- Bruce http://www.vroospeak.com http://google-gruyere.appspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jul 27 20:01:12 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Jul 2010 19:01:12 +0100 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C4F14E6.1060102@jmunch.dk> References: <4C4F14E6.1060102@jmunch.dk> Message-ID: Have you really thought through the implementation in CPython? When a non-generator function is called, there's usually a C stack frame in the way which prevents yielding. --Guido On Tue, Jul 27, 2010 at 6:18 PM, Anders J. Munch <2010 at jmunch.dk> wrote: > Looking at PEP 380 (http://www.python.org/dev/peps/pep-0380/), the > need for yield forwarding syntax comes from the inability to delegate > yielding functionality in the usual way. For example, if you have a > recurring pattern including yields, like this (this is a toy example, > please don't take it for more than that): > > ?if a: > ? ? ?yield a > ?if b: > ? ? ?yield b > > you cannot do the extract function refactoring in the usual way: > > ? ? def yield_if_true(x): > ? ? ?if x: > ? ? ? ? ?yield x > ?yield_if_true(a) > ?yield_if_true(b) > > because yield_if_true would become a generator. > > PEP 380 addresses this by making the workaround - "for x in > yield_if_true(a): yield x" - easier to write. > > But suppose you could address the source instead? ?Suppose you could > write yield_if_true in such a way that it did not become a generator > despite yielding? > > Syntactically this could be done with a yielding *function* in > addition to the yield statement/expression, either as a builtin or a > function in the sys module. ?Let's call it 'yield_' , for lack of a > better name. ?The function would yield the nearest generator on the > call stack. > > Now the example would work with a slight modifiction: > > ?def yield_if_true(x): > ? ? ?if x: > ? ? ? ? ?yield_(x) > ?yield_if_true(a) > ?yield_if_true(b) > > The real benefits from a yield_ function come with recursive > functions. ?A recursive tree traversal that yields from the leaf nodes > currently suffers from a performance penalty: Every yield is repeated > as many times as the depth of the tree, turning a O(n) traversal > algorithm into an O(n lg(n)) algorithm. ?PEP 380 does not change that. > > But a yield_ function could be O(1), regardless of the forwarding > depth. > > To be fair, a clever implementation might be able to short-circuit a > 'yield from' chain and achieve the same thing. > > Two main drawbacks of yield_: > - Difficulty of implementation. Generators would need to keep an > ?entire stack branch alive, instead of merely a single frame, and if > ?that somehow affects the efficiency of simple generators, that would > ?be bad. > - 'the nearest generator on the call stack' is sort of a dynamic > ?scoping thing, which has its problems. ?For example, if you forget > ?to make the relevant function a generator (the "if 0: yield None" > ?trick might have been needed but was forgotten), then the yield > ?would trickle up to some random generator higher up, with confusing > ?results. ?And if you use yield_ in a callback, well, let's just say > ?that an interesting case too. > > All the same, if a yield_ function is practical, I think it is a > better way to address the problems that motivate PEP 380. ?I'm guessing you > could implement 'yield from' as a pure-Python > function using yield_, making yield_ strictly more powerful, although > I couldn't say for sure as I haven't studied the enhanced generator > protocol. > > regards, Anders > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Jul 27 20:04:20 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Jul 2010 19:04:20 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On Tue, Jul 27, 2010 at 5:25 PM, Robert Kern wrote: > I've occasionally wished that we could repurpose backticks for expression > literals: > > ?expr = `x + y*z` > ?assert isinstance(expr, ast.Expression) Maybe you could just as well make it a plain string literal and call a function that parses it into a parse tree: expr = parse("x + y*z") assert isinstance(expr, ast.Expression) The advantage of this approach is that you can define a different language too... -- --Guido van Rossum (python.org/~guido) From alex.gaynor at gmail.com Tue Jul 27 20:14:20 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Tue, 27 Jul 2010 13:14:20 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On Tue, Jul 27, 2010 at 1:04 PM, Guido van Rossum wrote: > On Tue, Jul 27, 2010 at 5:25 PM, Robert Kern wrote: >> I've occasionally wished that we could repurpose backticks for expression >> literals: >> >> ?expr = `x + y*z` >> ?assert isinstance(expr, ast.Expression) > > Maybe you could just as well make it a plain string literal and call a > function that parses it into a parse tree: > > expr = parse("x + y*z") > assert isinstance(expr, ast.Expression) > > The advantage of this approach is that you can define a different > language too... > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > As an interesting (but perhaps not relevant) data point, the documentation is rather nebulous as to whether the bool cast exists (it says things like "should return true if", but never explicitly that it takes the boolean value of the return from __contains__), further it doesn't seem to be tested at all (to the point where I only noticed today that PyPy's behavior is different, since this apparently breaks no tests). Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From raymond.hettinger at gmail.com Tue Jul 27 20:38:15 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 27 Jul 2010 11:38:15 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: <4AC1C4FF-7B82-4C35-B221-BE27A0C43587@gmail.com> On Jul 27, 2010, at 11:04 AM, Guido van Rossum wrote: > On Tue, Jul 27, 2010 at 5:25 PM, Robert Kern wrote: >> I've occasionally wished that we could repurpose backticks for expression >> literals: >> >> expr = `x + y*z` >> assert isinstance(expr, ast.Expression) > > Maybe you could just as well make it a plain string literal and call a > function that parses it into a parse tree: > > expr = parse("x + y*z") > assert isinstance(expr, ast.Expression) > > The advantage of this approach is that you can define a different > language too... Starting with string literals and a parse function seems like a great design decision. It already works and doesn't require new syntax. And, as Guido pointed out, it future proofs the design by freeing the domain specific language from the constraints of Python itself. Raymond From masklinn at masklinn.net Tue Jul 27 21:00:19 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 27 Jul 2010 21:00:19 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: <3EEB340D-0BF3-48EE-AC02-5E6A7DF1495C@masklinn.net> On 2010-07-27, at 20:04 , Guido van Rossum wrote: > On Tue, Jul 27, 2010 at 5:25 PM, Robert Kern wrote: >> I've occasionally wished that we could repurpose backticks for expression >> literals: >> >> expr = `x + y*z` >> assert isinstance(expr, ast.Expression) > > Maybe you could just as well make it a plain string literal and call a > function that parses it into a parse tree: > > expr = parse("x + y*z") > assert isinstance(expr, ast.Expression) > > The advantage of this approach is that you can define a different > language too? The nice thing about having it be special-sauce syntax is that it can be parsed along with the rest of the script, failing early, and it can be stored in bytecode. Whereas the string itself will only be parsed when the function is actually executed. From alexander.belopolsky at gmail.com Tue Jul 27 21:05:10 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 27 Jul 2010 15:05:10 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <3EEB340D-0BF3-48EE-AC02-5E6A7DF1495C@masklinn.net> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <3EEB340D-0BF3-48EE-AC02-5E6A7DF1495C@masklinn.net> Message-ID: On Tue, Jul 27, 2010 at 3:00 PM, Masklinn wrote: .. > The nice thing about having it be special-sauce syntax is that it can be parsed along with the rest of the > script, failing early, and it can be stored in bytecode. Whereas the string itself will only be parsed when > the function is actually executed. Not enough to justify new syntax IMO. Just create your parse trees at the module or class level. chances are that's where they belong anyways. I would be very interested to see the parse() function. It does not exist (yet), right? From solipsis at pitrou.net Tue Jul 27 21:06:42 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 27 Jul 2010 21:06:42 +0200 Subject: [Python-ideas] Non-boolean return from __contains__ References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <4AC1C4FF-7B82-4C35-B221-BE27A0C43587@gmail.com> Message-ID: <20100727210642.548d5d34@pitrou.net> On Tue, 27 Jul 2010 11:38:15 -0700 Raymond Hettinger wrote: > > On Jul 27, 2010, at 11:04 AM, Guido van Rossum wrote: > > > On Tue, Jul 27, 2010 at 5:25 PM, Robert Kern wrote: > >> I've occasionally wished that we could repurpose backticks for expression > >> literals: > >> > >> expr = `x + y*z` > >> assert isinstance(expr, ast.Expression) > > > > Maybe you could just as well make it a plain string literal and call a > > function that parses it into a parse tree: > > > > expr = parse("x + y*z") > > assert isinstance(expr, ast.Expression) > > > > The advantage of this approach is that you can define a different > > language too... > > Starting with string literals and a parse function seems like > a great design decision. It already works and doesn't require > new syntax. Yes, you only have to write a dedicated full-fledged parser, your code isn't highlighted properly in text editors, and you can't easily access Python objects declared in the enclosing scope. I guess it explains that none of the common ORMs seem to have adopted such a ?great design decision? :-) Regards Antoine. From phd at phd.pp.ru Tue Jul 27 21:22:14 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Tue, 27 Jul 2010 23:22:14 +0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <20100727210642.548d5d34@pitrou.net> References: <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <4AC1C4FF-7B82-4C35-B221-BE27A0C43587@gmail.com> <20100727210642.548d5d34@pitrou.net> Message-ID: <20100727192214.GA2597@phd.pp.ru> On Tue, Jul 27, 2010 at 09:06:42PM +0200, Antoine Pitrou wrote: > Yes, you only have to write a dedicated full-fledged parser, your code > isn't highlighted properly in text editors, and you can't easily access > Python objects declared in the enclosing scope. > > I guess it explains that none of the common ORMs seem to have adopted > such a ???great design decision??? :-) Python ORMs are about mapping between *Python* and SQL - we don't need no stinking DSLs! ;) Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ianb at colorstudy.com Tue Jul 27 21:24:54 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 27 Jul 2010 14:24:54 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On Tue, Jul 27, 2010 at 12:25 PM, Bruce Leban wrote: > The idea of LINQ is that you write the expression directly in the language > and it translates into a query expression. It's going to be operating on an > expression parse tree, right? Rather than trying to change > the allowable expressions maybe the question is to figure out how to > translate what we have and find what we can't express with what we have (and > that's an orthogonal question and has nothing to do with __xxx__ functions). > > On Tue, Jul 27, 2010 at 9:42 AM, Masklinn wrote: > >> What about french quotes >> >> expr = ?x + y * z? >> >> > Isn't there are already a syntax for this? > > expr = lambda: x + y * z > > Maybe you want some conversion of that lambda into a different form: > > expr = @ast lambda: x + y + z > There's also an unexecuted expression in generator expressions, which is prettier than lambda. There's two places where I've seen people doing this in Python (not counting the operator overloading, of which there are many examples). The first is DejaVu ( http://www.aminus.net/dejavu/chrome/common/doc/trunk/managing.html#Querying) which decompiles lambdas. Then it does partial translation to SQL, and I think will actually execute things in Python when they can't be translated (e.g., if you are using a Python function on a database-derived result). But it can easily break between Python versions, and only works with CPython. It also seems to have some fairly complex rules about partial evaluation. The other place is peak.rules (http://pypi.python.org/pypi/PEAK-Rules) which uses a strings for conditions. My understanding is that the string is compiled to an AST and then analyzed, so partial expressions shared by many conditions can be efficiently evaluated together. Also it changes scopes (the expression is defined outside the function, but evaluated in the context of specific function arguments). Maybe it'd be helpful to consider actual examples in the context of SQL... def users_over_age(minimum_age=timedelta(years=18)): return User.select("datetime.now() - user.birth_date > minimum_age") # or... return User.select(datetime.now() - User.birth_date > minimum_age) def users_with_addresses(): return User.select("sql_exists(Address.select('address.user_id == user.id'))") # or ... return User.select(sql_exists(Address.select(Address.user_id == User.user_id)) def users_in_list(list_of_users_or_ids): list_of_ids = [item.id if isinstance(item, User) else item for item in list_of_users_or_ids] return User.select("user.id in list_of_ids") # or ... return User.select(sql_in(User.id, list_of_ids)) Well, I'm not seeing any advantage. You could do things like: def valid_email(email): # Obviously write it better... return re.match(r'[a-z0-9]+@[a-z0-9.-]+', email) def users_with_valid_email(): return User.select("valid_email(user.email)") and then have it detect (ala DejaVu) that valid_email() cannot be translated to SQL, so select everything then filter it with that function. This looks clever, but usually this kind of transparency will only bite; as in this example, what looks like it might be a normal kind of query is actually an extremely expensive query that might take a very long time to complete. (My impression is that LINQ is clever like this too, allowing expressions that are evaluated in part in different locations?) I was worried about binding arguments, but potentially it can work nicely. E.g., all these only take a single variable from the outer scope, but imagine something like: def expired_users(time=timedelta(days=30): return User.select("user.last_login < (datetime.now() - time)") if you were clever you could detect that "datetime.now() - time" can be statically computed. If you weren't clever you might send the expression to the database (which actually isn't terrible). But maybe consider a case: def users_with_ip(ip): return User.select("user.last_login_ip == encode_ip(ip)") where encode_ip does something like turn dotted numbers into an integer. If the mapper is clever it might tell that there are no SQL expressions in the arguments to encode_ip, and it can evaluate it early. Except... what if the function does something like return a random number? Then you've changed things by evaluating it once instead of for every user. So maybe you can't do that optimization, and so the only way to make this work is to create a local variable to make explicit that you only want to evaluate the argument once. As such, the status quo is better (User.select(User.last_login_ip == encode_ip(ip))) because the way it is evaluated is more obvious, and the constraints are clearer. This is managed because "magic" stuff is very specific (those column objects, which have all the operator overloading), and everything else is plain Python. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ianb at colorstudy.com Tue Jul 27 21:29:19 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 27 Jul 2010 14:29:19 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: On Tue, Jul 27, 2010 at 11:49 AM, Michael Foord wrote: > On 27 July 2010 17:42, Alexander Belopolsky < > alexander.belopolsky at gmail.com> wrote: > >> On Tue, Jul 27, 2010 at 11:59 AM, Guido van Rossum >> wrote: >> .. >> >> So it doesn't help that 'in' may return something else than a bool >> >> because the method is called on the wrong object for your purposes. >> > >> > Well that pretty much kills the proposal. I can't believe nobody >> > (including myself) figured this out earlier in the thread. :-( >> >> It may kill a use case or two, but not the proposal. In the >> libraries like numpy where all python containers get replaced, this is >> not an issue. Also this problem invites __rcontains__ solution, > > > > Wasn't the lack of an __rcontains__ a problem for the web-sig guys trying > to work out the bytes / strings issue? > I think PJE wanted to implement a string type that was bytes+encoding (as opposed to using Python's native strings). You can overload __add__ etc so everything works, but you couldn't make this work: encodedbytes(b'1234', 'utf8') in '12345' because '12345'.__contains__ would reject the encodedbytes type outright. __rcontains__ would work because here '12345' would know that it didn't understand encodedbytes. It wouldn't work for lists though, as [].__contains__ can handle *any* type, as it just tests for equality across all of its members. So it's not like __radd__ because the original object can't know that it should defer to the other argument. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Tue Jul 27 21:24:49 2010 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 27 Jul 2010 20:24:49 +0100 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: <4C4F3281.9040402@mrabarnett.plus.com> Bruce Leban wrote: > The idea of LINQ is that you write the expression directly in the > language and it translates into a query expression. It's going to be > operating on an expression parse tree, right? Rather than trying to > change the allowable expressions maybe the question is to figure out how > to translate what we have and find what we can't express with what we > have (and that's an orthogonal question and has nothing to do with > __xxx__ functions). > > On Tue, Jul 27, 2010 at 9:42 AM, Masklinn > wrote: > > What about french quotes > > expr = ?x + y * z? > > > Isn't there are already a syntax for this? > > expr = lambda: x + y * z > > Maybe you want some conversion of that lambda into a different form: > > expr = @ast lambda: x + y + z > Or: results = db => sql_expression where the parse tree for "sql_expression" is passed to db.__linq__. The parse tree is compiled to SQL and cached for possible future use. From 2010 at jmunch.dk Tue Jul 27 21:40:13 2010 From: 2010 at jmunch.dk (Anders J. Munch) Date: Tue, 27 Jul 2010 21:40:13 +0200 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: References: <4C4F14E6.1060102@jmunch.dk> Message-ID: <4C4F361D.9010903@jmunch.dk> Guido van Rossum wrote: > Have you really thought through the implementation in CPython? Not at all. > When a > non-generator function is called, there's usually a C stack frame in > the way which prevents yielding. Of course. I knew there'd be good reason why you didn't do a more full coroutine implementation originally, but on a beautiful summer day reckless optimism took over and it didn't come readily to mind. On platforms with coroutines available at the C level, I think something could be devised, something like keep a pool of coroutines and switch to a different one at strategic points. But I realise that requiring all platforms to support coroutines is a deal-breaker, so there. - Anders From cs at zip.com.au Tue Jul 27 23:55:20 2010 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 28 Jul 2010 07:55:20 +1000 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> Message-ID: <20100727215519.GA1674@cskk.homeip.net> On 25Jul2010 11:48, Raymond Hettinger wrote: | On Jul 25, 2010, at 11:15 AM, Alex Gaynor wrote: | > Recently I've been wondering why __contains__ casts all of it's | > returns to be boolean values. Specifically I'd like to propose that | > __contains__'s return values be passed directly back as the result of | > the `in` operation. | | x = y in z # where x is a non boolean. | | Yuck. | | One of the beautiful aspects of __contains__ is that its simply signature | allows it to be used polymorphically throughout the whole language. Didn't we have the dual of this argument a week or so ago, where rantingrick was complaining that ints could be used as booleans, and that it was simply appalling? That Python should immediately make 0 also behave as True because he didn't feel it was "empty". His argument was widely opposed, and IMHO rightly so. Personally, I'm +0.5 on the proposal: - because Python already allows pretty much anything to be used in a Boolean context, this means that anything can be "used polymorphically throughout the whole language", to use your term above; I do not think it breaks anything - do any of the other comparison methods enforce Booleanness? ==/__eq__ doesn't and I didn't think the others did. All that is required for functionality is sane choice of return value by the implementors. - have you used SQLAlchemy? Its SQL constrction by writing: .select([...columns...], table.c.COL1 == 3) is extremely programmer friendly, and works directly off overloading the column object's .__eq__() method to return something that gets made into a robust SQL query later. I'm going to snip two of your paragraphs here and proceed to: | There is no "natural" interpretation of an in-operator returning | a non-boolean. There is in the SQLAlchemy example above; "in" with an SQLA column object would return a hook to make a "value in (a,b,c,...)" SQL expression. It is all about context, and in Python the .__* methods let objects provide the context for evaluation of expressions - that's what polymorphism does for us. The proposal changes nothing for pre-existing uses. It in no way causes: False in [False] to return False, because it doesn't change bool.__contains__. The proposal it to not coerce the result of __contains__ to bool(), allowing _new_ objects to return a more nuanced result for __contains__ for their own purposes. As long as that make sense in the use context, I believe this is a plus and not a minus. We can all write nonsensical code by implementing __eq__ with gibberish. So what? | If the above snippet assigns "foo" to x, what | does that mean? If it assigns -10, what does that mean? In current Python, it means "true". | Language design is about associating meanings (semantics) | with syntax. ISTM, this would be poor design. We already allow programmers to do that all over the place with the special methods. This proposal removes an apparently arbitrary restriction on __contains__ that doesn't seem to be applied to the other comparators. +0.5, verging on +1. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ It is in those arenas that he previously extinguished himself. - Chuck Rogers From cs at zip.com.au Wed Jul 28 00:10:48 2010 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 28 Jul 2010 08:10:48 +1000 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: Message-ID: <20100727221047.GA5223@cskk.homeip.net> On 27Jul2010 16:59, Guido van Rossum wrote: | On Tue, Jul 27, 2010 at 3:29 PM, Mathias Panzenb?ck | wrote: | > On 07/26/2010 04:20 AM, Alex Gaynor wrote: | >> | >> Fundamentally the argument in favor of it is the same as for the other | >> comparison operators: you want to do symbolic manipulation using the | >> "normal" syntax, as a DSL. ?My example is that of a SQL expression | >> builder: SQLAlchemy uses User.id == 3 to create a clause where the ID | >> is 3, but for "id in [1, 2, 3]" it has: User.id.in_([1, 2, 3]), which | >> is rather unseamly IMO (at least as much as having User.id.eq(3) would | >> be). | >> | > | > This is a bad example for your wish because this code: | >>>> id in [1, 2, 3] | > | > translates into: | >>>> [1, 2, 3].__contains__(id) | > | > So it doesn't help that 'in' may return something else than a bool | > because the method is called on the wrong object for your purposes. | | Well that pretty much kills the proposal. I can't believe nobody | (including myself) figured this out earlier in the thread. :-( That's a real shame. ".__rcontains__", anyone? For the record (since I just said +0.5 to +1), I'm down to +0 on the proposal; I think the idea's good and removes an (to my mind) arbitrary constraint on __contains__, but now I haven't got a use case:-( Alex's "id in wrapper([1,2,3])" doesn't seem better than the existing "column.in_([1,2,3])" that already exists, alas. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ "Are we alpinists, or are we tourists" followed by "tourists! tourists!" - Kobus Barnard in rec.climbing, on things he's heard firsthand From dangyogi at gmail.com Wed Jul 28 02:17:44 2010 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Tue, 27 Jul 2010 20:17:44 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: For the LINQ approach, I'd rather see an open ended hook for allowing any required syntax, rather than only SQL-like syntax. OTOH, re.compile and db_cursor.execute are two examples where no new mechanism is needed. And you'd have to quote the new syntax somehow anyway since the Python parser wouldn't understand it... Which makes me wonder if it really makes sense to try to overload these operators in order to generate some kind of mini-language, vs just using your own syntax like re.compile or db_cursor.execute does. The one catch is that it is often nice to be able to refer to Python variables (at least; or perhaps full expressions) within the mini-language. The db_cursor.execute is an example, and putting the placeholders in the SQL syntax with arguments later works, but gets tedious. To be able to include Python expressions directly within the mini-language, the library implementing the new syntax would have to be able to translate the new syntax into Python and have it spliced into the code where it was used. Something like an intelligent macro expansion. This means that the library's translation code has to be called from the Python compiler. I'm not familiar enough with the compiler to know how crazy this is. Mython tries to do something similar. -Bruce On Tue, Jul 27, 2010 at 10:02 AM, Guido van Rossum wrote: > > Therefore I think the LINQ approach, which (IIUC) converts an > expression into a parse tree when certain syntax is encountered, and > calls a built-in method with that parse tree, would be a fresh breath > of air. No need deriding it just because Microsoft came up with it > first. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Jul 28 03:03:48 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 Jul 2010 13:03:48 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> Message-ID: <4C4F81F4.1060708@canterbury.ac.nz> Guido van Rossum wrote: > On Tue, Jul 27, 2010 at 3:29 PM, Mathias Panzenb?ck > >>So it doesn't help that 'in' may return something else than a bool >>because the method is called on the wrong object for your purposes. > > > Well that pretty much kills the proposal. I can't believe nobody > (including myself) figured this out earlier in the thread. :-( Alternatively, it could be taken as a sign that there is a special method missing -- there should be an __in__ method that is tried on the first operand before trying __contains__ on the second. (And if we'd thought of this at the beginning, __contains__ would have been called __rin__. :-) -- Greg From raymond.hettinger at gmail.com Wed Jul 28 03:09:17 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 27 Jul 2010 18:09:17 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <4C4F81F4.1060708@canterbury.ac.nz> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <4C4EED64.90806@gmx.net> <4C4F81F4.1060708@canterbury.ac.nz> Message-ID: <3102EC1F-BBFF-4F61-811C-F651BBF91736@gmail.com> On Jul 27, 2010, at 6:03 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> On Tue, Jul 27, 2010 at 3:29 PM, Mathias Panzenb?ck > > >>> So it doesn't help that 'in' may return something else than a bool >>> because the method is called on the wrong object for your purposes. >> >> Well that pretty much kills the proposal. I can't believe nobody >> (including myself) figured this out earlier in the thread. :-( > > Alternatively, it could be taken as a sign that there is > a special method missing -- there should be an __in__ > method that is tried on the first operand before trying > __contains__ on the second. (And if we'd thought of this > at the beginning, __contains__ would have been called > __rin__. :-) Don't forget __not_in__ and __not_rin__ ;-) Raymond From greg.ewing at canterbury.ac.nz Wed Jul 28 03:19:12 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 Jul 2010 13:19:12 +1200 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C4F14E6.1060102@jmunch.dk> References: <4C4F14E6.1060102@jmunch.dk> Message-ID: <4C4F8590.60004@canterbury.ac.nz> Anders J. Munch wrote: > But suppose you could address the source instead? Suppose you could > write yield_if_true in such a way that it did not become a generator > despite yielding? I don't see how this would work. The problem isn't that yield_if_true becomes a generator -- it's that the function calling yield_if_true *doesn't* become a generator, even though it needs to. > Let's call it 'yield_' , for lack of a > better name. The function would yield the nearest generator on the > call stack. But if none of the calling functions have yields anywhere else, then they're just ordinary functions, and there is *no* generator on the call stack! -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 28 03:27:39 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 Jul 2010 13:27:39 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: <4C4F878B.9050501@canterbury.ac.nz> Bruce Leban wrote: > Isn't there are already a syntax for this? > > expr = lambda: x + y * z > > Maybe you want some conversion of that lambda into a different form: > > expr = @ast lambda: x + y + z If you need new syntax for this, then it's a sign that there *isn't* already a syntax for what we want. Given that we need new syntax anyway, there doesn't seem to be any point in bothering with the lambda: expr = @ast: x + y + z or any other suitable syntax. -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 28 03:43:37 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 Jul 2010 13:43:37 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: <4C4F8B49.7030002@canterbury.ac.nz> Guido van Rossum wrote: > Maybe you could just as well make it a plain string literal and call a > function that parses it into a parse tree: > > expr = parse("x + y*z") > assert isinstance(expr, ast.Expression) > > The advantage of this approach is that you can define a different > language too... This is more or less what we have now when we pass SQL queries as strings to database engines. There are numerous problems with this. One of them is the fact that editors have no idea that the code inside the string is code, so they can't help you out with any syntax highlighting, formatting, etc. Another is that it makes passing parameters to the embedded code very awkward. One of the things I would like to get from a code-as-ast feature is a natural way of embedding sub-expressions that *do* get evaluated according to the normal Python rules. For example, one should be able to write something like cust = "SMITH" date = today() sales = select(transactions, @ast: customer_code == cust and transaction_date == date) and have it possible for the implementation of select() to easily and safely evaluate 'cust' and 'date' in the calling environment. -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 28 03:50:15 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 Jul 2010 13:50:15 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <3EEB340D-0BF3-48EE-AC02-5E6A7DF1495C@masklinn.net> Message-ID: <4C4F8CD7.3030409@canterbury.ac.nz> Alexander Belopolsky wrote: > Not enough to justify new syntax IMO. Just create your parse trees at > the module or class level. chances are that's where they belong > anyways. Except when they don't, and would be clearer written in-line at the point of use. -- Greg From pyideas at rebertia.com Wed Jul 28 03:56:14 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 27 Jul 2010 18:56:14 -0700 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: <4C4F8B49.7030002@canterbury.ac.nz> References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <4C4F8B49.7030002@canterbury.ac.nz> Message-ID: On Tue, Jul 27, 2010 at 6:43 PM, Greg Ewing wrote: > One of the things I would > like to get from a code-as-ast feature is a natural way > of embedding sub-expressions that *do* get evaluated > according to the normal Python rules. For example, > one should be able to write something like > > ?cust = "SMITH" > ?date = today() > ?sales = select(transactions, > ? ?@ast: customer_code == cust and transaction_date == date) > > and have it possible for the implementation of select() > to easily and safely evaluate 'cust' and 'date' in the > calling environment. In other words, you want (possibly an implicit form of) the comma operator from Scheme's quasiquote.[1] Maybe Paul Graham /was/ onto something. Cheers, Chris -- [1] http://www.cs.hut.fi/Studies/T-93.210/schemetutorial/node7.html From greg.ewing at canterbury.ac.nz Wed Jul 28 04:10:46 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 28 Jul 2010 14:10:46 +1200 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <4C4F8B49.7030002@canterbury.ac.nz> Message-ID: <4C4F91A6.1040205@canterbury.ac.nz> Chris Rebert wrote: > In other words, you want (possibly an implicit form of) the comma > operator from Scheme's quasiquote. I thought about that, but I'd rather avoid having to expicitly mark the sub-expressions if possible. The way I envisage it, each of the AST nodes would have an eval() method which would evaluate it in the calling environment. It would be up to the consumer of the AST to decide when to call it. -- Greg From ianb at colorstudy.com Wed Jul 28 04:42:45 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 27 Jul 2010 21:42:45 -0500 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On Tue, Jul 27, 2010 at 7:17 PM, Bruce Frederiksen wrote: > The one catch is that it is often nice to be able to refer to Python > variables (at least; or perhaps full expressions) within the mini-language. > The db_cursor.execute is an example, and putting the placeholders in the SQL > syntax with arguments later works, but gets tedious. To be able to include > Python expressions directly within the mini-language, the library > implementing the new syntax would have to be able to translate the new > syntax into Python and have it spliced into the code where it was used. > Something like an intelligent macro expansion. This means that the > library's translation code has to be called from the Python compiler. > > I'm not familiar enough with the compiler to know how crazy this is. > Mython tries to do something similar. > Basically templating languages do this, and a templating language could be used for exactly this sort of purpose (but *not* simply a generic templating language, then you get SQL or whatever-else injection problems). I put together a small example that might work for SQL: http://svn.colorstudy.com/home/ianb/recipes/sqltemplate.py Unfortunately templating languages in Python aren't nearly as easy to implement as they should be, and the results are not as elegant as they could be. I've been using templating snippets a lot more in my code lately, and find it more handy than I would have originally expected. It doesn't seem unreasonable in this case either. Well... unless you want to introspect the expression, which would mean parsing the resulting SQL (and is then hard). -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From funbuggie at gmail.com Wed Jul 28 05:13:58 2010 From: funbuggie at gmail.com (Barend erasmus) Date: Wed, 28 Jul 2010 05:13:58 +0200 Subject: [Python-ideas] Counter Message-ID: Hi made a counter that count till 1 000 000 000.How long will it take to get there when there is no delay. Thx Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmjohnson.mailinglist at gmail.com Wed Jul 28 06:38:02 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Tue, 27 Jul 2010 18:38:02 -1000 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C4F8590.60004@canterbury.ac.nz> References: <4C4F14E6.1060102@jmunch.dk> <4C4F8590.60004@canterbury.ac.nz> Message-ID: What would this code do: def yield_if_true(x): if x: yield_(x) def maybe_yield(x): if calculate_property(x): yield_if_true(x) else: return None maybe_yield(5) When maybe_yield(5) is called different things need to happen depending on whether it?s a function or a generator. If it?s a generator, it shouldn?t execute calculate_property(x) yet, because generators don?t execute their contents until someone says next(maybe_yield(5)) (or maybe_yield(5).next() in Py2.x). On the other hand, if it?s not a generator but a function, then it should run calculate_property right away. You could try starting out as a function and then switching to being a generator if anything that you call has a call to yield_ inside of it, but that strikes me as extremely clumsy and complicated, and it could lead to unpleasant surprises if calculate_property has side-effects. ISTM there?s no way to do something like PEP-380 without requiring that some special keyword is used to indicate that the def is creating a generator and not creating a function. There other directions to take this besides yield from (for example, we could replace def with gen or some such and give return a special meaning inside a gen statement), but to try to do it without any kind of keyword at the site of the caller means violating "In the face of ambiguity, refuse the temptation to guess.? -- Carl Johnson On Tue, Jul 27, 2010 at 3:19 PM, Greg Ewing wrote: > Anders J. Munch wrote: > >> But suppose you could address the source instead? ?Suppose you could >> write yield_if_true in such a way that it did not become a generator >> despite yielding? > > I don't see how this would work. The problem isn't that > yield_if_true becomes a generator -- it's that the function > calling yield_if_true *doesn't* become a generator, even > though it needs to. > >> Let's call it 'yield_' , for lack of a >> better name. ?The function would yield the nearest generator on the >> call stack. > > But if none of the calling functions have yields anywhere > else, then they're just ordinary functions, and there is > *no* generator on the call stack! > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ncoghlan at gmail.com Wed Jul 28 06:39:41 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 28 Jul 2010 14:39:41 +1000 Subject: [Python-ideas] Readability of hex strings (Was: Use of coding cookie in 3.x stdlib) In-Reply-To: <201007271012.44460.steve@pearwood.info> References: <201007271012.44460.steve@pearwood.info> Message-ID: (another attempt at getting this discussion over on python-ideas where it belongs) On Tue, Jul 27, 2010 at 10:12 AM, Steven D'Aprano wrote: > Since it only takes a pair of small helper functions to convert hex > dumps in the form "XXXX XXXX ..." to and from byte strings, I don't see > the need for new syntax and would vote -1 on the idea. However, I'd > vote +0 on a matching bytes.tohex() method to partner with the existing > bytes.fromhex(). Having written my own bytes formatting function to do exactly as Anatoly asks (i.e. display a string of binary data as hex characters with spaces between each byte), I can understand the desire to have something like that readily available. The following is not particularly intuitive: >>> " ".join(format(c, "x") for c in b"abcdefABCDEF") '61 62 63 64 65 66 41 42 43 44 45 46' The 2.x equivalent is just as bad: >>> " ".join(format(ord(c), "x") for c in "abcdefABCDEF") '61 62 63 64 65 66 41 42 43 44 45 46' However, I'll caveat that support by pointing out that the basic formatting quickly becomes inadequate for many purposes. Personally, I quickly replaced it with a fixed width dump format that provides the ASCII character dumps over on the right hand side the way most hex editors do). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From andreengels at gmail.com Wed Jul 28 07:38:18 2010 From: andreengels at gmail.com (Andre Engels) Date: Wed, 28 Jul 2010 07:38:18 +0200 Subject: [Python-ideas] Counter In-Reply-To: References: Message-ID: On Wed, Jul 28, 2010 at 5:13 AM, Barend erasmus wrote: > Hi > > made a counter that count till 1 000 000 000.How > long will it take to get there when there is no delay. That depends - how have you programmed your counter, what do you do with the result, what computer are you using, what other processes are running? Rather than asking people who do not know all these things, you'd better try it out yourself: Let it count to 10 000 000, and multiply that result by 100. It will not be exact, but it will be a much better estimate than I or anyone else here can provide. -- Andr? Engels, andreengels at gmail.com From pyideas at rebertia.com Wed Jul 28 07:43:03 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 27 Jul 2010 22:43:03 -0700 Subject: [Python-ideas] Counter In-Reply-To: References: Message-ID: On Tue, Jul 27, 2010 at 8:13 PM, Barend erasmus wrote: > Hi > > made a counter that count till 1 000 000 000.How > long will it take to get there when there is no delay. > > Thx > > Ben Your post is completely off-topic. This mailinglist (python-ideas) is for proposing/discussing ideas for improving/modifying the Python language. For general discussion and questions about Python, please post to python-list/comp.lang.python instead. It is accessible from either: http://mail.python.org/mailman/listinfo/python-list http://groups.google.com/group/comp.lang.python/topics Regards, Chris From solipsis at pitrou.net Wed Jul 28 14:49:55 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 28 Jul 2010 14:49:55 +0200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: <20100728144955.06236bdb@pitrou.net> Hello, On Sat, 24 Jul 2010 14:31:42 -0700 "Gregory P. Smith" wrote: > > The EnvrionmentError hierarchy and common errno test code has bothered me > for a while. While I think the namespace pollution concern is valid I would > suggest adding "Error" to the end of all of the names (your initial proposal > only says "Error" on the end of one of them) as that is consistent with the > bulk of the existing standard exceptions and warnings. They are unlikely to > conflict with anything other than exceptions people have already defined > themselves in any existing code (which could likely be refactored out after > we officially define these). The reason I haven't added "Error" to them is that the names are already quite long, and it's quite obvious that they refer to errors. I'm obviously not religious about it, though :) Regards Antoine. From mal at egenix.com Wed Jul 28 15:06:44 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 28 Jul 2010 15:06:44 +0200 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: <20100728144955.06236bdb@pitrou.net> References: <1279740852.3222.38.camel@localhost.localdomain> <20100728144955.06236bdb@pitrou.net> Message-ID: <4C502B64.6020309@egenix.com> Antoine Pitrou wrote: > > Hello, > > On Sat, 24 Jul 2010 14:31:42 -0700 > "Gregory P. Smith" wrote: >> >> The EnvrionmentError hierarchy and common errno test code has bothered me >> for a while. While I think the namespace pollution concern is valid I would >> suggest adding "Error" to the end of all of the names (your initial proposal >> only says "Error" on the end of one of them) as that is consistent with the >> bulk of the existing standard exceptions and warnings. They are unlikely to >> conflict with anything other than exceptions people have already defined >> themselves in any existing code (which could likely be refactored out after >> we officially define these). > > The reason I haven't added "Error" to them is that the names are > already quite long, and it's quite obvious that they refer to errors. > I'm obviously not religious about it, though :) Please keep the "Error" suffix on those exception class names. This is common practice and we wouldn't want to break with it just because the names get a little longer (we have editor type completion to deal with that ;-). Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Wed Jul 28 16:26:25 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 28 Jul 2010 07:26:25 -0700 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: References: <4C4F14E6.1060102@jmunch.dk> <4C4F8590.60004@canterbury.ac.nz> Message-ID: On Tue, Jul 27, 2010 at 9:38 PM, Carl M. Johnson wrote: > What would this code do: > > def yield_if_true(x): > ? ? if x: > ? ? ? ? yield_(x) > > def maybe_yield(x): > ? ?if calculate_property(x): > ? ? ? ?yield_if_true(x) > ? ?else: > ? ? ? ?return None > > maybe_yield(5) > > When maybe_yield(5) is called different things need to happen > depending on whether it?s a function or a generator. If it?s a > generator, it shouldn?t execute calculate_property(x) yet, because > generators don?t execute their contents until someone says > next(maybe_yield(5)) (or maybe_yield(5).next() in Py2.x). On the other > hand, if it?s not a generator but a function, then it should run > calculate_property right away. You could try starting out as a > function and then switching to being a generator if anything that you > call has a call to yield_ inside of it, but that strikes me as > extremely clumsy and complicated, and it could lead to unpleasant > surprises if calculate_property has side-effects. > > ISTM there?s no way to do something like PEP-380 without requiring > that some special keyword is used to indicate that the def is creating > a generator and not creating a function. There other directions to > take this besides yield from (for example, we could replace def with > gen or some such and give return a special meaning inside a gen > statement), but to try to do it without any kind of keyword at the > site of the caller means violating "In the face of ambiguity, refuse > the temptation to guess.? Well, in a statically typed language the compiler could figure it out based on the type of the things being called -- "generator-ness" could be propagated by the type system just like "thows a certain expression" is in Java. -- --Guido van Rossum (python.org/~guido) From 2010 at jmunch.dk Wed Jul 28 18:31:51 2010 From: 2010 at jmunch.dk (Anders J. Munch) Date: Wed, 28 Jul 2010 18:31:51 +0200 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: References: <4C4F14E6.1060102@jmunch.dk> <4C4F8590.60004@canterbury.ac.nz> Message-ID: <4C505B77.9080106@jmunch.dk> Carl M. Johnson wrote: > What would this code do: > > def yield_if_true(x): > if x: > yield_(x) > > def maybe_yield(x): > if calculate_property(x): > yield_if_true(x) > else: > return None > > maybe_yield(5) maybe_yield is not a generator, because it does not use the yield keyword. Nothing changed about that. As there's no generator here, yield_(x) would raise some exception: "RuntimeError: No generator found for yield_" Regular yield syntax remains the preferred option - with yield_ as a supplement for when delegation is needed. Perhaps a better name for it would be 'yield_outer' or 'nonlocal_yield'. regards, Anders From scott+python-ideas at scottdial.com Wed Jul 28 18:54:42 2010 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Wed, 28 Jul 2010 12:54:42 -0400 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C4F14E6.1060102@jmunch.dk> References: <4C4F14E6.1060102@jmunch.dk> Message-ID: <4C5060D2.2070402@scottdial.com> On 7/27/2010 1:18 PM, Anders J. Munch wrote: > But suppose you could address the source instead? Suppose you could > write yield_if_true in such a way that it did not become a generator > despite yielding? ... > Now the example would work with a slight modifiction: > > def yield_if_true(x): > if x: > yield_(x) > yield_if_true(a) > yield_if_true(b) > > The real benefits from a yield_ function come with recursive > functions. Except now yield_if_true() is not a generator. So, you have two classes of "generators" now (top-level/delegated), which breaks all sorts of things. How would you write izip() with this? You'd have to have a top-level (yield) version and a delegated (yield_()) version to satisfy all of the use-cases. I think your yield_() is actually a detriment to recursive functions. The "yield from" solution is superior in that the delegated generators are still normal generators to other call sites that are indifferent to that use case. I feel that there is no escaping that "for v in g: yield g" and small variations are an amazingly common pattern that are amazingly naive. Although for many use cases, it works just fine; the unfortunate side-effect is that anyone building more clever generators on-top find themselves victimized by other authors. Creating a "yield from" syntax gives the other authors a pleasant and explicit shorthand, and provides for the other use cases automatically. > I'm guessing > you could implement 'yield from' as a pure-Python > function using yield_, making yield_ strictly more powerful PEP 380 already provides a pure python description of "yield from" using existing syntax, therefore the assertion that you could implement it using yield_() provides no measure for how "powerful" your suggestion is. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From 2010 at jmunch.dk Wed Jul 28 19:58:53 2010 From: 2010 at jmunch.dk (Anders J. Munch) Date: Wed, 28 Jul 2010 19:58:53 +0200 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C5060D2.2070402@scottdial.com> References: <4C4F14E6.1060102@jmunch.dk> <4C5060D2.2070402@scottdial.com> Message-ID: <4C506FDD.3070208@jmunch.dk> Scott Dial wrote: > Except now yield_if_true() is not a generator. So, you have two classes > of "generators" now (top-level/delegated), which breaks all sorts of > things. Right, yield_if_true is a regular function, that's the whole point. There'd still only be one kind of generator, defined by the presence of the yield keyword. Nothing would break. The function that calls yield_if_true (or some other function up the call chain) would need to be a generator, if necessary made such using the traditional workaround if 0: yield regards, Anders From guido at python.org Wed Jul 28 20:28:51 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 28 Jul 2010 11:28:51 -0700 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C506FDD.3070208@jmunch.dk> References: <4C4F14E6.1060102@jmunch.dk> <4C5060D2.2070402@scottdial.com> <4C506FDD.3070208@jmunch.dk> Message-ID: All, trust me, this idea is going nowhere. Don't waste your time. -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Thu Jul 29 02:48:39 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 29 Jul 2010 12:48:39 +1200 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C506FDD.3070208@jmunch.dk> References: <4C4F14E6.1060102@jmunch.dk> <4C5060D2.2070402@scottdial.com> <4C506FDD.3070208@jmunch.dk> Message-ID: <4C50CFE7.9080404@canterbury.ac.nz> Anders J. Munch wrote: > Right, yield_if_true is a regular function, that's the whole point. What if it needs to call yield_() more than once? If it's just a regular function, then it has no ability to be suspended at the point of yield and resumed. -- Greg From mark at qtrac.eu Thu Jul 29 09:20:50 2010 From: mark at qtrac.eu (Mark Summerfield) Date: Thu, 29 Jul 2010 08:20:50 +0100 Subject: [Python-ideas] Distutils setup.py & per user site packages Message-ID: <20100729082050.07074800@dino> Hi, I have downloaded a package from PyPI that uses distutils setup.py. When I run it with -h it shows options for building and installing, but does not appear to have an option for installation in my per user site packages directory (see PEP 370). I think it would be useful to add a "--local" option to setup.py that would install into the per site package directory. This would allow people to keep their Linux distros pristine while still being able to install packages that their distros don't have. (Or is there such an option that I've missed?) (In my particular case it wasn't a problem; I just built it and moved it.) -- Mark Summerfield, Qtrac Ltd, www.qtrac.eu C++, Python, Qt, PyQt - training and consultancy "Programming in Python 3" - ISBN 0321680561 http://www.qtrac.eu/py3book.html From flub at devork.be Thu Jul 29 10:15:57 2010 From: flub at devork.be (Floris Bruynooghe) Date: Thu, 29 Jul 2010 09:15:57 +0100 Subject: [Python-ideas] Distutils setup.py & per user site packages In-Reply-To: <20100729082050.07074800@dino> References: <20100729082050.07074800@dino> Message-ID: <20100729081557.GA20903@laurie.devork.be> Hi Mark On Thu, Jul 29, 2010 at 08:20:50AM +0100, Mark Summerfield wrote: > I have downloaded a package from PyPI that uses distutils setup.py. > When I run it with -h it shows options for building and installing, but > does not appear to have an option for installation in my per user site > packages directory (see PEP 370). > > (Or is there such an option that I've missed?) $ python2.6 setup.py --help install Shows you that you can use $ python2.6 setup.py install --user for this. You have to be using python2.6 or higher though. Note that this question doesn't really belong on python-ideas, it should have been posted to distutils-sig or python-list (comp.lang.python). But proposing a --local does make the line blurred ;-). Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From alexandre.conrad at gmail.com Thu Jul 29 12:41:35 2010 From: alexandre.conrad at gmail.com (Alexandre Conrad) Date: Thu, 29 Jul 2010 12:41:35 +0200 Subject: [Python-ideas] str.split with empty separator Message-ID: Hello all, What if str.split could take an empty separator? >>> 'banana'.split('') ['b', 'a', 'n', 'a', 'n', 'a'] I know this can be done with: >>> list('banana') ['b', 'a', 'n', 'a', 'n', 'a'] I think that, semantically speaking, it would make sens to split where there are no characters (in between them). Right now you can join from an empty string: ''.join(['b', 'a', 'n', 'a', 'n', 'a']) So why can't we split from an empty string? This wouldn't introduce any backwards incompatible changes as str.split currently can't have an empty separator: >>> 'banana'.split('') Traceback (most recent call last): File "", line 1, in ValueError: empty separator I would love to see my banana actually split. :) Regards, -- Alex twitter.com/alexconrad From ziade.tarek at gmail.com Thu Jul 29 13:35:41 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 13:35:41 +0200 Subject: [Python-ideas] Json object-level serializer Message-ID: Hello, What about adding in the json package the ability for an object to provide a different object to serialize ? This would be useful to translate a class into a structure that can be passed to json.dumps So, it __json__ is provided, its used for serialization instead of the object itself: >>> import json >>> class MyComplexClass(object): ... def __json__(self): ... return 'json' ... >>> o = MyComplexClass() >>> json.dumps(o) '"json"' Cheers Tarek -- Tarek Ziad? | http://ziade.org From phd at phd.pp.ru Thu Jul 29 13:40:59 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Thu, 29 Jul 2010 15:40:59 +0400 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: <20100729114059.GA2798@phd.pp.ru> On Thu, Jul 29, 2010 at 01:35:41PM +0200, Tarek Ziad? wrote: > > What about adding in the json package the ability for an object to > provide a different object to serialize ? > This would be useful to translate a class into a structure that can be > passed to json.dumps > > So, it __json__ is provided, its used for serialization instead of the > object itself: Also there must be a deserialization hook. Pickle uses __setstate__, and pickle stores the name of the class to call __setstate__ upon. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From solipsis at pitrou.net Thu Jul 29 13:41:18 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 13:41:18 +0200 Subject: [Python-ideas] Json object-level serializer References: Message-ID: <20100729134118.206545ec@pitrou.net> On Thu, 29 Jul 2010 13:35:41 +0200 Tarek Ziad? wrote: > Hello, > > What about adding in the json package the ability for an object to > provide a different object to serialize ? > This would be useful to translate a class into a structure that can be > passed to json.dumps How about letting json use the __reduce__ protocol instead? From ziade.tarek at gmail.com Thu Jul 29 13:54:12 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 13:54:12 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729114059.GA2798@phd.pp.ru> References: <20100729114059.GA2798@phd.pp.ru> Message-ID: On Thu, Jul 29, 2010 at 1:40 PM, Oleg Broytman wrote: > On Thu, Jul 29, 2010 at 01:35:41PM +0200, Tarek Ziad? wrote: >> >> What about adding in the json package the ability for an object to >> provide a different object to serialize ? >> This would be useful to translate a class into a structure that can be >> passed to json.dumps >> >> So, it __json__ is provided, its used for serialization instead of the >> object itself: > > ? Also there must be a deserialization hook. Pickle uses __setstate__, and > pickle stores the name of the class to call __setstate__ upon. You cannot do a round trip because once the object is serialized, json don't know which class to instantiate to de-serialize it Which is fine really, since json just serialize simple elements. Cheers Tarek -- Tarek Ziad? | http://ziade.org From mal at egenix.com Thu Jul 29 13:54:28 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 29 Jul 2010 13:54:28 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729134118.206545ec@pitrou.net> References: <20100729134118.206545ec@pitrou.net> Message-ID: <4C516BF4.8030702@egenix.com> Antoine Pitrou wrote: > On Thu, 29 Jul 2010 13:35:41 +0200 > Tarek Ziad? wrote: >> Hello, >> >> What about adding in the json package the ability for an object to >> provide a different object to serialize ? >> This would be useful to translate a class into a structure that can be >> passed to json.dumps > > How about letting json use the __reduce__ protocol instead? How would you then write a class that works with both pickle and json ? IMO, we'd need a separate method to return a JSON version of the object, e.g. .__json__(). I'm not sure how deserialization could be handled, since JSON doesn't support arbitrary object types. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ziade.tarek at gmail.com Thu Jul 29 13:56:48 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 13:56:48 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729134118.206545ec@pitrou.net> References: <20100729134118.206545ec@pitrou.net> Message-ID: On Thu, Jul 29, 2010 at 1:41 PM, Antoine Pitrou wrote: > On Thu, 29 Jul 2010 13:35:41 +0200 > Tarek Ziad? wrote: >> Hello, >> >> What about adding in the json package the ability for an object to >> provide a different object to serialize ? >> This would be useful to translate a class into a structure that can be >> passed to json.dumps > > How about letting json use the __reduce__ protocol instead? Maybe that's because I've never used it, but I find this protocol is very complex for this simple use case > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Tarek Ziad? | http://ziade.org From ziade.tarek at gmail.com Thu Jul 29 14:02:21 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 14:02:21 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C516BF4.8030702@egenix.com> References: <20100729134118.206545ec@pitrou.net> <4C516BF4.8030702@egenix.com> Message-ID: On Thu, Jul 29, 2010 at 1:54 PM, M.-A. Lemburg wrote: > Antoine Pitrou wrote: >> On Thu, 29 Jul 2010 13:35:41 +0200 >> Tarek Ziad? wrote: >>> Hello, >>> >>> What about adding in the json package the ability for an object to >>> provide a different object to serialize ? >>> This would be useful to translate a class into a structure that can be >>> passed to json.dumps >> >> How about letting json use the __reduce__ protocol instead? > > How would you then write a class that works with both pickle > and json ? > > IMO, we'd need a separate method to return a JSON version of > the object, e.g. .__json__(). I'm not sure how deserialization > could be handled, since JSON doesn't support arbitrary object > types. As I told Oleg, I think its OK not to have a round trip like Pickle. The use case I have is to express a structure in Json, but loading it back can be done in a custom, explicit process. It cannot be triggered from the json package itself since it cannot know that a given Json structure was built through a specific class. Cheers Tarek > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source ?(#1, Jul 29 2010) >>>> Python/Zope Consulting and Support ... ? ? ? ?http://www.egenix.com/ >>>> mxODBC.Zope.Database.Adapter ... ? ? ? ? ? ? http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... ? ? ? ?http://python.egenix.com/ > ________________________________________________________________________ > > ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: > > > ? eGenix.com Software, Skills and Services GmbH ?Pastor-Loeh-Str.48 > ? ?D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > ? ? ? ? ? Registered at Amtsgericht Duesseldorf: HRB 46611 > ? ? ? ? ? ? ? http://www.egenix.com/company/contact/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Tarek Ziad? | http://ziade.org From 2010 at jmunch.dk Thu Jul 29 14:06:37 2010 From: 2010 at jmunch.dk (Anders J. Munch) Date: Thu, 29 Jul 2010 14:06:37 +0200 Subject: [Python-ideas] PEP 380 alternative: A yielding function In-Reply-To: <4C50CFE7.9080404@canterbury.ac.nz> References: <4C4F14E6.1060102@jmunch.dk> <4C5060D2.2070402@scottdial.com> <4C506FDD.3070208@jmunch.dk> <4C50CFE7.9080404@canterbury.ac.nz> Message-ID: <4C516ECD.2070204@jmunch.dk> Greg Ewing wrote: > Anders J. Munch wrote: > >> Right, yield_if_true is a regular function, that's the whole point. > > What if it needs to call yield_() more than once? If it's > just a regular function, then it has no ability to be > suspended at the point of yield and resumed. I meant a regular function from the point of view of the compiler. The implementation would be special, of course. And therein lies the rub: It's unimplementable in CPython, alas. It could work in an implementation with a non-recursive eval loop, but if I'm not much mistaken, CPython recurses the eval loop even for a pure-Python function call. regards, Anders From g.brandl at gmx.net Thu Jul 29 14:22:01 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 29 Jul 2010 14:22:01 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: Am 29.07.2010 13:35, schrieb Tarek Ziad?: > Hello, > > What about adding in the json package the ability for an object to > provide a different object to serialize ? > This would be useful to translate a class into a structure that can be > passed to json.dumps > > So, it __json__ is provided, its used for serialization instead of the > object itself: > >>>> import json >>>> class MyComplexClass(object): > .... def __json__(self): > .... return 'json' > .... >>>> o = MyComplexClass() >>>> json.dumps(o) > '"json"' You can do this with a very short subclass of the JSONEncoder: class MyJSONEncoder(JSONEncoder): def default(self, obj): return obj.__json__() # with a useful failure message I don't think it needs to be built into the default encoder. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From mal at egenix.com Thu Jul 29 14:27:26 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 29 Jul 2010 14:27:26 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729134118.206545ec@pitrou.net> <4C516BF4.8030702@egenix.com> Message-ID: <4C5173AE.4060309@egenix.com> Tarek Ziad? wrote: > On Thu, Jul 29, 2010 at 1:54 PM, M.-A. Lemburg wrote: >> Antoine Pitrou wrote: >>> On Thu, 29 Jul 2010 13:35:41 +0200 >>> Tarek Ziad? wrote: >>>> Hello, >>>> >>>> What about adding in the json package the ability for an object to >>>> provide a different object to serialize ? >>>> This would be useful to translate a class into a structure that can be >>>> passed to json.dumps >>> >>> How about letting json use the __reduce__ protocol instead? >> >> How would you then write a class that works with both pickle >> and json ? >> >> IMO, we'd need a separate method to return a JSON version of >> the object, e.g. .__json__(). I'm not sure how deserialization >> could be handled, since JSON doesn't support arbitrary object >> types. > > As I told Oleg, I think its OK not to have a round trip like Pickle. > > The use case I have is to express a structure in Json, but loading it back > can be done in a custom, explicit process. > > It cannot be triggered from the json package itself since it cannot know > that a given Json structure was built through a specific class. I just wanted to emphasize that a separate new method is needed, rather than trying to reuse a pickle-protocol method. I don't think deserialization support is needed either. The application getting the decoded JSON data can do that in an application specific way based on the lists and dictionaries it gets from the JSON decoder. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mal at egenix.com Thu Jul 29 14:31:11 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 29 Jul 2010 14:31:11 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: <4C51748F.9090006@egenix.com> Georg Brandl wrote: > Am 29.07.2010 13:35, schrieb Tarek Ziad?: >> Hello, >> >> What about adding in the json package the ability for an object to >> provide a different object to serialize ? >> This would be useful to translate a class into a structure that can be >> passed to json.dumps >> >> So, it __json__ is provided, its used for serialization instead of the >> object itself: >> >>>>> import json >>>>> class MyComplexClass(object): >> .... def __json__(self): >> .... return 'json' >> .... >>>>> o = MyComplexClass() >>>>> json.dumps(o) >> '"json"' > > You can do this with a very short subclass of the JSONEncoder: > > class MyJSONEncoder(JSONEncoder): > def default(self, obj): > return obj.__json__() # with a useful failure message Does that also work with the JSON C extension ? > I don't think it needs to be built into the default encoder. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ziade.tarek at gmail.com Thu Jul 29 14:39:09 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 14:39:09 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 2:22 PM, Georg Brandl wrote: .. > You can do this with a very short subclass of the JSONEncoder: > > class MyJSONEncoder(JSONEncoder): > ? ?def default(self, obj): > ? ? ? ?return obj.__json__() ?# with a useful failure message > > I don't think it needs to be built into the default encoder. Yes, but you need to customize in that case the encoding process and own it. Having a builtin recognition of __json__ would allow you to pass your objects to be serialized to any third party code that uses a plain json.dumps. For instance, some web kits out there will automatically serialize your objects into json strings when you want to do json responses. e.g. it becomes a builtin adapter Cheers Tarek From fetchinson at googlemail.com Thu Jul 29 14:47:34 2010 From: fetchinson at googlemail.com (Daniel Fetchinson) Date: Thu, 29 Jul 2010 14:47:34 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: > What about adding in the json package the ability for an object to > provide a different object to serialize ? > This would be useful to translate a class into a structure that can be > passed to json.dumps > > So, it __json__ is provided, its used for serialization instead of the > object itself: > >>>> import json >>>> class MyComplexClass(object): > ... def __json__(self): > ... return 'json' > ... >>>> o = MyComplexClass() >>>> json.dumps(o) > '"json"' Have a look at turbojson [1], the jsonification package that uses peak.rules [2] and which comes with turbogears [3]. It does exactly what you propose. Cheers, Daniel [1a] http://pypi.python.org/pypi/TurboJson [1b] http://svn.turbogears.org/projects/TurboJson [2a] pypi.python.org/pypi/PEAK-Rules [2b] http://peak.telecommunity.com/DevCenter/RulesReadme [3] http:///www.turbogears.org -- Psss, psss, put it down! - http://www.cafepress.com/putitdown From ncoghlan at gmail.com Thu Jul 29 14:51:11 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 29 Jul 2010 22:51:11 +1000 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 10:39 PM, Tarek Ziad? wrote: > On Thu, Jul 29, 2010 at 2:22 PM, Georg Brandl wrote: > .. >> You can do this with a very short subclass of the JSONEncoder: >> >> class MyJSONEncoder(JSONEncoder): >> ? ?def default(self, obj): >> ? ? ? ?return obj.__json__() ?# with a useful failure message >> >> I don't think it needs to be built into the default encoder. > > Yes, but you need to customize in that case the encoding process and own it. > > Having a builtin recognition of __json__ would allow you to pass your objects > to be serialized to any third party code that uses a plain json.dumps. > > For instance, some web kits out there will automatically serialize > your objects into json strings > when you want to do json responses. e.g. it becomes a builtin adapter I'll channel PJE here and point out that this kind of magic-method based protocol proliferation is exactly what a general purpose generic-function implementation is designed to avoid (i.e. instead of having json.dumps check for a __json__ magic method, you'd just flag json.dumps as a generic function and let people register their own overloads). Each individual time this question comes up people tend to react with "oh, that's too complicated and overkill, but magic methods are simple, so let's just define another magic method". The sum total of all those magic methods starts to accumulate into a lot of complexity of its own though :P Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Thu Jul 29 14:57:13 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 14:57:13 +0200 Subject: [Python-ideas] Json object-level serializer References: Message-ID: <20100729145713.3bdc4cdd@pitrou.net> On Thu, 29 Jul 2010 14:47:34 +0200 Daniel Fetchinson wrote: > > What about adding in the json package the ability for an object to > > provide a different object to serialize ? > > This would be useful to translate a class into a structure that can be > > passed to json.dumps > > > > So, it __json__ is provided, its used for serialization instead of the > > object itself: > > > >>>> import json > >>>> class MyComplexClass(object): > > ... def __json__(self): > > ... return 'json' > > ... > >>>> o = MyComplexClass() > >>>> json.dumps(o) > > '"json"' > > Have a look at turbojson [1], the jsonification package that uses > peak.rules [2] and which comes with turbogears [3]. It does exactly > what you propose. That it uses PEAK-Rules is probably a good reason to avoid it. Also, AFAIK, TurboGears have stopped using turbojson and relies on [simple]json instead. Regards Antoine. From ncoghlan at gmail.com Thu Jul 29 14:59:52 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 29 Jul 2010 22:59:52 +1000 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 10:47 PM, Daniel Fetchinson wrote: > Have a look at turbojson [1], the jsonification package that uses > peak.rules [2] and which comes with turbogears [3]. It does exactly > what you propose. Speaking of PJE and generic functions* ;) Cheers, Nick. *For those following along at home that may not be familiar with the names of various Python developers, PJE is Phillip J. Eby, the author of peak.rules (amongst many other things). -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mikegraham at gmail.com Thu Jul 29 15:01:27 2010 From: mikegraham at gmail.com (Mike Graham) Date: Thu, 29 Jul 2010 09:01:27 -0400 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 7:35 AM, Tarek Ziad? wrote: > Hello, > > What about adding in the json package the ability for an object to > provide a different object to serialize ? > This would be useful to translate a class into a structure that can be > passed to json.dumps > > So, it __json__ is provided, its used for serialization instead of the > object itself: > >>>> import json >>>> class MyComplexClass(object): > ... ? ? def __json__(self): > ... ? ? ? ? return 'json' > ... >>>> o = MyComplexClass() >>>> json.dumps(o) > '"json"' > > > > Cheers > Tarek > > -- > Tarek Ziad? | http://ziade.org Since there isn't really any magic going on, why use a __foo__ name? The majority of __foo__ names are for things you shouldn't reference yourself, but it doesn't seem like this is too personal a method to do that with. This allows inheritance of JSONization. The current custom serialization stuff does not. I'm not certain which is the bug and which is the feature. Since you aren't using anything useful from the json module, why involve it at all? Consistent API? One nice thing about the json module is that when using it you always produce valid JSON. Even the hooks for custom serialization keep this property. This is fairly nice to have. Regards, Mike From solipsis at pitrou.net Thu Jul 29 15:11:16 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 15:11:16 +0200 Subject: [Python-ideas] Json object-level serializer References: Message-ID: <20100729151116.219d6765@pitrou.net> On Thu, 29 Jul 2010 22:51:11 +1000 Nick Coghlan wrote: > > Each individual time this question comes up people tend to react with > "oh, that's too complicated and overkill, but magic methods are > simple, so let's just define another magic method". The sum total of > all those magic methods starts to accumulate into a lot of complexity > of its own though :P I don't agree. __json__ only matters to people who do JSON encoding/decoding. Other people can safely ignore it. And I don't see how generic functions bring less cognitive overhead. (they actually bring more of it, since most implementations are more complicated to begin with) From solipsis at pitrou.net Thu Jul 29 15:13:34 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 15:13:34 +0200 Subject: [Python-ideas] Json object-level serializer References: Message-ID: <20100729151334.639738e6@pitrou.net> On Thu, 29 Jul 2010 22:59:52 +1000 Nick Coghlan wrote: > *For those following along at home that may not be familiar with the > names of various Python developers, PJE is Phillip J. Eby, the author > of peak.rules (amongst many other things). Many of which unmaintained or even incompatible with recent Python versions (due to, for example, ugly bytecode hacks). Regards Antoine. From g.brandl at gmx.net Thu Jul 29 15:12:54 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 29 Jul 2010 15:12:54 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C51748F.9090006@egenix.com> References: <4C51748F.9090006@egenix.com> Message-ID: Am 29.07.2010 14:31, schrieb M.-A. Lemburg: > Georg Brandl wrote: >> Am 29.07.2010 13:35, schrieb Tarek Ziad?: >>> Hello, >>> >>> What about adding in the json package the ability for an object to >>> provide a different object to serialize ? >>> This would be useful to translate a class into a structure that can be >>> passed to json.dumps >>> >>> So, it __json__ is provided, its used for serialization instead of the >>> object itself: >>> >>>>>> import json >>>>>> class MyComplexClass(object): >>> .... def __json__(self): >>> .... return 'json' >>> .... >>>>>> o = MyComplexClass() >>>>>> json.dumps(o) >>> '"json"' >> >> You can do this with a very short subclass of the JSONEncoder: >> >> class MyJSONEncoder(JSONEncoder): >> def default(self, obj): >> return obj.__json__() # with a useful failure message > > Does that also work with the JSON C extension ? I think so. The C encoder gets the default function as an argument. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From fetchinson at googlemail.com Thu Jul 29 15:19:56 2010 From: fetchinson at googlemail.com (Daniel Fetchinson) Date: Thu, 29 Jul 2010 15:19:56 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729145713.3bdc4cdd@pitrou.net> References: <20100729145713.3bdc4cdd@pitrou.net> Message-ID: >> > What about adding in the json package the ability for an object to >> > provide a different object to serialize ? >> > This would be useful to translate a class into a structure that can be >> > passed to json.dumps >> > >> > So, it __json__ is provided, its used for serialization instead of the >> > object itself: >> > >> >>>> import json >> >>>> class MyComplexClass(object): >> > ... def __json__(self): >> > ... return 'json' >> > ... >> >>>> o = MyComplexClass() >> >>>> json.dumps(o) >> > '"json"' >> >> Have a look at turbojson [1], the jsonification package that uses >> peak.rules [2] and which comes with turbogears [3]. It does exactly >> what you propose. > > That it uses PEAK-Rules is probably a good reason to avoid it. Why? > Also, AFAIK, TurboGears have stopped using turbojson and relies on > [simple]json instead. That might be true for turbogears2 but turbogears1 (which is still in active development) still uses turbojson. Turbogears 1 and 2 diverged so much that it would be more appropriate to call them different names and consider them different projects (I personally use and prefer tg1). Cheers, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown From ziade.tarek at gmail.com Thu Jul 29 15:25:20 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 15:25:20 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 2:51 PM, Nick Coghlan wrote: > On Thu, Jul 29, 2010 at 10:39 PM, Tarek Ziad? wrote: >> On Thu, Jul 29, 2010 at 2:22 PM, Georg Brandl wrote: >> .. >>> You can do this with a very short subclass of the JSONEncoder: >>> >>> class MyJSONEncoder(JSONEncoder): >>> ? ?def default(self, obj): >>> ? ? ? ?return obj.__json__() ?# with a useful failure message >>> >>> I don't think it needs to be built into the default encoder. >> >> Yes, but you need to customize in that case the encoding process and own it. >> >> Having a builtin recognition of __json__ would allow you to pass your objects >> to be serialized to any third party code that uses a plain json.dumps. >> >> For instance, some web kits out there will automatically serialize >> your objects into json strings >> when you want to do json responses. e.g. it becomes a builtin adapter > > I'll channel PJE here and point out that this kind of magic-method > based protocol proliferation is exactly what a general purpose > generic-function implementation is designed to avoid (i.e. instead of > having json.dumps check for a __json__ magic method, you'd just flag > json.dumps as a generic function and let people register their own > overloads). > > Each individual time this question comes up people tend to react with > "oh, that's too complicated and overkill, but magic methods are > simple, so let's just define another magic method". The sum total of > all those magic methods starts to accumulate into a lot of complexity > of its own though :P That makes sense. OTHO, if we drop the idea of having a __magical__ method, we could have an collections' ABC instead, called JSONSerializable, with one method to override, This is more about declaring the interface rather than adding yet another __magic__ method That's a nice OOP pattern to have imho Cheers Tarek > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > -- Tarek Ziad? | http://ziade.org From solipsis at pitrou.net Thu Jul 29 15:28:44 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 15:28:44 +0200 Subject: [Python-ideas] Json object-level serializer References: <20100729145713.3bdc4cdd@pitrou.net> Message-ID: <20100729152844.115f7087@pitrou.net> On Thu, 29 Jul 2010 15:19:56 +0200 Daniel Fetchinson wrote: > > > > That it uses PEAK-Rules is probably a good reason to avoid it. > > Why? I might be mistaken, but it seems to me that it isn't maintained anymore (or perhaps that's RuleDispatch, which is from the same author). It doesn't seem to have had a stable release in years. Regards Antoine. From solipsis at pitrou.net Thu Jul 29 15:34:29 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 15:34:29 +0200 Subject: [Python-ideas] Json object-level serializer References: Message-ID: <20100729153429.382c0318@pitrou.net> On Thu, 29 Jul 2010 15:25:20 +0200 Tarek Ziad? wrote: > > That makes sense. OTHO, if we drop the idea of having a __magical__ method, > we could have an collections' ABC instead, called JSONSerializable, > with one method to override, > > This is more about declaring the interface rather than adding yet > another __magic__ method > > That's a nice OOP pattern to have imho Python is supposed to be duck-typed. It would be strange to add a couple of random exceptions to that general rule. Moreover, having to *both* derive an existing class and implement the single method defined on that class is one complication too many. And I don't see how `__json__` is more annoying than e.g. `to_json`. Regards Antoine. From ziade.tarek at gmail.com Thu Jul 29 15:42:17 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 15:42:17 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729153429.382c0318@pitrou.net> References: <20100729153429.382c0318@pitrou.net> Message-ID: On Thu, Jul 29, 2010 at 3:34 PM, Antoine Pitrou wrote: > On Thu, 29 Jul 2010 15:25:20 +0200 > Tarek Ziad? wrote: >> >> That makes sense. OTHO, if we drop the idea of having a __magical__ method, >> we could have an collections' ABC instead, called JSONSerializable, >> with one method to override, >> >> This is more about declaring the interface rather than adding yet >> another __magic__ method >> >> That's a nice OOP pattern to have imho > > Python is supposed to be duck-typed. It would be strange to add a > couple of random exceptions to that general rule. Moreover, having to > *both* derive an existing class and implement the single method defined > on that class is one complication too many. Not sure to follow here, since ABCs are about having an object supporting a series of methods no matter what are the parent classes. e.g. this is closer to the concept of "interfaces". IOW you don't need to derive from a parent class, you just to need to provide a given set of methods, and ABC provides a ways to check that an object has that signature. see: http://docs.python.org/library/collections.html#abcs-abstract-base-classes ABS is the modern duck typing I'd say :) > > And I don't see how `__json__` is more annoying than e.g. `to_json`. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Tarek Ziad? | http://ziade.org From ziade.tarek at gmail.com Thu Jul 29 15:47:01 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 15:47:01 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> Message-ID: On Thu, Jul 29, 2010 at 3:42 PM, Tarek Ziad? wrote: >> And I don't see how `__json__` is more annoying than e.g. `to_json`. Its easier to override From solipsis at pitrou.net Thu Jul 29 15:49:42 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 15:49:42 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> Message-ID: <1280411382.3175.38.camel@localhost.localdomain> Le jeudi 29 juillet 2010 ? 15:42 +0200, Tarek Ziad? a ?crit : > On Thu, Jul 29, 2010 at 3:34 PM, Antoine Pitrou wrote: > > On Thu, 29 Jul 2010 15:25:20 +0200 > > Tarek Ziad? wrote: > >> > >> That makes sense. OTHO, if we drop the idea of having a __magical__ method, > >> we could have an collections' ABC instead, called JSONSerializable, > >> with one method to override, > >> > >> This is more about declaring the interface rather than adding yet > >> another __magic__ method > >> > >> That's a nice OOP pattern to have imho > > > > Python is supposed to be duck-typed. It would be strange to add a > > couple of random exceptions to that general rule. Moreover, having to > > *both* derive an existing class and implement the single method defined > > on that class is one complication too many. > > Not sure to follow here, since ABCs are about having an object > supporting a series of methods no matter what are the parent classes. > e.g. this is closer to the concept of "interfaces". > > IOW you don't need to derive from a parent class, you just to need to provide > a given set of methods, and ABC provides a ways to check that an > object has that signature. Ok, but then how does it avoid having a __magic__ method? You can't use a normal name such as "to_json" because then an existing class with that method could be wrongly inferred as implementing your new ABC, and break existing code. Besides, defining an ABC for a single, module-specific method sounds rather overkill. This reminds of me of projects plagued by an overuse of interfaces for every possible concept. Regards Antoine. From solipsis at pitrou.net Thu Jul 29 15:51:14 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 15:51:14 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> Message-ID: <1280411474.3175.40.camel@localhost.localdomain> Le jeudi 29 juillet 2010 ? 15:47 +0200, Tarek Ziad? a ?crit : > On Thu, Jul 29, 2010 at 3:42 PM, Tarek Ziad? wrote: > >> And I don't see how `__json__` is more annoying than e.g. `to_json`. > > Its easier to override Could you expand a little bit? From ziade.tarek at gmail.com Thu Jul 29 16:01:23 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 16:01:23 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <1280411382.3175.38.camel@localhost.localdomain> References: <20100729153429.382c0318@pitrou.net> <1280411382.3175.38.camel@localhost.localdomain> Message-ID: On Thu, Jul 29, 2010 at 3:49 PM, Antoine Pitrou wrote: > Le jeudi 29 juillet 2010 ? 15:42 +0200, Tarek Ziad? a ?crit : >> On Thu, Jul 29, 2010 at 3:34 PM, Antoine Pitrou wrote: >> > On Thu, 29 Jul 2010 15:25:20 +0200 >> > Tarek Ziad? wrote: >> >> >> >> That makes sense. OTHO, if we drop the idea of having a __magical__ method, >> >> we could have an collections' ABC instead, called JSONSerializable, >> >> with one method to override, >> >> >> >> This is more about declaring the interface rather than adding yet >> >> another __magic__ method >> >> >> >> That's a nice OOP pattern to have imho >> > >> > Python is supposed to be duck-typed. It would be strange to add a >> > couple of random exceptions to that general rule. Moreover, having to >> > *both* derive an existing class and implement the single method defined >> > on that class is one complication too many. >> >> Not sure to follow here, since ABCs are about having an object >> supporting a series of methods no matter what are the parent classes. >> e.g. this is closer to the concept of "interfaces". >> >> IOW you don't need to derive from a parent class, you just to need to provide >> a given set of methods, and ABC provides a ways to check that an >> object has that signature. > > Ok, but then how does it avoid having a __magic__ method? You can't use > a normal name such as "to_json" because then an existing class with that > method could be wrongly inferred as implementing your new ABC, and break > existing code. yes that's a possible side-effect, unless we explicitely register those class Using ABC's register technique. > > Besides, defining an ABC for a single, module-specific method sounds > rather overkill. This reminds of me of projects plagued by an overuse of > interfaces for every possible concept. Thats what Iterator did in ABC though. From ziade.tarek at gmail.com Thu Jul 29 16:03:22 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 16:03:22 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <1280411474.3175.40.camel@localhost.localdomain> References: <20100729153429.382c0318@pitrou.net> <1280411474.3175.40.camel@localhost.localdomain> Message-ID: On Thu, Jul 29, 2010 at 3:51 PM, Antoine Pitrou wrote: > Le jeudi 29 juillet 2010 ? 15:47 +0200, Tarek Ziad? a ?crit : >> On Thu, Jul 29, 2010 at 3:42 PM, Tarek Ziad? wrote: >> >> And I don't see how `__json__` is more annoying than e.g. `to_json`. >> >> Its easier to override > > Could you expand a little bit? If you want to override to_json in a subclass, to slightly adapt it, its easier because the __json__ name is mangled by Python From solipsis at pitrou.net Thu Jul 29 16:14:27 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 16:14:27 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> <1280411474.3175.40.camel@localhost.localdomain> Message-ID: <1280412867.3175.42.camel@localhost.localdomain> Le jeudi 29 juillet 2010 ? 16:03 +0200, Tarek Ziad? a ?crit : > On Thu, Jul 29, 2010 at 3:51 PM, Antoine Pitrou wrote: > > Le jeudi 29 juillet 2010 ? 15:47 +0200, Tarek Ziad? a ?crit : > >> On Thu, Jul 29, 2010 at 3:42 PM, Tarek Ziad? wrote: > >> >> And I don't see how `__json__` is more annoying than e.g. `to_json`. > >> > >> Its easier to override > > > > Could you expand a little bit? > > If you want to override to_json in a subclass, to slightly adapt it, its easier > because the __json__ name is mangled by Python No, it isn't. >>> class C: ... def __json__(self): ... pass ... >>> C.__json__ >>> C().__json__ > From alexandre.conrad at gmail.com Thu Jul 29 16:14:39 2010 From: alexandre.conrad at gmail.com (Alexandre Conrad) Date: Thu, 29 Jul 2010 16:14:39 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> <1280411474.3175.40.camel@localhost.localdomain> Message-ID: 2010/7/29 Tarek Ziad? : > On Thu, Jul 29, 2010 at 3:51 PM, Antoine Pitrou wrote: >> Le jeudi 29 juillet 2010 ? 15:47 +0200, Tarek Ziad? a ?crit : >>> On Thu, Jul 29, 2010 at 3:42 PM, Tarek Ziad? wrote: >>> >> And I don't see how `__json__` is more annoying than e.g. `to_json`. >>> >>> Its easier to override >> >> Could you expand a little bit? > > If you want to override to_json in a subclass, to slightly adapt it, its easier > because the __json__ name is mangled by Python I believe that mangling will not be performed if the identifier ends with more than one underscore. So __json__ won't be mangled. Regards, -- Alex twitter.com/alexconrad From solipsis at pitrou.net Thu Jul 29 16:22:13 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 16:22:13 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> <1280411382.3175.38.camel@localhost.localdomain> Message-ID: <1280413333.3175.50.camel@localhost.localdomain> > > Besides, defining an ABC for a single, module-specific method sounds > > rather overkill. This reminds of me of projects plagued by an overuse of > > interfaces for every possible concept. > > Thats what Iterator did in ABC though. Well, it could be argued that testing for an Iterator is useful for a significant variety of code, while testing for a JSONSerialiazable doesn't have an use case outside of the json module itself. Besides, it remains to be seen if anyone will use the Iterator ABC instead of directly looking up the __next__ method. I'm not convinced that all of the ABCs bundled with the stdlib are really useful, apart from showcasing the potentialities of ABCs. Regards Antoine. From ziade.tarek at gmail.com Thu Jul 29 16:27:14 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 16:27:14 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729153429.382c0318@pitrou.net> <1280411474.3175.40.camel@localhost.localdomain> Message-ID: On Thu, Jul 29, 2010 at 4:14 PM, Alexandre Conrad wrote: > 2010/7/29 Tarek Ziad? : >> On Thu, Jul 29, 2010 at 3:51 PM, Antoine Pitrou wrote: >>> Le jeudi 29 juillet 2010 ? 15:47 +0200, Tarek Ziad? a ?crit : >>>> On Thu, Jul 29, 2010 at 3:42 PM, Tarek Ziad? wrote: >>>> >> And I don't see how `__json__` is more annoying than e.g. `to_json`. >>>> >>>> Its easier to override >>> >>> Could you expand a little bit? >> >> If you want to override to_json in a subclass, to slightly adapt it, its easier >> because the __json__ name is mangled by Python > > I believe that mangling will not be performed if the identifier ends > with more than one underscore. So __json__ won't be mangled. Ooops I forgot about that :) From scott+python-ideas at scottdial.com Thu Jul 29 16:29:34 2010 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Thu, 29 Jul 2010 10:29:34 -0400 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729151334.639738e6@pitrou.net> References: <20100729151334.639738e6@pitrou.net> Message-ID: <4C51904E.3030007@scottdial.com> On 7/29/2010 9:13 AM, Antoine Pitrou wrote: > On Thu, 29 Jul 2010 22:59:52 +1000 Nick Coghlan wrote: >> *For those following along at home that may not be familiar with the >> names of various Python developers, PJE is Phillip J. Eby, the author >> of peak.rules (amongst many other things). > > Many of which unmaintained or even incompatible with recent Python > versions (due to, for example, ugly bytecode hacks). On 7/29/2010 9:28 AM, Antoine Pitrou wrote: > I might be mistaken, but it seems to me that it isn't maintained > anymore (or perhaps that's RuleDispatch, which is from the same > author). It doesn't seem to have had a stable release in years. Sounds like you are damning the man and just chucking the concept and his projects in along with him. URL: svn://svn.eby-sarna.com/svnroot/PEAK-Rules Last Changed Date: 2009-07-15 00:30:57 -0400 (Wed, 15 Jul 2009) r2600 | pje | 2009-07-15 00:30:57 -0400 (Wed, 15 Jul 2009) | 2 lines Fix for Python 2.6 DeprecationWarning PEAK-Rules-0.5a1.dev-r2600.tar.gz 29-Jul-2010 04:22 93K It's unclear to me that not having been changed in a year constitutes "unmaintained" especially since PJE seems quite responsive on the PEAK mailing list. So, please restrain yourself unless you have something more to say than FUD about PJE and PEAK-Rules. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From solipsis at pitrou.net Thu Jul 29 16:46:37 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 29 Jul 2010 16:46:37 +0200 Subject: [Python-ideas] Json object-level serializer References: <20100729151334.639738e6@pitrou.net> <4C51904E.3030007@scottdial.com> Message-ID: <20100729164637.6da2f760@pitrou.net> On Thu, 29 Jul 2010 10:29:34 -0400 Scott Dial wrote: > > It's unclear to me that not having been changed in a year constitutes > "unmaintained" especially since PJE seems quite responsive on the PEAK > mailing list. So, please restrain yourself unless you have something > more to say than FUD about PJE and PEAK-Rules. Well, sorry if I mixed up PEAK-Rules and RuleDispatch (which, again, are similar libraries from the same author). The fact that RuleDispatch has been unmaintained, though, has been a source of problems for some people and projects. Regards Antoine. From ziade.tarek at gmail.com Thu Jul 29 17:04:04 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Thu, 29 Jul 2010 17:04:04 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C51904E.3030007@scottdial.com> References: <20100729151334.639738e6@pitrou.net> <4C51904E.3030007@scottdial.com> Message-ID: On Thu, Jul 29, 2010 at 4:29 PM, Scott Dial wrote: .. > > PEAK-Rules-0.5a1.dev-r2600.tar.gz ? ? ? ? ? ?29-Jul-2010 04:22 ? 93K I wouldn't trust a release from the Hacker Quarterly From alexander.belopolsky at gmail.com Thu Jul 29 17:28:24 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 29 Jul 2010 11:28:24 -0400 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C516BF4.8030702@egenix.com> References: <20100729134118.206545ec@pitrou.net> <4C516BF4.8030702@egenix.com> Message-ID: On Thu, Jul 29, 2010 at 7:54 AM, M.-A. Lemburg wrote: > Antoine Pitrou wrote: >> On Thu, 29 Jul 2010 13:35:41 +0200 >> Tarek Ziad? wrote: >>> Hello, >>> >>> What about adding in the json package the ability for an object to >>> provide a different object to serialize ? >>> This would be useful to translate a class into a structure that can be >>> passed to json.dumps >> >> How about letting json use the __reduce__ protocol instead? > +1. I think this is a very sensible idea. Note that Tarek's request was not for a magic method like __repr__ that would return an easy to parse string. Instead, the request was for a method that would return an object that can be serialized instead of the original object and will carry enough data to restore the original object. > How would you then write a class that works with both pickle > and json ? > Hopefully, for most types json would be able to use a unmodified __reduce__ method. If his is not enough, the reduce protocol already has an extension mechanism. For example, an object may implement obj.__reduce_ex__('json') that would return json-friendly tuple instead of pickle oriented obj.__reduce__(). > IMO, we'd need a separate method to return a JSON version of > the object, e.g. .__json__(). I'm not sure how deserialization > could be handled, since JSON doesn't support arbitrary object > types. I am afraid this was the turning point in this thread after which the discussion went (IMO) in the wrong direction. Again, the OP's request was for a method that would return an object that json or another simple serializer (say yaml) could handle, not for a method that will return json string. From python at mrabarnett.plus.com Thu Jul 29 18:28:46 2010 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 29 Jul 2010 17:28:46 +0100 Subject: [Python-ideas] str.split with empty separator In-Reply-To: References: Message-ID: <4C51AC3E.8080201@mrabarnett.plus.com> Alexandre Conrad wrote: > Hello all, > > What if str.split could take an empty separator? > >>>> 'banana'.split('') > ['b', 'a', 'n', 'a', 'n', 'a'] > > I know this can be done with: > >>>> list('banana') > ['b', 'a', 'n', 'a', 'n', 'a'] > > I think that, semantically speaking, it would make sens to split where > there are no characters (in between them). Right now you can join from > an empty string: > > ''.join(['b', 'a', 'n', 'a', 'n', 'a']) > > So why can't we split from an empty string? > > This wouldn't introduce any backwards incompatible changes as > str.split currently can't have an empty separator: > >>>> 'banana'.split('') > Traceback (most recent call last): > File "", line 1, in > ValueError: empty separator > > I would love to see my banana actually split. :) > Shouldn't it be this: >>> 'banana'.split('') ['', 'b', 'a', 'n', 'a', 'n', 'a', ''] After all, the separator does exist at the start and end of the string: >>> 'banana'.startswith('') True >>> 'banana'.endswith('') True From tjreedy at udel.edu Thu Jul 29 19:18:42 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 29 Jul 2010 13:18:42 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> Message-ID: On 7/27/2010 2:04 PM, Guido van Rossum wrote: > On Tue, Jul 27, 2010 at 5:25 PM, Robert Kern wrote: >> I've occasionally wished that we could repurpose backticks for expression >> literals: >> >> expr = `x + y*z` >> assert isinstance(expr, ast.Expression) > > Maybe you could just as well make it a plain string literal and call a > function that parses it into a parse tree: > > expr = parse("x + y*z") > assert isinstance(expr, ast.Expression) > > The advantage of this approach is that you can define a different > language too... and that is already exists, and is more visible than backticks >>> def expr(s): return ast.parse(s, mode='eval') # defaults is 'exec' >>> e = expr('a+b') >>> e <_ast.Expression object at 0x00F8DCF0> -- Terry Jan Reedy From tjreedy at udel.edu Thu Jul 29 19:18:54 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 29 Jul 2010 13:18:54 -0400 Subject: [Python-ideas] Non-boolean return from __contains__ In-Reply-To: References: <6B58DE7F-F50D-4A55-8C95-F084BFBFA11E@gmail.com> <83CC4121-6D78-4053-B134-3D7BADBA9F82@gmail.com> <6A11FE63-3E1B-4E8C-8A5D-E5A953C98807@masklinn.net> <4C4E7DB8.2040407@canterbury.ac.nz> <3EEB340D-0BF3-48EE-AC02-5E6A7DF1495C@masklinn.net> Message-ID: On 7/27/2010 3:05 PM, Alexander Belopolsky wrote: > anyways. I would be very interested to see the parse() function. It > does not exist (yet), right? >>> def expr(s): return ast.parse(s, mode='eval') >>> e = expr('a+b') >>> e <_ast.Expression object at 0x00F8DCF0> -- Terry Jan Reedy From ronaldoussoren at mac.com Thu Jul 29 21:13:25 2010 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 29 Jul 2010 21:13:25 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729134118.206545ec@pitrou.net> <4C516BF4.8030702@egenix.com> Message-ID: On 29 Jul, 2010, at 17:28, Alexander Belopolsky wrote: > On Thu, Jul 29, 2010 at 7:54 AM, M.-A. Lemburg wrote: >> Antoine Pitrou wrote: >>> On Thu, 29 Jul 2010 13:35:41 +0200 >>> Tarek Ziad? wrote: >>>> Hello, >>>> >>>> What about adding in the json package the ability for an object to >>>> provide a different object to serialize ? >>>> This would be useful to translate a class into a structure that can be >>>> passed to json.dumps >>> >>> How about letting json use the __reduce__ protocol instead? >> > > +1. I think this is a very sensible idea. Note that Tarek's request > was not for a magic method like __repr__ that would return an easy to > parse string. Instead, the request was for a method that would return > an object that can be serialized instead of the original object and > will carry enough data to restore the original object. I'm -1 on this because the __reduce__ protocol and the proposed __json__ protocol have slightly different purposes. When I use JSON I generally only publish part of the object-state into JSON, even when pickling the object would store the entire state. Another reason for not sharing the same method for pickling and json serialisation is that the json side may have external constraints (that is, the consumer of the JSON data may have requirements on how objects are serialized) and those constraints should not limit how the object can be pickled. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3567 bytes Desc: not available URL: From mal at egenix.com Thu Jul 29 23:25:09 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 29 Jul 2010 23:25:09 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <4C51748F.9090006@egenix.com> Message-ID: <4C51F1B5.9000608@egenix.com> Georg Brandl wrote: > Am 29.07.2010 14:31, schrieb M.-A. Lemburg: >> Georg Brandl wrote: >>> Am 29.07.2010 13:35, schrieb Tarek Ziad?: >>>> Hello, >>>> >>>> What about adding in the json package the ability for an object to >>>> provide a different object to serialize ? >>>> This would be useful to translate a class into a structure that can be >>>> passed to json.dumps >>>> >>>> So, it __json__ is provided, its used for serialization instead of the >>>> object itself: >>>> >>>>>>> import json >>>>>>> class MyComplexClass(object): >>>> .... def __json__(self): >>>> .... return 'json' >>>> .... >>>>>>> o = MyComplexClass() >>>>>>> json.dumps(o) >>>> '"json"' >>> >>> You can do this with a very short subclass of the JSONEncoder: >>> >>> class MyJSONEncoder(JSONEncoder): >>> def default(self, obj): >>> return obj.__json__() # with a useful failure message >> >> Does that also work with the JSON C extension ? > > I think so. The C encoder gets the default function as an argument. Then that sounds like the right way forward. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Fri Jul 30 00:12:24 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Jul 2010 08:12:24 +1000 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100729151116.219d6765@pitrou.net> References: <20100729151116.219d6765@pitrou.net> Message-ID: On Thu, Jul 29, 2010 at 11:11 PM, Antoine Pitrou wrote: > On Thu, 29 Jul 2010 22:51:11 +1000 > Nick Coghlan wrote: >> >> Each individual time this question comes up people tend to react with >> "oh, that's too complicated and overkill, but magic methods are >> simple, so let's just define another magic method". The sum total of >> all those magic methods starts to accumulate into a lot of complexity >> of its own though :P > > I don't agree. __json__ only matters to people who do JSON > encoding/decoding. Other people can safely ignore it. Which is exactly the attitude I was talking about: for each individual case, people go "oh, I understand magic methods, those are easy". It's the overall process of identifying the need for and gathering consensus on magic methods that is unwieldy (and ultimately fails to scale, leading to non-extensible interfaces by default, with pretty printing being the classic example, and JSON serialisation the latest). > And I don't see how generic functions bring less cognitive overhead. > (they actually bring more of it, since most implementations are more > complicated to begin with) Mostly because the fully fledged generic implementations like PEAK-rules tend to get brought into discussions when they aren't needed. Single-type generic dispatch is actually so common they gave it a name: object-oriented programming. All single-type generic dispatch is about is having a registry for a particular operation that says "to perform this operation, with objects of this type, use this function". Instead of having a protocol that says "look up this magic method in the object's own namespace" (which requires a) agreement on the magic name to use and b) that the original author of the type in question both knew and cared about the operation the application developer is interested in) you instead have a protocol that says "here is a standard mechanism for declaring a type registry for a function, so you only have to learn how to register a function once". Is it really harder for people to learn how to write things like: json.dumps.overload(mytype, mytype.to_json) json.dumps.overload(third_party_type, my_third_party_type_serialiser) than it is for them to figure out that implementing a __json__ method will allow them to change how their object is serialised? (Not to mention that a __json__ method can only be used via monkey-patching if the type you want to serialise differently came from a library module rather than your own code). The generic function registration approach is incidentally discoverable via dir(json.dumps) to see that a function provides the relevant generic function registration methods. Magic method protocols can *only* be discovered by reading documentation. Function registration is a solved problem, with much better solutions than the ad hoc YAMM (yet-another-magic-method) approach we currently use. We just keep getting scared away from the right answer by the crazily complex overloading schemes that libraries like PEAK-rules allow. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Fri Jul 30 00:39:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 30 Jul 2010 00:39:07 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> Message-ID: <1280443147.3175.70.camel@localhost.localdomain> > > I don't agree. __json__ only matters to people who do JSON > > encoding/decoding. Other people can safely ignore it. > > Which is exactly the attitude I was talking about: for each individual > case, people go "oh, I understand magic methods, those are easy". It's > the overall process of identifying the need for and gathering > consensus on magic methods that is unwieldy (and ultimately fails to > scale, leading to non-extensible interfaces by default, with pretty > printing being the classic example, and JSON serialisation the > latest). Why do you want to gather consensus? There is a single json serialization module in the stdlib and it's obvious that __json__ can/should be claimed by that module. Actually, your argument could be returned: if you use generic functions (such as @json.dumps.overload), alternative json serializers won't easily be able to make use of the information, while they could access the __json__ method like the standard json module does. > Is it really harder for people to learn how to write things like: > > json.dumps.overload(mytype, mytype.to_json) > json.dumps.overload(third_party_type, my_third_party_type_serialiser) It is certainly more annoying and less natural than: def __json__(self): .... Sure, generic functions as a paradigm appear more powerful, more decoupled, etc. But in practice __magic__ methods are sufficient for most uses. Practicality beats purity. That may be why in all the years that the various generic functions libraries have existed, they don't seem to have been really popular compared to the simpler convention of defining fixed method names. (besides, it wouldn't necessarily be json.dumps that you overload, but some internal function of the json module; making it even less intuitive and easily discoverable) > The generic function registration approach is incidentally > discoverable via dir(json.dumps) to see that a function provides the > relevant generic function registration methods. Magic method protocols > can *only* be discovered by reading documentation. If help(json.dumps) includes a small blurb about __json__, it makes the information at least as easily discoverable as invoking dir(json.dumps). Besides, I don't find it shocking if documentation problems have to be solved through documentation. Regards Antoine. From merwok at netwok.org Fri Jul 30 00:46:22 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 30 Jul 2010 00:46:22 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> Message-ID: <4C5204BE.8020805@netwok.org> Thank you for explaining generic functions so clearly. Is there a good module out there implementing them without ?crazily complex overloading schemes?? Regards From alex.gaynor at gmail.com Fri Jul 30 00:48:17 2010 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Thu, 29 Jul 2010 17:48:17 -0500 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <1280443147.3175.70.camel@localhost.localdomain> References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> Message-ID: On Thu, Jul 29, 2010 at 5:39 PM, Antoine Pitrou wrote: > >> > I don't agree. __json__ only matters to people who do JSON >> > encoding/decoding. Other people can safely ignore it. >> >> Which is exactly the attitude I was talking about: for each individual >> case, people go "oh, I understand magic methods, those are easy". It's >> the overall process of identifying the need for and gathering >> consensus on magic methods that is unwieldy (and ultimately fails to >> scale, leading to non-extensible interfaces by default, with pretty >> printing being the classic example, and JSON serialisation the >> latest). > > Why do you want to gather consensus? There is a single json > serialization module in the stdlib and it's obvious that __json__ > can/should be claimed by that module. > > Actually, your argument could be returned: if you use generic functions > (such as @json.dumps.overload), alternative json serializers won't > easily be able to make use of the information, while they could access > the __json__ method like the standard json module does. > >> Is it really harder for people to learn how to write things like: >> >> ? ? json.dumps.overload(mytype, mytype.to_json) >> ? ? json.dumps.overload(third_party_type, my_third_party_type_serialiser) > > It is certainly more annoying and less natural than: > ? ?def __json__(self): .... > > Sure, generic functions as a paradigm appear more powerful, more > decoupled, etc. But in practice __magic__ methods are sufficient for > most uses. Practicality beats purity. > > That may be why in all the years that the various generic functions > libraries have existed, they don't seem to have been really popular > compared to the simpler convention of defining fixed method names. > > (besides, it wouldn't necessarily be json.dumps that you overload, but > some internal function of the json module; making it even less intuitive > and easily discoverable) > >> The generic function registration approach is incidentally >> discoverable via dir(json.dumps) to see that a function provides the >> relevant generic function registration methods. Magic method protocols >> can *only* be discovered by reading documentation. > > If help(json.dumps) includes a small blurb about __json__, it makes the > information at least as easily discoverable as invoking dir(json.dumps). > > Besides, I don't find it shocking if documentation problems have to be > solved through documentation. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > I'd be -1 on an __json__ method. From the perspective of someone who works on Django, the big issue we have is how do you specify how a model (something from the database) should be serialized to JSON. Often people suggest something like __json__ that the Django serializer (which uses the json module) could pick up on, however this is usually rejected: objects tend to have multiple serializations based on context. Unlike pickle, which is usually used for internal consumption, json is usually intended for the wide world, and generally you want to expose different data to different clients. For example an event's json might include a list of attendees for an authenticated client, but an unauthenticated client should only see a list of titles. For this reason Django has always rejected such an approach, in favor of having a per-serialization specification. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me From dstanek at dstanek.com Fri Jul 30 01:12:57 2010 From: dstanek at dstanek.com (David Stanek) Date: Thu, 29 Jul 2010 19:12:57 -0400 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 7:35 AM, Tarek Ziad? wrote: > Hello, > > What about adding in the json package the ability for an object to > provide a different object to serialize ? > This would be useful to translate a class into a structure that can be > passed to json.dumps > > So, it __json__ is provided, its used for serialization instead of the > object itself: > >>>> import json >>>> class MyComplexClass(object): > ... ? ? def __json__(self): > ... ? ? ? ? return 'json' > ... >>>> o = MyComplexClass() >>>> json.dumps(o) > '"json"' > > > > Cheers > Tarek > In my experience serializing an object is usually not a concern of the object itself. I do not want to have to touch every object in my system when I need an alternate format. The pattern I currently use is to hint, as a class-level tuple, the fields that should be serialized. django-piston has a good working example of this pattern. It becomes a bit unruly when you have a big object graph, but I typically keep my object models shallow. -- David blog: http://www.traceback.org twitter: http://twitter.com/dstanek From greg.ewing at canterbury.ac.nz Fri Jul 30 02:06:18 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 30 Jul 2010 12:06:18 +1200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: Message-ID: <4C52177A.7080906@canterbury.ac.nz> Mike Graham wrote: > Since there isn't really any magic going on, why use a __foo__ name? > The majority of __foo__ names are for things you shouldn't reference > yourself To my mind, the main reason is to avoid name clashes. Protocol methods often may need to be added to just about any class, and using a __foo__ name greatly reduces the chance of it coinciding with some pre-existing class-specific method. Anyway, you don't call it yourself in this case either -- it's called by the proposed json-serialising framework. -- Greg From greg.ewing at canterbury.ac.nz Fri Jul 30 02:33:40 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 30 Jul 2010 12:33:40 +1200 Subject: [Python-ideas] str.split with empty separator In-Reply-To: References: Message-ID: <4C521DE4.3060704@canterbury.ac.nz> Alexandre Conrad wrote: > What if str.split could take an empty separator? Do you have a use case for this? > Right now you can join from an empty string... > So why can't we split from an empty string? Because splitting on an empty string is ambiguous, and nobody has so far put forward a compelling use case that would show how the ambiguity should best be resolved. -- Greg From raymond.hettinger at gmail.com Fri Jul 30 04:16:50 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 29 Jul 2010 19:16:50 -0700 Subject: [Python-ideas] str.split with empty separator In-Reply-To: <4C521DE4.3060704@canterbury.ac.nz> References: <4C521DE4.3060704@canterbury.ac.nz> Message-ID: <134FCBE5-2659-4C68-B958-36F9B660C77F@gmail.com> On Jul 29, 2010, at 5:33 PM, Greg Ewing wrote: > Alexandre Conrad wrote: > >> What if str.split could take an empty separator? I propose that the semantics of str.split() never be changed. It has been around for a long time and has a complex set of behaviors that people have come to rely on. For years, we've answered arcane questions about it and have made multiple revisions to the docs in a never ending quest to precisely describe exactly what it does without just showing the C underlying code. Accordingly, existing uses depend mainly on what-it-does-as-implemented and less on the various ways it has been documented over the years. Almost any change to str.split() would either complexify the explanation of what it does or would change the behavior in a way the would break somebody's code (perhaps in a subtle ways that are hard to detect). In my opinion, str.split() should never be touched again. Instead, it may be worthwhile to develop new splitters with precise semantics aimed at specific use cases. Raymond From python at mrabarnett.plus.com Fri Jul 30 04:41:35 2010 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 30 Jul 2010 03:41:35 +0100 Subject: [Python-ideas] str.split with empty separator In-Reply-To: <134FCBE5-2659-4C68-B958-36F9B660C77F@gmail.com> References: <4C521DE4.3060704@canterbury.ac.nz> <134FCBE5-2659-4C68-B958-36F9B660C77F@gmail.com> Message-ID: <4C523BDF.1070403@mrabarnett.plus.com> Raymond Hettinger wrote: > On Jul 29, 2010, at 5:33 PM, Greg Ewing wrote: > >> Alexandre Conrad wrote: >> >>> What if str.split could take an empty separator? > > I propose that the semantics of str.split() never be changed. > > It has been around for a long time and has a complex set of behaviors > that people have come to rely on. For years, we've answered arcane > questions about it and have made multiple revisions to the docs in a > never ending quest to precisely describe exactly what it does without > just showing the C underlying code. Accordingly, existing uses depend > mainly on what-it-does-as-implemented and less on the various ways > it has been documented over the years. > > Almost any change to str.split() would either complexify the explanation > of what it does or would change the behavior in a way the would > break somebody's code (perhaps in a subtle ways that are hard to detect). > > In my opinion, str.split() should never be touched again. > Instead, it may be worthwhile to develop new splitters > with precise semantics aimed at specific use cases. > Does it really have a complex set of behaviours? The only (possibly) surprising behaviour for me is when it splits on whitespace (ie, passing it None as the separator). I find it very easy to understand. Or perhaps I'm just smarter than I thought! :-) From funbuggie at gmail.com Fri Jul 30 06:05:34 2010 From: funbuggie at gmail.com (Barend erasmus) Date: Fri, 30 Jul 2010 06:05:34 +0200 Subject: [Python-ideas] Help!!please Message-ID: Can someone help me and tell why this does not work. prys=0 tp=0 pr=0 f = open('C:\Documents and Settings\ZU1TN\Desktop\Nommers\K55.txt', 'r') pr=f.readline() prys =int(prys) tp =int(tp) pr =int(pr) tp=pr-prys f.close tp=str(tp) print tp raw_input() THX -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Jul 30 06:27:44 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 30 Jul 2010 16:27:44 +1200 Subject: [Python-ideas] str.split with empty separator In-Reply-To: <4C523BDF.1070403@mrabarnett.plus.com> References: <4C521DE4.3060704@canterbury.ac.nz> <134FCBE5-2659-4C68-B958-36F9B660C77F@gmail.com> <4C523BDF.1070403@mrabarnett.plus.com> Message-ID: <4C5254C0.70702@canterbury.ac.nz> On 30/07/10 14:41, MRAB wrote: > Does it really have a complex set of behaviours? I think Raymond may be referring to the fact that the behaviour of split() with and without a splitting string differs in subtle ways with certain edge cases. It's almost better thought of as two different functions that happen to share a name. -- Greg From raymond.hettinger at gmail.com Fri Jul 30 06:51:24 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 29 Jul 2010 21:51:24 -0700 Subject: [Python-ideas] str.split with empty separator In-Reply-To: <4C523BDF.1070403@mrabarnett.plus.com> References: <4C521DE4.3060704@canterbury.ac.nz> <134FCBE5-2659-4C68-B958-36F9B660C77F@gmail.com> <4C523BDF.1070403@mrabarnett.plus.com> Message-ID: <203D2D80-6B4E-4417-8069-3F4EF00C2D8D@gmail.com> On Jul 29, 2010, at 7:41 PM, MRAB wrote: > Raymond Hettinger wrote: >> On Jul 29, 2010, at 5:33 PM, Greg Ewing wrote: >>> Alexandre Conrad wrote: >>> >>>> What if str.split could take an empty separator? >> I propose that the semantics of str.split() never be changed. >> It has been around for a long time and has a complex set of behaviors that people have come to rely on. For years, we've answered arcane questions about it and have made multiple revisions to the docs in a >> never ending quest to precisely describe exactly what it does without just showing the C underlying code. Accordingly, existing uses depend >> mainly on what-it-does-as-implemented and less on the various ways >> it has been documented over the years. Almost any change to str.split() would either complexify the explanation >> of what it does or would change the behavior in a way the would >> break somebody's code (perhaps in a subtle ways that are hard to detect). >> In my opinion, str.split() should never be touched again. Instead, it may be worthwhile to develop new splitters with precise semantics aimed at specific use cases. > Does it really have a complex set of behaviours? The only (possibly) > surprising behaviour for me is when it splits on whitespace (ie, passing > it None as the separator). I find it very easy to understand. Or perhaps > I'm just smarter than I thought! :-) Past bug reports and newsgroup discussions covered a variety of misunderstandings: * completely different algorithm when separator is None * behavior when separator is multiple characters (i.e. set of possible splitters vs an aggregate splitter either with or without overlaps). * behavior when maxsplit is zero * behavior when string begins or ends with whitespace * which characters count as whitespace * behavior when a string begins or ends with a split character * when runs of splitters are treated as a single splitter * behavior of a zero-length splitter * conditions under which x.join(s.split(x)) roundtrips * algorithmic difference from re.split() * are there invariants between s.count(x) and len(s.split(x)) so that you can correctly predict the number of fields returned It was common that people thought str.split() was easy to understand until a corner case arose that defied their expectations. When the experts chimed-in, it became clear that almost no one in those discussions had a clear understanding of exactly what the implemented behaviors were and it was common to resort to experiment to disprove various incorrect hypotheses. We revised the docs several times and added a number of examples and now have a pretty good description that took years to get right. Even now, it might be a good idea to validate the docs by seeing if someone can use the documentation text to write a pure python version of str.split() that behaves exactly like the real thing (including all corner cases). Even if you find all of the above to be easy and intuitive, I still think it wise that we not add to complexity of str.split() with new or altered behaviors. Raymond From pyideas at rebertia.com Fri Jul 30 07:28:48 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 29 Jul 2010 22:28:48 -0700 Subject: [Python-ideas] Help!!please In-Reply-To: References: Message-ID: On Thu, Jul 29, 2010 at 9:05 PM, Barend erasmus wrote: > Can someone help me and tell why this does not work. > > prys=0 > tp=0 > pr=0 > f = open('C:\Documents and Settings\ZU1TN\Desktop\Nommers\K55.txt', 'r') > pr=f.readline() > prys =int(prys) > tp =int(tp) > pr =int(pr) > tp=pr-prys > f.close > tp=str(tp) > print tp > raw_input() > > THX Your post is off-topic for this mailinglist. This mailinglist (python-ideas) is for proposing/discussing ideas for improving/modifying the Python language. For general discussion and questions about using Python, please post to python-list/comp.lang.python instead. It is accessible from either: http://mail.python.org/mailman/listinfo/python-list http://groups.google.com/group/comp.lang.python/topics Further, your question is quite vague in not stating *how* the code isn't working. You may wish to read the following guidance before posting to python-list: http://catb.org/esr/faqs/smart-questions.html Finally, here's a less redundant version of your code with two obvious errors fixed: prys = 0 # Windows file paths either use / or \\ or raw string literals f = open('C:/Documents and Settings/ZU1TN/Desktop/Nommers/K55.txt', 'r') pr = f.readline() pr = int(pr) tp = pr - prys # which simplifies to: tp = pr f.close() # you were missing the parens print tp raw_input() Not offering cheers, Chris -- http://blog.rebertia.com From alexandre.conrad at gmail.com Fri Jul 30 10:28:10 2010 From: alexandre.conrad at gmail.com (Alexandre Conrad) Date: Fri, 30 Jul 2010 10:28:10 +0200 Subject: [Python-ideas] str.split with empty separator In-Reply-To: <4C51AC3E.8080201@mrabarnett.plus.com> References: <4C51AC3E.8080201@mrabarnett.plus.com> Message-ID: 2010/7/29 MRAB : > Shouldn't it be this: > >>>> 'banana'.split('') > ['', 'b', 'a', 'n', 'a', 'n', 'a', ''] Humm... I believe that it may be correct. It's not what I was expecting, but it does look accurate. -- Alex twitter.com/alexconrad From ncoghlan at gmail.com Fri Jul 30 12:39:13 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 30 Jul 2010 20:39:13 +1000 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C5204BE.8020805@netwok.org> References: <20100729151116.219d6765@pitrou.net> <4C5204BE.8020805@netwok.org> Message-ID: On Fri, Jul 30, 2010 at 8:46 AM, ?ric Araujo wrote: > Thank you for explaining generic functions so clearly. > > Is there a good module out there implementing them without ?crazily > complex overloading schemes?? I'm not sure. Most of my exposure to generic functions is through PJE and he's a big fan of pushing them to their limits (hence RuleDispatch and PEAK-rules). There is an extremely bare bones implementation used internally by pkgutil's emulation of the standard import process, but Guido has said we shouldn't document or promote that in any official way without a PEP (cf. the simple proposal in http://bugs.python.org/issue5135 and the PJE's previous more comprehensive proposal in PEP 3124). As others have explained more clearly than I did, generic functions work better than magic methods when the same basic operation (e.g. pretty printing, JSON serialisation) is common to many object types, but the details may vary between applications, or even within a single application. By giving the application more control over how different types are handled (through the generic functions' separate type registries) it is much easier to have context dependent behaviour, while still fairly easily sharing code in cases where it makes sense. E.g. to use Alex Gaynor's example of attendee list serialisation and the issue 5135 syntax: @functools.simplegeneric def json_unauthenticated(obj): return json.dumps(obj) # Default to a basic dumps() call @functools.simplegeneric def json_authenticated(obj): return json_unauthenticated(obj) # Default to being the same as unauthenticated info @json_unauthenticated.register(EventAttendees): def attendee_titles(attendees): return json.dumps([attendee.title for attendee in attendees]) @json_authenticated.register(EventAttendees): def attendee_details(attendees): return json.dumps([attendee.full_details() for attendee in attendees]) (Keep in mind that I don't use JSON, so there are likely plenty of details wrong with the above, but it should give a basic idea of what generic functions are designed to support). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From 8mayday at gmail.com Fri Jul 30 12:42:18 2010 From: 8mayday at gmail.com (Andrey Popp) Date: Fri, 30 Jul 2010 14:42:18 +0400 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C5204BE.8020805@netwok.org> References: <20100729151116.219d6765@pitrou.net> <4C5204BE.8020805@netwok.org> Message-ID: You can check out my implementation of generic functions and methods in Python [1]. There are no byte code hacks, no frame introspection, support for function and method dispatching by one or more positional arguments. [1]: pypi.python.org/pypi/generic On Fri, Jul 30, 2010 at 2:46 AM, ?ric Araujo wrote: > Thank you for explaining generic functions so clearly. > > Is there a good module out there implementing them without ?crazily > complex overloading schemes?? > > Regards > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Andrey Popp phone: +7 911 740 24 91 e-mail: 8mayday at gmail.com From merwok at netwok.org Fri Jul 30 14:06:54 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 30 Jul 2010 14:06:54 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> <4C5204BE.8020805@netwok.org> Message-ID: <4C52C05E.5080209@netwok.org> > There is an extremely bare bones implementation used internally by > pkgutil's emulation of the standard import process Ah, I stumbled upon that this week actually, but did not understand how it worked nor why it was useful since there?s only one decorated function and only one registered type. Thanks for pointing it, I may play with it to get a better understanding and see the possibilities. Regards From merwok at netwok.org Fri Jul 30 14:08:52 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 30 Jul 2010 14:08:52 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> <4C5204BE.8020805@netwok.org> Message-ID: <4C52C0D4.5090801@netwok.org> Thanks Andrey, I?ll play with it when I?ll take time to dive into generic functions. Regards From greg.ewing at canterbury.ac.nz Sat Jul 31 03:17:41 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 31 Jul 2010 13:17:41 +1200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <1280443147.3175.70.camel@localhost.localdomain> References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> Message-ID: <4C5379B5.6030806@canterbury.ac.nz> Antoine Pitrou wrote: > Sure, generic functions as a paradigm appear more powerful, more > decoupled, etc. In this case there's a sense in which using a generic function could be seen as *increasing* coupling. Suppose I write a class Foo, and as a convenience to my users, I want to give it the ability to be json-serialised. If that is done using a generic function, then I need to put a call in my module to register it. But that makes my module dependent on the json-serialising module, even for applications which don't use json at all. The alternative is just to provide the function but don't register it. But using that approach, every application that *does* use json would be responsible for registering all the functions for all the classes that need to be serialised, including those in library modules that it may not be directly aware of. This doesn't seem like a good situation either. -- Greg From ncoghlan at gmail.com Sat Jul 31 03:31:49 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 31 Jul 2010 11:31:49 +1000 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C5379B5.6030806@canterbury.ac.nz> References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> <4C5379B5.6030806@canterbury.ac.nz> Message-ID: On Sat, Jul 31, 2010 at 11:17 AM, Greg Ewing wrote: > Suppose I write a class Foo, and as a convenience to my users, > I want to give it the ability to be json-serialised. If that > is done using a generic function, then I need to put a call > in my module to register it. But that makes my module dependent > on the json-serialising module, even for applications which > don't use json at all. > > The alternative is just to provide the function but don't > register it. But using that approach, every application that > *does* use json would be responsible for registering all the > functions for all the classes that need to be serialised, > including those in library modules that it may not be directly > aware of. This doesn't seem like a good situation either. Hence why most generic function proposals are accompanied by proposals for lazy module import hooks (i.e. delaying the registration until the relevant module is imported). However, the simpler approach is just to recommend that single-dispatch generic functions default to a particular method. "magic method" vs "generic function" isn't actually an either-or decision: it is quite possible to have the latter rely on the former in its default "unrecognised type" implementation, while still providing the type registration infrastructure that allows an application to say "no, I don't want that behaviour in this case, I want to do something different". To be honest, there are actually some more features I would want to push for in ABCs (specifically, a public API to view an ABC's type registry, as well as a callback API to be notified of registration changes) before seriously proposing an official generic function implementation in the standard library. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cs at zip.com.au Sat Jul 31 07:59:12 2010 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 31 Jul 2010 15:59:12 +1000 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <4C52177A.7080906@canterbury.ac.nz> References: <4C52177A.7080906@canterbury.ac.nz> Message-ID: <20100731055912.GA10882@cskk.homeip.net> I'm uncomfortable with the __foo__ style proposed. Details and "what I would do" below. On 30Jul2010 12:06, Greg Ewing wrote: | Mike Graham wrote: | >Since there isn't really any magic going on, why use a __foo__ name? | >The majority of __foo__ names are for things you shouldn't reference | >yourself | | To my mind, the main reason is to avoid name clashes. Protocol | methods often may need to be added to just about any class, | and using a __foo__ name greatly reduces the chance of it | coinciding with some pre-existing class-specific method. Might not the adder of a class specific method make the same argument? If they really want a class _specific_ method, ought thy not to be using the __foo style, thus avoiding clashes anyway? The __json__ name make me uncomfortable; to my mind __foo_ names belong to the language in order to implement/override stuff like [], not to a library hook. | Anyway, you don't call it yourself in this case either -- it's | called by the proposed json-serialising framework. I'm curious; what's the special benefit to JSON here? I don't mean JSON is unpopular or horrible, but I can see people going to a __xml__ hook for a proposed XML serialisation framework, and __sql__ for db storage, and ... I'm doing a little serialisation myself for another purpose. My code gets classes that want serialisation to register themselves with the serialisation module thus: # DB is a NodeDB instance, which can store various objects DB.register_type(class, tobytes, frombytes) where class is the class desiring special serialisation and tobytes and frombytes are callables; tobytes takes an instance of the class and returns the byte serialisation and frombytes does the reverse. No special names needed and no __foo__ special name reservation. Why wouldn't one just extend the json module with a "serialise this" and "unserialise this" type registry? Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ You can listen to what everybody says, but the fact remains that you've got to get out there and do the thing yourself. - Joan Sutherland From ziade.tarek at gmail.com Sat Jul 31 13:50:16 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 31 Jul 2010 13:50:16 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> <4C5379B5.6030806@canterbury.ac.nz> Message-ID: On Sat, Jul 31, 2010 at 3:31 AM, Nick Coghlan wrote: ... > > To be honest, there are actually some more features I would want to > push for in ABCs (specifically, a public API to view an ABC's type > registry, as well as a callback API to be notified of registration > changes) before seriously proposing an official generic function > implementation in the standard library. funny hazard, I was proposing to PEP 3319 authors about having the _abc_registry attribute somehow exposed. do you have an idea on how this could be done without forcing ABC subclasses to have a new public method ? Maybe a separate function ? like >>> from abc import get_registry >>> get_registry(MyAbc) -- Tarek Ziad? | http://ziade.org From fuzzyman at voidspace.org.uk Sat Jul 31 14:08:14 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 31 Jul 2010 13:08:14 +0100 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> <4C5204BE.8020805@netwok.org> Message-ID: On 30 July 2010 11:39, Nick Coghlan wrote: > On Fri, Jul 30, 2010 at 8:46 AM, ?ric Araujo wrote: > > Thank you for explaining generic functions so clearly. > > > > Is there a good module out there implementing them without ?crazily > > complex overloading schemes?? > > I'm not sure. Most of my exposure to generic functions is through PJE > and he's a big fan of pushing them to their limits (hence RuleDispatch > and PEAK-rules). > > There is an extremely bare bones implementation used internally by > pkgutil's emulation of the standard import process, but Guido has said > we shouldn't document or promote that in any official way without a > PEP (cf. the simple proposal in http://bugs.python.org/issue5135 and > the PJE's previous more comprehensive proposal in PEP 3124). > > As others have explained more clearly than I did, generic functions > work better than magic methods when the same basic operation (e.g. > pretty printing, JSON serialisation) is common to many object types, > but the details may vary between applications, or even within a single > application. By giving the application more control over how different > types are handled (through the generic functions' separate type > registries) it is much easier to have context dependent behaviour, > while still fairly easily sharing code in cases where it makes sense. > > E.g. to use Alex Gaynor's example of attendee list serialisation and > the issue 5135 syntax: > > @functools.simplegeneric > def json_unauthenticated(obj): > return json.dumps(obj) # Default to a basic dumps() call > > @functools.simplegeneric > def json_authenticated(obj): > return json_unauthenticated(obj) # Default to being the same as > unauthenticated info > > @json_unauthenticated.register(EventAttendees): > def attendee_titles(attendees): > return json.dumps([attendee.title for attendee in attendees]) > > @json_authenticated.register(EventAttendees): > def attendee_details(attendees): > return json.dumps([attendee.full_details() for attendee in attendees]) > > I really like Alex Gaynor's simple MultiMethod implementation. From: http://alexgaynor.net/2010/jun/26/multimethods-python/ It doesn't have a concept of a default call, but that would be very easy to add. Basic usage is: json_unauthenticated = MultiMethod() @json_unauthenticated.register(EventAttendees) def json_unauthenticated(attendees): return json.dumps([attendee.title for attendee in attendees]) @json_unauthenticated.register(OtherType) def json_unauthenticated(othertypes): return json.dumps(othertypes) And so on. Michael > (Keep in mind that I don't use JSON, so there are likely plenty of > details wrong with the above, but it should give a basic idea of what > generic functions are designed to support). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Jul 31 14:10:58 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Jul 2010 14:10:58 +0200 Subject: [Python-ideas] Json object-level serializer References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> <4C5379B5.6030806@canterbury.ac.nz> Message-ID: <20100731141058.311766a0@pitrou.net> On Sat, 31 Jul 2010 11:31:49 +1000 Nick Coghlan wrote: > > However, the simpler approach is just to recommend that > single-dispatch generic functions default to a particular method. > "magic method" vs "generic function" isn't actually an either-or > decision: it is quite possible to have the latter rely on the former > in its default "unrecognised type" implementation, while still > providing the type registration infrastructure that allows an > application to say "no, I don't want that behaviour in this case, I > want to do something different". Ah, right. That sounds much more appealing. > To be honest, there are actually some more features I would want to > push for in ABCs (specifically, a public API to view an ABC's type > registry, as well as a callback API to be notified of registration > changes) before seriously proposing an official generic function > implementation in the standard library. Would be nice, indeed. Regards Antoine. From solipsis at pitrou.net Sat Jul 31 14:12:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Jul 2010 14:12:07 +0200 Subject: [Python-ideas] Json object-level serializer References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> <4C5379B5.6030806@canterbury.ac.nz> Message-ID: <20100731141207.242601e7@pitrou.net> On Sat, 31 Jul 2010 13:50:16 +0200 Tarek Ziad? wrote: > On Sat, Jul 31, 2010 at 3:31 AM, Nick Coghlan wrote: > ... > > > > To be honest, there are actually some more features I would want to > > push for in ABCs (specifically, a public API to view an ABC's type > > registry, as well as a callback API to be notified of registration > > changes) before seriously proposing an official generic function > > implementation in the standard library. > > funny hazard, I was proposing to PEP 3319 authors about having the > _abc_registry attribute > somehow exposed. Rather than exposing the registry object itself (which is an implementation detail), how about exposing lookup operations on this registry? Regards Antoine. From fuzzyman at voidspace.org.uk Sat Jul 31 14:15:13 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 31 Jul 2010 13:15:13 +0100 Subject: [Python-ideas] PEP 3151: Reworking the OS and IO exception hierarchy In-Reply-To: References: <1279740852.3222.38.camel@localhost.localdomain> Message-ID: On 24 July 2010 22:31, Gregory P. Smith wrote: > > On Wed, Jul 21, 2010 at 12:34 PM, Antoine Pitrou wrote: >> >> Hello, >> >> I would like to propose the following PEP for feedback and review. >> Permanent link to up-to-date version with proper HTML formatting: >> http://www.python.org/dev/peps/pep-3151/ >> >> Thank you, >> >> Antoine. > > [...] > +1 in on this whole PEP! > > +1 from me too. Michael > The EnvrionmentError hierarchy and common errno test code has bothered me > for a while. While I think the namespace pollution concern is valid I would > suggest adding "Error" to the end of all of the names (your initial proposal > only says "Error" on the end of one of them) as that is consistent with the > bulk of the existing standard exceptions and warnings. They are unlikely to > conflict with anything other than exceptions people have already defined > themselves in any existing code (which could likely be refactored out after > we officially define these). > > > >> >> Earlier discussion >> ================== >> >> While this is the first time such as formal proposal is made, the idea >> has received informal support in the past [1]_; both the introduction >> of finer-grained exception classes and the coalescing of OSError and >> IOError. >> >> The removal of WindowsError alone has been discussed and rejected >> as part of another PEP [2]_, but there seemed to be a consensus that the >> distinction with OSError wasn't meaningful. This supports at least its >> aliasing with OSError. >> >> >> Moratorium >> ========== >> >> The moratorium in effect on language builtins means this PEP has little >> chance to be accepted for Python 3.2. >> >> >> Possible alternative >> ==================== >> >> Pattern matching >> ---------------- >> >> Another possibility would be to introduce an advanced pattern matching >> syntax when catching exceptions. For example:: >> >> try: >> os.remove(filename) >> except OSError as e if e.errno == errno.ENOENT: >> pass >> >> Several problems with this proposal: >> >> * it introduces new syntax, which is perceived by the author to be a >> heavier >> change compared to reworking the exception hierarchy >> * it doesn't decrease typing effort significantly >> * it doesn't relieve the programmer from the burden of having to remember >> errno mnemonics >> > > ugh. no. :) That only works well for single exceptions and encourages > less explicit exception types. Exceptions are a class hierarchy, we should > encourage its use rather than encouraging magic type specific attributes > with conditionals. > > -gps > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From fetchinson at googlemail.com Sat Jul 31 16:24:05 2010 From: fetchinson at googlemail.com (Daniel Fetchinson) Date: Sat, 31 Jul 2010 16:24:05 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: References: <20100729151116.219d6765@pitrou.net> Message-ID: >>> Each individual time this question comes up people tend to react with >>> "oh, that's too complicated and overkill, but magic methods are >>> simple, so let's just define another magic method". The sum total of >>> all those magic methods starts to accumulate into a lot of complexity >>> of its own though :P >> >> I don't agree. __json__ only matters to people who do JSON >> encoding/decoding. Other people can safely ignore it. > > Which is exactly the attitude I was talking about: for each individual > case, people go "oh, I understand magic methods, those are easy". It's > the overall process of identifying the need for and gathering > consensus on magic methods that is unwieldy (and ultimately fails to > scale, leading to non-extensible interfaces by default, with pretty > printing being the classic example, and JSON serialisation the > latest). > >> And I don't see how generic functions bring less cognitive overhead. >> (they actually bring more of it, since most implementations are more >> complicated to begin with) > > Mostly because the fully fledged generic implementations like > PEAK-rules tend to get brought into discussions when they aren't > needed. Single-type generic dispatch is actually so common they gave > it a name: object-oriented programming. All single-type generic > dispatch is about is having a registry for a particular operation that > says "to perform this operation, with objects of this type, use this > function". Instead of having a protocol that says "look up this magic > method in the object's own namespace" (which requires a) agreement on > the magic name to use and b) that the original author of the type in > question both knew and cared about the operation the application > developer is interested in) you instead have a protocol that says > "here is a standard mechanism for declaring a type registry for a > function, so you only have to learn how to register a function once". > > Is it really harder for people to learn how to write things like: > > json.dumps.overload(mytype, mytype.to_json) > json.dumps.overload(third_party_type, my_third_party_type_serialiser) > > than it is for them to figure out that implementing a __json__ method > will allow them to change how their object is serialised? (Not to > mention that a __json__ method can only be used via monkey-patching if > the type you want to serialise differently came from a library module > rather than your own code). > > The generic function registration approach is incidentally > discoverable via dir(json.dumps) to see that a function provides the > relevant generic function registration methods. Magic method protocols > can *only* be discovered by reading documentation. > > Function registration is a solved problem, with much better solutions > than the ad hoc YAMM (yet-another-magic-method) approach we currently > use. We just keep getting scared away from the right answer by the > crazily complex overloading schemes that libraries like PEAK-rules > allow. +1 on your entire diagnosis. If peak-rules is too complicated and perhaps unmaintained then the focus should be on cooking up a better generic function library. The complaints against peak-rules comes up frequently enough and this shows that there is a need for a generic function library because people do use peak-rules. The problem is not with the concept but only (perhaps) with this particular implementation (disclaimer: I'm perfectly happy with peak-rules). Cheers, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown From ncoghlan at gmail.com Sat Jul 31 17:26:20 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 Aug 2010 01:26:20 +1000 Subject: [Python-ideas] Exposing the ABC registration graph (was Re: Json object-level serializer) Message-ID: On Sat, Jul 31, 2010 at 10:12 PM, Antoine Pitrou wrote: > On Sat, 31 Jul 2010 13:50:16 +0200 > Tarek Ziad? wrote: >> On Sat, Jul 31, 2010 at 3:31 AM, Nick Coghlan wrote: >> > To be honest, there are actually some more features I would want to >> > push for in ABCs (specifically, a public API to view an ABC's type >> > registry, as well as a callback API to be notified of registration >> > changes) before seriously proposing an official generic function >> > implementation in the standard library. >> >> funny hazard, I was proposing to PEP 3319 authors about having the >> _abc_registry attribute >> somehow exposed. > > Rather than exposing the registry object itself (which is an > implementation detail), how about exposing lookup operations on this > registry? There's a related problem here that ties into one of the complaints I have with pkgutil.simplegeneric: because that decorator relies on MRO traversal in order to obtain a reasonably efficient implementation, it completely ignores any ABC registrations. That's fairly suboptimal, since a comparable chain of "isinstance()" checks *will* respect ABC registrations (it's just horrendously slow and doesn't scale, since the worst-case number of checks increases linearly with the number of branches in the if-elif chain). So I think the idea of query methods in the abc module is a good way to go. It allows the Python implementation freedom in choosing whether to have separate type registries stored on the ABCs themselves, or instead have global registries stored in the abc module. In particular, it allows the interpreter to cache the transitive closure of the ABC graph, such that an application can ask for the set of all objects that implement a given ABC, as well as the set of all ABCs that a given object implements. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ziade.tarek at gmail.com Sat Jul 31 18:17:42 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 31 Jul 2010 18:17:42 +0200 Subject: [Python-ideas] Json object-level serializer In-Reply-To: <20100731141207.242601e7@pitrou.net> References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> <4C5379B5.6030806@canterbury.ac.nz> <20100731141207.242601e7@pitrou.net> Message-ID: On Sat, Jul 31, 2010 at 2:12 PM, Antoine Pitrou wrote: > On Sat, 31 Jul 2010 13:50:16 +0200 > Tarek Ziad? wrote: > >> On Sat, Jul 31, 2010 at 3:31 AM, Nick Coghlan wrote: >> ... >> > >> > To be honest, there are actually some more features I would want to >> > push for in ABCs (specifically, a public API to view an ABC's type >> > registry, as well as a callback API to be notified of registration >> > changes) before seriously proposing an official generic function >> > implementation in the standard library. >> >> funny hazard, I was proposing to PEP 3319 authors about having the >> _abc_registry attribute >> somehow exposed. > > Rather than exposing the registry object itself (which is an > implementation detail), how about exposing lookup operations on this > registry? Sure but how ? global functions ? From solipsis at pitrou.net Sat Jul 31 19:53:38 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Jul 2010 19:53:38 +0200 Subject: [Python-ideas] Json object-level serializer References: <20100729151116.219d6765@pitrou.net> <1280443147.3175.70.camel@localhost.localdomain> <4C5379B5.6030806@canterbury.ac.nz> <20100731141207.242601e7@pitrou.net> Message-ID: <20100731195338.4dc95c24@pitrou.net> On Sat, 31 Jul 2010 18:17:42 +0200 Tarek Ziad? wrote: > > > > Rather than exposing the registry object itself (which is an > > implementation detail), how about exposing lookup operations on this > > registry? > > Sure but how ? global functions ? Functions of the abc module, yes. Regards Antoine.