From eliswilson at hushmail.com Wed May 1 01:27:44 2013 From: eliswilson at hushmail.com (eliswilson at hushmail.com) Date: Tue, 30 Apr 2013 19:27:44 -0400 Subject: [Python-ideas] Biggest Fake Conference in Computer Science Message-ID: <20130430232744.8989DE6736@smtp.hushmail.com> Biggest Fake Conference in Computer Science We are researchers from different parts of the world and conducted a study on the world?s biggest bogus computer science conference WORLDCOMP http://sites.google.com/site/worlddump1 organized by Prof. Hamid Arabnia from University of Georgia, USA. We submitted a fake paper to WORLDCOMP 2011 and again (the same paper with a modified title) to WORLDCOMP 2012. This paper had numerous fundamental mistakes. Sample statements from that paper include: (1). Binary logic is fuzzy logic and vice versa (2). Pascal developed fuzzy logic (3). Object oriented languages do not exhibit any polymorphism or inheritance (4). TCP and IP are synonyms and are part of OSI model (5). Distributed systems deal with only one computer (6). Laptop is an example for a super computer (7). Operating system is an example for computer hardware Also, our paper did not express any conceptual meaning. However, it was accepted both the times without any modifications (and without any reviews) and we were invited to submit the final paper and a payment of $500+ fee to present the paper. We decided to use the fee for better purposes than making Prof. Hamid Arabnia richer. After that, we received few reminders from WORLDCOMP to pay the fee but we never responded. This fake paper is different from the two fake papers already published (see https://sites.google.com/site/worlddump4 for details) in WORLDCOMP. We MUST say that you should look at the above website if you have any thoughts of participating in WORLDCOMP. DBLP and other indexing agencies have stopped indexing WORLDCOMP?s proceedings since 2011 due to its fakeness. See http://www.informatik.uni-trier.de/~ley/db/conf/icai/index.html for of one of the conferences of WORLDCOMP and notice that there is no listing after 2010. See Section 2 of http://sites.google.com/site/dumpconf for comments from well-known researchers about WORLDCOMP. The status of your WORLDCOMP papers can be changed from scientific to other (i.e., junk or non-technical) at any time. Better not to have a paper than having it in WORLDCOMP and spoil the resume and peace of mind forever! Our study revealed that WORLDCOMP is money making business, using University of Georgia mask, for Prof. Hamid Arabnia. He is throwing out a small chunk of that money (around 20 dollars per paper published in WORLDCOMP?s proceedings) to his puppet (Mr. Ashu Solo or A.M.G. Solo) who publicizes WORLDCOMP and also defends it at various forums, using fake/anonymous names. The puppet uses fake names and defames other conferences to divert traffic to WORLDCOMP. He also makes anonymous phone calls and threatens the critiques of WORLDCOMP (See Item 7 of Section 5 of above website). That is, the puppet does all his best to get a maximum number of papers published at WORLDCOMP to get more money into his (and Prof. Hamid Arabnia?s) pockets. Prof. Hamid Arabnia makes a lot of tricks. For example, he appeared in a newspaper to fool the public, claiming him a victim of cyber-attack (see Item 8 in Section 5 of above website). Monte Carlo Resort (the venue of WORLDCOMP for more than 10 years, until 2012) has refused to provide the venue for WORLDCOMP?13 because of the fears of their image being tarnished due to WORLDCOMP?s fraudulent activities. That is why WORLDCOMP?13 is taking place at a different resort. WORLDCOMP will not be held after 2013. The draft paper submission deadline is over but still there are no committee members, no reviewers, and there is no conference Chairman. The only contact details available on WORLDCOMP?s website is just an email address! We ask Prof. Hamid Arabnia to publish all reviews for all the papers (after blocking identifiable details) since 2000 conference. Reveal the names and affiliations of all the reviewers (for each year) and how many papers each reviewer had reviewed on average. We also ask him to look at the Open Challenge (Section 6) at https://sites.google.com/site/moneycomp1 and respond if he has any professional values. Sorry for posting to multiple lists. Spreading the word is the only way to stop this bogus conference. Please forward this message to other mailing lists and people. We are shocked with Prof. Hamid Arabnia and his puppet?s activities at http://worldcomp-fake-bogus.blogspot.com Search Google using the keyword worldcomp fake for additional links. From robertc at robertcollins.net Wed May 1 02:10:53 2013 From: robertc at robertcollins.net (Robert Collins) Date: Wed, 1 May 2013 12:10:53 +1200 Subject: [Python-ideas] Biggest Fake Conference in Computer Science In-Reply-To: <20130430232744.8989DE6736@smtp.hushmail.com> References: <20130430232744.8989DE6736@smtp.hushmail.com> Message-ID: Please stop with the spam. We get the message already. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed May 1 02:38:11 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 May 2013 10:38:11 +1000 Subject: [Python-ideas] Make traceback messages aware of line continuation In-Reply-To: References: <20130430131216.339fbc98@fsol> Message-ID: On Wed, May 1, 2013 at 12:41 AM, Felipe Cruz wrote: > +1 > > It would be great to have this feature. As Terry explained on the tracker issue, printing the entire expression is hugely problematic, as it means that long definition displays (e.g. for dictionaries or lists) will show a huge amount of noise, rather than the subexpression that actually triggered the exception. It is far better to break up the code to avoid long lines in the first place, even if that involves creating "redundant" assignment statements for subexpressions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From haoyi.sg at gmail.com Wed May 1 03:12:55 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Tue, 30 Apr 2013 18:12:55 -0700 Subject: [Python-ideas] A nice __repr__ for the ast.* classes? Message-ID: Wouldn't it be nice if this >>> import ast >>> print repr(ast.parse("(1 + 1)").body[0].value) <_ast.BinOp object at 0x0000000001E94B38> printed something more useful? >>> print repr(ast.parse("(1 + 1)").body[0].value) BinOp(left=Num(n=1), op=Add(), right=Num(n=1)) I've been doing some work on macropy , which uses the ast.* classes extensively, and it's annoying that we have to resort to dirty-tricks like monkey-patching the AST classes (for CPython 2.7) or even monkey-patching __builtin__.repr (to get it working on PyPy) just to get eval(repr(my_ast)) == my_ast to hold true. And a perfectly good solution already exists in the ast.dump() method, too! (It would also be nice if "==" did a structural comparison on the ast.* classes too, but that's a different issue). -Haoyi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed May 1 03:21:12 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Apr 2013 18:21:12 -0700 Subject: [Python-ideas] A nice __repr__ for the ast.* classes? In-Reply-To: References: Message-ID: But do you really want it to print the entire parse tree even if it represents several pages of code? On Tue, Apr 30, 2013 at 6:12 PM, Haoyi Li wrote: > Wouldn't it be nice if this > >>>> import ast >>>> print repr(ast.parse("(1 + 1)").body[0].value) > <_ast.BinOp object at 0x0000000001E94B38> > > printed something more useful? > >>>> print repr(ast.parse("(1 + 1)").body[0].value) > BinOp(left=Num(n=1), op=Add(), right=Num(n=1)) > > I've been doing some work on macropy, which uses the ast.* classes > extensively, and it's annoying that we have to resort to dirty-tricks like > monkey-patching the AST classes (for CPython 2.7) or even monkey-patching > __builtin__.repr (to get it working on PyPy) just to get > > eval(repr(my_ast)) == my_ast > > to hold true. And a perfectly good solution already exists in the ast.dump() > method, too! (It would also be nice if "==" did a structural comparison on > the ast.* classes too, but that's a different issue). > > -Haoyi > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Wed May 1 03:33:10 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 01 May 2013 11:33:10 +1000 Subject: [Python-ideas] A nice __repr__ for the ast.* classes? In-Reply-To: References: Message-ID: <518070D6.5090408@pearwood.info> On 01/05/13 11:21, Guido van Rossum wrote: > But do you really want it to print the entire parse tree even if it > represents several pages of code? Large dicts have the same problem. I can't tell you the number of times I've printed an apparently innocent dict that happened to have builtins in it. I don't think there's any good solution to this. My feeling is that the usefulness of small {dicts | parse trees} having a nice repr outweighs the inconvenience of large ones having a big repr, but I would understand if others had a different opinion. -- Steven From haoyi.sg at gmail.com Wed May 1 03:35:06 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Tue, 30 Apr 2013 18:35:06 -0700 Subject: [Python-ideas] A nice __repr__ for the ast.* classes? In-Reply-To: References: Message-ID: Isn't that the distinction between repr() and str()? That repr() is generally (to a greater extent) meant to return eval()-able code, while str() is just something nice to look at. I don't think the # of pages it outputs should really matter, if the thing you are printing really is that big. It won't be much bigger than printing big lists or dicts, and we happily let those cause the terminal to scroll for minutes at a time if we accidentally print them. The default behavior (e.g. when you print() it or string-interpolate it) would still give you something short and nice to look at. Presumably when someone called repr() instead of str(), he was hoping for some sort of eval()-able code snippet. On Tue, Apr 30, 2013 at 6:21 PM, Guido van Rossum wrote: > But do you really want it to print the entire parse tree even if it > represents several pages of code? > > On Tue, Apr 30, 2013 at 6:12 PM, Haoyi Li wrote: > > Wouldn't it be nice if this > > > >>>> import ast > >>>> print repr(ast.parse("(1 + 1)").body[0].value) > > <_ast.BinOp object at 0x0000000001E94B38> > > > > printed something more useful? > > > >>>> print repr(ast.parse("(1 + 1)").body[0].value) > > BinOp(left=Num(n=1), op=Add(), right=Num(n=1)) > > > > I've been doing some work on macropy, which uses the ast.* classes > > extensively, and it's annoying that we have to resort to dirty-tricks > like > > monkey-patching the AST classes (for CPython 2.7) or even monkey-patching > > __builtin__.repr (to get it working on PyPy) just to get > > > > eval(repr(my_ast)) == my_ast > > > > to hold true. And a perfectly good solution already exists in the > ast.dump() > > method, too! (It would also be nice if "==" did a structural comparison > on > > the ast.* classes too, but that's a different issue). > > > > -Haoyi > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graffatcolmingov at gmail.com Wed May 1 04:49:51 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Tue, 30 Apr 2013 22:49:51 -0400 Subject: [Python-ideas] Biggest Fake Conference in Computer Science In-Reply-To: References: <20130430232744.8989DE6736@smtp.hushmail.com> Message-ID: I will block him in a few minutes. On Apr 30, 2013 8:11 PM, "Robert Collins" wrote: > Please stop with the spam. We get the message already. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliswilson at hushmail.com Wed May 1 00:34:33 2013 From: eliswilson at hushmail.com (eliswilson at hushmail.com) Date: Tue, 30 Apr 2013 18:34:33 -0400 Subject: [Python-ideas] Biggest Fake Conference in Computer Science Message-ID: <20130430223433.4279714DBE1@smtp.hushmail.com> Biggest Fake Conference in Computer Science We are researchers from different parts of the world and conducted a study on the world?s biggest bogus computer science conference WORLDCOMP http://sites.google.com/site/worlddump1 organized by Prof. Hamid Arabnia from University of Georgia, USA. We submitted a fake paper to WORLDCOMP 2011 and again (the same paper with a modified title) to WORLDCOMP 2012. This paper had numerous fundamental mistakes. Sample statements from that paper include: (1). Binary logic is fuzzy logic and vice versa (2). Pascal developed fuzzy logic (3). Object oriented languages do not exhibit any polymorphism or inheritance (4). TCP and IP are synonyms and are part of OSI model (5). Distributed systems deal with only one computer (6). Laptop is an example for a super computer (7). Operating system is an example for computer hardware Also, our paper did not express any conceptual meaning. However, it was accepted both the times without any modifications (and without any reviews) and we were invited to submit the final paper and a payment of $500+ fee to present the paper. We decided to use the fee for better purposes than making Prof. Hamid Arabnia richer. After that, we received few reminders from WORLDCOMP to pay the fee but we never responded. This fake paper is different from the two fake papers already published (see https://sites.google.com/site/worlddump4 for details) in WORLDCOMP. We MUST say that you should look at the above website if you have any thoughts of participating in WORLDCOMP. DBLP and other indexing agencies have stopped indexing WORLDCOMP?s proceedings since 2011 due to its fakeness. See http://www.informatik.uni-trier.de/~ley/db/conf/icai/index.html for of one of the conferences of WORLDCOMP and notice that there is no listing after 2010. See Section 2 of http://sites.google.com/site/dumpconf for comments from well-known researchers about WORLDCOMP. The status of your WORLDCOMP papers can be changed from scientific to other (i.e., junk or non-technical) at any time. Better not to have a paper than having it in WORLDCOMP and spoil the resume and peace of mind forever! Our study revealed that WORLDCOMP is money making business, using University of Georgia mask, for Prof. Hamid Arabnia. He is throwing out a small chunk of that money (around 20 dollars per paper published in WORLDCOMP?s proceedings) to his puppet (Mr. Ashu Solo or A.M.G. Solo) who publicizes WORLDCOMP and also defends it at various forums, using fake/anonymous names. The puppet uses fake names and defames other conferences to divert traffic to WORLDCOMP. He also makes anonymous phone calls and threatens the critiques of WORLDCOMP (See Item 7 of Section 5 of above website). That is, the puppet does all his best to get a maximum number of papers published at WORLDCOMP to get more money into his (and Prof. Hamid Arabnia?s) pockets. Prof. Hamid Arabnia makes a lot of tricks. For example, he appeared in a newspaper to fool the public, claiming him a victim of cyber-attack (see Item 8 in Section 5 of above website). Monte Carlo Resort (the venue of WORLDCOMP for more than 10 years, until 2012) has refused to provide the venue for WORLDCOMP?13 because of the fears of their image being tarnished due to WORLDCOMP?s fraudulent activities. That is why WORLDCOMP?13 is taking place at a different resort. WORLDCOMP will not be held after 2013. The draft paper submission deadline is over but still there are no committee members, no reviewers, and there is no conference Chairman. The only contact details available on WORLDCOMP?s website is just an email address! We ask Prof. Hamid Arabnia to publish all reviews for all the papers (after blocking identifiable details) since 2000 conference. Reveal the names and affiliations of all the reviewers (for each year) and how many papers each reviewer had reviewed on average. We also ask him to look at the Open Challenge (Section 6) at https://sites.google.com/site/moneycomp1 and respond if he has any professional values. Sorry for posting to multiple lists. Spreading the word is the only way to stop this bogus conference. Please forward this message to other mailing lists and people. We are shocked with Prof. Hamid Arabnia and his puppet?s activities at http://worldcomp-fake-bogus.blogspot.com Search Google using the keyword worldcomp fake for additional links. From ncoghlan at gmail.com Wed May 1 05:51:36 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 May 2013 13:51:36 +1000 Subject: [Python-ideas] A nice __repr__ for the ast.* classes? In-Reply-To: References: Message-ID: On Wed, May 1, 2013 at 11:35 AM, Haoyi Li wrote: > Isn't that the distinction between repr() and str()? That repr() is > generally (to a greater extent) meant to return eval()-able code, while > str() is just something nice to look at. The stronger distinction is that repr() is ideally never ambiguous about the corresponding value, while str() might be. For example: >>> "1" '1' >>> 1 1 >>> str("1") '1' >>> str(1) '1' That desire for an unambiguous repr for each value is why CPython defaults to using "" Having repr support eval is certainly nice when practical, but it's far from a requirement. Even the desire for a unique repr-per-value isn't a guarantee, since some types *do* impose a size limit. > I don't think the # of pages it outputs should really matter, if the thing > you are printing really is that big. It won't be much bigger than printing > big lists or dicts, and we happily let those cause the terminal to scroll > for minutes at a time if we accidentally print them. The default behavior > (e.g. when you print() it or string-interpolate it) would still give you > something short and nice to look at. Presumably when someone called repr() > instead of str(), he was hoping for some sort of eval()-able code snippet. In this particular case, I'm also inclined to favour the approach of using ast.dump as the repr for AST nodes. Call it +0. Adding support for depth limiting and peer node limiting to ast.dump (with missing nodes replaced with "...") would also be neat. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From graffatcolmingov at gmail.com Wed May 1 05:51:01 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Tue, 30 Apr 2013 23:51:01 -0400 Subject: [Python-ideas] Biggest Fake Conference in Computer Science In-Reply-To: References: <20130430232744.8989DE6736@smtp.hushmail.com> Message-ID: Whoops. Thought this was still the code-quality mailing list. He's banned there at least. On Apr 30, 2013 10:49 PM, "Ian Cordasco" wrote: > I will block him in a few minutes. > On Apr 30, 2013 8:11 PM, "Robert Collins" > wrote: > >> Please stop with the spam. We get the message already. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter at norvig.com Wed May 1 10:11:10 2013 From: peter at norvig.com (Peter Norvig) Date: Wed, 1 May 2013 01:11:10 -0700 Subject: [Python-ideas] docstring in decorator? Message-ID: We've all gotten used to the conventions for docstrings laid out in PEP 257. But I got to wondering: if we had had decorators back in the 20th century, would docstrings be in a decorator? I found out that PEP 318 says "Guido felt ... it was quite possible that a 'docstring' decorator could help move the docstring to outside the function body." Is this a good idea? Compare: Current practice: def square(x): """ Compute the square of the number x. For example, square(3) == (3 * 3) == 9. """ return x * x Docstring in decorator: @doc("""Compute the square of the number x. For example, square(3) == (3 * 3) == 9.""") def square(x): return x * x Advantages of this approach: * Matches practice of other languages, such as Java and C++, where (by most conventions) comments precede the first line of the function definition. * If you have a short function and want to write the header and body on one line, you can, and still have a docstring. * At least for top-level functions, we don't need the complicated rules (24-line-long 'trim' function from PEP 257) to determine what to do with leading blanks, because there are no leading blanks. (For nested functions we still have the complication.) * The docstring can be an expression -- it need not be a literal string. For example, we could apply a localization/internationalization function to each string. (I know this could also be done by a tool that hunts through modules and looks for docstrings, but the decorator could do it at definition time without need of an external tool.) * The docstring could be several comma-separated parts which would be ' '.join-ed together, as in print, making it easier to construct an interesting docstring: @doc("Compute the square", version_str, author_str) * Alternately, doc could take keyword args and do a setattr for each one: @doc("Compute the square", version="1.1", author="GvR") Disadvantage: * If it ain't broke, don't fix it. Recommendation: I'm not proposing any change to Python; no proposal is necessary because if people like this they can easily implement it themselves. (I am interested in people's reactions.) Implementation: def doc(*docstring_parts, **kwargs): """Return a decorator that attaches a docstring.""" def f(fn): docstr = ' '.join(map(str, docstring_parts)) fn.__doc__ = docstr for attr in kwargs: setattr(fn, attr, kwargs[attr]) return fn return f -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed May 1 11:00:47 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 1 May 2013 12:00:47 +0300 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: <517D4FEA.9090605@nedbatchelder.com> References: <517D4FEA.9090605@nedbatchelder.com> Message-ID: On Sun, Apr 28, 2013 at 7:35 PM, Ned Batchelder wrote: > If someone understood the entire discussion well enough to apply filters, > etc, then we'd already have reached an agreement. Perfect summary. Starting from this point, the problem I am trying to solve is to share this line of coming to agreement with the other people. The process of writing summaries or repeating previous arguments to find out if a consensus is reached is very exhausting. If in real life I can see a person and associate his position with his image, in the mailing list it is just a continuous and contradictory flow of information where you can not track people state. The role of filter or any other tool is thus two fold: 1. track and name aspects of discussed topic 2. track people positions related to these aspect to find out if you understand their objections I am not saying that mailing lists should be replaced, but perhaps somebody know a parallel tool (or process) to help with this stuff. I once used piratepad in hope to write summaries and track important/big topics, but copy/paste takes a lot of time, and it is very hard to track multiple aspects in parallel in a single pad or in multiple pads. It might be convenient if used collaboratively, but again - before a proper remote collaboration practice could be developed, a experimental period of testing this process in a teamwork is required. Anti Off-Topic: This can be considered an idea for improvement of PEP process (in addition to the idea of providing a feedback channel on PEP pages). -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed May 1 11:06:36 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 May 2013 19:06:36 +1000 Subject: [Python-ideas] docstring in decorator? In-Reply-To: References: Message-ID: On Wed, May 1, 2013 at 6:11 PM, Peter Norvig wrote: > Docstring in decorator: > > @doc("""Compute the square of the number x. > For example, square(3) == (3 * 3) == 9.""") > def square(x): return x * x > Disadvantage: > > * If it ain't broke, don't fix it. * The signature becomes harder to read That said, I think the main advantage of using the decorator is the ability to more easily set the docstring *programmatically* (which you did list as a benefit). I've been thinking we may want a "@inheritdocs" class decorator, too (which would replace any "__doc__ is None" entries in methods with the docs from the parent methods). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From techtonik at gmail.com Wed May 1 11:15:01 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 1 May 2013 12:15:01 +0300 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: References: Message-ID: On Sun, Apr 28, 2013 at 9:43 PM, Terry Jan Reedy wrote: > On 4/28/2013 11:37 AM, anatoly techtonik wrote: > >> I find it really hard to track proposals, ideas and various deviations >> in mailing lists, which is especially actual for lists such as >> python-ideas. I bet other people experience this problem too. The >> typical scenario: >> >> 1. You make a proposal >> 2. The discussion continues >> 3. Part of the discussion is hijacked >> 4. Another part brings the problem you haven't seen >> 5. You don't have time to investigate the problem >> 6. Discussion continues >> 7. Thread quickly gets out of scope of daily emails >> 8. Contact lost >> >> Several week later you remember about the proposal: >> >> 9. You open the original proposal to notice a small novel >> 10. You start to reread >> 11. Got confused >> 13. Recall the details >> 14, Find a way out from irrelevant deviation >> 15. Encounter the problem >> 16. Spend what is left to investigate the problem >> 17. Run out of time >> >> The major problem I have is steps 9-15. Sometimes these take the most of >> the time. What would help to make all the collaboration here more >> productive are colored view/filters (summaries) for discussions. It >> would work like so: >> >> 00. The discussion is laid out as a single page >> > > This is what the PEP process is about. Anyone can summarize a idea as a > proto-pep either initially or after preliminary discussion. Objections and > unresolved issues are part of a pep. Revisions and reposting are part of > the process. I thought about writing PEPs, but my CPU cycles and Memory are too limited to support current PEP process. I'd be happy to run a Stackless version of it, which can be paralleled, suspended or resumed on a different humanware. -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Wed May 1 11:52:13 2013 From: masklinn at masklinn.net (Masklinn) Date: Wed, 1 May 2013 11:52:13 +0200 Subject: [Python-ideas] docstring in decorator? In-Reply-To: References: Message-ID: On 2013-05-01, at 11:06 , Nick Coghlan wrote: > On Wed, May 1, 2013 at 6:11 PM, Peter Norvig wrote: >> Docstring in decorator: >> >> @doc("""Compute the square of the number x. >> For example, square(3) == (3 * 3) == 9.""") >> def square(x): return x * x >> Disadvantage: >> >> * If it ain't broke, don't fix it. > > * The signature becomes harder to read > > That said, I think the main advantage of using the decorator is the > ability to more easily set the docstring *programmatically* (which you > did list as a benefit). I've been thinking we may want a > "@inheritdocs" class decorator, too (which would replace any "__doc__ > is None" entries in methods with the docs from the parent methods). Could make sense, considering there's already functools.update_wrapper which does that, it could be used for more decorators. As an aside, isn't help() supposed to use __doc__? Because I tried using update_wrapper() on a partial() to give it the docstring of the wrapped partially applied function, but help() still yields partial's documentation and ignores the (correctly set, I checked) __doc__. Which makes partial() much less interesting across package boundaries since library users can be expected to lookup the "online" documentation of exposed or returned objects. help()'s doc does not mention how it generates help pages, pydoc's doc page doesn't either. From solipsis at pitrou.net Wed May 1 12:22:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 12:22:52 +0200 Subject: [Python-ideas] docstring in decorator? References: Message-ID: <20130501122252.02aa5fa8@fsol> On Wed, 1 May 2013 19:06:36 +1000 Nick Coghlan wrote: > On Wed, May 1, 2013 at 6:11 PM, Peter Norvig wrote: > > Docstring in decorator: > > > > @doc("""Compute the square of the number x. > > For example, square(3) == (3 * 3) == 9.""") > > def square(x): return x * x > > Disadvantage: > > > > * If it ain't broke, don't fix it. > > * The signature becomes harder to read > > That said, I think the main advantage of using the decorator is the > ability to more easily set the docstring *programmatically* (which you > did list as a benefit). I've been thinking we may want a > "@inheritdocs" class decorator, too (which would replace any "__doc__ > is None" entries in methods with the docs from the parent methods). Or perhaps pydoc should be smarter and walk the __mro__ until it finds a __doc__. Regards Antoine. From techtonik at gmail.com Wed May 1 12:52:53 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 1 May 2013 13:52:53 +0300 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: <87fvyawl86.fsf@uwakimon.sk.tsukuba.ac.jp> References: <517D4FEA.9090605@nedbatchelder.com> <87fvyawl86.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Apr 28, 2013 at 9:42 PM, Stephen J. Turnbull wrote: > > If he had a specific proposal to adopt an existing workflow tool that > could be just plugged in, that would be on-topic (for lack of an open- > subscription python-cabal list).[1] > Oh no - the day is lost. http://en.wikipedia.org/wiki/Cabal - it's a whole new English wor(l)d. AOT: I'd open a Python-Cabal room in my city to work out on secret plans like online editor for docs.python.org. Footnotes: > [1] open-cabal is an oxymoron, of course. And TINC. Of course. ;-) > TINCerers are lurking in the depths of strictly secret pydotorg list. =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Wed May 1 15:35:09 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 01 May 2013 22:35:09 +0900 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: References: Message-ID: <87vc72u8ky.fsf@uwakimon.sk.tsukuba.ac.jp> anatoly techtonik writes: > I thought about writing PEPs, but my CPU cycles and Memory are too > limited to support current PEP process. I'd be happy to run a > Stackless version of it, which can be paralleled, suspended or > resumed on a different humanware. There are multiple examples of all of the above in the current stack. From techtonik at gmail.com Wed May 1 16:56:28 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 1 May 2013 17:56:28 +0300 Subject: [Python-ideas] chunks implementation Message-ID: 16+ hours work. print(list( chunks('sadfdfa', 3) )) print(list( chunks(range(8), 3) )) print(list( chunks([1,2,3,4,5,7], 3) )) ['sad', 'fdf', 'a'] [[0, 1, 2], [3, 4, 5], [6, 7]] [[1, 2, 3], [4, 5, 7]] Is it good? -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: chunks.py Type: application/octet-stream Size: 1610 bytes Desc: not available URL: From solipsis at pitrou.net Wed May 1 17:09:23 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 17:09:23 +0200 Subject: [Python-ideas] chunks implementation References: Message-ID: <20130501170923.58f20705@fsol> On Wed, 1 May 2013 17:56:28 +0300 anatoly techtonik wrote: > 16+ hours work. > > print(list( chunks('sadfdfa', 3) )) > print(list( chunks(range(8), 3) )) > print(list( chunks([1,2,3,4,5,7], 3) )) As long as you are not willing to sign a contributor's agreement, there's no point posting any code snippets or patches here. If you want advice about your personal work, it is off-topic for python-ideas, please use python-list instead. Thank you, Antoine. From abarnert at yahoo.com Wed May 1 18:03:30 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 1 May 2013 09:03:30 -0700 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: <87vc72u8ky.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87vc72u8ky.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On May 1, 2013, at 6:35, "Stephen J. Turnbull" wrote: > anatoly techtonik writes: > >> I thought about writing PEPs, but my CPU cycles and Memory are too >> limited to support current PEP process. I'd be happy to run a >> Stackless version of it, which can be paralleled, suspended or >> resumed on a different humanware. > > There are multiple examples of all of the above in the current stack. And, since a PEP is basically just a text file with some very simple formatting, if you want to collaborate with one or more other people, almost any collaboration system you want will work fine. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From techtonik at gmail.com Wed May 1 23:17:55 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 2 May 2013 00:17:55 +0300 Subject: [Python-ideas] chunks implementation In-Reply-To: <20130501170923.58f20705@fsol> References: <20130501170923.58f20705@fsol> Message-ID: On Wed, May 1, 2013 at 6:09 PM, Antoine Pitrou wrote: > On Wed, 1 May 2013 17:56:28 +0300 > anatoly techtonik > wrote: > > 16+ hours work. > > > > print(list( chunks('sadfdfa', 3) )) > > print(list( chunks(range(8), 3) )) > > print(list( chunks([1,2,3,4,5,7], 3) )) > > As long as you are not willing to sign a contributor's agreement, > there's no point posting any code snippets or patches here. > This code is in public domain. What's wrong with that? I can say what's wrong with Python CLA which basically relicenses code under Apache 2.0 license to "whatever license PSF wants", which is pretty illegal (i.e. doesn't work) as I see it. But that's really offtopic. I've posted a question to the python-legal-sig related to documentation and Wikipedia. I.e. Wikipedia doesn't require you to sign CLA, so let's allow them to sort it out first. You have my confirmation as an author that the code is in public domain. You can use MIT license if there is a problem with public domain works not made by NASA in US. You can ask me about patents and I say that I don't own and don't aware of any patents related to this code. What's wrong with community process here? I am here. I am not dead yet. If PSF is afraid of something - I can answer publicly. Just tell me what's wrong with that. What is this deal with signing these papers? Nobody understands the outcomings, nobody can say in public how it works, but everybody follows the procedure. Is this the modern behavior of Python hackers praised by Paul Graham. To me it looks like a lemming behavior liked by corporations and controlling organizations. May I opt-out from be a lemming in Python community as I see it and still have an ability to contribute? Sorry, my SSD or Vista or some software had just failed and I am a little bit on the edge for losing a partition with experimental code and a lot of valuable notes for more than half of the year, so I am not filtering that I write, and that's also the reason I probably won't be able to support the discussion in upcoming few weeks. If you want advice about your personal work, it is off-topic for > python-ideas, please use python-list instead. > There is a trick in this code which I believe is crucial to understanding why this code is not in stdlib yet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed May 1 23:23:59 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 14:23:59 -0700 Subject: [Python-ideas] chunks implementation In-Reply-To: References: <20130501170923.58f20705@fsol> Message-ID: EVERYBODY, STOP THIS DISCUSSION. IT IS NOT PRODUCTIVE. IT IS OFF-TOPICS FOR THIS LIST. Anatoly, the next time you bring this up on any list you WILL be banned. This is your last warning. (But, Antoine, better sit on your hands than provoke him. :-) On Wed, May 1, 2013 at 2:17 PM, anatoly techtonik wrote: > On Wed, May 1, 2013 at 6:09 PM, Antoine Pitrou wrote: >> >> On Wed, 1 May 2013 17:56:28 +0300 >> anatoly techtonik >> wrote: >> > 16+ hours work. >> > >> > print(list( chunks('sadfdfa', 3) )) >> > print(list( chunks(range(8), 3) )) >> > print(list( chunks([1,2,3,4,5,7], 3) )) >> >> As long as you are not willing to sign a contributor's agreement, >> there's no point posting any code snippets or patches here. > > > This code is in public domain. What's wrong with that? I can say what's > wrong with Python CLA which basically relicenses code under Apache 2.0 > license to "whatever license PSF wants", which is pretty illegal (i.e. > doesn't work) as I see it. But that's really offtopic. I've posted a > question to the python-legal-sig related to documentation and Wikipedia. > I.e. Wikipedia doesn't require you to sign CLA, so let's allow them to sort > it out first. > > You have my confirmation as an author that the code is in public domain. You > can use MIT license if there is a problem with public domain works not made > by NASA in US. You can ask me about patents and I say that I don't own and > don't aware of any patents related to this code. What's wrong with community > process here? I am here. I am not dead yet. If PSF is afraid of something - > I can answer publicly. Just tell me what's wrong with that. What is this > deal with signing these papers? Nobody understands the outcomings, nobody > can say in public how it works, but everybody follows the procedure. Is this > the modern behavior of Python hackers praised by Paul Graham. To me it looks > like a lemming behavior liked by corporations and controlling organizations. > May I opt-out from be a lemming in Python community as I see it and still > have an ability to contribute? > > Sorry, my SSD or Vista or some software had just failed and I am a little > bit on the edge for losing a partition with experimental code and a lot of > valuable notes for more than half of the year, so I am not filtering that I > write, and that's also the reason I probably won't be able to support the > discussion in upcoming few weeks. > >> If you want advice about your personal work, it is off-topic for >> python-ideas, please use python-list instead. > > > There is a trick in this code which I believe is crucial to understanding > why this code is not in stdlib yet. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From techtonik at gmail.com Thu May 2 09:08:47 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 2 May 2013 10:08:47 +0300 Subject: [Python-ideas] chunks implementation In-Reply-To: References: <20130501170923.58f20705@fsol> Message-ID: On Thu, May 2, 2013 at 12:23 AM, Guido van Rossum wrote: > EVERYBODY, STOP THIS DISCUSSION. IT IS NOT PRODUCTIVE. IT IS > OFF-TOPICS FOR THIS LIST. > > Anatoly, the next time you bring this up on any list you WILL be > banned. This is your last warning. (But, Antoine, better sit on your > hands than provoke him. :-) I am an animal. I canot resist when people feed me. Sorry. :( -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Thu May 2 09:13:20 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 2 May 2013 10:13:20 +0300 Subject: [Python-ideas] chunks implementation In-Reply-To: References: <20130501170923.58f20705@fsol> Message-ID: On Thu, May 2, 2013 at 12:23 AM, Guido van Rossum wrote: > EVERYBODY, STOP THIS DISCUSSION. IT IS NOT PRODUCTIVE. IT IS > OFF-TOPICS FOR THIS LIST. > > Anatoly, the next time you bring this up on any list you WILL be > banned. This is your last warning. (But, Antoine, better sit on your > hands than provoke him. :-) Just to make this clear for future reference. Isn't the python-legal-sig created just to handle this stuff? May I at least bring these topics there? It is the last chance for me to become the committer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pieter at nagel.co.za Thu May 2 18:49:50 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Thu, 02 May 2013 18:49:50 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() Message-ID: <1367513390.2868.381.camel@basilisk> I propose adding methods like isfile(), isdir(), islink(), isfifo() and so on - basically everything that would currently be done via code like "stat.S_ISREG(s.st_mode)". Please indicate support or not, so I can know whether to draft a PEP and work on implementation. My motivation is twofold: Firstly, it would make code that needs to interpret stat() results using the existing S_ISREG etc. methods in the stat module look cleaner, more Pythonic, and less like C code manipulating bitmasks. Secondly, in a recent discussion on python-dev [1] the issue was raised that the stat() call can perform badly under certain situations, and that some form of caching of the result of stat() calls is therefore desirable. This proposal makes it easier to do one form of caching stat() results: the kind where the result is manually cached by storing it in some variable. Think of code such as: if os.path.isfile(f) or os.path.isdir(f): # do something This will indirectly cause two calls to stat(). Currently, if you want to manually cache that stat call, you'll need to write: s = os.stat(f) if stat.S_ISREG(s.st_mode) or stat.S_ISDIR(s.st_mode): # do something This not only looks more convoluted and requires an extra import of stat, but it also looks wildly different from the previous code even though it basically has the same semantics. Under my proposal, this could become: s = os.stat(f) if s.isfile() or s.isdir(): # do something This proposal is independent of the current PEP 428 Path object proposal. However, if accepted, users of PEP 428 Path objects will also benefit, since those can also return results of stat() calls. -- Pieter Nagel From python at mrabarnett.plus.com Thu May 2 20:12:28 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 02 May 2013 19:12:28 +0100 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367513390.2868.381.camel@basilisk> References: <1367513390.2868.381.camel@basilisk> Message-ID: <5182AC8C.9020704@mrabarnett.plus.com> On 02/05/2013 17:49, Pieter Nagel wrote: > I propose adding methods like isfile(), isdir(), islink(), isfifo() and > so on - basically everything that would currently be done via code like > "stat.S_ISREG(s.st_mode)". > > Please indicate support or not, so I can know whether to draft a PEP and > work on implementation. > > My motivation is twofold: > > Firstly, it would make code that needs to interpret stat() results using > the existing S_ISREG etc. methods in the stat module look cleaner, more > Pythonic, and less like C code manipulating bitmasks. > > Secondly, in a recent discussion on python-dev [1] the issue was raised > that the stat() call can perform badly under certain situations, and > that some form of caching of the result of stat() calls is therefore > desirable. > > This proposal makes it easier to do one form of caching stat() results: > the kind where the result is manually cached by storing it in some > variable. > > Think of code such as: > > if os.path.isfile(f) or os.path.isdir(f): > # do something > > This will indirectly cause two calls to stat(). > > Currently, if you want to manually cache that stat call, you'll need to > write: > > s = os.stat(f) > if stat.S_ISREG(s.st_mode) or stat.S_ISDIR(s.st_mode): > # do something > > This not only looks more convoluted and requires an extra import of > stat, but it also looks wildly different from the previous code even > though it basically has the same semantics. > > Under my proposal, this could become: > > s = os.stat(f) > if s.isfile() or s.isdir(): > # do something > > This proposal is independent of the current PEP 428 Path object > proposal. However, if accepted, users of PEP 428 Path objects will also > benefit, since those can also return results of stat() calls. > > +1 It also means not having to import the stat module to get the strangely-named (to me) constants (why the "S_" prefix? Yes, I do know why, BTW. :-)). From mertz at gnosis.cx Thu May 2 20:14:37 2013 From: mertz at gnosis.cx (David Mertz) Date: Thu, 2 May 2013 11:14:37 -0700 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5182AC8C.9020704@mrabarnett.plus.com> References: <1367513390.2868.381.camel@basilisk> <5182AC8C.9020704@mrabarnett.plus.com> Message-ID: +1 On May 2, 2013 11:13 AM, "MRAB" wrote: > On 02/05/2013 17:49, Pieter Nagel wrote: > >> I propose adding methods like isfile(), isdir(), islink(), isfifo() and >> so on - basically everything that would currently be done via code like >> "stat.S_ISREG(s.st_mode)". >> >> Please indicate support or not, so I can know whether to draft a PEP and >> work on implementation. >> >> My motivation is twofold: >> >> Firstly, it would make code that needs to interpret stat() results using >> the existing S_ISREG etc. methods in the stat module look cleaner, more >> Pythonic, and less like C code manipulating bitmasks. >> >> Secondly, in a recent discussion on python-dev [1] the issue was raised >> that the stat() call can perform badly under certain situations, and >> that some form of caching of the result of stat() calls is therefore >> desirable. >> >> This proposal makes it easier to do one form of caching stat() results: >> the kind where the result is manually cached by storing it in some >> variable. >> >> Think of code such as: >> >> if os.path.isfile(f) or os.path.isdir(f): >> # do something >> >> This will indirectly cause two calls to stat(). >> >> Currently, if you want to manually cache that stat call, you'll need to >> write: >> >> s = os.stat(f) >> if stat.S_ISREG(s.st_mode) or stat.S_ISDIR(s.st_mode): >> # do something >> >> This not only looks more convoluted and requires an extra import of >> stat, but it also looks wildly different from the previous code even >> though it basically has the same semantics. >> >> Under my proposal, this could become: >> >> s = os.stat(f) >> if s.isfile() or s.isdir(): >> # do something >> >> This proposal is independent of the current PEP 428 Path object >> proposal. However, if accepted, users of PEP 428 Path objects will also >> benefit, since those can also return results of stat() calls. >> >> >> +1 > > It also means not having to import the stat module to get the > strangely-named (to me) constants (why the "S_" prefix? Yes, I do know > why, BTW. :-)). > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu May 2 21:07:28 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 2 May 2013 20:07:28 +0100 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367513390.2868.381.camel@basilisk> References: <1367513390.2868.381.camel@basilisk> Message-ID: On 2 May 2013 17:49, Pieter Nagel wrote: > I propose adding methods like isfile(), isdir(), islink(), isfifo() and > so on - basically everything that would currently be done via code like > "stat.S_ISREG(s.st_mode)". > > Please indicate support or not, so I can know whether to draft a PEP and > work on implementation. > +1 for all the reasons you mention. I would never think of using stat.S_ISREG(s.st_mode) - it looks too low level. But s.isfile() looks completely obvious. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Fri May 3 00:38:31 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 3 May 2013 01:38:31 +0300 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: References: <1367513390.2868.381.camel@basilisk> Message-ID: +1 On Thu, May 2, 2013 at 10:07 PM, Paul Moore wrote: > > On 2 May 2013 17:49, Pieter Nagel wrote: >> >> I propose adding methods like isfile(), isdir(), islink(), isfifo() and >> so on - basically everything that would currently be done via code like >> "stat.S_ISREG(s.st_mode)". >> >> Please indicate support or not, so I can know whether to draft a PEP and >> work on implementation. > > > +1 for all the reasons you mention. I would never think of using > stat.S_ISREG(s.st_mode) - it looks too low level. But s.isfile() looks > completely obvious. > > Paul > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Thanks, Andrew Svetlov From ncoghlan at gmail.com Fri May 3 01:00:16 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 09:00:16 +1000 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: References: <1367513390.2868.381.camel@basilisk> Message-ID: +1 here, too. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri May 3 01:48:18 2013 From: christian at python.org (Christian Heimes) Date: Fri, 03 May 2013 01:48:18 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367513390.2868.381.camel@basilisk> References: <1367513390.2868.381.camel@basilisk> Message-ID: <5182FB42.6020808@python.org> Am 02.05.2013 18:49, schrieb Pieter Nagel: > Currently, if you want to manually cache that stat call, you'll need to > write: > > s = os.stat(f) > if stat.S_ISREG(s.st_mode) or stat.S_ISDIR(s.st_mode): > # do something > > This not only looks more convoluted and requires an extra import of > stat, but it also looks wildly different from the previous code even > though it basically has the same semantics. > > Under my proposal, this could become: > > s = os.stat(f) > if s.isfile() or s.isdir(): > # do something > > This proposal is independent of the current PEP 428 Path object > proposal. However, if accepted, users of PEP 428 Path objects will also > benefit, since those can also return results of stat() calls. Hi Pieter, I like your proposal. We could take the opportunity now and push the proposal one or two steps further. First step: drop the function call stat_result.isfile() or stat_result.isdir() don't have to be functions. The feature can also be implemented with properties, e.g. stat_result.is_file. Or can somebody think of a reason why they have to be callables anymore? Second step: get file type as string A property stat_result.file_type that returns the type of the file as string makes checks like "s.is_dir or s.is_file" even easier: s = os.stat(f) if s.file_type in {'reg', 'dir'}: do_something() We have to agree on a set of names, though. IMHO the abbreviations from stat.h are clear and distinct: {'fifo', 'chr', 'dir', 'blk', 'reg', 'lnk', 'sock', 'door', 'port'}. door and port are special file types on Solaris. Christian From eric at trueblade.com Fri May 3 01:53:19 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 02 May 2013 19:53:19 -0400 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5182FB42.6020808@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> Message-ID: <5182FC6F.8020307@trueblade.com> On 5/2/2013 7:48 PM, Christian Heimes wrote: > Second step: get file type as string > > A property stat_result.file_type that returns the type of the file as > string makes checks like "s.is_dir or s.is_file" even easier: > > s = os.stat(f) > if s.file_type in {'reg', 'dir'}: > do_something() > > We have to agree on a set of names, though. IMHO the abbreviations from > stat.h are clear and distinct: {'fifo', 'chr', 'dir', 'blk', 'reg', > 'lnk', 'sock', 'door', 'port'}. door and port are special file types on > Solaris. Seems like a use case for a flag-based enum! -- Eric. From abarnert at yahoo.com Fri May 3 02:09:25 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 2 May 2013 17:09:25 -0700 (PDT) Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5182FB42.6020808@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> Message-ID: <1367539765.17087.YahooMailNeo@web184703.mail.ne1.yahoo.com> From: Christian Heimes Sent: Thursday, May 2, 2013 4:48 PM > Am 02.05.2013 18:49, schrieb Pieter Nagel: >> Under my proposal, this could become: >> >> ? s = os.stat(f) >> ? if s.isfile() or s.isdir(): >> ? ? # do something +1 > First step: drop the function call > > stat_result.isfile() or stat_result.isdir() don't have to be functions. > The feature can also be implemented with properties, e.g. > stat_result.is_file. Or can somebody think of a reason why they have to > be callables anymore? Well, there's the fact that os.path.isfile is a callable. And I've actually seen code that uses isfile in a filter call, and operator.attrgettr('is_file') obviously isn't as nice. But then a genexp is probably nicer than filter here anyway. So, two very trivial downsides. I guess +0. > Second step: get file type as string > > A property stat_result.file_type that returns the type of the file as > string makes checks like "s.is_dir or s.is_file" even easier: > > s = os.stat(f) > if s.file_type in {'reg', 'dir'}: > ? do_something() If this is _in addition to_ the methods/attributes, +1. If it's in place of them, -1. There are cases where this will be simpler, but for the most common case, s.isdir is much nicer than s.file_type == 'dir'. > We have to agree on a set of names, though. IMHO the abbreviations from > stat.h are clear and distinct: {'fifo', 'chr', 'dir', > 'blk', 'reg', > 'lnk', 'sock', 'door', 'port'}. door and port > are special file types on > Solaris. This one's actually a problem. If os.path.isfile(name) is true, and so is s.isfile, but s.file_type=='file' is false, that's going to be confusing. Especially to novices and Windows programmers?the very people who write code like os.path.islink(f) or os.path.isdir(x) today because they're afraid of the stat module, who we're trying to help here. In fact, I suspect that, even after they learn that it's "reg" rather than "file", they're going to have a hard time remembering it. But calling it 'file' is confusing to everyone who _does_ know stat. Anything you can call stat on is a file. And I don't know of a good answer here. From abarnert at yahoo.com Fri May 3 02:20:37 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 2 May 2013 17:20:37 -0700 (PDT) Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5182FB42.6020808@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> Message-ID: <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> From: Christian Heimes Sent: Thursday, May 2, 2013 4:48 PM Also: > We have to agree on a set of names, though. IMHO the abbreviations from > stat.h are clear and distinct: {'fifo', 'chr', 'dir', > 'blk', 'reg', > 'lnk', 'sock', 'door', 'port'}. door and port > are special file types on > Solaris. Does Python have stat.S_ISDOOR on Solaris? (It doesn't on other POSIX systems, and it's not mentioned in the docs.) Meanwhile, if we're going to add non-standard platform-specific flags, these aren't the only two. Mac and most other *BSD have WHT. (I believe recent linux/glibc doesn't expose it anymore, because it's treated as internal to certain unionfs implementations?) POSIX 1.b also defines MQ, SEM, and SHM (although these aren't required to be stored inside the S_IFMT bits of mode). From python at mrabarnett.plus.com Fri May 3 03:56:56 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 03 May 2013 02:56:56 +0100 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367539765.17087.YahooMailNeo@web184703.mail.ne1.yahoo.com> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367539765.17087.YahooMailNeo@web184703.mail.ne1.yahoo.com> Message-ID: <51831968.80300@mrabarnett.plus.com> On 03/05/2013 01:09, Andrew Barnert wrote: > From: Christian Heimes > > Sent: Thursday, May 2, 2013 4:48 PM > > >> Am 02.05.2013 18:49, schrieb Pieter Nagel: >>> Under my proposal, this could become: >>> >>> s = os.stat(f) >>> if s.isfile() or s.isdir(): >>> # do something > > > +1 > >> First step: drop the function call >> >> stat_result.isfile() or stat_result.isdir() don't have to be functions. >> The feature can also be implemented with properties, e.g. >> stat_result.is_file. Or can somebody think of a reason why they have to >> be callables anymore? > > > Well, there's the fact that os.path.isfile is a callable. > True. > And I've actually seen code that uses isfile in a filter call, and operator.attrgettr('is_file') obviously isn't as nice. But then a genexp is probably nicer than filter here anyway. > > So, two very trivial downsides. I guess +0. > >> Second step: get file type as string >> >> A property stat_result.file_type that returns the type of the file as >> string makes checks like "s.is_dir or s.is_file" even easier: >> >> s = os.stat(f) >> if s.file_type in {'reg', 'dir'}: >> do_something() > > If this is _in addition to_ the methods/attributes, +1. > > If it's in place of them, -1. There are cases where this will be simpler, but for the most common case, s.isdir is much nicer than s.file_type == 'dir'. > >> We have to agree on a set of names, though. IMHO the abbreviations from >> stat.h are clear and distinct: {'fifo', 'chr', 'dir', >> 'blk', 'reg', >> 'lnk', 'sock', 'door', 'port'}. door and port >> are special file types on >> Solaris. > > > This one's actually a problem. > > If os.path.isfile(name) is true, and so is s.isfile, but s.file_type=='file' is false, that's going to be confusing. Especially to novices and Windows programmers?the very people who write code like os.path.islink(f) or os.path.isdir(x) today because they're afraid of the stat module, who we're trying to help here. In fact, I suspect that, even after they learn that it's "reg" rather than "file", they're going to have a hard time remembering it. > It wouldn't be """s.file_type=='file'""", but """'file' in s.file_type""". And I agree about 'reg'. > But calling it 'file' is confusing to everyone who _does_ know stat. Anything you can call stat on is a file. > ...even if os.path.isfile(...) says it isn't. > And I don't know of a good answer here. > Maybe "file_type" is the wrong name for it. From pieter at nagel.co.za Fri May 3 06:55:10 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Fri, 03 May 2013 06:55:10 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5182FB42.6020808@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> Message-ID: <1367556910.2868.415.camel@basilisk> On Fri, 2013-05-03 at 01:48 +0200, Christian Heimes wrote: > stat_result.isfile() or stat_result.isdir() don't have to be functions. > The feature can also be implemented with properties, e.g. > stat_result.is_file. Or can somebody think of a reason why they have to > be callables anymore? I lean towards keeping it a function call for symmetry with os.path.isfile() and friends. > s = os.stat(f) > if s.file_type in {'reg', 'dir'}: > do_something() If something like this were to be done, I wouldn't like doing it with magic string constants. I agree that the new enums would be better to do this with. This also raises the issue of whether, if there is a file type enumeration on the stat() result, whether there should be a symmetric os.path.file_type(f) call added. But I'll remain open to these kinds of discussions as the PEP is discussed, It seems there's enough support for the basic principle for me to go and work on the PEP. -- Pieter Nagel From pieter at nagel.co.za Fri May 3 07:22:26 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Fri, 03 May 2013 07:22:26 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> Message-ID: <1367558546.2868.419.camel@basilisk> On Thu, 2013-05-02 at 17:20 -0700, Andrew Barnert wrote: > Does Python have stat.S_ISDOOR on Solaris? (It doesn't on other POSIX systems, and it's not mentioned in the docs.) In principle I'm all for looking at missing platform-specific stat flags while that region of the stdlib is being worked on. In practice, though, I only have access to Linux when it comes to implementing this. Support for other platforms will most likely depend on the availability of volunteers when it comes to implementation. -- Pieter Nagel From cf.natali at gmail.com Fri May 3 08:56:37 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 3 May 2013 08:56:37 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5182FB42.6020808@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> Message-ID: > First step: drop the function call > > stat_result.isfile() or stat_result.isdir() don't have to be functions. > The feature can also be implemented with properties, e.g. > stat_result.is_file. Or can somebody think of a reason why they have to > be callables anymore? Callables sound more consistent. > Second step: get file type as string > > A property stat_result.file_type that returns the type of the file as > string makes checks like "s.is_dir or s.is_file" even easier: > > s = os.stat(f) > if s.file_type in {'reg', 'dir'}: > do_something() Strings shouldn't be used for anything except text. It defeats the typing system, prevents static check, offers poor performance, etc. This kind of attribute should ideally be an enum ;-) Note that you have to be careful when changing os.stat() return type: we absolutely don't want to break backward compatibility: for example, the returned object should look like a tuple (among other things, support indexing). From christian at python.org Fri May 3 13:57:12 2013 From: christian at python.org (Christian Heimes) Date: Fri, 03 May 2013 13:57:12 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367558546.2868.419.camel@basilisk> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> <1367558546.2868.419.camel@basilisk> Message-ID: <5183A618.7070005@python.org> Am 03.05.2013 07:22, schrieb Pieter Nagel: > On Thu, 2013-05-02 at 17:20 -0700, Andrew Barnert wrote: > >> Does Python have stat.S_ISDOOR on Solaris? (It doesn't on other POSIX systems, and it's not mentioned in the docs.) > > In principle I'm all for looking at missing platform-specific stat flags > while that region of the stdlib is being worked on. > > In practice, though, I only have access to Linux when it comes to > implementing this. Support for other platforms will most likely depend > on the availability of volunteers when it comes to implementation. You can ask Trent Nelson for snakebite.net access. He has lots important operation systems in his setup. I can also help you if you need information or testing. So far I was able to identify this set of file types: S_ISDIR() S_ISCHR() S_ISBLK() S_ISREG() S_ISLNK() S_ISSOCK() S_ISFIFO() # Solaris S_ISDOOR() S_ISPORT() # POSIX 1.b real-time extension S_ISMSG() S_ISSEM() S_ISSHM() # whiteout, translucent file systems S_ISWHT From christian at python.org Fri May 3 14:10:52 2013 From: christian at python.org (Christian Heimes) Date: Fri, 03 May 2013 14:10:52 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <51831968.80300@mrabarnett.plus.com> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367539765.17087.YahooMailNeo@web184703.mail.ne1.yahoo.com> <51831968.80300@mrabarnett.plus.com> Message-ID: <5183A94C.6010604@python.org> Am 03.05.2013 03:56, schrieb MRAB: > It wouldn't be """s.file_type=='file'""", but """'file' in s.file_type""". > > And I agree about 'reg'. > >> But calling it 'file' is confusing to everyone who _does_ know stat. >> Anything you can call stat on is a file. >> > ...even if os.path.isfile(...) says it isn't. > >> And I don't know of a good answer here. >> > Maybe "file_type" is the wrong name for it. It's the POSIX nomenclature and the Plan 9 concept "everything is a file". POSIX calls it "file type" all over the place, e.g. in the documentation of stat's st_mode field. A new term is going to confuse lots of Unix developers. Windows developer are only used to two kinds of files: regular files and directories. Even symlinks are rarely used on Windows. I agree that Windows developers are going to be confused by the concept of 'reg' or 'regular file'. Andrew's sugestion of an enum instead of strings has a nice benefit. We can have both concepts if file_types.FILE == file_types.REG. Christian From ronaldoussoren at mac.com Fri May 3 14:12:29 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 3 May 2013 14:12:29 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5183A618.7070005@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> <1367558546.2868.419.camel@basilisk> <5183A618.7070005@python.org> Message-ID: There also exist macros S_TYPEISMQ, S_TYPEISSEM, S_TYPEISSHM and S_TYPEISTMO, those have a struct stat as the argument and return if it refers to a message queue, semaphore, shared memory segment or typed memory (see http://pubs.opengroup.org/onlinepubs/009604599/basedefs/sys/stat.h.html). It don't know if there a systems where these macros can return a value other than 0 (both OSX and Linux always "return" 0 from these macros). Ronald On 3 May, 2013, at 13:57, Christian Heimes wrote: > Am 03.05.2013 07:22, schrieb Pieter Nagel: >> On Thu, 2013-05-02 at 17:20 -0700, Andrew Barnert wrote: >> >>> Does Python have stat.S_ISDOOR on Solaris? (It doesn't on other POSIX systems, and it's not mentioned in the docs.) >> >> In principle I'm all for looking at missing platform-specific stat flags >> while that region of the stdlib is being worked on. >> >> In practice, though, I only have access to Linux when it comes to >> implementing this. Support for other platforms will most likely depend >> on the availability of volunteers when it comes to implementation. > > You can ask Trent Nelson for snakebite.net access. He has lots important > operation systems in his setup. I can also help you if you need > information or testing. > > So far I was able to identify this set of file types: > > S_ISDIR() > S_ISCHR() > S_ISBLK() > S_ISREG() > S_ISLNK() > S_ISSOCK() > S_ISFIFO() > > # Solaris > S_ISDOOR() > S_ISPORT() > > # POSIX 1.b real-time extension > S_ISMSG() > S_ISSEM() > S_ISSHM() > > # whiteout, translucent file systems > S_ISWHT > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From christian at python.org Fri May 3 14:27:08 2013 From: christian at python.org (Christian Heimes) Date: Fri, 03 May 2013 14:27:08 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> <1367558546.2868.419.camel@basilisk> <5183A618.7070005@python.org> Message-ID: <5183AD1C.9010604@python.org> Am 03.05.2013 14:12, schrieb Ronald Oussoren: > There also exist macros S_TYPEISMQ, S_TYPEISSEM, S_TYPEISSHM and S_TYPEISTMO, those have a struct stat as the argument and return if it refers to a message queue, semaphore, shared memory segment or typed memory (see http://pubs.opengroup.org/onlinepubs/009604599/basedefs/sys/stat.h.html). > > It don't know if there a systems where these macros can return a value other than 0 (both OSX and Linux always "return" 0 from these macros). I've checked stat.h on some additional machines. Solaris 11 and AIX 7 have the macros but they always evaluates to 0. FreeBSD doesn't have the macros at all. I could not find typed memory object macros on any system. I guess we can safely ignore the files types as they aren't available on any supported platform. Christian From pieter at nagel.co.za Fri May 3 17:14:21 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Fri, 03 May 2013 17:14:21 +0200 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> Message-ID: <1367594061.2868.451.camel@basilisk> To all the proponents of a file_type() attribute: can you please show some use-cases for this? I don't want to complicate the PEP just for some speculative nice-to-have. -- Pieter Nagel From abarnert at yahoo.com Fri May 3 18:06:33 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 3 May 2013 09:06:33 -0700 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <1367594061.2868.451.camel@basilisk> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367594061.2868.451.camel@basilisk> Message-ID: <9CA5418D-0206-4D0D-B1E1-DC359AC8E6DF@yahoo.com> On May 3, 2013, at 8:14, Pieter Nagel wrote: > To all the proponents of a file_type() attribute: can you please show > some use-cases for this? There's one really obvious one for os.walk or similar functions: types_to_recurse = ('dir', 'link') if follow else ('dir',) # .., if s.file_type in types_to_recurse: try_to_recurse() Meanwhile, it strikes me that if you just change this to is_type(types_to_check), it gives a parallel with isinstance and friends, which has two benefits. First, it means you can handle synonyms. Just like isinstance can handle subclasses/ABC registration/etc. while type() cannot, istype can handle both 'reg' and 'file' as the same type while file_type cannot. Second, by following the usual python pattern for pre-checking, it makes it blindingly obvious that you're violating EAFTP, forcing you to think about whether you have a good reason to do so. In the example above, I do (I don't want to try recursing into symlinks if follow is false, even though it would work), but that may not be true for every use case. > I don't want to complicate the PEP just for some speculative > nice-to-have. > > -- > Pieter Nagel > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From random832 at fastmail.us Fri May 3 19:21:59 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 03 May 2013 13:21:59 -0400 Subject: [Python-ideas] Extend the os.stat() result objects with methods like isfile() and isdir() In-Reply-To: <5183A618.7070005@python.org> References: <1367513390.2868.381.camel@basilisk> <5182FB42.6020808@python.org> <1367540437.8073.YahooMailNeo@web184701.mail.ne1.yahoo.com> <1367558546.2868.419.camel@basilisk> <5183A618.7070005@python.org> Message-ID: <1367601719.12582.140661226202865.076F6A80@webmail.messagingengine.com> On Fri, May 3, 2013, at 7:57, Christian Heimes wrote: > You can ask Trent Nelson for snakebite.net access. He has lots important > operation systems in his setup. I can also help you if you need > information or testing. > > So far I was able to identify this set of file types Heirloom toolchest "ls" supports: http://heirloom.cvs.sourceforge.net/viewvc/heirloom/heirloom/ls/ls.c?revision=1.9&view=markup http://heirloom.cvs.sourceforge.net/viewvc/heirloom/heirloom/ls/ls.1?revision=1.5&view=markup S_IFNWK HP-UX network special file S_IFNAM XENIX special named file S_INSEM XENIX semaphore subtype of IFNAM (looked up from s->rdev) S_INSHD XENIX shared data subtype of IFNAM " " " " Of these, GNU coreutils ls only supports doors and whiteouts. Chasing after a random hunch (something about AIX), I found these: http://cd.textfiles.com/transameritech2/EXTRAS/JOVE-4.6/ASK.C S_ISHIDDEN Hidden Directory [aix] S_ISCDF Context Dependent Files [hpux] S_ISNWK Network Special [hpux] http://lists.gnu.org/archive/html/bug-gnulib/2012-12/msg00084.html S_ISMPX AIX "MPX" file (multiplex device?) https://github.com/gagern/gnulib/blob/master/tests/test-sys_stat.c has a massive pile of macros with no comments S_ISCTG S_ISMPB S_ISMPX S_ISNAM S_ISNWK S_ISOFD S_ISOFL S_ISPORT http://lists.gnu.org/archive/html/bug-gnulib/2004-08/msg00017.html S_ISOFD Cray DMF (data migration facility): off line, with data S_ISOFL Cray DMF (data migration facility): off line, with no data S_ISCTG Contiguous (It's possible that these may not be file types) http://doiso.googlecode.com/svn/trunk/Source/mkisofs-1.12b5/include/statdefs.h S_ISMPC UNUSED multiplexed c S_ISNAM Named file (XENIX) S_ISMPB UNUSED multiplexed b S_ISCNT Contiguous file S_ISSHAD Solaris shadow inode http://www.opensource.apple.com/source/gnutar/gnutar-450/gnutar/lib/sys_stat_.h S_ISMPB /* V7 */ S_ISPORT /* Solaris 10 and up */ S_TYPEISSEM S_TYPEISSHM - macros to check the XENIX IFNAM types mentioned above S_TYPEISMQ S_TYPEISTMO From jbvsmo at gmail.com Sat May 4 19:47:22 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Sat, 4 May 2013 14:47:22 -0300 Subject: [Python-ideas] global and nonlocal with atributes Message-ID: Hi, I couldn't find whether this was proposed or not, but seems interesting to me: Whenever there's an assignment inside a big function, we think it is a local variable, but it could have been defined as global on top and that can mess things up if not checked. So, I why not have attribute syntax on "global" and "nonlocal" keywords? Something that is written like: x = 10 def increment(): global x x += 1 could be replaced by x = 10 def increment(): global.x += 1 another example: def foo(x): def bar(): return nonlocal.x + 1 return bar These should generate the same "LOAD_GLOBAL" and "LOAD_DEREF" bytecodes, so "global" and "nonlocal" will not be objects nor passed as arguments. That way you can have both "global.x", "nonlocal.x" and "x" variables without conflict. x = 10 def f(x): def g(): # how can I set both "x" variables inside here? "Namespaces are one honking great idea -- let's do more of those!" -- Jo?o Bernardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From ubershmekel at gmail.com Sat May 4 20:03:33 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Sat, 4 May 2013 21:03:33 +0300 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: Message-ID: On 4 May 2013 19:48, "Jo?o Bernardo" wrote: > These should generate the same "LOAD_GLOBAL" and "LOAD_DEREF" bytecodes, so "global" and "nonlocal" will not be objects nor passed as arguments. > > That way you can have both "global.x", "nonlocal.x" and "x" variables without conflict. > I think the best language would work like that. But having more than one way to do it is bad for your health. So perhaps this can be the new way to do it in python 4. I'd also expect "global" to be iterable i.e. replacing "globals()". Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun May 5 07:36:45 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sun, 05 May 2013 01:36:45 -0400 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: Message-ID: On 5/4/2013 1:47 PM, Jo?o Bernardo wrote: > Hi, > I couldn't find whether this was proposed or not, but seems interesting > to me: > > Whenever there's an assignment inside a big function, If a function is 'too big', it can be split. Nexted functions that rebind a nonlocal name (closures), which is more often sensible than rebinding global names, are and should usually be short enough already. > we think it is a > local variable, but it could have been defined as global on top and that > can mess things up if not checked. > > So, I why not have attribute syntax on "global" and "nonlocal" keywords? Because attributes are attributes of objects and keywords are not objects > Something that is written like: > > x = 10 > def increment(): > global x > x += 1 > > could be replaced by > > x = 10 > def increment(): > global.x += 1 One can already do something similar g = globals() ... g['x'] += 1 -- Terry Jan Reedy From steve at pearwood.info Sun May 5 08:41:02 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 05 May 2013 16:41:02 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: Message-ID: <5185FEFE.6030000@pearwood.info> On 05/05/13 03:47, Jo?o Bernardo wrote: > Hi, > I couldn't find whether this was proposed or not, but seems interesting to > me: > > Whenever there's an assignment inside a big function, we think it is a > local variable, but it could have been defined as global on top and that > can mess things up if not checked. Are you suggesting that checking the top of the function for a global declaration is so difficult that Python needs special syntax to fix that? Your description of the problem that this new syntax will fix is based on two design flaws: * it's a function using global variables; * it's a BIG function; Both of which are signs that the function should be re-written, not that the language needs a band-aid to fix some of the pain from poor-quality code. A well-designed language should encourage good code. Both global variables and large monolithic functions are signs of potentially bad code. > So, I why not have attribute syntax on "global" and "nonlocal" keywords? [...] > x = 10 > def increment(): > global.x += 1 As Python exists today, this would be a big change for very little benefit. global is a keyword, not an object, so the language would need to allow *some* keywords to be used as if they were objects, but not all keywords: y = global.x + 1 # allowed y = for.x + 1 # not allowed Even though here global looks like it is an object, it isn't, since this will cause a SyntaxError: y = global # global what? Of course, global must remain a keyword, for backwards compatibility with code writing things the old way. However, I can see some merit in your suggestion, for Python 4000 when backwards compatibility is no longer an issue. Get rid of the global and nonlocal declarations, and instead introduce two, or possibly three, reserved names, similar to None: globals nonlocals builtins each of which is a true object. Now globals.x is equivalent to globals()['x'], and there is no need for a globals() function or a global keyword. Obviously I have not thought this through in any great detail. For instance, there is only a single None object in the entire Python application, but every module will need its own globals object, and every nested function its own nonlocals object. Maybe this idea is only superficially interesting and no good in practice. But we've got plenty of time to think about it. [...] > That way you can have both "global.x", "nonlocal.x" and "x" variables > without conflict. > > x = 10 > def f(x): > def g(): > # how can I set both "x" variables inside here? You can't, and you shouldn't need to. How hard is it for you to use a different name for the global and the local? There are billions of possible names to choose from, why use the same name for both? Again, your proposal encourages poorly written code. Good language features should encourage good code, not bad code. It's actually a good thing that using global variables is slightly inconvenient, as that helps discourage people from using them. -- Steven From rosuav at gmail.com Sun May 5 10:46:06 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 5 May 2013 18:46:06 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: <5185FEFE.6030000@pearwood.info> References: <5185FEFE.6030000@pearwood.info> Message-ID: On Sun, May 5, 2013 at 4:41 PM, Steven D'Aprano wrote: >> That way you can have both "global.x", "nonlocal.x" and "x" variables >> without conflict. >> >> x = 10 >> def f(x): >> def g(): >> # how can I set both "x" variables inside here? > > > You can't, and you shouldn't need to. How hard is it for you to use a > different name for the global and the local? There are billions of possible > names to choose from, why use the same name for both? The conflict already exists in another form, and with the same solution: >>> def foo(): list=[1,2,3] # ... return __builtins__.list() >>> foo() [] We're allowed to shadow builtins, and what's more, there's a way to explicitly call up the builtin even after it's been shadowed. If nothing else, it allows a measure of simplicity with common names like 'id' - instead of having to pick a different name, you just go right ahead and use the obvious one, knowing that the builtin _is_ retrievable. With globals, yes it's possible to reference them via globals()["some_name"], but that looks ugly. Would the OP's proposal look better if, instead of "globals.x", it were "__globals__.x"? Or possibly "__module__.x", that being a magic word that references your current module, whatever-it-may-be? ChrisA From python at 2sn.net Sun May 5 12:40:02 2013 From: python at 2sn.net (Alexander Heger) Date: Sun, 5 May 2013 20:40:02 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: <5185FEFE.6030000@pearwood.info> Message-ID: > We're allowed to shadow builtins, and what's more, there's a way to > explicitly call up the builtin even after it's been shadowed. If > nothing else, it allows a measure of simplicity with common names like > 'id' - instead of having to pick a different name, you just go right > ahead and use the obvious one, knowing that the builtin _is_ > retrievable. With globals, yes it's possible to reference them via > globals()["some_name"], but that looks ugly. Would the OP's proposal > look better if, instead of "globals.x", it were "__globals__.x"? Or > possibly "__module__.x", that being a magic word that references your > current module, whatever-it-may-be? I suppose this way you could also go several levels up as well __nonlocal__.__nonlocal__.x -Alexander From steve at pearwood.info Sun May 5 13:10:29 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 05 May 2013 21:10:29 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: <5185FEFE.6030000@pearwood.info> Message-ID: <51863E25.3030800@pearwood.info> On 05/05/13 18:46, Chris Angelico wrote: > On Sun, May 5, 2013 at 4:41 PM, Steven D'Aprano wrote: >>> That way you can have both "global.x", "nonlocal.x" and "x" variables >>> without conflict. >>> >>> x = 10 >>> def f(x): >>> def g(): >>> # how can I set both "x" variables inside here? >> >> >> You can't, and you shouldn't need to. How hard is it for you to use a >> different name for the global and the local? There are billions of possible >> names to choose from, why use the same name for both? > > The conflict already exists in another form, and with the same solution: > >>>> def foo(): > list=[1,2,3] > # ... > return __builtins__.list() > >>>> foo() > [] > > We're allowed to shadow builtins, and what's more, there's a way to > explicitly call up the builtin even after it's been shadowed. If > nothing else, it allows a measure of simplicity with common names like > 'id' - instead of having to pick a different name, you just go right > ahead and use the obvious one, knowing that the builtin _is_ > retrievable. With globals, yes it's possible to reference them via > globals()["some_name"], but that looks ugly. Shadowing is a real issue, but any solution must be reasonable. Adding special magic syntax for some, but not all, keywords to act almost, but not quite, like an object is a million lightyears beyond reasonable when there are already two perfectly good solutions to the shadowing problem: - pick a different name; - use globals()['some_name'] Only the first applies to non-locals, of course, but even so, "pick a different name" remains the best solution to this problem. Of course you are right, we are allowed to shadow builtins, or anything else. But the usual "consenting adults" disclaimer applies: - shadow things only when you need to (although the barrier for "need" can be quite low); - provided doing so doesn't cause you pain; - if it does cause you pain, then the old doctor's advice still stands: "Doc, it hurts when I do this." "Then don't do that!" - And if you absolutely insist on shooting yourself in the foot, then Python provides some tools for recovering from your self-inflicted wounds (in this case, doing a lookup on globals()) without promising to go to extraordinary efforts to disguise the fact that your code is bad and you should feel bad. If you are deliberately shadowing names *that you need*, then you are doing something dumb, and it is not Python's responsibility to make dumb things less painful. Magic syntax as suggested counts as extraordinary effort. It spoils the nice clean design of the language. "Everything is an object" becomes "everything is an object, except for global and nonlocal". How far do we want the global keyword to simulate being an object? For example, what happens if I do this? print(global) Can I do this? f = global.__getitem__ f("x") # like global.x Is global.__dict__ another way to spell globals()? Can I pass global as an argument to functions? If I can do this: eval(expr, global) then why can't I do this? x = global but in that case, what is type(x), given that global is not actually an object? Note that for backwards compatibility, we cannot just make global a reserved name, like None, that refers to the global namespace. Doing this might be good Python 4000 territory. But for Python 3.x, this proposal adds complexity and complication to the language to solve no problem that actually needs to be solved. > Would the OP's proposal > look better if, instead of "globals.x", it were "__globals__.x"? Or > possibly "__module__.x", that being a magic word that references your > current module, whatever-it-may-be? Whether you spell it globals() or __globals__ or __module__, the OP's suggestion isn't to add another name for the current global namespace. That would merely be redundant. We already have at least three ways to do the same thing: global x; x globals()['x'] eval('x', globals()) If somebody wants a forth way, using attribute access instead of key lookup, I'm sure they could write a proxy to globals() that works fine. Put it on ActiveState, and if it gets lots of interest, then maybe there would be a case for adding it as standard. But the OP's proposal is specifically to allow the global and nonlocal keywords to appear in places where currently expressions can appear, as well as still appearing as statements, which implies that they behave almost but not quite like objects without actually being objects. That's the sort of nonsense that you get in PHP, where there are things that look like functions or operators, like list() and (int), but are actually magic handled by the parser. http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/ -- Steven From rosuav at gmail.com Sun May 5 13:32:13 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 5 May 2013 21:32:13 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: <51863E25.3030800@pearwood.info> References: <5185FEFE.6030000@pearwood.info> <51863E25.3030800@pearwood.info> Message-ID: On Sun, May 5, 2013 at 9:10 PM, Steven D'Aprano wrote: > But the OP's proposal is specifically to allow the global and nonlocal > keywords to appear in places where currently expressions can appear, as well > as still appearing as statements, which implies that they behave almost but > not quite like objects without actually being objects. That's the sort of > nonsense that you get in PHP, where there are things that look like > functions or operators, like list() and (int), but are actually magic > handled by the parser. Fair enough. And I absolutely agree about the PHP mess, that's not something to desire by any means. Maybe my modified suggestion has an alternative use, though. Is there value in having a token that refers to the current module? Every language has a way for class methods/instance methods/whatever you call them/etc to reference the current object instance, even if attribute lookup is implicit - eg C++ with 'this': foo &foo::do_something() { abcd += 3; return *this; } Needing to reference your own module isn't common, but it would allow you to look up your own constants via calculated names, for instance, so maybe there's value in it. Or maybe not, I dunno. ChrisA From ncoghlan at gmail.com Sun May 5 13:39:50 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 5 May 2013 21:39:50 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: <5185FEFE.6030000@pearwood.info> <51863E25.3030800@pearwood.info> Message-ID: On Sun, May 5, 2013 at 9:32 PM, Chris Angelico wrote: > On Sun, May 5, 2013 at 9:10 PM, Steven D'Aprano wrote: >> But the OP's proposal is specifically to allow the global and nonlocal >> keywords to appear in places where currently expressions can appear, as well >> as still appearing as statements, which implies that they behave almost but >> not quite like objects without actually being objects. That's the sort of >> nonsense that you get in PHP, where there are things that look like >> functions or operators, like list() and (int), but are actually magic >> handled by the parser. > > Fair enough. And I absolutely agree about the PHP mess, that's not > something to desire by any means. > > Maybe my modified suggestion has an alternative use, though. Is there > value in having a token that refers to the current module? Every > language has a way for class methods/instance methods/whatever you > call them/etc to reference the current object instance, even if > attribute lookup is implicit - eg C++ with 'this': > > foo &foo::do_something() > { > abcd += 3; > return *this; > } > > Needing to reference your own module isn't common, but it would allow > you to look up your own constants via calculated names, for instance, > so maybe there's value in it. Or maybe not, I dunno. It's easy to do already if you want it: import sys thismod = sys.modules[__name__] And yes, after doing that at the module level, you can then use "thismod.x" instead of "global x; x" or "globals()['x']" In general though, wanting to do this kind of thing is a sign that you have a class or closure waiting to be defined somewhere in your code. Some things become painful because they indicate a structural problem in the affected code, and this is one of those cases. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From jbvsmo at gmail.com Sun May 5 17:44:23 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Sun, 5 May 2013 12:44:23 -0300 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: Message-ID: (posting to the list) > Whenever there's an assignment inside a big function, >> > > If a function is 'too big', it can be split. Nexted functions that rebind > a nonlocal name (closures), which is more often sensible than rebinding > global names, are and should usually be short enough already. > > Programming is also about reading other people's ugly code. So, I why not have attribute syntax on "global" and "nonlocal" keywords? >> > > Because attributes are attributes of objects and keywords are not objects > > Read again the part where I said: 'so "global" and "nonlocal" will not be objects'. BTW, Some other keywords do multiple jobs, like "if/else", "yield" and "super". Somehow, "super" is not in the list of python keywords, but it alters the environment where it is inserted in a way that no other variable can. Try doing "mega = super" and initialize a class with "mega().__init__()". I still think "global" an "nonlocal" shouldn't become objects, thought. One can already do something similar > g = globals() > ... > g['x'] += 1 > > This won't work with "locals". Also, there's not a way to access "nonlocal" variables. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbvsmo at gmail.com Sun May 5 17:45:46 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Sun, 5 May 2013 12:45:46 -0300 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: <5185FEFE.6030000@pearwood.info> References: <5185FEFE.6030000@pearwood.info> Message-ID: (posting to the list) Your description of the problem that this new syntax will fix is based on two design flaws: Sorry, but I think you're missing the point... People write bad code and other people read it. This is just an example and there are many other use cases. > > > As Python exists today, this would be a big change for very little > benefit. global is a keyword, not an object, so the language would need to > allow *some* keywords to be used as if they were objects, but not all > keywords: > > Even though here global looks like it is an object, it isn't, since this > will cause a SyntaxError: > > y = global # global what? Well, I didn't proposed them to become objects... There's nothing wrong with some keywords behaving *partially* like objects. Of course, global must remain a keyword, for backwards compatibility with > code writing things the old way. > > Yes. This is what I said > However, I can see some merit in your suggestion, for Python 4000 when > backwards compatibility is no longer an issue. Get rid of the global and > nonlocal declarations, and instead introduce two, or possibly three, > reserved names, similar to None: > > globals > nonlocals > builtins > > "builtins" is already a module, so it works already. The other two doesn't need to break compatibility > each of which is a true object. Now globals.x is equivalent to > globals()['x'], and there is no need for a globals() function or a global > keyword. > > If they do become objects, thats true. But the "nonlocal" case is a little more difficult than the "globals" > You can't, and you shouldn't need to. How hard is it for you to use a > different name for the global and the local? There are billions of possible > names to choose from, why use the same name for both? Again, your proposal > encourages poorly written code. Good language features should encourage > good code, not bad code. It's actually a good thing that using global > variables is slightly inconvenient, as that helps discourage people from > using them. > > Imagine you got the job of writing the inner function. Some crazy person did the other part and you cannot break compatibility or something... I'm not sayin it is a real case, just it would be nice to differ those variables Jo?o Bernardo 2013/5/5 Steven D'Aprano > On 05/05/13 03:47, Jo?o Bernardo wrote: > >> Hi, >> I couldn't find whether this was proposed or not, but seems interesting to >> me: >> >> Whenever there's an assignment inside a big function, we think it is a >> local variable, but it could have been defined as global on top and that >> can mess things up if not checked. >> > > Are you suggesting that checking the top of the function for a global > declaration is so difficult that Python needs special syntax to fix that? > > Your description of the problem that this new syntax will fix is based on > two design flaws: > > * it's a function using global variables; > > * it's a BIG function; > > Both of which are signs that the function should be re-written, not that > the language needs a band-aid to fix some of the pain from poor-quality > code. A well-designed language should encourage good code. Both global > variables and large monolithic functions are signs of potentially bad code. > > > > > So, I why not have attribute syntax on "global" and "nonlocal" keywords? >> > [...] > > x = 10 >> def increment(): >> global.x += 1 >> > > > As Python exists today, this would be a big change for very little > benefit. global is a keyword, not an object, so the language would need to > allow *some* keywords to be used as if they were objects, but not all > keywords: > > y = global.x + 1 # allowed > > y = for.x + 1 # not allowed > > > Even though here global looks like it is an object, it isn't, since this > will cause a SyntaxError: > > y = global # global what? > > > Of course, global must remain a keyword, for backwards compatibility with > code writing things the old way. > > However, I can see some merit in your suggestion, for Python 4000 when > backwards compatibility is no longer an issue. Get rid of the global and > nonlocal declarations, and instead introduce two, or possibly three, > reserved names, similar to None: > > globals > nonlocals > builtins > > each of which is a true object. Now globals.x is equivalent to > globals()['x'], and there is no need for a globals() function or a global > keyword. > > Obviously I have not thought this through in any great detail. For > instance, there is only a single None object in the entire Python > application, but every module will need its own globals object, and every > nested function its own nonlocals object. Maybe this idea is only > superficially interesting and no good in practice. But we've got plenty of > time to think about it. > > > > [...] > > That way you can have both "global.x", "nonlocal.x" and "x" variables >> without conflict. >> >> x = 10 >> def f(x): >> def g(): >> # how can I set both "x" variables inside here? >> > > You can't, and you shouldn't need to. How hard is it for you to use a > different name for the global and the local? There are billions of possible > names to choose from, why use the same name for both? Again, your proposal > encourages poorly written code. Good language features should encourage > good code, not bad code. It's actually a good thing that using global > variables is slightly inconvenient, as that helps discourage people from > using them. > > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun May 5 18:09:17 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 06 May 2013 02:09:17 +1000 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: References: <5185FEFE.6030000@pearwood.info> Message-ID: <5186842D.4050900@pearwood.info> On 06/05/13 01:45, Jo?o Bernardo wrote: > (posting to the list) > > Your description of the problem that this new syntax will fix is based on > two design flaws: > > Sorry, but I think you're missing the point... People write bad code and > other people read it. You are correct, people do write bad code, and a good programming language should encourage them to write good code, and discourage them from writing bad code. Changing "global x" to "global.x" will not make bad code less bad, it will still be bad. This proposal does not encourage people to write better code. All it does is give them a second way to write bad code. > This is just an example and there are many other use cases. Great! Let's hear what those other use-cases are. Maybe some of them are more compelling. > > Even though here global looks like it is an object, it isn't, since this >> will cause a SyntaxError: >> >> y = global # global what? > > > Well, I didn't proposed them to become objects... There's nothing wrong > with some keywords behaving *partially* like objects. Of course there is something wrong with keywords behaving partially like objects. It is bad design that makes it harder to reason about what code does. It makes it harder to learn the language. It complicates the execution model. It complicates the parser. Good programming languages must be consistent. Things which are similar should look similar. Things which are different should look different. In your proposal, things which are different (the global keyword, and objects) look similar. That makes Python less consistent, which makes it a worse language. -- Steven From ram.rachum at gmail.com Sun May 5 17:29:36 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Sun, 5 May 2013 08:29:36 -0700 (PDT) Subject: [Python-ideas] Implement `Executor.map_as_completed` Message-ID: <0feb50ca-2bb9-44ee-9480-daab00ce763d@googlegroups.com> I suggest a combination of `futures.as_completed` and `futures.Executor.map`. I think it should be a method `futures.Executor.map_as_completed`. It'll be the same as `.map`, but yield the results according to the order of completion. What do you think? (Background: This Stack Overflow question .) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbvsmo at gmail.com Sun May 5 18:46:35 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Sun, 5 May 2013 13:46:35 -0300 Subject: [Python-ideas] global and nonlocal with atributes In-Reply-To: <5186842D.4050900@pearwood.info> References: <5185FEFE.6030000@pearwood.info> <5186842D.4050900@pearwood.info> Message-ID: > > This is just an example and there are many other use cases. >> > > Great! Let's hear what those other use-cases are. Maybe some of them are > more compelling. > > - Being able to identify more easily the variable. - UnboundLocalError is too confusing for starters - This is a counterpart for "from foo import bar". Did you never scrolled up 500 lines of code in a module to see what this "do_weird_stuff()" function came from? - Maybe "big function" I wrote is not just what you're thinking: If you're reading a diff output, 10 lines could be enough to hide the information. If you're on 80x25 terminal window, 30 lines could be too much. If you have a critical security bug at 3am after drinking two liters of coffee, 1 line can be too much. And, if you're one of the people my boss hire to "help" the project, I won't even guess what is too much for your twisted mind. :) Yes, there are use cases, you can probably think of others too. >> Well, I didn't proposed them to become objects... There's nothing wrong >> with some keywords behaving *partially* like objects. >> > > Of course there is something wrong with keywords behaving partially like > objects. It is bad design that makes it harder to reason about what code > does. It makes it harder to learn the language. It complicates the > execution model. It complicates the parser. > > Good programming languages must be consistent. Things which are similar > should look similar. Things which are different should look different. In > your proposal, things which are different (the global keyword, and objects) > look similar. That makes Python less consistent, which makes it a worse > language. > > The form of "super" without arguments is super inconsistent. It is just black magic, but it works. Another point is that your text editor (usually) will color the word "global" differently, so it's easy to accept this may not be an object like others. Also, I would argue it is more consistent to use namespaces than to declare the variable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yannickhold at gmail.com Sun May 5 19:13:30 2013 From: yannickhold at gmail.com (Yannick Hold-Geoffroy) Date: Sun, 5 May 2013 13:13:30 -0400 Subject: [Python-ideas] Implement `Executor.map_as_completed` Message-ID: Hello, It may be interesting to highlight that multiprocessing handles this by providing a imap_unordered function . I provide an interface compatible to the futures (PEP-3148) in my parallel framework (SCOOP ) and I also feel the need for a simple way to iterate through results as they are completed. I understand that the *map()* function should be as generic as possible and finer grained concurrency should be explicitly handled by the user, but as Mr. Rachum said, a shortcut between the finest granularity (user have to submit futures manually) and generic serial-like map could be of use (It would help write DRY code since this is a really common use case). I am not necessarily for the creation of a new function implicitly doing an unordered iteration, though. Since futures.map returns an iterator, I think we should provide users with a function that wraps the futures.map and allow users to explicitly call it for unordered iteration. This could be very intuitive -- Like allowing futures.as_completed to accept futures.map() output. This could be done using the object-oriented philosophy (by keeping some internal state linking the map result to their related future) or functional philosophy (by returning an object behaving like the result with the addition of an identifier to the future). Internally, the function could either implement a callback returning the value to the user as soon as results are available or call as_completed on their related futures. Either way (new function or new wrapper) would greatly simplify some usual use cases of concurrent or parallel programs using or inspired by PEP-3148. Have a nice day, Yannick Hold -------------- next part -------------- An HTML attachment was scrubbed... URL: From foolistbar at googlemail.com Sun May 5 22:03:13 2013 From: foolistbar at googlemail.com (Geoffrey Sneddon) Date: Sun, 05 May 2013 21:03:13 +0100 Subject: [Python-ideas] Trie ABC? Message-ID: <5186BB01.8080805@googlemail.com> Currently there are a large number of trie implementations for Python, all with slightly different APIs. It would be nice to introduce a ABC for Tries to attempt to unify these. Why do people want tries in Python? Typically because they provide quick lookup of prefixes and keys starting with a prefix. html5lib uses a pluggable trie based around a pseudo-ABC for tokenizing entities, for example (see below for links). It would likely make sense to introduce Trie and MutableTrie ABCs inheriting from Mapping and (MutableMapping, Trie) respectively. It is suggested to add at least two mixins: longest_prefix(prefix) returning the longest prefix of "prefix" that is a member of the Trie. keys_with_prefix(prefix) (or maybe keys_starting_with, but I'm bike-shedding myself now!) returning an iterable of keys that start with with prefix. Some implementations simply override keys for this, adding an optional first argument of prefix. Many implementations also include something like has_keys_with_prefix returning len(keys_with_prefix(prefix)) > 0. A selection of existing trie implementations for Python: https://github.com/kmike/datrie https://github.com/buriy/python-chartrie https://github.com/kmike/hat-trie https://github.com/dhain/trie I can attempt to summarize the various APIs these provide if there is any interest whatsoever (it seems like there's little use if it's decided against on principle!). It would be nice to get an implementation into the stdlib if people are in favour of an ABC, but given there's no real singular solution that has been proven in the field I'm not rushing to force this. was a case of trying to get a trie implementation into the stdlib. There's something vaguely along the lines of the above ABC that I implemented for html5lib in , as well as a light-weight pure-Python implementation in . /gsnedders From stephen at xemacs.org Mon May 6 07:03:32 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 06 May 2013 14:03:32 +0900 Subject: [Python-ideas] Trie ABC? In-Reply-To: <5186BB01.8080805@googlemail.com> References: <5186BB01.8080805@googlemail.com> Message-ID: <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> Geoffrey Sneddon writes: > Currently there are a large number of trie implementations for Python, > all with slightly different APIs. It would be nice to introduce a ABC > for Tries to attempt to unify these. I don't understand why you want an ABC. Mapping is the ABC, Trie is a concrete implementation, and an actual trie is an instance of Trie. Wouldn't canonizing one of the existing implementations into the stdlib be the straight way forward? The fact that that was tried once and failed doesn't mean it's not the right thing. Cf. enums (multiple PEPs, finally succeeded with PEP 435). > Why do people want tries in Python? I don't think that question needs to be asked, tries are a well-known data structure with clear (if somewhat specialized) uses. From abarnert at yahoo.com Mon May 6 09:38:55 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 6 May 2013 00:38:55 -0700 Subject: [Python-ideas] Trie ABC? In-Reply-To: <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <419CCF48-1282-48C5-81E0-5EF59C941675@yahoo.com> On May 5, 2013, at 22:03, "Stephen J. Turnbull" wrote: > Geoffrey Sneddon writes: > >> Currently there are a large number of trie implementations for Python, >> all with slightly different APIs. It would be nice to introduce a ABC >> for Tries to attempt to unify these. > > I don't understand why you want an ABC. Mapping is the ABC, Trie is a > concrete implementation, and an actual trie is an instance of Trie. No, in his proposal, Trie adds new methods to the interface, on top of those required for Mapping (longest_prefix, etc.), making it an ABC. > Wouldn't canonizing one of the existing implementations into the > stdlib be the straight way forward? Even if we did, is it conceivable that someone might want to use another implementation, or an extension to the concept, or a wrapper, and want to signify that it implements a Trie, in the same way we can with MutableMapping (and all the other ABCs)? I'm not actually sure, but it's not something to dismiss out of hand. Put another way: we canonized a set implementation, and that didn't mean we had no use for a Set ABC. Why is Trie inherently different? From ncoghlan at gmail.com Mon May 6 09:47:39 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 May 2013 17:47:39 +1000 Subject: [Python-ideas] Trie ABC? In-Reply-To: <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, May 6, 2013 at 3:03 PM, Stephen J. Turnbull wrote: > Geoffrey Sneddon writes: > > > Currently there are a large number of trie implementations for Python, > > all with slightly different APIs. It would be nice to introduce a ABC > > for Tries to attempt to unify these. > > I don't understand why you want an ABC. Mapping is the ABC, Trie is a > concrete implementation, and an actual trie is an instance of Trie. > Wouldn't canonizing one of the existing implementations into the > stdlib be the straight way forward? I believe it is the extra trie specific method names that Geoffrey is interested in standardising: longest_prefix(item) keys_with_prefix(prefix) has_key_with_prefix(prefix) Note that using "prefix" as the parameter for the first operation is incorrect (as it is not a prefix, it is the candidate item to be matched against), and that using "has_keys_with_prefix" as the name for the last operation is wrong if the condition is "at least one key" - including the plural in the name suggests the condition is "at least two keys". This seems reasonable, although I think it may make more sense in conjunction with a reference implementation in the standard library. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From pieter at nagel.co.za Mon May 6 10:30:04 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 10:30:04 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) Message-ID: <1367829004.2868.619.camel@basilisk> Following our discussion of last week, here is a first draft of the PEP PEP: XXX Title: Extended stat_result Version: $Revision$ Last-Modified: $Date$ Author: Pieter Nagel Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 03-May-2013 Python-Version: 3.4 Abstract ======== This PEP proposes extending the result of ``os.stat()``, ``os.fstat()`` and ``os.lstat()`` calls with added methods such as ``is_file()``. These added methods will obviate the need to use the ``stat`` module to interpret the result of these calls. Motivation ========== Currently, there are two different mechanisms for interrogating the file types of filesystem paths, each with distinctly different appearance and performance characteristics. The first mechanism is a set of functions in the module ``os.path``, such as ``os.path.isfile()`` and ``os.path.isdir()``. These functions? names express their semantics relatively directly, but performance-wise each call entails an ``os.stat()`` call, which could potentially be redundant if another ``os.*stat()`` call had been done earlier for the same path in order to query other similar properties of the path. The second mechanism is by first calling ``os.stat()``, ``os.fstat()`` or ``os.lstat()`` (henceforth collectively referred to as just ?``os.*stat()``?) for a particular path, and then interpreting the result using functions in the ``stat`` module. Performance-wise, these only require a single ``os.*stat()`` call, no matter how many times different properties of the result object are interrogated. But the downside is that the names of the functions needed to interrogate the result object, such as ``stat.S_ISREG()``, are relatively opaque, and motivated more by a desire to conform to standards for the names of the underlying C macros than by a desire to be semantically meaningful in English or to be Pythonic. There are situations where the performance penalty of ``os.*stat()`` calls can be significant enough to take into consideration. For example, on some networked filesystems they can be quite slow. Another consideration is that each call releases the GIL, which can also have negative performance effects especially on multi-threaded code. The end result of all this is that performance-agnostic code can be written in a relatively straightforward way:: if os.path.isfile(f) or os.path.isdir(f): # do something Whereas in contrast, similar code that wishes to avoid the penalty of two potential calls to ``os.stat()``, will look radically different:: st = os.stat(f) if stat.S_ISREG(st.st_mode) or stat.S_ISDIR(st.st_mode): # do something The cost is even worse if one takes into account that the second code fragment still needs to take the nonexistence of ``f`` into account in order to be completely semantically equivalent to the first, and also has the extra cost of needing to import the ``stat`` module. This PEP proposes ameliorating the situation by adding higher-level predicates such as ``is_file()`` and ``is_dir()`` directly to the ``stat_result`` object, so that (assuming the file ``f`` exists) the second code example can become:: st = os.stat(f) if st.is_file() or st.is_dir(): # do something Specification ============= Added methods on ``stat_result`` -------------------------------- is_dir() Equivalent to ``bool(stat.S_ISDIR(self.st_mode))``. is_character_device() Equivalent to ``bool(stat.S_ISCHR(self.st_mode))``. is_block_device() Equivalent to ``bool(stat.S_ISBLK(self.st_mode))``. is_file() Equivalent to ``bool(stat.S_ISREG(self.st_mode))``. is_fifo() Equivalent to ``bool(stat.S_ISFIFO(self.st_mode))``. is_symbolic_link() Equivalent to ``bool(stat.S_ISLNK(self.st_mode))``. is_socket() Equivalent to ``bool(stat.S_ISSOCK(self.st_mode)``. same_stat(other) Equivalent to ``os.path.samestat(self, other)``. file_mode() This shall return ``stat.filemode(stat.S_IMODE(self.st_mode))``, i.e. a string of the form ?-rwxrwxrwx?. permission_bits() This shall return ``stat.S_IMODE(self.st_mode)``. format() This shall return ``stat.S_IFMT(self.st_mode)``. Added functions in ``os.path`` ------------------------------ is_dir(f) This shall be an alias for the existing isdir(f). is_character_device(f) This shall return ``os.stat(f).is_character_device()``, or ``False`` if ``f`` does not exist. is_block_device(f) This shall return ``os.stat(f).is_block_device()``, or ``False`` if ``f`` does not exist. is_file() This shall be an alias for the existing isfile(f). is_fifo() This shall return ``os.stat(f).is_fifo()``, or ``False`` if ``f`` does not exist. is_symbolic_link() This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if ``f`` does not exist. is_socket() This shall return ``os.stat(f).is_socket()``, or ``False`` if ``f`` does not exist. Rationale ========= The PEP is strongly motivated by a desire for symmetry between functions in ``os.path`` and methods on ``stat_result``. Therefore, for each predicate function in ``os.path`` that is essentially just an interrogation of ``os.*stat()``, given an existing path, the similarly-named predicate method on ``stat_result`` should have the exact same semantics. This definition does not cover the case where the path being interrogated does not exist. In those cases, predicate functions in ``os.path``, such as ``os.path.isfile()``, will return ``False``, whereas ``os.*stat()`` will raise FileNotFoundError even before any ``stat_result`` is returned that could have been interrogated. This renders considerations of how the proposed new predicates on ``stat_result`` could have been symmetrical with functions in ``os.path``, if their ``stat_result`` had existed, moot, and this PEP does not propose doing anything about the situation (but see `Open Issues`_ below). Secondly, this definition refers to ?similarly-named? predicates instead of ?identically-named? predicates, because the names in ``os.path`` pre-date PEP 8 [#PEP-8]_, and are not compliant with it. This PEP takes the position that it is better that the new predicate methods on ``stat_result`` be named in compliance with PEP 8 [#PEP-8]_ (i.e. ``is_file()``), than that they be precisely identical to the names in ``os.path`` (i.e ``isfile()``). Note also that PEP 428 [#PEP-428]_ also specifies PEP-8 compliant names such as ``is_file()`` for the exact same concepts, and if PEP 428 [#PEP-428]_ should be accepted, the issue would be even more pertinent. Lastly, this PEP takes the notion of symmetry as far as adding methods and aliases to the existing ``os.path`` in order to be symmetrical with the added behaviour on ``stat_result``. But the author is least strongly convicted of this latter point, and may be convinced to abandon it. Backwards Compatibility ======================= This PEP neither removes current behavior of ``stat_result``, nor changes the semantics of any current behavior. Likewise, it adds functions and aliases for functions to ``os.path``, but does not remove or change any existing ones. Therefore, this PEP should not cause any backwards incompatibilities, except in the rare and esoteric cases where code is dependent on the *nonexistence* of the proposed new names. It is not deemed important remain compatible with code that mistakenly holds the Python Standard Library to be closed for new additions. Open Issues =========== Whether it is more desirable for the proposed added methods? names to follow PEP 8 [#PEP-8]_ (i.e. ``is_file()`` etc.), or to mirror the pre-existing names in ``os.path`` (i.e. ``isfile()`` etc.) is still open for debate. The existing attributes on ``stat_result`` follow the pattern ``st_*`` in conformance to the relevant POSIX names for the fields of the C-level ``stat`` structure. The new names for the behaviours proposed here do not contain such an ``st_`` prefix (nor could they, for that would suggest a conformance with ``stat`` structure names which do not exist in POSIX). But the resulting asymmetry of names is annoying. Should aliases for the existing ``st_*`` names be added that omit the ``st_`` prefix? This PEP does not address a higher-lever mechanism for exposing the owner/group/other read/write/execute permissions. Is there a need for this? This PEP does not address a higher-lever mechanism for exposing the of the underlying ``st_flags`` field. Is there a need for this? This PEP proposes aliases and methods to make ``os.path`` conform more to the added ``stat_result`` methods proposed here. But is the impedance mismatch between ``isfile`` and ``is_file`` really that much of an issue to warrant this? As it stands, this PEP does not address the asymmetry between the existing ``os.path.isfile()`` etc. functions and the new proposed mechanism in the case where the underlying file does not exist. There is a way to handle this, though: an optional flag could be added to ``os.*stat()`` that would return a null object implementation of ``stat_result`` whenever the file does not exist. Then that null object could return ``False`` to ``is_file()`` etc., That means that the following code would behave identically, even when the file ``f`` does not exist:: if os.path.isfile(f) or os.path.isdir(f): # do something st = os.stat(f, null_if_missing=True) if st.is_file() or st.is_dir(): # do something Would this be a useful mechanism? Rejected Proposals ================== It has been proposed [#filetype]_ that a mechanism be added whereby ``stat_result`` could return some sort of type code identifying the file type. Originally these type codes were proposed as strings such as 'reg', 'dir', and the like, but others suggested enumerations instead. The author rejected that proposal to keep the current PEP focused on ameliorating existing asymmetries rather than adding new behavior, but is not opposed to the notion in principle (assuming enums are used instead of strings). Experience with creating the reference implementation for this PEP may yet change the author's mind. Concerns have been raised [#isdoor]_ about platform-specific stat flags (such as S_ISDOOR on Solaris) that Python does not currently support, and which could be added as part of this proposal. The author has rejected such proposals, yet again in order to keep the PEP focused. The author may, yet again, be persuaded otherwise. References ========== .. [#PEP-8] PEP 8, Style Guide for Python Code , Rossum, Warsaw (http://www.python.org/dev/peps/pep-0008) .. [#PEP-428] PEP 428, The pathlib module -- object-oriented filesystem paths , Pitrou (http://www.python.org/dev/peps/pep-0428) .. [#filetype] http://mail.python.org/pipermail/python-ideas/2013-May/020378.html .. [#isdoor] http://mail.python.org/pipermail/python-ideas/2013-May/020378.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Pieter Nagel From ncoghlan at gmail.com Mon May 6 10:50:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 May 2013 18:50:09 +1000 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: On Mon, May 6, 2013 at 6:30 PM, Pieter Nagel wrote: > Following our discussion of last week, here is a first draft of the PEP > > PEP: XXX > Title: Extended stat_result > Version: $Revision$ > Last-Modified: $Date$ > Author: Pieter Nagel > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 03-May-2013 > Python-Version: 3.4 > > > Abstract > ======== > > This PEP proposes extending the result of ``os.stat()``, ``os.fstat()`` and > ``os.lstat()`` calls with added methods such as ``is_file()``. These added > methods will obviate the need to use the ``stat`` module to interpret the > result of these calls. Good proposal. Something it doesn't yet cover, and should, is a Python level API for creating these new objects. If we go down the path of allowing a "null" stat object, it would also make sense to add an exists() method to the method API. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From pieter at nagel.co.za Mon May 6 11:18:15 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 11:18:15 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367831895.2868.626.camel@basilisk> On Mon, 2013-05-06 at 18:50 +1000, Nick Coghlan wrote: > Something it doesn't yet cover, and should, is a Python level API for > creating these new objects. It seems you see a need to be able to create stat_result objects that do not represent any actual files on your filesystem, but rather represent potential files that could exist on your (or another) filesystem. I'm curious as to the use cases you see? > If we go down the path of allowing a "null" stat object, it would also > make sense to add an exists() method to the method API. Actually exists() is what led me down that path, I just forgot it when I wrote the PEP. But for now, I'm not sure if the whole null object proposal is going to fly at all. I kind of like it, but I I can't call to mind any other prior art for such null objects in the stdlib. -- Pieter Nagel From solipsis at pitrou.net Mon May 6 11:30:23 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 May 2013 11:30:23 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> Message-ID: <20130506113023.07e1ce27@pitrou.net> Le Mon, 06 May 2013 11:18:15 +0200, Pieter Nagel a ?crit : > > > If we go down the path of allowing a "null" stat object, it would > > also make sense to add an exists() method to the method API. > > Actually exists() is what led me down that path, I just forgot it > when I wrote the PEP. > > But for now, I'm not sure if the whole null object proposal is going > to fly at all. I kind of like it, but I I can't call to mind any other > prior art for such null objects in the stdlib. I don't really understand the point of a null object here, since os.stat() will raise when called on a non-existent patch. Regards Antoine. From cf.natali at gmail.com Mon May 6 11:31:39 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Mon, 6 May 2013 11:31:39 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: Hello, Looks good, a couple remarks. > Added methods on ``stat_result`` > -------------------------------- There are too many is_XXX methods. > is_dir() > Equivalent to ``bool(stat.S_ISDIR(self.st_mode))``. > is_file() > Equivalent to ``bool(stat.S_ISREG(self.st_mode))``. > is_symbolic_link() > Equivalent to ``bool(stat.S_ISLNK(self.st_mode))``. OK. > is_character_device() > Equivalent to ``bool(stat.S_ISCHR(self.st_mode))``. > > is_block_device() > Equivalent to ``bool(stat.S_ISBLK(self.st_mode))``. > > is_fifo() > Equivalent to ``bool(stat.S_ISFIFO(self.st_mode))``. > > is_socket() > Equivalent to ``bool(stat.S_ISSOCK(self.st_mode)``. Those are IMO useless. If we go down this road, we can also add Solaris door files, and another bazillion files types (see the previous thread). Something like is_other() or is_special() is enough. Code needing more specific information about the file type know how to use S_XXX, and has to be non portable anyway. > same_stat(other) > Equivalent to ``os.path.samestat(self, other)``. I think it could be better to add a "key" attribute, which would return (st_dev, st_ino). Then, checking that two stat results refer to the same file is simply a matter of comparing their key. Could the following be properties? > permission_bits() > This shall return ``stat.S_IMODE(self.st_mode)``. good > file_mode() > This shall return ``stat.filemode(stat.S_IMODE(self.st_mode))``, i.e. a > string of the form ?-rwxrwxrwx?. Interesting. I don't like the name, though. Also, if we provide a way to return a string representation from the permission bits, we also probably want an helper to do it the other way around, i.e. a permission bit array from a string. So I think those two methods we be better as static helper methods. > format() > This shall return ``stat.S_IFMT(self.st_mode)``. Is this really necessary? AFAICT, S_IFMT is only useful as an helper for S_ISREG/etc. I don't see any added value in exposing it. > Added functions in ``os.path`` > ------------------------------ > > is_dir(f) > This shall be an alias for the existing isdir(f). Why? > is_character_device(f) > This shall return ``os.stat(f).is_character_device()``, or ``False`` if > ``f`` does not exist. > > is_block_device(f) > This shall return ``os.stat(f).is_block_device()``, or ``False`` if > ``f`` does not exist. > > is_file() > This shall be an alias for the existing isfile(f). > > is_fifo() > This shall return ``os.stat(f).is_fifo()``, or ``False`` if > ``f`` does not exist. > > is_symbolic_link() > This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if > ``f`` does not exist. > > is_socket() > This shall return ``os.stat(f).is_socket()``, or ``False`` if > ``f`` does not exist. Same remark as above, I'm not convinced that all those special cases are necessary. > Rationale > ========= > > The PEP is strongly motivated by a desire for symmetry between functions in > ``os.path`` and methods on ``stat_result``. > > Therefore, for each predicate function in ``os.path`` that is essentially > just an interrogation of ``os.*stat()``, given an existing path, the > similarly-named predicate method on ``stat_result`` should have the exact > same semantics. > > This definition does not cover the case where the path being interrogated > does not exist. In those cases, predicate functions in ``os.path``, such > as ``os.path.isfile()``, will return ``False``, whereas ``os.*stat()`` will > raise FileNotFoundError even before any ``stat_result`` is returned that > could have been interrogated. This renders considerations of how the > proposed new predicates on ``stat_result`` could have been symmetrical with > functions in ``os.path``, if their ``stat_result`` had existed, moot, and > this PEP does not propose doing anything about the situation (but see `Open > Issues`_ below). > > Secondly, this definition refers to ?similarly-named? predicates instead of > ?identically-named? predicates, because the names in ``os.path`` pre-date > PEP 8 [#PEP-8]_, and are not compliant with it. This PEP takes the > position that it is better that the new predicate methods on > ``stat_result`` be named in compliance with PEP 8 [#PEP-8]_ (i.e. > ``is_file()``), than that they be precisely identical to the names in > ``os.path`` (i.e ``isfile()``). Note also that PEP 428 [#PEP-428]_ also > specifies PEP-8 compliant names such as ``is_file()`` for the exact same > concepts, and if PEP 428 [#PEP-428]_ should be accepted, the issue would be > even more pertinent. > > Lastly, this PEP takes the notion of symmetry as far as adding methods and > aliases to the existing ``os.path`` in order to be symmetrical with the > added behaviour on ``stat_result``. But the author is least strongly > convicted of this latter point, and may be convinced to abandon it. I'm not convinced either. > Backwards Compatibility > ======================= > > This PEP neither removes current behavior of ``stat_result``, nor changes > the semantics of any current behavior. Likewise, it adds functions and > aliases for functions to ``os.path``, but does not remove or change any > existing ones. > > Therefore, this PEP should not cause any backwards incompatibilities, > except in the rare and esoteric cases where code is dependent on the > *nonexistence* of the proposed new names. It is not deemed important > remain compatible with code that mistakenly holds the Python Standard > Library to be closed for new additions. You just want to make sure that your stat result is compatible with the current implementation (tuple-struct, supporting indexing). > Open Issues > =========== > > Whether it is more desirable for the proposed added methods? names to > follow PEP 8 [#PEP-8]_ (i.e. ``is_file()`` etc.), or to mirror the > pre-existing names in ``os.path`` (i.e. ``isfile()`` etc.) is still open > for debate. > > The existing attributes on ``stat_result`` follow the pattern ``st_*`` in > conformance to the relevant POSIX names for the fields of the C-level > ``stat`` structure. The new names for the behaviours proposed here do not > contain such an ``st_`` prefix (nor could they, for that would suggest a > conformance with ``stat`` structure names which do not exist in POSIX). > But the resulting asymmetry of names is annoying. Should aliases for the > existing ``st_*`` names be added that omit the ``st_`` prefix? If we offer a higher level abstraction, then I think the 'st_' prefix should be dropped. > As it stands, this PEP does not address the asymmetry between the existing > ``os.path.isfile()`` etc. functions and the new proposed mechanism in the > case where the underlying file does not exist. There is a way to handle > this, though: an optional flag could be added to ``os.*stat()`` that would > return a null object implementation of ``stat_result`` whenever the file > does not exist. Then that null object could return ``False`` to > ``is_file()`` etc., That means that the following code would behave > identically, even when the file ``f`` does not exist:: > > if os.path.isfile(f) or os.path.isdir(f): > # do something > > st = os.stat(f, null_if_missing=True) > if st.is_file() or st.is_dir(): > # do something > > Would this be a useful mechanism? I don't like the idea of adding an optional attribute: stating a non existing file will return an exception, that's it. Also, I don't like the idea of a null object. From stephen at xemacs.org Mon May 6 12:43:22 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 06 May 2013 19:43:22 +0900 Subject: [Python-ideas] Trie ABC? In-Reply-To: <419CCF48-1282-48C5-81E0-5EF59C941675@yahoo.com> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> <419CCF48-1282-48C5-81E0-5EF59C941675@yahoo.com> Message-ID: <87wqrcv16d.fsf@uwakimon.sk.tsukuba.ac.jp> Andrew Barnert writes: > Even if we did, is it conceivable that someone might want to use > another implementation, or an extension to the concept, or a > wrapper, and want to signify that it implements a Trie, in the same > way we can with MutableMapping (and all the other ABCs)? I'm not > actually sure, but it's not something to dismiss out of hand. I asked precisely because I'm not dismissing it. If I really thought it were dismissible, I'd just let somebody authoritative do so. > Put another way: we canonized a set implementation, and that didn't > mean we had no use for a Set ABC. Why is Trie inherently different? It's not *inherently* so. As you point out, the additional methods needed make it implicitly a different ABC from Mapping. But it's also not clear to me that it's not different from Set, that we can't do well enough with a single implementation that is well-optimized. We don't need a Sorter ABC; timsort is good enough for practical purposes. Maybe that's the way tries are, too. I'd like to hear more about it. From random832 at fastmail.us Mon May 6 13:38:48 2013 From: random832 at fastmail.us (Random832) Date: Mon, 06 May 2013 07:38:48 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: <51879648.8010001@fastmail.us> On 05/06/2013 04:30 AM, Pieter Nagel wrote: > is_symbolic_link() > This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if > ``f`` does not exist. lstat, surely. From pieter at nagel.co.za Mon May 6 14:29:27 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 14:29:27 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <20130506113023.07e1ce27@pitrou.net> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> Message-ID: <1367843367.2868.631.camel@basilisk> On Mon, 2013-05-06 at 11:30 +0200, Antoine Pitrou wrote: > I don't really understand the point of a null object here, since > os.stat() will raise when called on a non-existent patch. The point is that I am proposing to (optionally) change the behaviour of os.stat(), so they will *not* raise when the path is non-existent, but instead will return a null object. That can be achieved by adding a keyword argument flag like, for example, null_if_missing, to os.stat() that will select the new behaviour. The default will be the current behaviour. This null object will implement exists(), is_file() and the like so that it returns False exactly as os.path.exists(), os.path.isfile() etc. would have done, if one had used them instead of os.stat() to interrogate() the properties of the file. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 14:37:50 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 14:37:50 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367843870.2868.635.camel@basilisk> On Mon, 2013-05-06 at 18:50 +1000, Nick Coghlan wrote: > Something it doesn't yet cover, and should, is a Python level API for > creating these new objects. I see, based on the following code in os.py that this mechanism already exists: def _make_stat_result(tup, dict): return stat_result(tup, dict) Since I don't propose removing any behaviour from the current stat_result, my proposal will leave that untouched. -- Pieter Nagel From solipsis at pitrou.net Mon May 6 14:38:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 May 2013 14:38:45 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> Message-ID: <20130506143845.1e961567@pitrou.net> Le Mon, 06 May 2013 14:29:27 +0200, Pieter Nagel a ?crit : > On Mon, 2013-05-06 at 11:30 +0200, Antoine Pitrou wrote: > > > I don't really understand the point of a null object here, since > > os.stat() will raise when called on a non-existent patch. > > The point is that I am proposing to (optionally) change the behaviour > of os.stat(), so they will *not* raise when the path is non-existent, > but instead will return a null object. I don't really think that's satisfactory: 1. it's making the API more complicated 2. None should really be returned, not a "null object" 3. other os functions which expect a file will raise when passed a non-existing path, not return a "null object" I understand why you would like to do that, but IMO the problems above outweigh the advantages. If we were designing os.stat() right now, perhaps returning None would be ok. But I don't think a "null object" is a good proposition. Regards Antoine. From christian at python.org Mon May 6 14:50:29 2013 From: christian at python.org (Christian Heimes) Date: Mon, 06 May 2013 14:50:29 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: Am 06.05.2013 10:30, schrieb Pieter Nagel: > Abstract > ======== > > This PEP proposes extending the result of ``os.stat()``, ``os.fstat()`` and > ``os.lstat()`` calls with added methods such as ``is_file()``. These added > methods will obviate the need to use the ``stat`` module to interpret the > result of these calls. A side note: The stat module has a flaw. It's a pure Python module with hard coded constants. So far this has worked on all known platforms because the platforms are all using the same constants with equal meaning. AFAIK the POSIX specs only specify the names of the constants but not any values. I'm planing to make the stat module a built-in. http://bugs.python.org/issue11016 contains a first draft. In order to test the code you have to add "stat statmodule.c" to Modules/Setup and run ./configure && make. Christian From christian at python.org Mon May 6 14:53:53 2013 From: christian at python.org (Christian Heimes) Date: Mon, 06 May 2013 14:53:53 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367843870.2868.635.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367843870.2868.635.camel@basilisk> Message-ID: Am 06.05.2013 14:37, schrieb Pieter Nagel: > I see, based on the following code in os.py that this mechanism already > exists: > > def _make_stat_result(tup, dict): > return stat_result(tup, dict) > > Since I don't propose removing any behaviour from the current > stat_result, my proposal will leave that untouched. That's just for pickle support. You are going to have to reimplement the stat_result class in C. It's currently implemented as PyStructSequence but you can't subclass it. Christian From pieter at nagel.co.za Mon May 6 14:56:57 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 14:56:57 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367845017.2868.651.camel@basilisk> Comments inline, On Mon, 2013-05-06 at 11:31 +0200, Charles-Fran?ois Natali wrote: > Hello, > > Looks good, a couple remarks. > > > Added methods on ``stat_result`` > > -------------------------------- > > There are too many is_XXX methods. > > > is_dir() > > Equivalent to ``bool(stat.S_ISDIR(self.st_mode))``. > > is_file() > > Equivalent to ``bool(stat.S_ISREG(self.st_mode))``. > > is_symbolic_link() > > Equivalent to ``bool(stat.S_ISLNK(self.st_mode))``. > > OK. > > > is_character_device() > > Equivalent to ``bool(stat.S_ISCHR(self.st_mode))``. > > > > is_block_device() > > Equivalent to ``bool(stat.S_ISBLK(self.st_mode))``. > > > > is_fifo() > > Equivalent to ``bool(stat.S_ISFIFO(self.st_mode))``. > > > > is_socket() > > Equivalent to ``bool(stat.S_ISSOCK(self.st_mode)``. > > Those are IMO useless. > If we go down this road, we can also add Solaris door files, and > another bazillion files types (see the previous thread). I agree that adding methods like is_socket(), which are already Unix-specific, will create precedent and pressure to add concepts like Solaris door files here as well. And I agree that decisions taken here will have to be consistent with what we want for these platform-specific concepts as well. But I am not so sure if adding the concepts on stat_result is necessarily such a bad thing. It seems we will soon have the precedent of PEP 428, which sequesters the platform-specific concepts into separate classes such as PosixPath and NTPath. Perhaps one can similarly manage the potential future explosion of platform-specific is_door() etc. methods you are concerned about, by implementing stat_result using different concrete types, like solaris_stat_result and nt_st_result and the like? > Something like is_other() or is_special() is enough. Code needing more > specific information about the file type know how to use S_XXX, and > has to be non portable anyway. But note that if Python is going to handle Solaris doors etc. at all, the concept will have to be added *somewhere*. I am not sure why letting stat.S_XXX functions accumulate is so much better than allowing stat_result.is_xxx() methods accumulate. > I think it could be better to add a "key" attribute, which would > return (st_dev, st_ino). Then, checking that two stat results refer to > the same file is simply a matter of comparing their key. Good idea. This will allow more to be done with stat keys than just comparing them for equality. > Could the following be properties? > > > permission_bits() > > This shall return ``stat.S_IMODE(self.st_mode)``. > > good > > > file_mode() > > This shall return ``stat.filemode(stat.S_IMODE(self.st_mode))``, i.e. a > > string of the form ?-rwxrwxrwx?. > > Interesting. > I don't like the name, though. Note that the only reason I proposed this is because the functionality already exists in the stat module as stat.filemode(), which is also what I based the name on. > Also, if we provide a way to return a string representation from the > permission bits, we also probably want an helper to do it the other > way around, i.e. a permission bit array from a string. > So I think those two methods we be better as static helper methods. I agree that adding the converse functionality would be symmetrical, but AFAIK, unlike filemode() that functionality does not already exist in the stdlib. I suppose in the end it depends on how much I want to merely regularise existing features in the stdlib, versus expanding functionality out to make a more cohesive whole. I'll take this into account. > > format() > > This shall return ``stat.S_IFMT(self.st_mode)``. > > Is this really necessary? > AFAICT, S_IFMT is only useful as an helper for S_ISREG/etc. I don't > see any added value in exposing it. I was guided (maybe too much?) by a desire to have all functionality on the stat module available without importing stat. But you convinced me it is not needed. > > Added functions in ``os.path`` > > ------------------------------ > > > > is_dir(f) > > This shall be an alias for the existing isdir(f). > > Why? That depends on the open question "how important is it that new stdlib code be completely PEP-8 compliant", as well as "how important is it that the same concepts in os.path have the same name as they will have on the statr_result object". I'm very keen on feedback on the latter. > > > is_character_device(f) > > This shall return ``os.stat(f).is_character_device()``, or ``False`` if > > ``f`` does not exist. > > > > is_block_device(f) > > This shall return ``os.stat(f).is_block_device()``, or ``False`` if > > ``f`` does not exist. > > > > is_file() > > This shall be an alias for the existing isfile(f). > > > > is_fifo() > > This shall return ``os.stat(f).is_fifo()``, or ``False`` if > > ``f`` does not exist. > > > > is_symbolic_link() > > This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if > > ``f`` does not exist. > > > > is_socket() > > This shall return ``os.stat(f).is_socket()``, or ``False`` if > > ``f`` does not exist. > > Same remark as above, I'm not convinced that all those special cases > are necessary. > > > Rationale > > ========= > > > > The PEP is strongly motivated by a desire for symmetry between functions in > > ``os.path`` and methods on ``stat_result``. > > > > Therefore, for each predicate function in ``os.path`` that is essentially > > just an interrogation of ``os.*stat()``, given an existing path, the > > similarly-named predicate method on ``stat_result`` should have the exact > > same semantics. > > > > This definition does not cover the case where the path being interrogated > > does not exist. In those cases, predicate functions in ``os.path``, such > > as ``os.path.isfile()``, will return ``False``, whereas ``os.*stat()`` will > > raise FileNotFoundError even before any ``stat_result`` is returned that > > could have been interrogated. This renders considerations of how the > > proposed new predicates on ``stat_result`` could have been symmetrical with > > functions in ``os.path``, if their ``stat_result`` had existed, moot, and > > this PEP does not propose doing anything about the situation (but see `Open > > Issues`_ below). > > > > Secondly, this definition refers to ?similarly-named? predicates instead of > > ?identically-named? predicates, because the names in ``os.path`` pre-date > > PEP 8 [#PEP-8]_, and are not compliant with it. This PEP takes the > > position that it is better that the new predicate methods on > > ``stat_result`` be named in compliance with PEP 8 [#PEP-8]_ (i.e. > > ``is_file()``), than that they be precisely identical to the names in > > ``os.path`` (i.e ``isfile()``). Note also that PEP 428 [#PEP-428]_ also > > specifies PEP-8 compliant names such as ``is_file()`` for the exact same > > concepts, and if PEP 428 [#PEP-428]_ should be accepted, the issue would be > > even more pertinent. > > > > Lastly, this PEP takes the notion of symmetry as far as adding methods and > > aliases to the existing ``os.path`` in order to be symmetrical with the > > added behaviour on ``stat_result``. But the author is least strongly > > convicted of this latter point, and may be convinced to abandon it. > > I'm not convinced either. > > > Backwards Compatibility > > ======================= > > > > This PEP neither removes current behavior of ``stat_result``, nor changes > > the semantics of any current behavior. Likewise, it adds functions and > > aliases for functions to ``os.path``, but does not remove or change any > > existing ones. > > > > Therefore, this PEP should not cause any backwards incompatibilities, > > except in the rare and esoteric cases where code is dependent on the > > *nonexistence* of the proposed new names. It is not deemed important > > remain compatible with code that mistakenly holds the Python Standard > > Library to be closed for new additions. > > You just want to make sure that your stat result is compatible with > the current implementation (tuple-struct, supporting indexing). > > > Open Issues > > =========== > > > > Whether it is more desirable for the proposed added methods? names to > > follow PEP 8 [#PEP-8]_ (i.e. ``is_file()`` etc.), or to mirror the > > pre-existing names in ``os.path`` (i.e. ``isfile()`` etc.) is still open > > for debate. > > > > The existing attributes on ``stat_result`` follow the pattern ``st_*`` in > > conformance to the relevant POSIX names for the fields of the C-level > > ``stat`` structure. The new names for the behaviours proposed here do not > > contain such an ``st_`` prefix (nor could they, for that would suggest a > > conformance with ``stat`` structure names which do not exist in POSIX). > > But the resulting asymmetry of names is annoying. Should aliases for the > > existing ``st_*`` names be added that omit the ``st_`` prefix? > > If we offer a higher level abstraction, then I think the 'st_' prefix > should be dropped. > > > As it stands, this PEP does not address the asymmetry between the existing > > ``os.path.isfile()`` etc. functions and the new proposed mechanism in the > > case where the underlying file does not exist. There is a way to handle > > this, though: an optional flag could be added to ``os.*stat()`` that would > > return a null object implementation of ``stat_result`` whenever the file > > does not exist. Then that null object could return ``False`` to > > ``is_file()`` etc., That means that the following code would behave > > identically, even when the file ``f`` does not exist:: > > > > if os.path.isfile(f) or os.path.isdir(f): > > # do something > > > > st = os.stat(f, null_if_missing=True) > > if st.is_file() or st.is_dir(): > > # do something > > > > Would this be a useful mechanism? > > I don't like the idea of adding an optional attribute: stating a non > existing file will return an exception, that's it. > Also, I don't like the idea of a null object. > -- Pieter Nagel From pieter at nagel.co.za Mon May 6 15:04:59 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 15:04:59 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <51879648.8010001@fastmail.us> References: <1367829004.2868.619.camel@basilisk> <51879648.8010001@fastmail.us> Message-ID: <1367845499.2868.657.camel@basilisk> On Mon, 2013-05-06 at 07:38 -0400, Random832 wrote: > On 05/06/2013 04:30 AM, Pieter Nagel wrote: > > is_symbolic_link() > > This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if > > ``f`` does not exist. > > lstat, surely. Interesting point. If is_symbolic_link() is added to stat_result, then what file it pertains to depends purely on whether one called os.stat() or os.lstat() to start with. Once the stat_result it returned, the inode is referred to is fixed. Currently, all existing isxxx() functions on os.path are implemented in terms of os.stat(), they follow symlinks. That is why I reflexively specified the new is_symbolic_link() the same. But I agree basing it on os.lstat() will be more meaningful. But will the fact that some isxxx() functions in os.path are based on os.stat(), and others on os.lstat(), be a sign of a fundamental smell? Does this argue against mirroring all of the functionality on stat_result to os.path? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Pieter Nagel From pieter at nagel.co.za Mon May 6 15:11:37 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 15:11:37 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367845897.2868.662.camel@basilisk> On Mon, 2013-05-06 at 14:50 +0200, Christian Heimes wrote: > Am 06.05.2013 10:30, schrieb Pieter Nagel: > > Abstract > > ======== > > > > This PEP proposes extending the result of ``os.stat()``, ``os.fstat()`` and > > ``os.lstat()`` calls with added methods such as ``is_file()``. These added > > methods will obviate the need to use the ``stat`` module to interpret the > > result of these calls. > > A side note: > > The stat module has a flaw. It's a pure Python module with hard coded > constants. So far this has worked on all known platforms because the > platforms are all using the same constants with equal meaning. AFAIK the > POSIX specs only specify the names of the constants but not any values. In theory my PEP provides a mechanism for abstracting away the precise bit values of the stat flags. The only wrinkle is that I currently specify the new is_xxx() methods' behaviour in terms of the current hard coded constants in stat. To decouple my PEP from the precise values of the ST_ constants, I'll have to find a way to formulate the behaviour of is_file() etc. on stat_result, and make it clear that it is most likely the same as stat.S_ISREG() on current platforms, without tying myself to the current values of stat.S_ISREG. I'll think on it. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 15:24:46 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 15:24:46 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367846686.2868.668.camel@basilisk> On Mon, 2013-05-06 at 11:31 +0200, Charles-Fran?ois Natali wrote: > If we go down this road, we can also add Solaris door files, and > another bazillion files types (see the previous thread). > Something like is_other() or is_special() is enough. Code needing more > specific information about the file type know how to use S_XXX, and > has to be non portable anyway. Having looked at the dicussion at http://bugs.python.org/issue11016 , I am even more convinced that leaving such concepts off stat_result (if and when Python supports them) just because code could rather use S_XXX constants in the stat module, is a flawed idea. To summarise the thread: POSIX guarantees the *names* os the S_XXX constants, not their numeric values. But the stat module propagates the misapprehension that the values are standardised. It seems that what is needed is precisely to abstract the concepts away, and adding them to stat_result is one way to have polymorphic behaviour that depends on platform. If the concern is that stat_result could grow too large due too to many different platform-specific concepts, then notions such as splitting solaris_stat_result from nt_stat_result etc. seems to be the way to go. -- Pieter Nagel From christian at python.org Mon May 6 15:26:33 2013 From: christian at python.org (Christian Heimes) Date: Mon, 06 May 2013 15:26:33 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367845897.2868.662.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367845897.2868.662.camel@basilisk> Message-ID: <5187AF89.8020802@python.org> Am 06.05.2013 15:11, schrieb Pieter Nagel: > In theory my PEP provides a mechanism for abstracting away the precise > bit values of the stat flags. > > The only wrinkle is that I currently specify the new is_xxx() methods' > behaviour in terms of the current hard coded constants in stat. > > To decouple my PEP from the precise values of the ST_ constants, I'll > have to find a way to formulate the behaviour of is_file() etc. on > stat_result, and make it clear that it is most likely the same as > stat.S_ISREG() on current platforms, without tying myself to the current > values of stat.S_ISREG. How is your code going to work? On POSIX you *have* to rely on the functions in the stat module. They are the only and authoritative way to interpret the meaning of st_mode. Once I have checked in my C implementation of the stat module it will provide a dependable interface to POSIX interface in stat.h. Please don't come up with your own way of interpreting st_mode. Your high level API should only use the low level stat.S_ISxxx(stat_result.st_mode) API and forget about the S_IFxxx integer constants. Christian From christian at python.org Mon May 6 15:33:30 2013 From: christian at python.org (Christian Heimes) Date: Mon, 06 May 2013 15:33:30 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367846686.2868.668.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> Message-ID: Am 06.05.2013 15:24, schrieb Pieter Nagel: > To summarise the thread: POSIX guarantees the *names* os the S_XXX > constants, not their numeric values. But the stat module propagates the > misapprehension that the values are standardised. For POSIX you safely ignore the issue as long as you just use the S_ISxxx() functions from the stat module. I'll take care of the rest. Promise! :) From pieter at nagel.co.za Mon May 6 15:46:53 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 15:46:53 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> Message-ID: <1367848013.2868.671.camel@basilisk> On Mon, 2013-05-06 at 15:33 +0200, Christian Heimes wrote: > For POSIX you safely ignore the issue as long as you just use the > S_ISxxx() functions from the stat module. I'll take care of the rest. At this point I'm not sure if it will be best to implement this change to stat_result in C or Python, so I'm not sure if I can us your new stat code. I noted with interest that the thread you linked to advocated that the stat module should be deprecated, and in a way that is what my PEP here does - at least in the sense of providing an alternative. -- Pieter Nagel From random832 at fastmail.us Mon May 6 16:09:24 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 06 May 2013 10:09:24 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367845499.2868.657.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <51879648.8010001@fastmail.us> <1367845499.2868.657.camel@basilisk> Message-ID: <1367849364.3783.140661227139565.77C9FD20@webmail.messagingengine.com> On Mon, May 6, 2013, at 9:04, Pieter Nagel wrote: > Currently, all existing isxxx() functions on os.path are implemented in > terms of os.stat(), they follow symlinks. That's not true of islink, either. I'm actually confused that you defined is_file and is_dir as aliases to the existing functions, but not is_symbolic_link. From pieter at nagel.co.za Mon May 6 16:21:51 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 16:21:51 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5187AF89.8020802@python.org> References: <1367829004.2868.619.camel@basilisk> <1367845897.2868.662.camel@basilisk> <5187AF89.8020802@python.org> Message-ID: <1367850111.2868.679.camel@basilisk> On Mon, 2013-05-06 at 15:26 +0200, Christian Heimes wrote: > How is your code going to work?. At the moment both os.stat() and stat_result are implemented in C, and I have to keep the option open that the work I do may need to be done in C as well. I've written C extensions to Python before that reference types defined in plain .py files, but I'm not sure if that's idiomatic or permitted in CPython, so I have to keep my options open. > On POSIX you *have* to rely on the > functions in the stat module. They are the only and authoritative way to > interpret the meaning of st_mode Technically, the definitions of the S_IS* macros in the C header files are the only authoritative way to interpret st_mode, the Python stat module is not a normative part of POSIX at all ;-) If I'm forced to implement in C, I'll use the POSIX libc macros. If I implement in python, I'll use the existing stat module, and benefit from your patch when it is accepted. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 16:25:49 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 16:25:49 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367849364.3783.140661227139565.77C9FD20@webmail.messagingengine.com> References: <1367829004.2868.619.camel@basilisk> <51879648.8010001@fastmail.us> <1367845499.2868.657.camel@basilisk> <1367849364.3783.140661227139565.77C9FD20@webmail.messagingengine.com> Message-ID: <1367850349.2868.681.camel@basilisk> On Mon, 2013-05-06 at 10:09 -0400, random832 at fastmail.us wrote: > That's not true of islink, either. I'm actually confused that you > defined is_file and is_dir as aliases to the existing functions, but not > is_symbolic_link. That's was an oversight, I'll update the PEP in the next round accordingly. -- Pieter Nagel From python at mrabarnett.plus.com Mon May 6 18:14:16 2013 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 06 May 2013 17:14:16 +0100 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: <5187D6D8.5040508@mrabarnett.plus.com> On 06/05/2013 09:30, Pieter Nagel wrote: [snip] > Specification > ============= > > > Added methods on ``stat_result`` > -------------------------------- > > is_dir() > Equivalent to ``bool(stat.S_ISDIR(self.st_mode))``. > > is_character_device() > Equivalent to ``bool(stat.S_ISCHR(self.st_mode))``. > > is_block_device() > Equivalent to ``bool(stat.S_ISBLK(self.st_mode))``. > > is_file() > Equivalent to ``bool(stat.S_ISREG(self.st_mode))``. > > is_fifo() > Equivalent to ``bool(stat.S_ISFIFO(self.st_mode))``. > > is_symbolic_link() > Equivalent to ``bool(stat.S_ISLNK(self.st_mode))``. > > is_socket() > Equivalent to ``bool(stat.S_ISSOCK(self.st_mode)``. > > same_stat(other) > Equivalent to ``os.path.samestat(self, other)``. > > file_mode() > This shall return ``stat.filemode(stat.S_IMODE(self.st_mode))``, i.e. a > string of the form ?-rwxrwxrwx?. > > permission_bits() > This shall return ``stat.S_IMODE(self.st_mode)``. > > format() > This shall return ``stat.S_IFMT(self.st_mode)``. > Some of the names seem too long for me. > > Added functions in ``os.path`` > ------------------------------ > > is_dir(f) > This shall be an alias for the existing isdir(f). > Do we really need 2 names for the same thing? No. > is_character_device(f) > This shall return ``os.stat(f).is_character_device()``, or ``False`` if > ``f`` does not exist. > > is_block_device(f) > This shall return ``os.stat(f).is_block_device()``, or ``False`` if > ``f`` does not exist. > > is_file() > This shall be an alias for the existing isfile(f). > > is_fifo() > This shall return ``os.stat(f).is_fifo()``, or ``False`` if > ``f`` does not exist. > > is_symbolic_link() > This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if > ``f`` does not exist. > > is_socket() > This shall return ``os.stat(f).is_socket()``, or ``False`` if > ``f`` does not exist. > Some of the names seem too long for me. > > Rationale > ========= > > The PEP is strongly motivated by a desire for symmetry between functions in > ``os.path`` and methods on ``stat_result``. > > Therefore, for each predicate function in ``os.path`` that is essentially > just an interrogation of ``os.*stat()``, given an existing path, the > similarly-named predicate method on ``stat_result`` should have the exact > same semantics. > > This definition does not cover the case where the path being interrogated > does not exist. In those cases, predicate functions in ``os.path``, such > as ``os.path.isfile()``, will return ``False``, whereas ``os.*stat()`` will > raise FileNotFoundError even before any ``stat_result`` is returned that > could have been interrogated. This renders considerations of how the > proposed new predicates on ``stat_result`` could have been symmetrical with > functions in ``os.path``, if their ``stat_result`` had existed, moot, and > this PEP does not propose doing anything about the situation (but see `Open > Issues`_ below). > > Secondly, this definition refers to ?similarly-named? predicates instead of > ?identically-named? predicates, because the names in ``os.path`` pre-date > PEP 8 [#PEP-8]_, and are not compliant with it. This PEP takes the > position that it is better that the new predicate methods on > ``stat_result`` be named in compliance with PEP 8 [#PEP-8]_ (i.e. > ``is_file()``), than that they be precisely identical to the names in > ``os.path`` (i.e ``isfile()``). Note also that PEP 428 [#PEP-428]_ also > specifies PEP-8 compliant names such as ``is_file()`` for the exact same > concepts, and if PEP 428 [#PEP-428]_ should be accepted, the issue would be > even more pertinent. > > Lastly, this PEP takes the notion of symmetry as far as adding methods and > aliases to the existing ``os.path`` in order to be symmetrical with the > added behaviour on ``stat_result``. But the author is least strongly > convicted of this latter point, and may be convinced to abandon it. > > > Backwards Compatibility > ======================= > > This PEP neither removes current behavior of ``stat_result``, nor changes > the semantics of any current behavior. Likewise, it adds functions and > aliases for functions to ``os.path``, but does not remove or change any > existing ones. > > Therefore, this PEP should not cause any backwards incompatibilities, > except in the rare and esoteric cases where code is dependent on the > *nonexistence* of the proposed new names. It is not deemed important > remain compatible with code that mistakenly holds the Python Standard > Library to be closed for new additions. > > > Open Issues > =========== > > Whether it is more desirable for the proposed added methods? names to > follow PEP 8 [#PEP-8]_ (i.e. ``is_file()`` etc.), or to mirror the > pre-existing names in ``os.path`` (i.e. ``isfile()`` etc.) is still open > for debate. > I think is best to follow PEP 8, except when there's an established pattern, as there is for 'isfile', etc. > The existing attributes on ``stat_result`` follow the pattern ``st_*`` in > conformance to the relevant POSIX names for the fields of the C-level > ``stat`` structure. The new names for the behaviours proposed here do not > contain such an ``st_`` prefix (nor could they, for that would suggest a > conformance with ``stat`` structure names which do not exist in POSIX). > But the resulting asymmetry of names is annoying. Should aliases for the > existing ``st_*`` names be added that omit the ``st_`` prefix? > Do we really need 2 names for the same thing? No. > This PEP does not address a higher-lever mechanism for exposing the > owner/group/other read/write/execute permissions. Is there a need for > this? > > This PEP does not address a higher-lever mechanism for exposing the of the > underlying ``st_flags`` field. Is there a need for this? > > This PEP proposes aliases and methods to make ``os.path`` conform more to > the added ``stat_result`` methods proposed here. But is the impedance > mismatch between ``isfile`` and ``is_file`` really that much of an issue to > warrant this? > > As it stands, this PEP does not address the asymmetry between the existing > ``os.path.isfile()`` etc. functions and the new proposed mechanism in the > case where the underlying file does not exist. There is a way to handle > this, though: an optional flag could be added to ``os.*stat()`` that would > return a null object implementation of ``stat_result`` whenever the file > does not exist. Then that null object could return ``False`` to > ``is_file()`` etc., That means that the following code would behave > identically, even when the file ``f`` does not exist:: > > if os.path.isfile(f) or os.path.isdir(f): > # do something > > st = os.stat(f, null_if_missing=True) > if st.is_file() or st.is_dir(): > # do something > > Would this be a useful mechanism? > I can see that it could be useful, but also possibly confusing (how can you get the 'stat' of something that doesn't exist?). From stefan at drees.name Mon May 6 18:25:01 2013 From: stefan at drees.name (Stefan Drees) Date: Mon, 06 May 2013 18:25:01 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5187D6D8.5040508@mrabarnett.plus.com> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> Message-ID: <5187D95D.3010501@drees.name> On 06.05.13 18:14, MRAB wrote: > On 06/05/2013 09:30, Pieter Nagel wrote: > [snip] >> Specification >> ============= >> >> >> Added methods on ``stat_result`` >> -------------------------------- >> >> is_dir() >> Equivalent to ``bool(stat.S_ISDIR(self.st_mode))``. >> >> is_character_device() >> Equivalent to ``bool(stat.S_ISCHR(self.st_mode))``. >> is_char_dev() >> is_block_device() >> Equivalent to ``bool(stat.S_ISBLK(self.st_mode))``. >> is_block_dev() >> is_file() >> Equivalent to ``bool(stat.S_ISREG(self.st_mode))``. >> >> is_fifo() >> Equivalent to ``bool(stat.S_ISFIFO(self.st_mode))``. >> >> is_symbolic_link() >> Equivalent to ``bool(stat.S_ISLNK(self.st_mode))``. >> is_sym_link() >> is_socket() >> Equivalent to ``bool(stat.S_ISSOCK(self.st_mode)``. >> >> same_stat(other) >> Equivalent to ``os.path.samestat(self, other)``. >> >> file_mode() >> This shall return ``stat.filemode(stat.S_IMODE(self.st_mode))``, >> i.e. a >> string of the form ?-rwxrwxrwx?. >> >> permission_bits() >> This shall return ``stat.S_IMODE(self.st_mode)``. >> perm_bits() or maybe better file_perm() ? >> format() >> This shall return ``stat.S_IFMT(self.st_mode)``. >> > Some of the names seem too long for me. please cf. shortening suggestions inline above. >> >> Added functions in ``os.path`` >> ------------------------------ >> >> is_dir(f) >> This shall be an alias for the existing isdir(f). >> > Do we really need 2 names for the same thing? No. > >> is_character_device(f) >> This shall return ``os.stat(f).is_character_device()``, or >> ``False`` if >> ``f`` does not exist. >> is_char_dev() >> is_block_device(f) >> This shall return ``os.stat(f).is_block_device()``, or ``False`` if >> ``f`` does not exist. >> is_block_dev() >> is_file() >> This shall be an alias for the existing isfile(f). >> >> is_fifo() >> This shall return ``os.stat(f).is_fifo()``, or ``False`` if >> ``f`` does not exist. >> >> is_symbolic_link() >> This shall return ``os.stat(f).is_symbolic_link()``, or ``False`` if >> ``f`` does not exist. >> is_sym_link() >> is_socket() >> This shall return ``os.stat(f).is_socket()``, or ``False`` if >> ``f`` does not exist. >> > Some of the names seem too long for me. ditto. > ... All the best, Stefan. From joshua.landau.ws at gmail.com Mon May 6 18:41:35 2013 From: joshua.landau.ws at gmail.com (Joshua Landau) Date: Mon, 6 May 2013 17:41:35 +0100 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5187D95D.3010501@drees.name> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> <5187D95D.3010501@drees.name> Message-ID: On 6 May 2013 17:25, Stefan Drees wrote: > is_sym_link() is_symlink() Glad to help, Joshua Landau -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Mon May 6 18:55:03 2013 From: phd at phdru.name (Oleg Broytman) Date: Mon, 6 May 2013 20:55:03 +0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: <20130506165503.GA17695@iskra.aviel.ru> Hi! On Mon, May 06, 2013 at 10:30:04AM +0200, Pieter Nagel wrote: > Title: Extended stat_result > Author: Pieter Nagel Good job! > is_character_device() > is_block_device() > is_symbolic_link() Long names. Make them (and other if possible) shorter. is_chardev(), is_blockdev() or even is_blkdev(), is_symlink(). After all, it is called is_dir(), not is_directory(), right? ;-) > file_mode() > This shall return ``stat.filemode( Shouldn't it be called filemode()? I see you are fond of underscores but Python style guides discourage using them AFAIK. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From phd at phdru.name Mon May 6 18:58:01 2013 From: phd at phdru.name (Oleg Broytman) Date: Mon, 6 May 2013 20:58:01 +0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367843367.2868.631.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> Message-ID: <20130506165801.GB17695@iskra.aviel.ru> On Mon, May 06, 2013 at 02:29:27PM +0200, Pieter Nagel wrote: > On Mon, 2013-05-06 at 11:30 +0200, Antoine Pitrou wrote: > > > I don't really understand the point of a null object here, since > > os.stat() will raise when called on a non-existent patch. > > The point is that I am proposing to (optionally) change the behaviour of > os.stat(), so they will *not* raise when the path is non-existent, but > instead will return a null object. > > That can be achieved by adding a keyword argument flag like, for > example, null_if_missing, to os.stat() that will select the new > behaviour. The default will be the current behaviour. > > This null object will implement exists(), is_file() and the like so that > it returns False exactly as os.path.exists(), os.path.isfile() etc. > would have done, if one had used them instead of os.stat() to > interrogate() the properties of the file. I don't like the idea of changing os.stat() behaviour. If you want to have a different type of return value use a different function. Call it os.stat_ex() or something. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From pieter at nagel.co.za Mon May 6 19:31:08 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 19:31:08 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5187D6D8.5040508@mrabarnett.plus.com> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> Message-ID: <1367861468.2868.701.camel@basilisk> On Mon, 2013-05-06 at 17:14 +0100, MRAB wrote: > Some of the names seem too long for me. Neither my nor your naming preference is relevant here, although I personally prefer long names over abbreviations - the time spent writing code in infinitesimal compared to the time reading, debugging, extending and maintaining it. But at issue here is which names fit with Guido's vision for the stdlib. PEP 8 says "All identifiers in the Python standard library... SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English)". This is a bit ambiguous, but I understand the parentheses to refer to a practice that Guido would like to stopped, by using English words instead. > Do we really need 2 names for the same thing? No. So you said twice, but I can not do much with any feedback if it is expressed as a mere assertion. I gave my rationale for doing so in the PEP. If you disagree, please give a rationale of your own. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 19:33:13 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 19:33:13 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <20130506165503.GA17695@iskra.aviel.ru> References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> Message-ID: <1367861593.2868.702.camel@basilisk> On Mon, 2013-05-06 at 20:55 +0400, Oleg Broytman wrote: > Shouldn't it be called filemode()? I see you are fond of underscores > but Python style guides discourage using them AFAIK. Actually, PEP 8 says "Function names should be lowercase, with words separated by underscores as necessary to improve readability." -- Pieter Nagel From pieter at nagel.co.za Mon May 6 19:49:44 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 19:49:44 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <20130506165801.GB17695@iskra.aviel.ru> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> Message-ID: <1367862584.2868.714.camel@basilisk> On Mon, 2013-05-06 at 20:58 +0400, Oleg Broytman wrote: > I don't like the idea of changing os.stat() behaviour. I'm actually not proposing changing os.stat() behaviour, not for existing code. I'm proposing new optional behaviour that code could request. There is precedent for functions not raising exceptions, depending on the way they're called. dict.pop will raise KeyError, unless a default is given. str.encode can raise UnicodeErrors, depending on the value of the 'errors' parameter. In that light, a function that might or might not raise FileNotFoundError depending on how it was called is not that weird. The motivation is to make "performant" code that calls stat() once look more similar to naive code that potentially calls it many times. And if naive code can say "os.path.isfile(f)" and get False even when f does not exist, then performant code should be able to do something similar by just statting and then interogating the results, without adding exception handling. > If you want to have a different type of return value use a different > function. Call it os.stat_ex() or something. This is more feasible nowadays, now that there's the pattern of specifying file descriptors as path, and passing follow_symlinks. That means I won't need to do os.fstat_ex() and os.lstat_ex() as well. -- Pieter Nagel From phd at phdru.name Mon May 6 19:56:17 2013 From: phd at phdru.name (Oleg Broytman) Date: Mon, 6 May 2013 21:56:17 +0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367861593.2868.702.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> <1367861593.2868.702.camel@basilisk> Message-ID: <20130506175617.GA20261@iskra.aviel.ru> On Mon, May 06, 2013 at 07:33:13PM +0200, Pieter Nagel wrote: > On Mon, 2013-05-06 at 20:55 +0400, Oleg Broytman wrote: > > > Shouldn't it be called filemode()? I see you are fond of underscores > > but Python style guides discourage using them AFAIK. > > Actually, PEP 8 says "Function names should be lowercase, with words > separated by underscores as necessary to improve readability." Yes, but stdlib doesn't follow this style. See, for example, built-in functions -- http://docs.python.org/library/functions.html : staticmethod() isinstance() basestring() execfile() issubclass() bytearray() frozenset() classmethod() getattr() hasattr() memoryview() delattr() setattr() I'm sure this style is codified somewhere, don't know where. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From g.brandl at gmx.net Mon May 6 20:08:05 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 06 May 2013 20:08:05 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367861468.2868.701.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> <1367861468.2868.701.camel@basilisk> Message-ID: Am 06.05.2013 19:31, schrieb Pieter Nagel: > On Mon, 2013-05-06 at 17:14 +0100, MRAB wrote: > >> Some of the names seem too long for me. > > Neither my nor your naming preference is relevant here, although I > personally prefer long names over abbreviations - the time spent writing > code in infinitesimal compared to the time reading, debugging, extending > and maintaining it. > > But at issue here is which names fit with Guido's vision for the stdlib. > PEP 8 says "All identifiers in the Python standard library... SHOULD use > English words wherever feasible (in many cases, abbreviations and > technical terms are used which aren't English)". This is a bit > ambiguous, but I understand the parentheses to refer to a practice that > Guido would like to stopped, by using English words instead. This is one view, the other is that this is not important enough (and the existing names not ugly enough) to introduce redundant names. I already don't like the different names as methods on stat_result, but this is a completely new API so the confusion will be less (but still present). I don't understand why os.path needs to be touched at all. >> Do we really need 2 names for the same thing? No. > > So you said twice, but I can not do much with any feedback if it is > expressed as a mere assertion. When debating API design, personal preference (or call it "intuition about what would be Pythonic") is a valid piece of feedback. Georg From python at mrabarnett.plus.com Mon May 6 20:11:49 2013 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 06 May 2013 19:11:49 +0100 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367861468.2868.701.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> <1367861468.2868.701.camel@basilisk> Message-ID: <5187F265.6040609@mrabarnett.plus.com> On 06/05/2013 18:31, Pieter Nagel wrote: > On Mon, 2013-05-06 at 17:14 +0100, MRAB wrote: > >> Some of the names seem too long for me. > > Neither my nor your naming preference is relevant here, although I > personally prefer long names over abbreviations - the time spent writing > code in infinitesimal compared to the time reading, debugging, extending > and maintaining it. > > But at issue here is which names fit with Guido's vision for the stdlib. > PEP 8 says "All identifiers in the Python standard library... SHOULD use > English words wherever feasible (in many cases, abbreviations and > technical terms are used which aren't English)". This is a bit > ambiguous, but I understand the parentheses to refer to a practice that > Guido would like to stopped, by using English words instead. > Python has a lot of abbreviations. The os.path module, for example, has isdir, dirname and splitext. There are also commonly-used names like len, str and int, and methods names like lstrip, also reserved words like def and elif. Because of all that, long names like is_character_device seem unPythonic, IMHO. >> Do we really need 2 names for the same thing? No. > > So you said twice, but I can not do much with any feedback if it is > expressed as a mere assertion. > > I gave my rationale for doing so in the PEP. If you disagree, please > give a rationale of your own. > Such aliases won't add anything to the language. If we wanted underscores in such places as isdir, we should probably have done it in the move from Python 2 to Python 3, along with the renaming of modules to lowercase. Again, IMHO. From g.rodola at gmail.com Mon May 6 20:28:24 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 6 May 2013 20:28:24 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: 2013/5/6 Pieter Nagel : > Abstract > ======== > > This PEP proposes extending the result of ``os.stat()``, ``os.fstat()`` and > ``os.lstat()`` calls with added methods such as ``is_file()``. These added > methods will obviate the need to use the ``stat`` module to interpret the > result of these calls. I'm not convinced this is a good proposal. It duplicates a consistent amount of already existent functionality just for the sake of avoiding importing the stat module and using stat's S_IS* functions which "are ugly" because upper-cased. > Specification > ============= > > > Added methods on ``stat_result`` > -------------------------------- > > is_dir() > Equivalent to ``bool(stat.S_ISDIR(self.st_mode))``. > > is_character_device() > Equivalent to ``bool(stat.S_ISCHR(self.st_mode))``. > > is_block_device() > Equivalent to ``bool(stat.S_ISBLK(self.st_mode))``. > > is_file() > Equivalent to ``bool(stat.S_ISREG(self.st_mode))``. > > is_fifo() > Equivalent to ``bool(stat.S_ISFIFO(self.st_mode))``. > > is_symbolic_link() > Equivalent to ``bool(stat.S_ISLNK(self.st_mode))``. > > is_socket() > Equivalent to ``bool(stat.S_ISSOCK(self.st_mode)``. These look way too long to me. If added I'd prefer the naming convention used so far in the os.path.is* to be kept (therefore isfile, islink, etc.). > same_stat(other) > Equivalent to ``os.path.samestat(self, other)``. Isn't this redundant? Aren't you introducing multiple ways for doing the same thing and going against the Zen? > Added functions in ``os.path`` > ------------------------------ > > is_dir(f) > This shall be an alias for the existing isdir(f). > is_file() > This shall be an alias for the existing isfile(f). I'm just -1 about this. It doesn't add anything and overcrowds and already crowded API. > Added functions in ``os.path`` > ------------------------------ > > is_character_device(f) > This shall return ``os.stat(f).is_character_device()``, or ``False`` if > ``f`` does not exist. > > is_block_device(f) > This shall return ``os.stat(f).is_block_device()``, or ``False`` if > ``f`` does not exist. > > is_fifo() > This shall return ``os.stat(f).is_fifo()``, or ``False`` if > ``f`` does not exist. > > is_socket() > This shall return ``os.stat(f).is_socket()``, or ``False`` if > ``f`` does not exist. -1 about these too. os.path provides only isfile(), isdir() and islink() because those are the most common and portable file types, and that's fine. Anything else is too specific (also *platform* specific) and does not deserve a new utility function in os.path. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From pieter at nagel.co.za Mon May 6 20:31:02 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 20:31:02 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <20130506175617.GA20261@iskra.aviel.ru> References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> <1367861593.2868.702.camel@basilisk> <20130506175617.GA20261@iskra.aviel.ru> Message-ID: <1367865062.2868.717.camel@basilisk> On Mon, 2013-05-06 at 21:56 +0400, Oleg Broytman wrote: > Yes, but stdlib doesn't follow this style. See, for example, built-in > functions -- http://docs.python.org/library/functions.html : > > staticmethod() isinstance() basestring() execfile() issubclass() bytearray() > frozenset() classmethod() getattr() hasattr() memoryview() delattr() setattr() > > I'm sure this style is codified somewhere, don't know where. Actually, PEP 8 *is* the coding convention for the stdlib: "This document gives coding conventions for the Python code comprising the standard library in the main Python distribution." -- Pieter Nagel From pieter at nagel.co.za Mon May 6 20:44:39 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 20:44:39 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> <1367861468.2868.701.camel@basilisk> Message-ID: <1367865879.2868.726.camel@basilisk> On Mon, 2013-05-06 at 20:08 +0200, Georg Brandl wrote: > I already > don't like the different names as methods on stat_result, but this is a > completely new API so the confusion will be less (but still present). > I don't understand why os.path needs to be touched at all. My understanding is that PEP 8 applies to all new code intended for the stdlib, so that forced me to change the names on the stat_result side; the aliases on the os.path side were and attempt to then heal the divergence that was created - since like you, I actually don't like the fact that there are different names either. The threading module, for example, introduced PEP 8 aliases as early as 2.6 (activeCount vs. active_count and the like). It's not like there's been any kind of concerted effort to do this en-masse, but on the other hand maybe the trend is to do this as and when parts of the stdlib are 'touched', as I do here. I suspect this will only clarified if this PEP reaches python-dev. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 20:58:04 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 20:58:04 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367866684.2868.739.camel@basilisk> On Mon, 2013-05-06 at 20:28 +0200, Giampaolo Rodola' wrote: > I'm not convinced this is a good proposal. Interesting, you seem to be the first that is against the proposal itself, as opposed to details of the proposal. > It duplicates a consistent amount of already existent functionality > just for the sake of avoiding importing the stat module and using > stat's S_IS* functions which "are ugly" because upper-cased. Agreed, it duplicates existent functionality. But there is precedent. When os.stat() was originally extended to return a result object instead of a tuple, the tuple behaviour was kept, and since then there has also been two different ways of getting at the same data in the stat() result. The question is whether the new way is enough of a code clarity win to justify itself. "st.is_file()" vs "stat.S_ISREG(st.st_mode)", to my eyes, is. But the primary motivation is not to avoid an import and "ugly" method names; the primary motivation is to make it easier to switch from writing naive code that stat()s a file many times to performant code that does stat() only once. The rest is just a happy side effect. And the *ulterior* motivation is to reduce the need for automagic stat() caching in PEP 428. > -1 about these too. > os.path provides only isfile(), isdir() and islink() because those are > the most common and portable file types, and that's fine. > Anything else is too specific (also *platform* specific) and does not > deserve a new utility function in os.path. The extent to which my additions to stat_result are "mirrored back" to os.path is one of the least certain parts of my proposal. And when PEP 428 is accepted, there'll be *three* places where these methods could live, so I'll have to take that into account. -- Pieter Nagel From python at mrabarnett.plus.com Mon May 6 21:12:13 2013 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 06 May 2013 20:12:13 +0100 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367865879.2868.726.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> <1367861468.2868.701.camel@basilisk> <1367865879.2868.726.camel@basilisk> Message-ID: <5188008D.6050304@mrabarnett.plus.com> On 06/05/2013 19:44, Pieter Nagel wrote: > On Mon, 2013-05-06 at 20:08 +0200, Georg Brandl wrote: >> I already >> don't like the different names as methods on stat_result, but this is a >> completely new API so the confusion will be less (but still present). >> I don't understand why os.path needs to be touched at all. > > My understanding is that PEP 8 applies to all new code intended for the > stdlib, so that forced me to change the names on the stat_result side; > the aliases on the os.path side were and attempt to then heal the > divergence that was created - since like you, I actually don't like the > fact that there are different names either. > I suppose it depends on what it means by "new code". If you're adding to an existing module, is consistency more important? > The threading module, for example, introduced PEP 8 aliases as early as > 2.6 (activeCount vs. active_count and the like). It's not like there's > been any kind of concerted effort to do this en-masse, but on the other > hand maybe the trend is to do this as and when parts of the stdlib are > 'touched', as I do here. > > I suspect this will only clarified if this PEP reaches python-dev. > From carl at oddbird.net Mon May 6 20:36:32 2013 From: carl at oddbird.net (Carl Meyer) Date: Mon, 06 May 2013 12:36:32 -0600 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367861593.2868.702.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> <1367861593.2868.702.camel@basilisk> Message-ID: <5187F830.6060807@oddbird.net> On 05/06/2013 11:33 AM, Pieter Nagel wrote: > On Mon, 2013-05-06 at 20:55 +0400, Oleg Broytman wrote: > >> Shouldn't it be called filemode()? I see you are fond of underscores >> but Python style guides discourage using them AFAIK. > > Actually, PEP 8 says "Function names should be lowercase, with words > separated by underscores as necessary to improve readability." I interpret "as necessary to improve readability" to actually discourage underscores: use them only when _necessary_ for readability, else leave them out. It does not say "separate all words with underscores." So I think this is consistent with actual stdlib usage, though obviously there's a lot of interpretative leeway in what is "necessary for readability." Carl From pieter at nagel.co.za Mon May 6 21:28:13 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 21:28:13 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5188008D.6050304@mrabarnett.plus.com> References: <1367829004.2868.619.camel@basilisk> <5187D6D8.5040508@mrabarnett.plus.com> <1367861468.2868.701.camel@basilisk> <1367865879.2868.726.camel@basilisk> <5188008D.6050304@mrabarnett.plus.com> Message-ID: <1367868493.2868.749.camel@basilisk> On Mon, 2013-05-06 at 20:12 +0100, MRAB wrote: > I suppose it depends on what it means by "new code". If you're adding > to an existing module, is consistency more important? I'm adding code to stat_result, so that's what I have to be consistent with. But all of the current names there st_size, st_mode and so on, derive 1:1 from the POSIX standard for the relevant struct in C. And POSIX does not provide any guidance for *new* names for additional behaviour in unrelated languages that expose the underlying stat struct. Yes, I could name the methods "st_is_file" and the like for "consistency" with the other st_ prefixed names, but that would be just silly. So what else remains in stat_result for me to be consistent with? Nothing. When it comes to names for methods, stat_result is currently a blank slate. Which, to my mind, means that PEP 8 should be followed here. -- Pieter Nagel From random832 at fastmail.us Mon May 6 21:32:40 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 06 May 2013 15:32:40 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367866684.2868.739.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367866684.2868.739.camel@basilisk> Message-ID: <1367868760.25143.140661227285297.3344D56F@webmail.messagingengine.com> > On Mon, 2013-05-06 at 20:28 +0200, Giampaolo Rodola' wrote: > > -1 about these too. > > os.path provides only isfile(), isdir() and islink() because those are > > the most common and portable file types, and that's fine. > > Anything else is too specific (also *platform* specific) and does not > > deserve a new utility function in os.path. I think the question of whether platform-specific functions belong in the os module can be answered by counting the number of occurrences of "Availability:" in its documentation. That ship has long sailed, for good or ill. On Mon, May 6, 2013, at 14:58, Pieter Nagel wrote: > The extent to which my additions to stat_result are "mirrored back" to > os.path is one of the least certain parts of my proposal. I'm not sure I like these functions either - of the existing ones, only ntpath.isdir does anything more efficient than just calling stat and examining the mode - and all of the new ones would be the same. Also, they silently fail (i.e. return False) rather than let exceptions be raised from stat (this is an inconsistency between them and the supposed philosophy people have claimed on this thread, of all file functions raising exceptions when the file doesn't exist, incidentally) From g.rodola at gmail.com Mon May 6 21:46:00 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 6 May 2013 21:46:00 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367866684.2868.739.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367866684.2868.739.camel@basilisk> Message-ID: 2013/5/6 Pieter Nagel : > But the primary motivation is not to avoid an import and "ugly" method > names; the primary motivation is to make it easier to switch from > writing naive code that stat()s a file many times to performant code > that does stat() only once. The rest is just a happy side effect. I don't understand what your proposal has to do with calling os.stat() once or twice (you can already call it once and use stat.S_IS* functions). > And the *ulterior* motivation is to reduce the need for automagic stat() > caching in PEP 428. Can you elaborate more (and possibly also update the PEP including this motivation)? >> -1 about these too. >> os.path provides only isfile(), isdir() and islink() because those are >> the most common and portable file types, and that's fine. >> Anything else is too specific (also *platform* specific) and does not >> deserve a new utility function in os.path. > > The extent to which my additions to stat_result are "mirrored back" to > os.path is one of the least certain parts of my proposal. Fair enough. > And when PEP 428 is accepted, there'll be *three* places where these > methods could live, so I'll have to take that into account. Another reason to leave os.path.* alone. =) --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From ethan at stoneleaf.us Mon May 6 20:57:51 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 06 May 2013 11:57:51 -0700 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367862584.2868.714.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> Message-ID: <5187FD2F.5070304@stoneleaf.us> On 05/06/2013 10:49 AM, Pieter Nagel wrote: > > The motivation is to make "performant" code that calls stat() once look > more similar to naive code that potentially calls it many times. And if > naive code can say "os.path.isfile(f)" and get False even when f does > not exist, then performant code should be able to do something similar > by just statting and then interogating the results, without adding > exception handling. +1 From g.brandl at gmx.net Mon May 6 21:47:47 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 06 May 2013 21:47:47 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367865062.2868.717.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> <1367861593.2868.702.camel@basilisk> <20130506175617.GA20261@iskra.aviel.ru> <1367865062.2868.717.camel@basilisk> Message-ID: Am 06.05.2013 20:31, schrieb Pieter Nagel: > On Mon, 2013-05-06 at 21:56 +0400, Oleg Broytman wrote: > >> Yes, but stdlib doesn't follow this style. See, for example, built-in >> functions -- http://docs.python.org/library/functions.html : >> >> staticmethod() isinstance() basestring() execfile() issubclass() bytearray() >> frozenset() classmethod() getattr() hasattr() memoryview() delattr() setattr() >> >> I'm sure this style is codified somewhere, don't know where. > > Actually, PEP 8 *is* the coding convention for the stdlib: > > "This document gives coding conventions for the Python code comprising > the standard library in the main Python distribution." Please: we *do* know PEP 8 around here. It is a good coding style, but its second section is the most important one of all. (I saw you quoting PEP 8 text in RFC style with upper-case "SHOULD": this is usually not the meaning that should be applied here.) Georg From pieter at nagel.co.za Mon May 6 21:46:59 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 21:46:59 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367868760.25143.140661227285297.3344D56F@webmail.messagingengine.com> References: <1367829004.2868.619.camel@basilisk> <1367866684.2868.739.camel@basilisk> <1367868760.25143.140661227285297.3344D56F@webmail.messagingengine.com> Message-ID: <1367869619.2868.753.camel@basilisk> On Mon, 2013-05-06 at 15:32 -0400, random832 at fastmail.us wrote: > Also, > they silently fail (i.e. return False) rather than let exceptions be > raised from stat I think this is actually good design. It makes total sense to me that os.path.isfile('/nonexistent') return False. Of course it isn't a file, since it doesn't exist. It's not a directory, symlink.. either, for the same reason. It would be most inconvenient if one needed to guard os.path.isfile() with a os.path.exists() first. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 22:02:12 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 22:02:12 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> <1367866684.2868.739.camel@basilisk> Message-ID: <1367870532.2868.767.camel@basilisk> On Mon, 2013-05-06 at 21:46 +0200, Giampaolo Rodola' wrote: > I don't understand what your proposal has to do with calling os.stat() > once or twice (you can already call it once and use stat.S_IS* > functions). I am referring to the impedance mismatch between the way the code is spelled in the two cases. Given code like "os.path.isfile(f)", one has to make a rather convoluted mental mapping to stat.S_ISREG if one wants to do essentially the same thing, but this time on a stat result. That convoluted mapping creates a mental impedance mismatch, and that mismatch makes it more difficult / less likely that code would be transformed into the more performant form. Then one has to take into account that novices tend to be more familiar with the easier, straightforward ways of doing things, so for them I think that would be isfile() and the like. For them, bitmasks on a subfield is even more of leap. In fact, one could say that the stat module as it is today already breaks TOOWTDI, (because it provides a totally different mechanism for interrogating file types than os.path does). My proposal is more in the spirit of fixing a violation of TOOWTDI than of adding extra ways to do things just for the heck of it. -- Pieter Nagel From pieter at nagel.co.za Mon May 6 22:11:02 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Mon, 06 May 2013 22:11:02 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> <1367861593.2868.702.camel@basilisk> <20130506175617.GA20261@iskra.aviel.ru> <1367865062.2868.717.camel@basilisk> Message-ID: <1367871062.2868.775.camel@basilisk> On Mon, 2013-05-06 at 21:47 +0200, Georg Brandl wrote: > Please: we *do* know PEP 8 around here. It is a good coding style, but its > second section is the most important one of all. If there had been pre-existing isxxx methods on stat_result, I would certainly not have proposed adding is_yyy methods alongside them. But what tips the scales for me is the fact that PEP 428 has been heavily discussed over at python-dev for quite some time now, and it also proposes is_file and the like, without that having been shot down. I don't want to submit a proposal for "isfile" shortly after "is_file" has already been accepted in a different context. One could argue that os.path.isfile is what the new method should conform to. But it that is not the route that PEP 428 takes, why should I do it different in a related PEP that also deals with files? If they change their mind on the methods in PEP 428 context, I'll follow suit. -- Pieter Nagel From tjreedy at udel.edu Mon May 6 22:30:25 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 06 May 2013 16:30:25 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367861593.2868.702.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <20130506165503.GA17695@iskra.aviel.ru> <1367861593.2868.702.camel@basilisk> Message-ID: On 5/6/2013 1:33 PM, Pieter Nagel wrote: > On Mon, 2013-05-06 at 20:55 +0400, Oleg Broytman wrote: > >> Shouldn't it be called filemode()? I see you are fond of underscores >> but Python style guides discourage using them AFAIK. > > Actually, PEP 8 says "Function names should be lowercase, with words > separated by underscores as necessary to improve readability." *as necessary* to improve readability To me, filemode is as readable as blackbird. English is full of compounds. From abarnert at yahoo.com Mon May 6 23:42:52 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 6 May 2013 14:42:52 -0700 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367869619.2868.753.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367866684.2868.739.camel@basilisk> <1367868760.25143.140661227285297.3344D56F@webmail.messagingengine.com> <1367869619.2868.753.camel@basilisk> Message-ID: <094DC195-A798-49BA-8FC4-55F2F3D588E4@yahoo.com> One thing nobody's mentioned yet is that os.path.isdir(f) or os.path.islink(f) isn't just a performance issue, it's a race condition. In face, most code that uses os.path.isfoo is just wrong for similar reasons. Usually you just want EAFTP--and, when you don't, usually you want to open and then fstat. But even when you do want stat, you dont want to call it multiple times. Discouraging naive users from writing this kind of code is the whole reason the proposal is a good idea. So, maybe the right answer isn't adding more methods to os.path or renaming the existing ones, but deprecating them entirely. And with that in mind: On May 6, 2013, at 12:46, Pieter Nagel wrote: > On Mon, 2013-05-06 at 15:32 -0400, random832 at fastmail.us wrote: > >> Also, >> they silently fail (i.e. return False) rather than let exceptions be >> raised from stat > > I think this is actually good design. It makes total sense to me that > os.path.isfile('/nonexistent') return False. Of course it isn't a file, > since it doesn't exist. It's not a directory, symlink.. either, for the > same reason. > > It would be most inconvenient if one needed to guard os.path.isfile() > with a os.path.exists() first. I was against the return-false part of the proposal, but now I'm not sure. You almost always want to guard with try, not os.path.exists. The fact that even people who obviously know better still get this wrong implies that making novices guard it manually may be dangerous. From jimjjewett at gmail.com Tue May 7 00:09:32 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 6 May 2013 18:09:32 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: On Mon, May 6, 2013 at 4:30 AM, Pieter Nagel wrote: > This PEP proposes extending the result of ``os.stat()``, ``os.fstat()`` and > ``os.lstat()`` calls with added methods such as ``is_file()``. These added > methods will obviate the need to use the ``stat`` module to interpret the > result of these calls. Another alternative would be to modify os.path.isfile, os.path.isdir, etc so that they can accept a stat_result in place of a filename (or open-file handle). Yet another (albeit more complicated, with questionable backwards-compatibility) alternative would be to have the os.path.* functions maintain a very short-duration cache, so that if the same file is queried multiple times within a second or so, the stat_result could be reused. > Whereas in contrast, similar code that wishes to avoid the penalty of two > potential calls to ``os.stat()``, will look radically different:: > st = os.stat(f) > if stat.S_ISREG(st.st_mode) or stat.S_ISDIR(st.st_mode): > # do something Out of curiosity, is it common to call more than function, except in the following cases: (1) stat.S_ISREG(st.st_mode) or stat.S_ISDIR(st.st_mode) (2) try all the type functions until successful If those are the only real use cases, it might make sense to just add a pair of functions for those two specific cases. > if os.path.isfile_or_dir(filename) Or maybe just for the latter, with the first spelled either > from os.path import filekind > if filekind(filename) in (filekind.REGULAR, filekind.DIR) #symlinks? or > from os.path import filekind > if filekind(filename) isinstance (filekind.REGULAR, filekind.DIR) > This PEP proposes ameliorating the situation by adding higher-level > predicates such as ``is_file()`` and ``is_dir()`` directly to the > ``stat_result`` object, so that (assuming the file ``f`` exists) the second > code example can become:: > st = os.stat(f) > if st.is_file() or st.is_dir(): > # do something Even assuming these are added individually (as opposed to a single filekind), is there a reason not to make them properties? I understand that a property normally shouldn't hide something as expensive as a system call, but in this case the system call is already complete before the caller has a stat_return with attributes. > Added methods on ``stat_result`` > same_stat(other) > Equivalent to ``os.path.samestat(self, other)``. Why is this not just an equality test? Is there just too much of a backward-compatibility problem for stat_result objects that refer to the same device/inode, but have differences in the way other attributes are set? > format() > This shall return ``stat.S_IFMT(self.st_mode)``. I don't think this is important enough to justify the confusion with "".format > Rejected Proposals > ================== > It has been proposed [#filetype]_ that a mechanism be added whereby > ``stat_result`` could return some sort of type code identifying the file > type. Originally these type codes were proposed as strings such as 'reg', > 'dir', and the like, but others suggested enumerations instead. The author > rejected that proposal to keep the current PEP focused on ameliorating > existing asymmetries rather than adding new behavior, but is not opposed to > the notion in principle (assuming enums are used instead of strings). > Experience with creating the reference implementation for this PEP may yet > change the author's mind. I don't think an Enum is quite the right fit, because of Symbolic links -- if the original filename was to a link, that can be important, but asking everyone to check for both regular and link_to_regular is ugly. Special Types (anything other than File and Directory) may vary by system, and may have a subclass relationship. On the other hand, using instance on marker classes might lose the efficiency you were concerned about. Maybe the answer is to use marker classes plus convenience methods for the special case methods of is_file and is_dir? -jJ From random832 at fastmail.us Tue May 7 00:12:17 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 06 May 2013 18:12:17 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <094DC195-A798-49BA-8FC4-55F2F3D588E4@yahoo.com> References: <1367829004.2868.619.camel@basilisk> <1367866684.2868.739.camel@basilisk> <1367868760.25143.140661227285297.3344D56F@webmail.messagingengine.com> <1367869619.2868.753.camel@basilisk> <094DC195-A798-49BA-8FC4-55F2F3D588E4@yahoo.com> Message-ID: <1367878337.32334.140661227357021.64A3EA3A@webmail.messagingengine.com> On Mon, May 6, 2013, at 17:42, Andrew Barnert wrote: > So, maybe the right answer isn't adding more methods to os.path or > renaming the existing ones, but deprecating them entirely. That was what I was driving at with "I don't like these methods" - I meant I don't like _any_ of them. Even if nt._isdir is a nice example of taking advantage of something the platform does that Unix doesn't. From abarnert at yahoo.com Tue May 7 01:07:45 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 6 May 2013 16:07:45 -0700 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <0CF0E6B9-2007-4C84-8A2E-D0B293DA6A5C@yahoo.com> On May 6, 2013, at 15:09, Jim Jewett wrote: > Out of curiosity, is it common to call more than function, except in > the following cases: > > (1) stat.S_ISREG(st.st_mode) or stat.S_ISDIR(st.st_mode) > (2) try all the type functions until successful > > If those are the only real use cases, it might make sense to just add > a pair of functions for those two specific cases. (3) DIR or LNK (4) REG or LNK I think the former is even more common than (1). From greg.ewing at canterbury.ac.nz Tue May 7 01:42:12 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 07 May 2013 11:42:12 +1200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367846686.2868.668.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> Message-ID: <51883FD4.1070503@canterbury.ac.nz> Pieter Nagel wrote: > To summarise the thread: POSIX guarantees the *names* os the S_XXX > constants, not their numeric values. But the stat module propagates the > misapprehension that the values are standardised. How does it do that? The values may happen to be the same in all current implementations, but the docs don't promise that. -- Greg From turnbull at sk.tsukuba.ac.jp Tue May 7 02:34:40 2013 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Tue, 07 May 2013 09:34:40 +0900 Subject: [Python-ideas] Trie ABC? In-Reply-To: References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <8738tzmxun.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > I believe it is the extra trie specific method names that Geoffrey is > interested in standardising: Understood. > This seems reasonable, although I think it may make more sense in > conjunction with a reference implementation in the standard library. Reasonable in the abstract, yes. But if it's possible to do as well as timsort[1], IMHO providing an ABC rather than going directly to a concrete implementation violates Occam's Razor. All I want[2] here is somebody to tell me that "no, trie cannot (at this time) be reduced to a single algorithm that can be tuned to perform very well in almost all cases encountered in practice, and it will be common for different users[3] to want to implement different algorithms." Footnotes: [1] With all due respect to Tim, it's not "God's own sort". But who cares? [2] Of course you can just ignore me. :-) [3] Who aren't graduate students writing a MS thesis on tries. From abarnert at yahoo.com Tue May 7 03:19:13 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 6 May 2013 18:19:13 -0700 (PDT) Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <51883FD4.1070503@canterbury.ac.nz> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> Message-ID: <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> > From: Greg Ewing > Sent: Monday, May 6, 2013 4:42 PM > > Pieter Nagel wrote: >> To summarise the thread: POSIX guarantees the *names* os the S_XXX >> constants, not their numeric values. But the stat module propagates the >> misapprehension that the values are standardised. > > How does it do that? The values may happen to be the > same in all current implementations, but the docs don't > promise that. Well, to someone who reads the _source_, it kind of implies that misapprehension. But fixing that requires just a one-line comment saying something like, "These values are not standardized, they just happen to work on every platform this version of this module is used with." From abarnert at yahoo.com Tue May 7 04:22:25 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 6 May 2013 19:22:25 -0700 (PDT) Subject: [Python-ideas] Trie ABC? In-Reply-To: <8738tzmxun.fsf@uwakimon.sk.tsukuba.ac.jp> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> <8738tzmxun.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1367893345.62115.YahooMailNeo@web184703.mail.ne1.yahoo.com> Before getting further into trie? The most common data structure people ask for in Python isn't tries, it's sorted mappings, like C++ std::map and friends. There are a dozen or more implementations out there?and they all extend the (Mutable)Mapping ABC in different ways. For example,?https://pypi.python.org/pypi/bintrees/ adds bisect methods, key slices, and heapq-like methods;?http://stutzbachenterprises.com/blist/sorteddict.html adds a different set of bisect methods, some subset methods, and a few other things.?Wouldn't it be nice if they could all just implement the part that's actually interesting and worth competing over, and?inherit collections.abc.MutableSortedMapping to get an ideal set of methods? MutableMapping is useful for exactly that reason. Maybe we should go through the different sorteddict implementations, pick the best one, improve its API with the best features of the others (and drop any misfeatures), and stick it in the stdlib. But I think people are still going to want to use the alternatives. (In fact, blist was rejected for stdlib inclusion a while back for exactly that reason.) So, I think it makes sense to add the ABC without adding a reference implementation. Anyway, on to trie? > From: Stephen J. Turnbull > Sent: Monday, May 6, 2013 5:34 PM > > Nick Coghlan writes: > >> This seems reasonable, although I think it may make more sense in >> conjunction with a reference implementation in the standard library. > > Reasonable in the abstract, yes.? But if it's possible to do as well > as timsort[1], IMHO providing an ABC rather than going directly to a > concrete implementation violates Occam's Razor.? All I want[2] here is > somebody to tell me that "no, trie cannot (at this time) be reduced to > a single algorithm that can be tuned to perform very well in almost > all cases encountered in practice, and it will be common for different > users[3] to want to implement different algorithms." First, it's not an algorithm like sort, it's a data structure like set. And there's at least one important related but different data structure that has the same API as an (immutable) trie, but a different implementation:?a branch-merged trie. This is a huge space savings for sparse lookup tables, like Unicode data.?(Actually, there's a whole class of compacted tries that are useful, but the branch-merged one seems to be the one that half the Unicode-data implementations out there use, so I'm being conservative here and saying maybe just that one would be good enough.) Meanwhile, even if you stick with just the standard data structure, many implementations are hardcoded to deal only with ASCII (or UCS-2 or UTF-32 or some other fixed-size) characters instead of arbitrary objects. And the reason for this isn't just faster and simpler implementation, or ability to wrap up C code with a decade of optimization and testing behind it, but space. People store giant dictionaries in tries, and if our One True Trie took 4x as much space as an ASCII-only trie, people would still use the latter. So, could we get away with a single trie? No? but maybe a single frozentrie and a single trie (although note that converting between them wouldn't be constant space/time, unlike frozenset/set), with both doing the same kind of transparent fallback as str (use ASCII if possible, UCS-2 if possible, UTF-32 if possible, PyObject* if the elements aren't single-char strings at all), would be a good enough 80% solution. Is anyone volunteering to build that? From stephen at xemacs.org Tue May 7 05:12:10 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 07 May 2013 12:12:10 +0900 Subject: [Python-ideas] Trie ABC? In-Reply-To: <1367893345.62115.YahooMailNeo@web184703.mail.ne1.yahoo.com> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> <8738tzmxun.fsf@uwakimon.sk.tsukuba.ac.jp> <1367893345.62115.YahooMailNeo@web184703.mail.ne1.yahoo.com> Message-ID: <87sj1zlbzp.fsf@uwakimon.sk.tsukuba.ac.jp> Andrew Barnert writes: > First, [trie is] not an algorithm like sort, it's a data structure > like set. I don't see that that distinction matters here. I'm simply using timsort as the platinum standard of quality of implementation. :-) The question is "does somebody need multiple implementations?" The answer is evidently "yes": > So, could we get away with a single trie? No? /me can go back to GSoC mentoring now. From random832 at fastmail.us Tue May 7 06:09:14 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Tue, 07 May 2013 00:09:14 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> Message-ID: <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> On Mon, May 6, 2013, at 21:19, Andrew Barnert wrote: > But fixing that requires just a one-line comment saying something like, > "These values are not standardized, they just happen to work on every > platform this version of this module is used with." On the subject of platforms they _don't_ work with... What happens to code that hardcodes values like 0o644 on such platforms? I suspect there is more code than there otherwise would have been due to these constants not being in the os module where stat, open, mkdir, and chmod live (and it doesn't break on the author's system). os.open and os.mkdir are even documented as having a default value of 0o777, not stat.S_IRWXU|stat.S_IRWXG|stat.S_IRWXO. Also, a browse of the Infozip source code shows that on Amiga and THEOS, there are mode bits that are uncomfortably close enough to POSIX (in their names, not their values) that it would make sense to include them in the stat module, but don't use the same model with r/w/x user/group/other. I bring this up because it looks like the Amiga version of Python 2.3 actually does a mapping to unix values (i.e. to the specific concrete values we are discussing) Can anyone give an example of a POSIX platform where these values don't work? I swear I remember reading about one but don't remember what it was. From pieter at nagel.co.za Tue May 7 08:48:55 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Tue, 07 May 2013 08:48:55 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367909335.2868.801.camel@basilisk> On Mon, 2013-05-06 at 18:09 -0400, Jim Jewett wrote: > Another alternative would be to modify os.path.isfile, os.path.isdir, > etc so that they can accept a stat_result in place of a filename (or > open-file handle). Interesting proposal. The downside is that it would place even more pressure on os.path to accumulate all kinds of platform-specific things like os.path.isdoor (for Solaris). Or, alternatively, that Python will never support these things in order not to pollute os.path. With stat_result, one can conceptually have platform-specific types os stat_result, and place is_door only on solaris_stat_result. The other downside is it overloads the meaning of the parameters even more. > Yet another (albeit more complicated, with questionable > backwards-compatibility) alternative would be to have the os.path.* > functions maintain a very short-duration cache, so that if the same > file is queried multiple times within a second or so, the stat_result > could be reused. In a recent discussion on python-dev regarding similar propose behaviour in PEP 428, Guido pronounced that in general, he wants APIs that cache to also expose their uncached variants, because both use cases are usually needed. So this caching would need to be optional on os.path.*, implying yet another parameter. I'm -1 on this. Also note that you might anyway end up getting something like this with PEP 428. At the moment it has essentially infinite caching, not sure how it will be changed after Guido's statement. > Out of curiosity, is it common to call more than function, except in > the following cases: What about, for example, admin script code that walks a bunch of files and wants to do different things to different types of files? I.e. if os.path.isfile(f): # do something elif os.path.isdir(f): # do something elif os.path.islink(f): # do something #etc. > > (1) stat.S_ISREG(st.st_mode) or stat.S_ISDIR(st.st_mode) > (2) try all the type functions until successful > > If those are the only real use cases, it might make sense to just add > a pair of functions for those two specific cases. > > > if os.path.isfile_or_dir(filename) I don't think this will cover all usecases. > Or maybe just for the latter, with the first spelled either > > > from os.path import filekind > > if filekind(filename) in (filekind.REGULAR, filekind.DIR) #symlinks? > > or > > > from os.path import filekind > > if filekind(filename) isinstance (filekind.REGULAR, filekind.DIR) The basic notion has been proposed before. I'm holding off on it, because:I don't think it'll ever be the *only* mechanism for interrogating file types (we need to retain os.path.isfile for backwards compatibility). I expect naive and newbie code to still prefer os.path, and so this filekind notion will be just another way in which performant code that calls stat() only once will look totally different from naive code. I'm in favour of it being a potential *additional* way to interrogate file types, but I'm not going to champion it for now. This proposal will need a lot of word to get the modelling of the filekinds correct, taking into account questions like "is a fifo filekind a kind of file filekind", and will need a survey of platform-specific stat flags that Python may want to support in the near future. > Even assuming these are added individually (as opposed to a single > filekind), is there a reason not to make them properties? I > understand that a property normally shouldn't hide something as > expensive as a system call, but in this case the system call is > already complete before the caller has a stat_return with attributes. I'll want to follow PEP 428's lead here, and both it (and os.path!) currently have them as methods. My next draft will make it clear why I consider PEP 428 relevant here. > > same_stat(other) > > Equivalent to ``os.path.samestat(self, other)``. > > Why is this not just an equality test? > > Is there just too much of a backward-compatibility problem for > stat_result objects that refer to the same device/inode, but have > differences in the way other attributes are set? Equality, to my mind, implies that all visible state is being compared, so if I were to add __eq__ to stat result, it will compare st_size, st_mtime and the whole lot too. Anything else would be confusing. Plus, imagine you call stat() on a file now, and store the result for some reason. Later, you call stat() again, and want to see if any of the old stat_results you stored refer to the same file. But meantime the st_size etc. changed, even though the file itself is still "the same file". So this is a totally different operation that equality. Agree the name same_stat() is not ideal. I think I should relax my desire to use only names that echo os.path in cases where it makes sense. > > > format() > > This shall return ``stat.S_IFMT(self.st_mode)``. > > I don't think this is important enough to justify the confusion with "".format The next PEP will likely omit it entirely -- Pieter Nagel From pieter at nagel.co.za Tue May 7 08:51:50 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Tue, 07 May 2013 08:51:50 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> Message-ID: <1367909510.2868.804.camel@basilisk> On Tue, 2013-05-07 at 00:09 -0400, random832 at fastmail.us wrote: > On the subject of platforms they _don't_ work with... What happens to > code that hardcodes values like 0o644 on such platforms? At issue here is the file type stat flags such as S_IS*. The permission bits are not a problem. I'm confident that the permission bits are specified by POSIX, and thus cross-platform (haven't double-checked though). -- Pieter Nagel From pieter at nagel.co.za Tue May 7 09:00:59 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Tue, 07 May 2013 09:00:59 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367829004.2868.619.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> Message-ID: <1367910059.2868.811.camel@basilisk> I want to thank every one for their feedback so far. I have a lot of changes to make, a lot to mull over, I anticipate a revised draft sometime next week, One large area of contention is the naming of is_file vs file. I realise that I've been following an implicit argument to follow PEP 428 rather than os.path's names, and I'll make my rationale why very clear in the next draft. My legalistic invocations of PEP 8 were more an attempt to reverse-engineer why PEP 428 uses is_file and why python-dev seems, to me, to take that as a given. -- Pieter Nagel From solipsis at pitrou.net Tue May 7 11:09:41 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 11:09:41 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> Message-ID: <20130507110941.2b3b4cb2@pitrou.net> Le Mon, 06 May 2013 19:49:44 +0200, Pieter Nagel a ?crit : > > The motivation is to make "performant" code that calls stat() once > look more similar to naive code that potentially calls it many times. > And if naive code can say "os.path.isfile(f)" and get False even when > f does not exist, then performant code should be able to do something > similar by just statting and then interogating the results, without > adding exception handling. Then perhaps make it return None and let users write: st = os.rich_stat(filename) is_a_file = st and st.isfile() (which is a rather clean and concise coding style as far as I'm humbly concerned :-)) Regards Antoine. From solipsis at pitrou.net Tue May 7 11:13:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 11:13:45 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> Message-ID: <20130507111345.5bd9e394@pitrou.net> Le Mon, 06 May 2013 19:49:44 +0200, Pieter Nagel a ?crit : > On Mon, 2013-05-06 at 20:58 +0400, Oleg Broytman wrote: > > > I don't like the idea of changing os.stat() behaviour. > > I'm actually not proposing changing os.stat() behaviour, not for > existing code. I'm proposing new optional behaviour that code could > request. > > There is precedent for functions not raising exceptions, depending on > the way they're called. dict.pop will raise KeyError, unless a default > is given. str.encode can raise UnicodeErrors, depending on the value > of the 'errors' parameter. > > In that light, a function that might or might not raise > FileNotFoundError depending on how it was called is not that weird. However, I should point out that FileNotFoundError is not the only error that may be raised: >>> os.stat("/proc/1/fd/1") Traceback (most recent call last): File "", line 1, in PermissionError: [Errno 13] Permission denied: '/proc/1/fd/1' (meaning that None may hide more information than you'd like) Regards Antoine. From pieter at nagel.co.za Tue May 7 12:05:33 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Tue, 07 May 2013 12:05:33 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <20130507111345.5bd9e394@pitrou.net> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> <20130507111345.5bd9e394@pitrou.net> Message-ID: <1367921133.2868.817.camel@basilisk> On Tue, 2013-05-07 at 11:13 +0200, Antoine Pitrou wrote: > However, I should point out that FileNotFoundError is not the only > error that may be raised: > > >>> os.stat("/proc/1/fd/1") > Traceback (most recent call last): > File "", line 1, in > PermissionError: [Errno 13] Permission denied: '/proc/1/fd/1' This raises the question of whether the current os.path.isfile etc. behaviour is correct. Currently it swallows any os.error and returns False. But in a situation like the above, is not necessarily true that the path is not a file just because you don't have permission to interrogate it.. And even if os.path.isfile is arguably incorrect in this situation, is it likely to be fixed? If not, should new code, like my proposal or your PEP 428, follow its incorrect example or do it correctly, but differently? Speaking of which, your current implementation os PEP 428 raises exceptions when quering is_file() or is_dir() on nonexistent paths. Is that by design, or an oversight? -- Pieter Nagel From solipsis at pitrou.net Tue May 7 12:18:33 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 12:18:33 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> <20130507111345.5bd9e394@pitrou.net> <1367921133.2868.817.camel@basilisk> Message-ID: <20130507121833.76074186@pitrou.net> Le Tue, 07 May 2013 12:05:33 +0200, Pieter Nagel a ?crit : > On Tue, 2013-05-07 at 11:13 +0200, Antoine Pitrou wrote: > > > However, I should point out that FileNotFoundError is not the only > > error that may be raised: > > > > >>> os.stat("/proc/1/fd/1") > > Traceback (most recent call last): > > File "", line 1, in > > PermissionError: [Errno 13] Permission denied: '/proc/1/fd/1' > > This raises the question of whether the current os.path.isfile etc. > behaviour is correct. > > Currently it swallows any os.error and returns False. But in a > situation like the above, is not necessarily true that the path is > not a file just because you don't have permission to interrogate it.. Well, at least it's not usable as a file by the current user. Which is generally the point :) But isfile() and friends are higher-level helpers, it makes more sense for them to swallow an exception than os.stat() (IMHO). > Speaking of which, your current implementation os PEP 428 raises > exceptions when quering is_file() or is_dir() on nonexistent paths. Is > that by design, or an oversight? To be honest I haven't given much thought to it, but I guess they could also swallow exceptions. With the current default caching behaviour, it is not detrimental to write `path.exists() and path.is_file()`, though. Regards Antoine. From ethan at stoneleaf.us Tue May 7 12:46:08 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 07 May 2013 03:46:08 -0700 Subject: [Python-ideas] PEP 428 and is_file [was Re: PEP: Extended stat_result (First Draft)] In-Reply-To: <20130507121833.76074186@pitrou.net> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> <20130507111345.5bd9e394@pitrou.net> <1367921133.2868.817.camel@basilisk> <20130507121833.76074186@pitrou.net> Message-ID: <5188DB70.1030802@stoneleaf.us> On 05/07/2013 03:18 AM, Antoine Pitrou wrote: > Le Tue, 07 May 2013 12:05:33 +0200, Pieter Nagel a ?crit : >> >> Speaking of which, your current implementation os PEP 428 raises >> exceptions when quering is_file() or is_dir() on nonexistent paths. Is >> that by design, or an oversight? > > To be honest I haven't given much thought to it, but I guess they > could also swallow exceptions. +1 >With the current default caching > behaviour, it is not detrimental to write `path.exists() and > path.is_file()`, though. It may not be detrimental to the code, but it is to the user (at least to me ;). If the path doesn't exist, it's obviously not a file. Why should I have to do two checks instead of one? It would be like having to guard every == with a try/except NotImplementedError and substituting False. And yes, I realize that .exists() would tell be the path is not a file, but .exists is not the only reason why it might not be a file, which means in the general case I would have to do two checks. -- ~Ethan~ From python at mrabarnett.plus.com Tue May 7 13:26:07 2013 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 07 May 2013 12:26:07 +0100 Subject: [Python-ideas] PEP 428 and is_file [was Re: PEP: Extended stat_result (First Draft)] In-Reply-To: <5188DB70.1030802@stoneleaf.us> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> <20130507111345.5bd9e394@pitrou.net> <1367921133.2868.817.camel@basilisk> <20130507121833.76074186@pitrou.net> <5188DB70.1030802@stoneleaf.us> Message-ID: <5188E4CF.2000706@mrabarnett.plus.com> On 07/05/2013 11:46, Ethan Furman wrote: > On 05/07/2013 03:18 AM, Antoine Pitrou wrote: >> Le Tue, 07 May 2013 12:05:33 +0200, Pieter Nagel a ?crit : >>> >>> Speaking of which, your current implementation os PEP 428 raises >>> exceptions when quering is_file() or is_dir() on nonexistent paths. Is >>> that by design, or an oversight? >> >> To be honest I haven't given much thought to it, but I guess they >> could also swallow exceptions. > > +1 > >>With the current default caching >> behaviour, it is not detrimental to write `path.exists() and >> path.is_file()`, though. > > It may not be detrimental to the code, but it is to the user (at least to me ;). If the path doesn't exist, it's > obviously not a file. Why should I have to do two checks instead of one? It would be like having to guard every == > with a try/except NotImplementedError and substituting False. > Isn't that a little like duck-typing? If I can't access it as a file, it might as well not be a file (if it doesn't quack, then it is, for all intents and purposes, not a duck, even if, in reality, it is). > And yes, I realize that .exists() would tell be the path is not a file, but .exists is not the only reason why it might > not be a file, which means in the general case I would have to do two checks. > From random832 at fastmail.us Tue May 7 13:44:45 2013 From: random832 at fastmail.us (Random832) Date: Tue, 07 May 2013 07:44:45 -0400 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367909510.2868.804.camel@basilisk> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> <1367909510.2868.804.camel@basilisk> Message-ID: <5188E92D.2080100@fastmail.us> On 05/07/2013 02:51 AM, Pieter Nagel wrote: > At issue here is the file type stat flags such as S_IS*. > > The permission bits are not a problem. I'm confident that the permission > bits are specified by POSIX, and thus cross-platform (haven't > double-checked though). They are not. What is specified, which you may be thinking of, is the meaning of a numeric argument to the chmod _command_, from which you can no more infer that the bits themselves are the same than you can for signal numbers vs. the seven numeric values specified as arguments to the kill command. From pieter at nagel.co.za Tue May 7 13:55:32 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Tue, 07 May 2013 13:55:32 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5188E92D.2080100@fastmail.us> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> <1367909510.2868.804.camel@basilisk> <5188E92D.2080100@fastmail.us> Message-ID: <1367927732.2868.820.camel@basilisk> On Tue, 2013-05-07 at 07:44 -0400, Random832 wrote: > On 05/07/2013 02:51 AM, Pieter Nagel wrote: > > At issue here is the file type stat flags such as S_IS*. > > > > The permission bits are not a problem. I'm confident that the permission > > bits are specified by POSIX, and thus cross-platform (haven't > > double-checked though). > They are not. What is specified, which you may be thinking of, is the > meaning of a numeric argument to the chmod _command_, from which you can > no more infer that the bits themselves are the same than you can for > signal numbers vs. the seven numeric values specified as arguments to > the kill command. Ah, I was not aware of that. However, the permission bits are not as open-ended for extension as the stat S_I* flags are. So if there aren't any clashes by now, there might never be. Whereas in the bug-report thread I linked, it was pointed out that platforms already exist with different meanings assigned to the same bits of the stat flags, so the issue is more pressing there. -- Pieter Nagel From cf.natali at gmail.com Tue May 7 13:58:47 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 7 May 2013 13:58:47 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <5188E92D.2080100@fastmail.us> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> <1367909510.2868.804.camel@basilisk> <5188E92D.2080100@fastmail.us> Message-ID: >> The permission bits are not a problem. I'm confident that the permission >> bits are specified by POSIX, and thus cross-platform (haven't >> double-checked though). > > They are not. What is specified, which you may be thinking of, is the > meaning of a numeric argument to the chmod _command_, from which you can no > more infer that the bits themselves are the same than you can for signal > numbers vs. the seven numeric values specified as arguments to the kill > command. No, there are actually standardized: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html """ Name Numeric Value Description S_IRWXU 0700 Read, write, execute/search by owner. """ There's a *lot* of code which uses hardcoded permission values. File types are another matter, and I wouldn't rely on hardcoded values too much. But this problem will soon be solved by Christian's reimplementation of the stat module, see http://bugs.python.org/issue11016 cf From christian at python.org Tue May 7 14:29:58 2013 From: christian at python.org (Christian Heimes) Date: Tue, 07 May 2013 14:29:58 +0200 Subject: [Python-ideas] PEP: Extended stat_result (First Draft) In-Reply-To: <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> References: <1367829004.2868.619.camel@basilisk> <1367846686.2868.668.camel@basilisk> <51883FD4.1070503@canterbury.ac.nz> <1367889553.528.YahooMailNeo@web184704.mail.ne1.yahoo.com> <1367899754.23623.140661227452109.1DF07949@webmail.messagingengine.com> Message-ID: Am 07.05.2013 06:09, schrieb random832 at fastmail.us: > On Mon, May 6, 2013, at 21:19, Andrew Barnert wrote: >> But fixing that requires just a one-line comment saying something like, >> "These values are not standardized, they just happen to work on every >> platform this version of this module is used with." > > On the subject of platforms they _don't_ work with... What happens to > code that hardcodes values like 0o644 on such platforms? I suspect there > is more code than there otherwise would have been due to these constants > not being in the os module where stat, open, mkdir, and chmod live (and > it doesn't break on the author's system). The Open Group specifies names *and* numeric values for the permission bits: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html Any POSIX compatible ought to use this values. I don't see an issue for permission bits. The S_IF* constants are a whole different story, though. From rosuav at gmail.com Tue May 7 17:18:02 2013 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 8 May 2013 01:18:02 +1000 Subject: [Python-ideas] PEP 428 and is_file [was Re: PEP: Extended stat_result (First Draft)] In-Reply-To: <5188DB70.1030802@stoneleaf.us> References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> <20130507111345.5bd9e394@pitrou.net> <1367921133.2868.817.camel@basilisk> <20130507121833.76074186@pitrou.net> <5188DB70.1030802@stoneleaf.us> Message-ID: On Tue, May 7, 2013 at 8:46 PM, Ethan Furman wrote: > It may not be detrimental to the code, but it is to the user (at least to me > ;). If the path doesn't exist, it's obviously not a file. +1. It makes good sense. Though it probably won't help your argument to point out that shell scripts enjoy the same feature - see 'man 1 test' [1] - where "-f" means "exists and is a file", "-x" means "exists and is executable", etc. [1] eg http://unixhelp.ed.ac.uk/CGI/man-cgi?test ChrisA From ethan at stoneleaf.us Tue May 7 17:58:04 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 07 May 2013 08:58:04 -0700 Subject: [Python-ideas] PEP 428 and is_file [was Re: PEP: Extended stat_result (First Draft)] In-Reply-To: References: <1367829004.2868.619.camel@basilisk> <1367831895.2868.626.camel@basilisk> <20130506113023.07e1ce27@pitrou.net> <1367843367.2868.631.camel@basilisk> <20130506165801.GB17695@iskra.aviel.ru> <1367862584.2868.714.camel@basilisk> <20130507111345.5bd9e394@pitrou.net> <1367921133.2868.817.camel@basilisk> <20130507121833.76074186@pitrou.net> <5188DB70.1030802@stoneleaf.us> Message-ID: <5189248C.7060704@stoneleaf.us> On 05/07/2013 08:18 AM, Chris Angelico wrote: > On Tue, May 7, 2013 at 8:46 PM, Ethan Furman wrote: >> It may not be detrimental to the code, but it is to the user (at least to me >> ;). If the path doesn't exist, it's obviously not a file. > > +1. It makes good sense. Though it probably won't help your argument > to point out that shell scripts enjoy the same feature - see 'man 1 > test' [1] - where "-f" means "exists and is a file", "-x" means > "exists and is executable", etc. Not at all! If `stat.is_file` returns True it must exist and it must be a file! ;) -- ~Ethan~ From foolistbar at googlemail.com Tue May 7 20:03:14 2013 From: foolistbar at googlemail.com (Geoffrey Sneddon) Date: Tue, 07 May 2013 19:03:14 +0100 Subject: [Python-ideas] Trie ABC? In-Reply-To: <1367893345.62115.YahooMailNeo@web184703.mail.ne1.yahoo.com> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> <8738tzmxun.fsf@uwakimon.sk.tsukuba.ac.jp> <1367893345.62115.YahooMailNeo@web184703.mail.ne1.yahoo.com> Message-ID: <518941E2.8060309@googlemail.com> On 07/05/13 03:22, Andrew Barnert wrote: >> >From: Stephen J. Turnbull >> >Sent: Monday, May 6, 2013 5:34 PM >> > >> >Nick Coghlan writes: >> > >>> >> This seems reasonable, although I think it may make more sense in >>> >> conjunction with a reference implementation in the standard library. >> > >> >Reasonable in the abstract, yes. But if it's possible to do as well >> >as timsort[1], IMHO providing an ABC rather than going directly to a >> >concrete implementation violates Occam's Razor. All I want[2] here is >> >somebody to tell me that "no, trie cannot (at this time) be reduced to >> >a single algorithm that can be tuned to perform very well in almost >> >all cases encountered in practice, and it will be common for different >> >users[3] to want to implement different algorithms." > > First, it's not an algorithm like sort, it's a data structure like set. > > And there's at least one important related but different data structure that has the same API as an (immutable) trie, but a different implementation: a branch-merged trie. This is a huge space savings for sparse lookup tables, like Unicode data. (Actually, there's a whole class of compacted tries that are useful, but the branch-merged one seems to be the one that half the Unicode-data implementations out there use, so I'm being conservative here and saying maybe just that one would be good enough.) > > Meanwhile, even if you stick with just the standard data structure, many implementations are hardcoded to deal only with ASCII (or UCS-2 or UTF-32 or some other fixed-size) characters instead of arbitrary objects. And the reason for this isn't just faster and simpler implementation, or ability to wrap up C code with a decade of optimization and testing behind it, but space. People store giant dictionaries in tries, and if our One True Trie took 4x as much space as an ASCII-only trie, people would still use the latter. Even more extreme than this is what the marisa-trie package does: it has different implementations depending on value type. It has one that can hold any Python object, one that can hold identically typed tuples using magic with the struct module, and one that can hold bytes objects. Dealing with giant dictionaries how you encode the value is just as important as the key. Note also libdatrie uses an alphabetmap, which allows arbitrary subsets of the Unicode plane to be used. Of course, this then makes it impossible to add a new arbitrary key without recreating the entire trie, so the mutable/immutable distinction is important. > So, could we get away with a single trie? No? but maybe a single frozentrie and a single trie (although note that converting between them wouldn't be constant space/time, unlike frozenset/set), with both doing the same kind of transparent fallback as str (use ASCII if possible, UCS-2 if possible, UTF-32 if possible, PyObject* if the elements aren't single-char strings at all), would be a good enough 80% solution. I'd be totally in favour of putting such a thing in the stdlib? if someone ever wrote it. :) But yes: I think Andrew has succulently explained why I'm not rushing into trying to push a specific implementation into the stdlib. I'm not against putting one in, but it'll need to be carefully chosen: any pure-Python one will likely have undesirable memory usage due to the overhead per string object (there's a reason for my implementation implements the interface using a binary-search-tree and not a trie!); and most modern C implementations are separate libraries: the Python wrapper is only really interesting from an API POV, so it doesn't matter too much if we choose something not used before. /gsnedders From abarnert at yahoo.com Tue May 7 23:22:21 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 7 May 2013 14:22:21 -0700 (PDT) Subject: [Python-ideas] Trie ABC? In-Reply-To: <518941E2.8060309@googlemail.com> References: <5186BB01.8080805@googlemail.com> <871u9klmxn.fsf@uwakimon.sk.tsukuba.ac.jp> <8738tzmxun.fsf@uwakimon.sk.tsukuba.ac.jp> <1367893345.62115.YahooMailNeo@web184703.mail.ne1.yahoo.com> <518941E2.8060309@googlemail.com> Message-ID: <1367961741.99230.YahooMailNeo@web184702.mail.ne1.yahoo.com> From: Geoffrey Sneddon Sent: Tuesday, May 7, 2013 11:03 AM > But yes: I think Andrew has succulently explained why I'm not rushing into > trying to push a specific implementation into the stdlib. I'm not against > putting one in, but it'll need to be carefully chosen: any pure-Python one > will likely have undesirable memory usage due to the overhead per string object > (there's a reason for my implementation implements the interface using a > binary-search-tree and not a trie!); and most modern C implementations are > separate libraries: the Python wrapper is only really interesting from an API > POV, so it doesn't matter too much if we choose something not used before. That raises another interesting point:?The same implementation can support both a Trie interface and a SortedMapping interface (or, for that matter, both SortedMapping and Heap)?but not _every_ implementation can, or should. Which is a good argument for having both ABCs in the stdlib. If you've got a binary tree that can support Trie (which I believe means a slower log N, but still log N, for the relevant operations?), why not inherit collections.abc.MutableTrie, collections.ABC.MutableSortedMapping? From cf.natali at gmail.com Wed May 8 14:39:29 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 8 May 2013 14:39:29 +0200 Subject: [Python-ideas] improving C structs layout Message-ID: Hi, I was recently looking at the PyThreadState data structure (for issue #17912, but it's unimportant), and noticed that the layout of the members leaves some holes (due to alignment). While it doesn't import too much for PyThreadState (because of trailing padding), I wondered whether other structures in the code base could benefit from a better layout. So I ran pahole [1], and found the following structures: $ pahole -P python PyMemberDef 40 32 8 wrapperbase 56 48 8 unicode_formatter_t 136 128 8 _expr 56 52 4 _stmt 72 68 4 _excepthandler 40 36 4 _node 40 32 8 compiler_unit 448 440 8 tok_state 992 984 8 The first column is the current size, and the second column the size after a more judicious layout. For example. Before: $ pahole -C wrapperbase python struct wrapperbase { char * name; /* 0 8 */ int offset; /* 8 4 */ /* XXX 4 bytes hole, try to pack */ void * function; /* 16 8 */ wrapperfunc wrapper; /* 24 8 */ char * doc; /* 32 8 */ int flags; /* 40 4 */ /* XXX 4 bytes hole, try to pack */ PyObject * name_strobj; /* 48 8 */ /* size: 56, cachelines: 1, members: 7 */ /* sum members: 48, holes: 2, sum holes: 8 */ /* last cacheline: 56 bytes */ }; After: $ pahole -C wrapperbase -R python struct wrapperbase { char * name; /* 0 8 */ int offset; /* 8 4 */ int flags; /* 12 4 */ void * function; /* 16 8 */ wrapperfunc wrapper; /* 24 8 */ char * doc; /* 32 8 */ PyObject * name_strobj; /* 40 8 */ /* size: 48, cachelines: 1, members: 7 */ /* last cacheline: 48 bytes */ }; /* saved 8 bytes! */ While some of the structs above aren't worth the trouble (like tok_state), I think some might be interesting candidates. This could lead to reduced memory usage (well, of course it depends on the number of instances), and better cache usage/locality of reference. So what do you think, is it worth it? cf [1] https://github.com/acmel/dwarves From asolano at icai.es Wed May 8 15:45:09 2013 From: asolano at icai.es (Alfredo Solano) Date: Wed, 08 May 2013 15:45:09 +0200 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: <518A56E5.7090203@icai.es> Hi, Interesting observation, but isn't C struct alignment platform/compiler dependent? That would mean optimizing the member declaration order for one architecture may have a performance hit for another one. Alfredo On 05/08/2013 02:39 PM, Charles-Fran?ois Natali wrote: > Hi, > > I was recently looking at the PyThreadState data structure (for issue > #17912, but it's unimportant), and noticed that the layout of the > members leaves some holes (due to alignment). > While it doesn't import too much for PyThreadState (because of > trailing padding), I wondered whether other structures in the code > base could benefit from a better layout. > So I ran pahole [1], and found the following structures: > > $ pahole -P python > PyMemberDef 40 32 8 > wrapperbase 56 48 8 > unicode_formatter_t 136 128 8 > _expr 56 52 4 > _stmt 72 68 4 > _excepthandler 40 36 4 > _node 40 32 8 > compiler_unit 448 440 8 > tok_state 992 984 8 > > The first column is the current size, and the second column the size > after a more judicious layout. > > For example. > Before: > $ pahole -C wrapperbase python > struct wrapperbase { > char * name; /* 0 8 */ > int offset; /* 8 4 */ > > /* XXX 4 bytes hole, try to pack */ > > void * function; /* 16 8 */ > wrapperfunc wrapper; /* 24 8 */ > char * doc; /* 32 8 */ > int flags; /* 40 4 */ > > /* XXX 4 bytes hole, try to pack */ > > PyObject * name_strobj; /* 48 8 */ > > /* size: 56, cachelines: 1, members: 7 */ > /* sum members: 48, holes: 2, sum holes: 8 */ > /* last cacheline: 56 bytes */ > }; > > After: > $ pahole -C wrapperbase -R python > struct wrapperbase { > char * name; /* 0 8 */ > int offset; /* 8 4 */ > int flags; /* 12 4 */ > void * function; /* 16 8 */ > wrapperfunc wrapper; /* 24 8 */ > char * doc; /* 32 8 */ > PyObject * name_strobj; /* 40 8 */ > > /* size: 48, cachelines: 1, members: 7 */ > /* last cacheline: 48 bytes */ > }; /* saved 8 bytes! */ > > While some of the structs above aren't worth the trouble (like > tok_state), I think some might be interesting candidates. > This could lead to reduced memory usage (well, of course it depends on > the number of instances), and better cache usage/locality of > reference. > > So what do you think, is it worth it? > > cf > > [1] https://github.com/acmel/dwarves > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From solipsis at pitrou.net Wed May 8 15:52:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 May 2013 15:52:51 +0200 Subject: [Python-ideas] improving C structs layout References: <518A56E5.7090203@icai.es> Message-ID: <20130508155251.5d139068@fsol> On Wed, 08 May 2013 15:45:09 +0200 Alfredo Solano wrote: > Hi, > > Interesting observation, but isn't C struct alignment platform/compiler > dependent? The ABIs are standardized, so I would answer no. Even if they weren't, there are common sense rules to minimize padding, such as to put fields of the same width next to each other (e.g. put chars together instead of intermingling them with ints and floats). Regards Antoine. From ncoghlan at gmail.com Wed May 8 16:47:58 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 May 2013 00:47:58 +1000 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: General +0 from me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Wed May 8 17:55:38 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 08 May 2013 11:55:38 -0400 Subject: [Python-ideas] improving C structs layout In-Reply-To: <518A56E5.7090203@icai.es> References: <518A56E5.7090203@icai.es> Message-ID: <1368028538.6569.140661228173977.6F467288@webmail.messagingengine.com> On Wed, May 8, 2013, at 9:45, Alfredo Solano wrote: > Hi, > > Interesting observation, but isn't C struct alignment platform/compiler > dependent? That would mean optimizing the member declaration order for > one architecture may have a performance hit for another one. It's platform-dependent, yes, but generally any given platform is going to either A) not care or B) generally place holes when there are smaller types before larger types, and not place them when there are larger types before smaller types. And it's unlikely that you'd get a performance hit for this, the whole point of the padding is to _not_ have member order impact performance. From abarnert at yahoo.com Wed May 8 18:17:27 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 8 May 2013 09:17:27 -0700 Subject: [Python-ideas] improving C structs layout In-Reply-To: <20130508155251.5d139068@fsol> References: <518A56E5.7090203@icai.es> <20130508155251.5d139068@fsol> Message-ID: On May 8, 2013, at 6:52, Antoine Pitrou wrote: > On Wed, 08 May 2013 15:45:09 +0200 > Alfredo Solano wrote: >> Hi, >> >> Interesting observation, but isn't C struct alignment platform/compiler >> dependent? > > The ABIs are standardized, so I would answer no. What standard are you talking about? There's certainly no ABI standard that covers both Win64 and ARM7 Linux. There are very definitely different packing and alignment rules for different platforms that python runs on. > Even if they weren't, there are common sense rules to minimize padding, > such as to put fields of the same width next to each other (e.g. put > chars together instead of intermingling them with ints and floats). This is true... But you have to keep in mind that the width of different types is itself platform-dependent. If you've got an int, a long, a pointer, and a double, how do you pack them in a way that makes sense for all platforms, or even just the big 3 of x86_64 Mac/Linux/BSD, Win64, and Win32? All that being said, I think the right thing is to abandon the pretense of portability and look at the actual platforms that matter, and prioritize accordingly. If something is ideal on both Win32 and x86_64 Mac/Linux/BSD, good on Win64, no worse than today on ARM7 and x86 Linux/BSD, but worse on, say, 32-bit PowerPC Linux/AIX... That's probably a tradeoff worth having, right? From random832 at fastmail.us Wed May 8 18:25:36 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 08 May 2013 12:25:36 -0400 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: <518A56E5.7090203@icai.es> <20130508155251.5d139068@fsol> Message-ID: <1368030336.14958.140661228187209.124346AA@webmail.messagingengine.com> On Wed, May 8, 2013, at 12:17, Andrew Barnert wrote: > This is true... But you have to keep in mind that the width of different > types is itself platform-dependent. If you've got an int, a long, a > pointer, and a double, how do you pack them in a way that makes sense for > all platforms, or even just the big 3 of x86_64 Mac/Linux/BSD, Win64, and > Win32? A double is always going to be at least 64 bits. A long is usually going to be 32 or 64 bits. A pointer is usually going to be 32 or 64 bits, and on win64 (the only common one where long and pointer are different), long is 32 and pointer is 64. An int is usually 32 bits. From solipsis at pitrou.net Wed May 8 18:59:09 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 May 2013 18:59:09 +0200 Subject: [Python-ideas] improving C structs layout References: <518A56E5.7090203@icai.es> <20130508155251.5d139068@fsol> Message-ID: <20130508185909.5ba47e0c@fsol> On Wed, 8 May 2013 09:17:27 -0700 Andrew Barnert wrote: > On May 8, 2013, at 6:52, Antoine Pitrou wrote: > > > On Wed, 08 May 2013 15:45:09 +0200 > > Alfredo Solano wrote: > >> Hi, > >> > >> Interesting observation, but isn't C struct alignment platform/compiler > >> dependent? > > > > The ABIs are standardized, so I would answer no. > > What standard are you talking about? There's certainly no ABI standard that covers both Win64 and ARM7 Linux. Per-platform ABI standards. Compilers aren't generally free to invent things if they want to be interoperable with each other. > There are very definitely different packing and alignment rules for different platforms that python runs on. > > > Even if they weren't, there are common sense rules to minimize padding, > > such as to put fields of the same width next to each other (e.g. put > > chars together instead of intermingling them with ints and floats). > > This is true... But you have to keep in mind that the width of different types is itself platform-dependent. If you've got an int, a long, a pointer, and a double, how do you pack them in a way that makes sense for all platforms, or even just the big 3 of x86_64 Mac/Linux/BSD, Win64, and Win32? Well, you can be sure that int <= long, and in most cases you can assume other inequalities such as int <= pointer and int <= double (and even pointer <= double). The long <=> pointer relationship is less predictable, but on common platforms long <= pointer. So, double then pointer then long then int. (of course, there may be other concerns such as ensuring proximity of fields which are often looked up together) Regards Antoine. From python at mrabarnett.plus.com Wed May 8 19:10:39 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 08 May 2013 18:10:39 +0100 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: <518A56E5.7090203@icai.es> <20130508155251.5d139068@fsol> Message-ID: <518A870F.4020502@mrabarnett.plus.com> On 08/05/2013 17:17, Andrew Barnert wrote: > On May 8, 2013, at 6:52, Antoine Pitrou wrote: > >> On Wed, 08 May 2013 15:45:09 +0200 >> Alfredo Solano wrote: >>> Hi, >>> >>> Interesting observation, but isn't C struct alignment platform/compiler >>> dependent? >> >> The ABIs are standardized, so I would answer no. > > What standard are you talking about? There's certainly no ABI standard that covers both Win64 and ARM7 Linux. > > There are very definitely different packing and alignment rules for different platforms that python runs on. > >> Even if they weren't, there are common sense rules to minimize padding, >> such as to put fields of the same width next to each other (e.g. put >> chars together instead of intermingling them with ints and floats). > > This is true... But you have to keep in mind that the width of different types is itself platform-dependent. If you've got an int, a long, a pointer, and a double, how do you pack them in a way that makes sense for all platforms, or even just the big 3 of x86_64 Mac/Linux/BSD, Win64, and Win32? > > All that being said, I think the right thing is to abandon the pretense of portability and look at the actual platforms that matter, and prioritize accordingly. If something is ideal on both Win32 and x86_64 Mac/Linux/BSD, good on Win64, no worse than today on ARM7 and x86 Linux/BSD, but worse on, say, 32-bit PowerPC Linux/AIX... That's probably a tradeoff worth having, right? > If you're dealing with members whose sizes are powers of 2, I think that ordering them in decreasing size works pretty well. In fact, I think that it might be a good compromise even if the sizes of some of the members aren't a power of 2. From abarnert at yahoo.com Wed May 8 19:51:17 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 8 May 2013 10:51:17 -0700 Subject: [Python-ideas] improving C structs layout In-Reply-To: <20130508185909.5ba47e0c@fsol> References: <518A56E5.7090203@icai.es> <20130508155251.5d139068@fsol> <20130508185909.5ba47e0c@fsol> Message-ID: <22600743-888E-4E16-BDEA-34AF329C0DC8@yahoo.com> On May 8, 2013, at 9:59, Antoine Pitrou wrote: > On Wed, 8 May 2013 09:17:27 -0700 > Andrew Barnert wrote: > >> On May 8, 2013, at 6:52, Antoine Pitrou wrote: >> >>> On Wed, 08 May 2013 15:45:09 +0200 >>> Alfredo Solano wrote: >>>> Hi, >>>> >>>> Interesting observation, but isn't C struct alignment platform/compiler >>>> dependent? >>> >>> The ABIs are standardized, so I would answer no. >> >> What standard are you talking about? There's certainly no ABI standard that covers both Win64 and ARM7 Linux. > > Per-platform ABI standards. Compilers aren't generally free to invent > things if they want to be interoperable with each other. That's my point. You said it wasn't platform dependent, and that's clearly not true. And python runs on multiple platforms. So Unless you want to have different layouts for each platform, there is no standard. >>> Even if they weren't, there are common sense rules to minimize padding, >>> such as to put fields of the same width next to each other (e.g. put >>> chars together instead of intermingling them with ints and floats). >> >> This is true... But you have to keep in mind that the width of different types is itself platform-dependent. If you've got an int, a long, a pointer, and a double, how do you pack them in a way that makes sense for all platforms, or even just the big 3 of x86_64 Mac/Linux/BSD, Win64, and Win32? > > Well, you can be sure that int <= long, and in most cases you can assume > other inequalities such as int <= pointer and int <= double (and even > pointer <= double). The long <=> pointer relationship is less > predictable, but on common platforms long <= pointer. > > So, double then pointer then long then int. On a platform with 64-bit long and 32-bit pointer, this would leave a gap. Fortunately, even though this is perfectly legal, no major platform has these sizes. And that's exactly why you have to prioritize for the most important platforms, instead of only pretending that standards make that unnecessary. I think you were already implicitly doing that, but you were explicitly claiming not to. From cf.natali at gmail.com Wed May 8 19:58:00 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 8 May 2013 19:58:00 +0200 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: > General +0 from me. Thanks for your help Nick ;-) Otherwise, yes, a good rule of thumb is to order fields by decreasing alignment constraints (which is in turn dependent of the field size, since they're usually aligned on a multiple of their size). So what's the consensus, should I bother creating an entry in the tracker? cf From solipsis at pitrou.net Wed May 8 19:58:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 08 May 2013 19:58:59 +0200 Subject: [Python-ideas] improving C structs layout In-Reply-To: <22600743-888E-4E16-BDEA-34AF329C0DC8@yahoo.com> References: <518A56E5.7090203@icai.es> <20130508155251.5d139068@fsol> <20130508185909.5ba47e0c@fsol> <22600743-888E-4E16-BDEA-34AF329C0DC8@yahoo.com> Message-ID: <1368035939.2531.13.camel@fsol> Le mercredi 08 mai 2013 ? 10:51 -0700, Andrew Barnert a ?crit : > > Well, you can be sure that int <= long, and in most cases you can assume > > other inequalities such as int <= pointer and int <= double (and even > > pointer <= double). The long <=> pointer relationship is less > > predictable, but on common platforms long <= pointer. > > > > So, double then pointer then long then int. > > On a platform with 64-bit long and 32-bit pointer, this would leave a > gap. And this isn't very interesting since: > Fortunately, even though this is perfectly legal, no major platform > has these sizes. ... which is precisely my point ("on common platforms long <= pointer"). > And that's exactly why you have to prioritize for the most important > platforms, instead of only pretending that standards make that > unnecessary. I didn't pretend so, but perhaps you are trolling. Besides, we are talking about an optimization, not something which breaks expected behaviour. Regards Antoine. From haoyi.sg at gmail.com Wed May 8 22:04:33 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 8 May 2013 16:04:33 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: Just an update for people who are interested, The project (https://github.com/lihaoyi/macropy) is more or less done for now, in its current state as a proof of concept/demo. Almost all of it runs perfectly on both CPython and PyPy, except for the pattern matcher which has some bugs on PyPy we haven't ironed out yet. Jython doesn't work at all: it seems to handle a number of things about the ast module pretty differently from either PyPy or CPython. We've got a pretty impressive list of feature demos: - Quasiquotes , a quick way to manipulate fragments of a program - String Interpolation, a common feature in many languages - Pyxl , integrating XML markup into a Python program - Tracing and Smart Asserts - Case Classes , easy Algebraic Data Types from Scala - Pattern Matching from the Functional Programming world - LINQ to SQL from C# - Quick Lambdas from Scala and Groovy, - Parser Combinators, inspired by Scala's . And have pushed a release to PyPI (https://pypi.python.org/pypi/MacroPy), to make it easier for people to download it and mess around. Hopefully somebody will find this useful in messing around with the Python language! Thanks! -Haoyi On Sat, Apr 27, 2013 at 11:05 PM, Haoyi Li wrote: > I pushed a simple implementation of case classes using > Macros, as well as a really nice to use parser combinator library. > The case classes are interesting because they overlap a lot with > enumerations: auto-generated __str__, __repr__, inheritence via nesting, > they can have members and methods, etc. > > They also show off pretty well how far Python's syntax (and semantic!) can > be stretched using macros, so if anyone still has some crazy ideas for > enumerations and wants to prototype them without hacking the CPython > interpreter, this is your chance! > > Thanks! > -Haoyi > > > On Wed, Apr 24, 2013 at 3:15 PM, Haoyi Li wrote: > >> @Jonathan: That would be possible, although I can't say I know how to do >> it. A naive macro that wraps everything and has a "substitute awaits for >> yields, wrap them in inlineCallbacks(), and substitute returns for >> returnValue()s" may work, but I'm guessing it would run into a forest of >> edge cases where the code isn't so simple (what if you *want* a return? >> etc.). >> >> pdb *should* show the code after macro expansion. Without source maps, >> I'm not sure there's any way around that, so debugging may be hard. >> >> Of course, if the alternative is macros of forking the interpreter, maybe >> macros is the easier way to do it =) Debugging a buggy custom-forked >> interpreter probably isn't easy either! >> >> >> On Wed, Apr 24, 2013 at 5:48 PM, Jonathan Slenders wrote: >> >>> One use case I have is for Twisted's inlineCallbacks. I forked the >>> pypy project to implement the await-keyword. Basically it transforms: >>> >>> def async_function(deferred_param): >>> a = await deferred_param >>> b = await some_call(a) >>> return b >>> >>> into: >>> >>> @defer.inlineCallbacks >>> def async_function(deferred_param): >>> a = yield deferred_param >>> b = yield some_call(a) >>> yield defer.returnValue(b) >>> >>> >>> Are such things possible? And if so, what lines of code would pdb show >>> during introspection of the code? >>> >>> It's interesting, but when macros become more complicated, the >>> debugging of these things can turn out to be really hard, I think. >>> >>> >>> 2013/4/24 Haoyi Li : >>> > I haven't tested in on various platforms, so hard to say for sure. >>> MacroPy >>> > basically relies on a few things: >>> > >>> > - exec/eval >>> > - PEP 302 >>> > - the ast module >>> > >>> > All of these are pretty old pieces of python (almost 10 years old!) so >>> it's >>> > not some new-and-fancy functionality. Jython seems to have all of >>> them, I >>> > couldn't find any information about PyPy. >>> > >>> > When the project is more mature and I have some time, I'll see if I >>> can get >>> > it to work cross platform. If anyone wants to fork the repo and try it >>> out, >>> > that'd be great too! >>> > >>> > -Haoyi >>> > >>> > >>> > >>> > >>> > >>> > On Wed, Apr 24, 2013 at 11:55 AM, Andrew Barnert >>> wrote: >>> >> >>> >> On Apr 24, 2013, at 8:05, Haoyi Li wrote: >>> >> >>> >> You actually can get a syntax like that without macros, using >>> >> stack-introspection, locals-trickery and lots of `eval`. The question >>> is >>> >> whether you consider macros more "extreme" than stack-introspection, >>> >> locals-trickery and `eval`! A JIT compiler will probably be much >>> happier >>> >> with macros. >>> >> >>> >> >>> >> That last point makes this approach seem particularly interesting to >>> me, >>> >> which makes me wonder: Is your code CPython specific, or does it also >>> work >>> >> with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot >>> easier to >>> >> mess with in the first place than CPython, having macros at the same >>> >> language level as your code is just as interesting in both >>> implementations. >>> >> >>> >> >>> >> On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy >>> >> wrote: >>> >>> >>> >>> On 4/23/2013 11:49 PM, Haoyi Li wrote: >>> >>>> >>> >>>> I thought this may be of interest to some people on this list, even >>> if >>> >>>> not strictly an "idea". >>> >>>> >>> >>>> I'm working on MacroPy , a >>> little >>> >>>> >>> >>>> pure-python library that allows user-defined AST rewrites as part >>> of the >>> >>>> import process (using PEP 302). >>> >>> >>> >>> >>> >>> From the readme >>> >>> ''' >>> >>> String Interpolation >>> >>> >>> >>> a, b = 1, 2 >>> >>> c = s%"%{a} apple and %{b} bananas" >>> >>> print c >>> >>> #1 apple and 2 bananas >>> >>> ''' >>> >>> I am a little surprised that you would base a cutting edge extension >>> on >>> >>> Py 2. Do you have it working with 3.3 also? >>> >>> >>> >>> '''Unlike the normal string interpolation in Python, MacroPy's string >>> >>> interpolation allows the programmer to specify the variables to be >>> >>> interpolated inline inside the string.''' >>> >>> >>> >>> Not true as I read that. >>> >>> >>> >>> a, b = 1, 2 >>> >>> print("{a} apple and {b} bananas".format(**locals())) >>> >>> print("%(a)s apple and %(b)s bananas" % locals()) >>> >>> #1 apple and 2 bananas >>> >>> #1 apple and 2 bananas >>> >>> >>> >>> I rather like the anon funcs with anon params. That only works when >>> each >>> >>> param is only used once in the expression, but that restriction is >>> the >>> >>> normal case. >>> >>> >>> >>> I am interested to see what you do with pattern matching. >>> >>> >>> >>> tjr >>> >>> >>> >>> _______________________________________________ >>> >>> Python-ideas mailing list >>> >>> Python-ideas at python.org >>> >>> http://mail.python.org/mailman/listinfo/python-ideas >>> >> >>> >> >>> >> _______________________________________________ >>> >> Python-ideas mailing list >>> >> Python-ideas at python.org >>> >> http://mail.python.org/mailman/listinfo/python-ideas >>> > >>> > >>> > >>> > _______________________________________________ >>> > Python-ideas mailing list >>> > Python-ideas at python.org >>> > http://mail.python.org/mailman/listinfo/python-ideas >>> > >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed May 8 23:32:45 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Wed, 08 May 2013 17:32:45 -0400 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: On 5/8/2013 1:58 PM, Charles-Fran?ois Natali wrote: >> General +0 from me. > > Thanks for your help Nick ;-) First question: is there any downside in terms of breaking code? Is order of fields so private that we ignore any order-dependent code? For someone accessing structure bytes by system-dependent pointer offset, I imagine so. But are these structs things that ctypes users would use, and do such users have to duplicate struct definition, including field order, in their coce? (I am not a ctypes user, but know there are ctype issues around structs and alignment.) > Otherwise, yes, a good rule of thumb is to order fields by decreasing > alignment constraints (which is in turn dependent of the field size, > since they're usually aligned on a multiple of their size). > > So what's the consensus, should I bother creating an entry in the tracker? I think this is a 'scratch my itch' kind of issue. Assuming no downside, I suggest opening an issue with patch for the change you think would give the most benefit and see what response you get. How much space might be saved in a real application. If necessary, try pydev also. tjr From duda.piotr at gmail.com Thu May 9 12:29:45 2013 From: duda.piotr at gmail.com (Piotr Duda) Date: Thu, 9 May 2013 12:29:45 +0200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects Message-ID: In discussion about http://www.python.org/dev/peps/pep-0435/ being mentioned problems with enums created with functional syntax: - need for explicit pass of type name to "function" that create enum type: Animals = Enum('Animals', 'dog cat bird') which violates DRY - pickling which requires __module__ atribute to be set, proposed solutions for this was either use non-portable and fragile _getframe hack, or explicit pass module name: Animals = Enum('animals.Animals', 'dog cat bird') aside violating DRY, this solution also has other problems, it won't work correctly if module is executed directly, and will break if pickling nested classes from http://www.python.org/dev/peps/pep-3154/ will be implemented (it may by bypassed by either provide separate arguments for module and type name, or using different separator for separating module and class name) These also apply for other objects like NamedTuple or mentioned NamedValues. To solve these problems I propose to add simple syntax that assigns these attributes to arbitrary object: def name = expression other possible forms may be: def name from expression class name = expression class name from expression name := expression # new operator which would be equivalent for: _tmp = expression _tmp.__name__ = 'name' _tmp.__qualname__ = ... # corresponding qualname _tmp.__module__ = __name__ # apply decorators if present name = _tmp with new syntax declaring Enum will look like def Animals = Enum('dog cat bird') as pointed by Larry it may be done using existing syntax in form: @Enum('dog cat bird') def Animals(): pass but it's ugly, and may by confusing. Other examples: def MyTuple = NamedTuple("a b c d") def PI = NamedValue(3.1415926) -- ???????? ?????? From p.f.moore at gmail.com Thu May 9 13:19:10 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 9 May 2013 12:19:10 +0100 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: On 9 May 2013 11:29, Piotr Duda wrote: > > To solve these problems I propose to add simple syntax that assigns > these attributes to arbitrary object: > def name = expression > other possible forms may be: > def name from expression > class name = expression > class name from expression > name := expression # new operator > > > which would be equivalent for: > _tmp = expression > _tmp.__name__ = 'name' > _tmp.__qualname__ = ... # corresponding qualname > _tmp.__module__ = __name__ > # apply decorators if present > name = _tmp > Just for clarification, if you used this syntax with an expression which returned an object which *didn't* allow attributes to be set, I assume it would simply fail at runtime with an AttributeError? For example, def x = 12 This isn't a point against the syntax, I just think it's worth being explicit that this is what would happen. Overall, I'm somewhat indifferent. The use case seems fairly specialised to me, and yet the syntax "def name = value" seems like it's worth reserving for something a bit more generally useful. Maybe the def name=value syntax should implement a protocol, that objects like enum and namedtuple subclasses can hook into (in the same way that the context manager and iterator protocols work, or indeed the whole class definition mechanism). Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From duda.piotr at gmail.com Thu May 9 13:52:01 2013 From: duda.piotr at gmail.com (Piotr Duda) Date: Thu, 9 May 2013 13:52:01 +0200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: 2013/5/9 Paul Moore : > On 9 May 2013 11:29, Piotr Duda wrote: >> >> >> To solve these problems I propose to add simple syntax that assigns >> these attributes to arbitrary object: >> def name = expression >> other possible forms may be: >> def name from expression >> class name = expression >> class name from expression >> name := expression # new operator >> >> >> which would be equivalent for: >> _tmp = expression >> _tmp.__name__ = 'name' >> _tmp.__qualname__ = ... # corresponding qualname >> _tmp.__module__ = __name__ >> # apply decorators if present >> name = _tmp > > > Just for clarification, if you used this syntax with an expression which > returned an object which *didn't* allow attributes to be set, I assume it > would simply fail at runtime with an AttributeError? For example, > > def x = 12 Yes, it fails, I thought about ignoring exceptions on attribute assignment, but then the syntax wouldn't provide any guarantees and in those cases it will be equivalent of simple assignment. > > This isn't a point against the syntax, I just think it's worth being > explicit that this is what would happen. > > Overall, I'm somewhat indifferent. The use case seems fairly specialised to > me, and yet the syntax "def name = value" seems like it's worth reserving > for something a bit more generally useful. > > Maybe the def name=value syntax should implement a protocol, that objects > like enum and namedtuple subclasses can hook into (in the same way that the > context manager and iterator protocols work, or indeed the whole class > definition mechanism). This may be good idea. -- ???????? ?????? From storchaka at gmail.com Thu May 9 15:53:34 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 09 May 2013 16:53:34 +0300 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: 09.05.13 00:32, Terry Jan Reedy ???????(??): > First question: is there any downside in terms of breaking code? It will break some sys.getsizeof() tests. These tests should be corrected when structure layout will changed. From ncoghlan at gmail.com Thu May 9 16:08:52 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 May 2013 00:08:52 +1000 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: On 9 May 2013 21:53, "Piotr Duda" wrote: > > 2013/5/9 Paul Moore : > > On 9 May 2013 11:29, Piotr Duda wrote: > >> > >> > >> To solve these problems I propose to add simple syntax that assigns > >> these attributes to arbitrary object: > >> def name = expression > >> other possible forms may be: > >> def name from expression > >> class name = expression > >> class name from expression > >> name := expression # new operator One more possible colour for the bikeshed: name def= expression > >> > >> > >> which would be equivalent for: > >> _tmp = expression > >> _tmp.__name__ = 'name' > >> _tmp.__qualname__ = ... # corresponding qualname > >> _tmp.__module__ = __name__ > >> # apply decorators if present > >> name = _tmp > > > > > > Just for clarification, if you used this syntax with an expression which > > returned an object which *didn't* allow attributes to be set, I assume it > > would simply fail at runtime with an AttributeError? For example, > > > > def x = 12 > > Yes, it fails, I thought about ignoring exceptions on attribute > assignment, but then the syntax wouldn't provide any guarantees and in > those cases it will be equivalent of simple assignment. > > > > > This isn't a point against the syntax, I just think it's worth being > > explicit that this is what would happen. > > > > Overall, I'm somewhat indifferent. The use case seems fairly specialised to > > me, and yet the syntax "def name = value" seems like it's worth reserving > > for something a bit more generally useful. > > > > Maybe the def name=value syntax should implement a protocol, that objects > > like enum and namedtuple subclasses can hook into (in the same way that the > > context manager and iterator protocols work, or indeed the whole class > > definition mechanism). > > This may be good idea. An intriguing idea, indeed. I can't promise I'll approve of the end result, but I think a PEP proposing a name binding protocol that passes in the module name, the "location" within the module (when inside a function or class) and the target name could be worth reading. Directly setting __module__, __name__ and __qualname__ may be a reasonable default behaviour. The new syntax is essentially competing with the current implicit-but-fragile stack introspection and the explicit-but-cumbersome passing of the target name. Even if the ultimate verdict ends being "not worth the hassle", we would at least have a common reference point when this discussion next comes up (it seems to be every couple of years or so). Cheers, Nick. > > > -- > ???????? > ?????? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu May 9 16:57:58 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 09 May 2013 15:57:58 +0100 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: <518BB976.5080001@mrabarnett.plus.com> On 09/05/2013 15:08, Nick Coghlan wrote: > > On 9 May 2013 21:53, "Piotr Duda" > wrote: > > > > 2013/5/9 Paul Moore >: > > > On 9 May 2013 11:29, Piotr Duda > wrote: > > >> > > >> > > >> To solve these problems I propose to add simple syntax that assigns > > >> these attributes to arbitrary object: > > >> def name = expression > > >> other possible forms may be: > > >> def name from expression > > >> class name = expression > > >> class name from expression > > >> name := expression # new operator > > One more possible colour for the bikeshed: > > name def= expression > Considering that attributes of 'name' are also being set, how about: name .= expression > > >> > > >> > > >> which would be equivalent for: > > >> _tmp = expression > > >> _tmp.__name__ = 'name' > > >> _tmp.__qualname__ = ... # corresponding qualname > > >> _tmp.__module__ = __name__ > > >> # apply decorators if present > > >> name = _tmp > > > > > > > > > Just for clarification, if you used this syntax with an expression > > > which returned an object which *didn't* allow attributes to be set, > > > I assume it would simply fail at runtime with an AttributeError? For > > > example, > > > > > > def x = 12 > > > > Yes, it fails, I thought about ignoring exceptions on attribute > > assignment, but then the syntax wouldn't provide any guarantees and in > > those cases it will be equivalent of simple assignment. > > > > > > > > This isn't a point against the syntax, I just think it's worth being > > > explicit that this is what would happen. > > > > > > Overall, I'm somewhat indifferent. The use case seems fairly > > > specialised to me, and yet the syntax "def name = value" seems like > > > it's worth reserving for something a bit more generally useful. > > > > > > Maybe the def name=value syntax should implement a protocol, that > > > objects like enum and namedtuple subclasses can hook into (in the same > > > way that the context manager and iterator protocols work, or indeed the > > > whole class definition mechanism). > > > > This may be good idea. > > An intriguing idea, indeed. I can't promise I'll approve of the end > result, but I think a PEP proposing a name binding protocol that passes > in the module name, the "location" within the module (when inside a > function or class) and the target name could be worth reading. > > Directly setting __module__, __name__ and __qualname__ may be a > reasonable default behaviour. > > The new syntax is essentially competing with the current > implicit-but-fragile stack introspection and the explicit-but-cumbersome > passing of the target name. > > Even if the ultimate verdict ends being "not worth the hassle", we would > at least have a common reference point when this discussion next comes > up (it seems to be every couple of years or so). > From greg.ewing at canterbury.ac.nz Fri May 10 01:08:46 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 10 May 2013 11:08:46 +1200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: <518C2C7E.6000104@canterbury.ac.nz> Paul Moore wrote: > Overall, I'm somewhat indifferent. The use case seems fairly specialised > to me, and yet the syntax "def name = value" seems like it's worth > reserving for something a bit more generally useful. Not sure about the syntax, but I for one would find something like this useful for other purposes. For example, in some of my libraries I have a function that creates a special kind of property that needs to know its own name. Currently you have to write it like this: class Foo: blarg = overridable_property('blarg', "The blarginess of the Foo") which is an annoying DRY violation. Using the proposed syntax, it could be written class Foo: def blarg = overridable_property("The blarginess of the Foo") > Maybe the def name=value syntax should implement a protocol, Hmmm. Maybe def name = value could turn into name = value.__def__('name', __name__) -- Greg From p.f.moore at gmail.com Fri May 10 09:06:54 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 10 May 2013 08:06:54 +0100 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <518C2C7E.6000104@canterbury.ac.nz> References: <518C2C7E.6000104@canterbury.ac.nz> Message-ID: On 10 May 2013 00:08, Greg Ewing wrote: > Hmmm. Maybe > > def name = value > > could turn into > > name = value.__def__('name', __name__) > Yes, that's the sort of thing I had in mind, although it hadn't occurred to me that it would be as simple as this. To satisfy the original use case, it would need the module the name is defined in passed to __def__ as well. One other possibility would be passing the containing class, too, but I don't think that's needed - the metaclass machinery gives us the means to play class-based tricks already. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri May 10 09:06:59 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 10 May 2013 09:06:59 +0200 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: Terry Jan Reedy, 08.05.2013 23:32: > On 5/8/2013 1:58 PM, Charles-Fran?ois Natali wrote: >>> General +0 from me. >> >> Thanks for your help Nick ;-) > > First question: is there any downside in terms of breaking code? Certainly. It should not be done for public structs, which includes basically everything that resides in header files. Modifying public structs changes the ABI, so a module compiled for one CPython version would need to be recompiled for the one that changes the structs *if* it uses them. I don't think this change is worth that risk and hassle. Stefan From ubershmekel at gmail.com Fri May 10 09:25:06 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Fri, 10 May 2013 10:25:06 +0300 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <518C2C7E.6000104@canterbury.ac.nz> References: <518C2C7E.6000104@canterbury.ac.nz> Message-ID: On Fri, May 10, 2013 at 2:08 AM, Greg Ewing wrote: > Paul Moore wrote: > >> Overall, I'm somewhat indifferent. The use case seems fairly specialised >> to me, and yet the syntax "def name = value" seems like it's worth >> reserving for something a bit more generally useful. >> > > Hmmm. Maybe > > def name = value > > could turn into > > name = value.__def__('name', __name__) > > C-esque macros (and perhaps macropy) could implement this and decorators under the same roof. Perhaps we can somehow solve the more general problem without introducing macro kludge? Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri May 10 09:39:23 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 10 May 2013 17:39:23 +1000 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 5:06 PM, Stefan Behnel wrote: > Terry Jan Reedy, 08.05.2013 23:32: >> On 5/8/2013 1:58 PM, Charles-Fran?ois Natali wrote: >>>> General +0 from me. >>> >>> Thanks for your help Nick ;-) >> >> First question: is there any downside in terms of breaking code? > > Certainly. It should not be done for public structs, which includes > basically everything that resides in header files. Modifying public structs > changes the ABI, so a module compiled for one CPython version would need to > be recompiled for the one that changes the structs *if* it uses them. I > don't think this change is worth that risk and hassle. Would that be a problem if the change is done only in 3.4, though? ChrisA From p.f.moore at gmail.com Fri May 10 09:58:47 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 10 May 2013 08:58:47 +0100 Subject: [Python-ideas] improving C structs layout In-Reply-To: References: Message-ID: On 10 May 2013 08:39, Chris Angelico wrote: > > Certainly. It should not be done for public structs, which includes > > basically everything that resides in header files. Modifying public > structs > > changes the ABI, so a module compiled for one CPython version would need > to > > be recompiled for the one that changes the structs *if* it uses them. I > > don't think this change is worth that risk and hassle. > > Would that be a problem if the change is done only in 3.4, though? It would affect the stable ABI. Structs in the stable ABI can't be changed until Python 4, AIUI. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri May 10 10:20:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 10:20:16 +0200 Subject: [Python-ideas] improving C structs layout References: Message-ID: <20130510102016.39b3d420@pitrou.net> Le Fri, 10 May 2013 08:58:47 +0100, Paul Moore a ?crit : > On 10 May 2013 08:39, Chris Angelico > wrote: > > > > Certainly. It should not be done for public structs, which > > > includes basically everything that resides in header files. > > > Modifying public > > structs > > > changes the ABI, so a module compiled for one CPython version > > > would need > > to > > > be recompiled for the one that changes the structs *if* it uses > > > them. I don't think this change is worth that risk and hassle. > > > > Would that be a problem if the change is done only in 3.4, though? > > > It would affect the stable ABI. Structs in the stable ABI can't be > changed until Python 4, AIUI. You are right. We can only change the layout of those structs which are not in the stable ABI. (fortunately, I think Martin has generally been wise enough to exclude implementation details from the ABI :-)) Regards Antoine. From alexandre.boulay59 at gmail.com Fri May 10 11:36:59 2013 From: alexandre.boulay59 at gmail.com (Alexandre Boulay) Date: Fri, 10 May 2013 11:36:59 +0200 Subject: [Python-ideas] improve Idle Message-ID: I think that could be a good idea to put colored dots on idle's scroll bar for each def or class created, each got its own color, that's not a big conceptual improvement but that could be helpfull to show the structure, show what is what and which class is in which class -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Fri May 10 14:32:17 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 May 2013 08:32:17 -0400 Subject: [Python-ideas] Add a .pop() method to ZipFile Message-ID: Check out this efficient way to remove the last file from any ordinary zip file. class TruncatingZipFile(zipfile.ZipFile): """ZipFile that can pop files off the end. This works for ordinary zip files that do not contain non-ZIP data interleaved between the compressed files.""" def pop(self): """Truncate the last file off this zipfile.""" if not self.fp: raise RuntimeError( "Attempt to pop from ZIP archive that was already closed") last = self.infolist().pop() del self.NameToInfo[last.filename] self.fp.seek(last.header_offset, os.SEEK_SET) self.fp.truncate() self._didModify = True From rovitotv at gmail.com Fri May 10 14:52:57 2013 From: rovitotv at gmail.com (Todd V. Rovito) Date: Fri, 10 May 2013 08:52:57 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: References: Message-ID: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> On May 10, 2013, at 5:36 AM, Alexandre Boulay wrote: > I think that could be a good idea to put colored dots on idle's scroll bar for each def or class created, each got its own color, that's not a big conceptual improvement but that could be helpfull to show the structure, show what is what and which class is in which class _______________________________________________ Alexandre, Sounds like a great idea to me! I recommend you open up an enhancement issue on bugs.python.org then write a patch. You could make this an IDLE extension for the editor window that way people could turn it off/on as they desire. Another thought is to create the issue then post a link and a brief email to the idle-dev mailing list. From solipsis at pitrou.net Fri May 10 15:01:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 15:01:14 +0200 Subject: [Python-ideas] Add a .pop() method to ZipFile References: Message-ID: <20130510150114.56f776af@pitrou.net> Le Fri, 10 May 2013 08:32:17 -0400, Daniel Holth a ?crit : > Check out this efficient way to remove the last file from any > ordinary zip file. But why would you care such an operation? A generic remove() operation would sound more useful. Regards Antoine. From masklinn at masklinn.net Fri May 10 15:22:08 2013 From: masklinn at masklinn.net (Masklinn) Date: Fri, 10 May 2013 15:22:08 +0200 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <20130510150114.56f776af@pitrou.net> References: <20130510150114.56f776af@pitrou.net> Message-ID: <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> On 2013-05-10, at 15:01 , Antoine Pitrou wrote: > Le Fri, 10 May 2013 08:32:17 -0400, > Daniel Holth a ?crit : >> Check out this efficient way to remove the last file from any >> ordinary zip file. > > But why would you care such an operation? A generic remove() operation > would sound more useful. Guessing it's because a generic `remove()` is less trivial to implement, which would also be why the suggestion is for .pop() but not .pop([index]): removing files at the end means you can just truncate the archive and append the central directory[0], removing files in the middle means either zeroing and leaving a hole or moving all following files. Although technically it's only simple if you assume file content and central directory entries are in the same order, which is unwarranted. So, guessing it's because you can do a half-assed job at implementing pop(), it will usually work (and will silently corrupt your archive when it does not) [0] you may also need to rewrite all offsets in the central directory, I don't remember if they are offset from file start or central directory record start. From dholth at gmail.com Fri May 10 15:35:47 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 May 2013 09:35:47 -0400 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> Message-ID: On Fri, May 10, 2013 at 9:22 AM, Masklinn wrote: > On 2013-05-10, at 15:01 , Antoine Pitrou wrote: > >> Le Fri, 10 May 2013 08:32:17 -0400, >> Daniel Holth a ?crit : >>> Check out this efficient way to remove the last file from any >>> ordinary zip file. >> >> But why would you care such an operation? A generic remove() operation >> would sound more useful. > > Guessing it's because a generic `remove()` is less trivial to implement, > which would also be why the suggestion is for .pop() but not > .pop([index]): removing files at the end means you can just truncate the > archive and append the central directory[0], removing files in the middle > means either zeroing and leaving a hole or moving all following files. Yes, only the last file can be removed efficiently. > Although technically it's only simple if you assume file content and > central directory entries are in the same order, which is unwarranted. The implementation assumes several things that could be false, but are usually true. A sort and several other checks would be warranted. > So, guessing it's because you can do a half-assed job at implementing > pop(), it will usually work (and will silently corrupt your archive > when it does not) > > [0] you may also need to rewrite all offsets in the central directory, I > don't remember if they are offset from file start or central directory > record start. _didModify = True takes care of rewriting the central directory. From solipsis at pitrou.net Fri May 10 15:44:04 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 15:44:04 +0200 Subject: [Python-ideas] Add a .pop() method to ZipFile References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> Message-ID: <20130510154404.35b2287f@pitrou.net> Le Fri, 10 May 2013 09:35:47 -0400, Daniel Holth a ?crit : > On Fri, May 10, 2013 at 9:22 AM, Masklinn > wrote: > > On 2013-05-10, at 15:01 , Antoine Pitrou wrote: > > > >> Le Fri, 10 May 2013 08:32:17 -0400, > >> Daniel Holth a > >> ?crit : > >>> Check out this efficient way to remove the last file from any > >>> ordinary zip file. > >> > >> But why would you care such an operation? A generic remove() > >> operation would sound more useful. > > > > Guessing it's because a generic `remove()` is less trivial to > > implement, which would also be why the suggestion is for .pop() but > > not .pop([index]): removing files at the end means you can just > > truncate the archive and append the central directory[0], removing > > files in the middle means either zeroing and leaving a hole or > > moving all following files. > > Yes, only the last file can be removed efficiently. But what are the situations where you want to remove the last file with certainty? It sounds like a rather specialized use-case. If you make an analogy with sequences (list, bytearray), you can delete any slice in a sequence, but it will be more efficient towards the end (with a straightforward implementation, anyway :-)). Regards Antoine. From dholth at gmail.com Fri May 10 15:53:51 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 May 2013 09:53:51 -0400 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <20130510154404.35b2287f@pitrou.net> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <20130510154404.35b2287f@pitrou.net> Message-ID: On Fri, May 10, 2013 at 9:44 AM, Antoine Pitrou wrote: > Le Fri, 10 May 2013 09:35:47 -0400, > Daniel Holth a ?crit : >> On Fri, May 10, 2013 at 9:22 AM, Masklinn >> wrote: >> > On 2013-05-10, at 15:01 , Antoine Pitrou wrote: >> > >> >> Le Fri, 10 May 2013 08:32:17 -0400, >> >> Daniel Holth a >> >> ?crit : >> >>> Check out this efficient way to remove the last file from any >> >>> ordinary zip file. >> >> >> >> But why would you care such an operation? A generic remove() >> >> operation would sound more useful. >> > >> > Guessing it's because a generic `remove()` is less trivial to >> > implement, which would also be why the suggestion is for .pop() but >> > not .pop([index]): removing files at the end means you can just >> > truncate the archive and append the central directory[0], removing >> > files in the middle means either zeroing and leaving a hole or >> > moving all following files. >> >> Yes, only the last file can be removed efficiently. > > But what are the situations where you want to remove the last file with > certainty? It sounds like a rather specialized use-case. > > If you make an analogy with sequences (list, bytearray), you can delete > any slice in a sequence, but it will be more efficient towards the end > (with a straightforward implementation, anyway :-)). > > Regards > > Antoine. Come to think of it you probably could come up with a pretty decent .remove() implementation that didn't give each file special attention, just by copying blocks from later to earlier in the file. This was written for wheel which intentionally puts the metadata and digital signatures at the end of the file. You can add or remove signers by replacing that file. From abarnert at yahoo.com Fri May 10 17:53:03 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 10 May 2013 08:53:03 -0700 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> Message-ID: <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> On May 10, 2013, at 6:22, Masklinn wrote: > Although technically it's only simple if you assume file content and > central directory entries are in the same order, which is unwarranted. Correct me if I'm wrong, but you can detect this case, and the case where there's a gap between the last file and the directory, and anything else that would break this code, just by parsing the directory, right? > So, guessing it's because you can do a half-assed job at implementing > pop(), it will usually work (and will silently corrupt your archive > when it does not) ... but a version that had the checks would instead usually work, and raise an exception when it does not. If I'm right, that's just a reason to fix the implementation, not to throw the idea out. That being said, I'm still not sure what the benefit is. Sure, you could use a zipfile as a stack of temporary files or something, but it's still going to be slower than just a temp dir full of gzip files. Or you could destructively recursive-process a zipfile (reversed), but why? There must be a use case I'm missing that made the OP write this in the first place. From tjreedy at udel.edu Fri May 10 19:16:06 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 10 May 2013 13:16:06 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: On 5/10/2013 8:52 AM, Todd V. Rovito wrote: > > On May 10, 2013, at 5:36 AM, Alexandre Boulay > wrote: >> I think that could be a good idea to put colored dots on idle's >> scroll bar for each def or class created, each got its own color, I cannot really understand what you are proposing. The scroll bar is for scrolling, and it has the arrow buttons and the bar itself that would interfere with placing dots. Furthermore, scroll bars are widgets defined by tk and as far as I know, IDLE has no control over the detailed appearance. >> that's not a big conceptual improvement but that could be helpfull >> to show the structure, show what is what and which class is in >> which class _______________________________________________ Furthermore, I do not see how dots would really show that structure. Are you familiar with with the Code Context option on the Options menu? Although I think it needs some polishing (to show all context, not just the three innermost lines), it already does what you seem to want, but with indented names rather than by nameless dots. > Alexandre, Sounds like a great idea to me! I recommend you open up > an enhancement issue on bugs.python.org If, after looking as the existing Code Context option, you still have an idea for improvememt, please post to idle-dev first. Todd, please don't suggest that people post half-baked, possibly impossible to code, ideas to the tracker. The tracker already has a thousand enhancement requests. Many are dead clutter. Others need discussion that they will never get on the tracker. In general, I think it is much better for code ideas to come first to this list or, for Idle ideas, idle-dev, to see if they are new, feasible, and have sufficient support to be applied once coded. Terry From dholth at gmail.com Fri May 10 19:30:18 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 May 2013 13:30:18 -0400 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> Message-ID: On Fri, May 10, 2013 at 11:53 AM, Andrew Barnert wrote: > On May 10, 2013, at 6:22, Masklinn wrote: > >> Although technically it's only simple if you assume file content and >> central directory entries are in the same order, which is unwarranted. > > Correct me if I'm wrong, but you can detect this case, and the case where there's a gap between the last file and the directory, and anything else that would break this code, just by parsing the directory, right? > >> So, guessing it's because you can do a half-assed job at implementing >> pop(), it will usually work (and will silently corrupt your archive >> when it does not) > > ... but a version that had the checks would instead usually work, and raise an exception when it does not. > > If I'm right, that's just a reason to fix the implementation, not to throw the idea out. The improved implementation would need to sort the central directory entries by the location of their data, making sure the files to be affected had no extra bytes between them and did not overlap. Archives created by zipfile follow those rules already. > That being said, I'm still not sure what the benefit is. Sure, you could use a zipfile as a stack of temporary files or something, but it's still going to be slower than just a temp dir full of gzip files. Or you could destructively recursive-process a zipfile (reversed), but why? There must be a use case I'm missing that made the OP write this in the first place. I wrote it because the last file in a particular zip archive is an embedded digital signature. To re-sign the file you may remove the file, add or remove signatures from that file, and append the new signatures file. From storchaka at gmail.com Fri May 10 19:42:17 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 10 May 2013 20:42:17 +0300 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> Message-ID: 10.05.13 20:30, Daniel Holth ???????(??): > I wrote it because the last file in a particular zip archive is an > embedded digital signature. To re-sign the file you may remove the > file, add or remove signatures from that file, and append the new > signatures file. Well. There is no need to include this specialized feature in the stdlib. Just use it in your project. Instead of touching private _didModify attribute, set `self.comment = self.comment`. From tjreedy at udel.edu Fri May 10 19:55:13 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 10 May 2013 13:55:13 -0400 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: Message-ID: On 5/10/2013 8:32 AM, Daniel Holth wrote: > Check out this efficient way to remove the last file from any ordinary zip file. > > class TruncatingZipFile(zipfile.ZipFile): > """ZipFile that can pop files off the end. This works for ordinary zip > files that do not contain non-ZIP data interleaved between the compressed > files.""" > > def pop(self): > """Truncate the last file off this zipfile.""" > if not self.fp: > raise RuntimeError( > "Attempt to pop from ZIP archive that was already closed") > last = self.infolist().pop() > del self.NameToInfo[last.filename] > self.fp.seek(last.header_offset, os.SEEK_SET) > self.fp.truncate() > self._didModify = True I object to the name. Pop methods -- list.pop, set.pop, dict.pop, and dict.popitem -- remove *and return* an item from a collection. They raise exceptions when attempting to pop from an empty collection. (Dict.pop is a semi-exception). This usage of 'pop' in Python derives from the pop functions/methods of classical stacks. This method merely removes. If self.infolist is a list, I guess it would also raise IndexError when empty. tjr From guido at python.org Fri May 10 20:48:51 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 10 May 2013 11:48:51 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? Message-ID: I just spent a few minutes staring at a bug caused by a missing comma -- I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). This is a fairly common mistake, and IIRC at Google we even had a lint rule against this (there was also a Python dialect used for some specific purpose where this was explicitly forbidden). Now, with modern compiler technology, we can (and in fact do) evaluate compile-time string literal concatenation with the '+' operator, so there's really no reason to support 'a' 'b' any more. (The reason was always rather flimsy; I copied it from C but the reason why it's needed there doesn't really apply to Python, as it is mostly useful inside macros.) Would it be reasonable to start deprecating this and eventually remove it from the language? -- --Guido van Rossum (python.org/~guido) From matt at whoosh.ca Fri May 10 20:50:47 2013 From: matt at whoosh.ca (Matt Chaput) Date: Fri, 10 May 2013 14:50:47 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518D4187.5030902@whoosh.ca> On 5/10/2013 2:48 PM, Guido van Rossum wrote: > Would it be reasonable to start deprecating this and eventually remove > it from the language? Yes please! I've been bitten by the same issue more than once. Matt From dave at krondo.com Fri May 10 20:58:49 2013 From: dave at krondo.com (Dave Peticolas) Date: Fri, 10 May 2013 11:58:49 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: 2013/5/10 Guido van Rossum > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? >From my perspective as a Python user (not knowing anything about the ramifications for the required changes to the parser, etc.) it is very reasonable. This bug is very hard to spot when it happens, and an argument count error is really one of the more benign forms it can take. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Dave Peticolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri May 10 21:16:13 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 21:16:13 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? References: Message-ID: <20130510211613.53f7649d@fsol> On Fri, 10 May 2013 11:48:51 -0700 Guido van Rossum wrote: > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? I'm rather -1. It's quite convenient and I don't want to add some '+' signs everywhere I use it. I'm sure many people also have long string literals out there and will have to endure the pain of a dull task to "fix" their code. However, in your case, foo('a' 'b') could raise a SyntaxWarning, since the "continuation" is on the same line. Regards Antoine. From ezio.melotti at gmail.com Fri May 10 21:18:19 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Fri, 10 May 2013 22:18:19 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130510211613.53f7649d@fsol> References: <20130510211613.53f7649d@fsol> Message-ID: On Fri, May 10, 2013 at 10:16 PM, Antoine Pitrou wrote: > On Fri, 10 May 2013 11:48:51 -0700 > Guido van Rossum wrote: >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > I'm rather -1. It's quite convenient and I don't want to add some '+' > signs everywhere I use it. I'm sure many people also have long string > literals out there and will have to endure the pain of a dull task to > "fix" their code. > > However, in your case, foo('a' 'b') could raise a SyntaxWarning, since > the "continuation" is on the same line. > I was going to say the exact same thing -- you just read my mind :) > Regards > > Antoine. > From alexander.belopolsky at gmail.com Fri May 10 21:26:10 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 10 May 2013 15:26:10 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 2:48 PM, Guido van Rossum wrote: > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). I had a similar experience just few weeks ago. The bug was in a long list written like this: ['item11', 'item12', ..., 'item17', 'item21', 'item22', ..., 'item27' ... 'item91', 'item92', ..., 'item97'] Clearly the bug crept in when more items were added. (I try to keep redundant commas at the end of the list to avoid this, but not everyone likes this style.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? > +1, but I would start by requiring () around concatenated strings. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Fri May 10 21:28:48 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 10 May 2013 21:28:48 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130510211613.53f7649d@fsol> References: <20130510211613.53f7649d@fsol> Message-ID: <518D4A70.1030807@egenix.com> On 10.05.2013 21:16, Antoine Pitrou wrote: > On Fri, 10 May 2013 11:48:51 -0700 > Guido van Rossum wrote: >> I just spent a few minutes staring at a bug caused by a missing comma >> -- I got a mysterious argument count error because instead of foo('a', >> 'b') I had written foo('a' 'b'). >> >> This is a fairly common mistake, and IIRC at Google we even had a lint >> rule against this (there was also a Python dialect used for some >> specific purpose where this was explicitly forbidden). >> >> Now, with modern compiler technology, we can (and in fact do) evaluate >> compile-time string literal concatenation with the '+' operator, so >> there's really no reason to support 'a' 'b' any more. (The reason was >> always rather flimsy; I copied it from C but the reason why it's >> needed there doesn't really apply to Python, as it is mostly useful >> inside macros.) >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > I'm rather -1. It's quite convenient and I don't want to add some '+' > signs everywhere I use it. I'm sure many people also have long string > literals out there and will have to endure the pain of a dull task to > "fix" their code. > > However, in your case, foo('a' 'b') could raise a SyntaxWarning, since > the "continuation" is on the same line. Nice idea. I mostly use this feature when writing multi-line or too-long-to-fit-on- one-editor-line string literals: s = ('abc\n' 'def\n' 'ghi\n') t = ('some long paragraph spanning multiple lines in an editor, ' 'without newlines') This looks and works much better than triple-quoted string literals, esp. when defining such string literals in indented code. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Fri May 10 21:30:15 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 10 May 2013 12:30:15 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130510211613.53f7649d@fsol> References: <20130510211613.53f7649d@fsol> Message-ID: On Fri, May 10, 2013 at 12:16 PM, Antoine Pitrou wrote: > On Fri, 10 May 2013 11:48:51 -0700 > Guido van Rossum wrote: >> I just spent a few minutes staring at a bug caused by a missing comma >> -- I got a mysterious argument count error because instead of foo('a', >> 'b') I had written foo('a' 'b'). >> >> This is a fairly common mistake, and IIRC at Google we even had a lint >> rule against this (there was also a Python dialect used for some >> specific purpose where this was explicitly forbidden). >> >> Now, with modern compiler technology, we can (and in fact do) evaluate >> compile-time string literal concatenation with the '+' operator, so >> there's really no reason to support 'a' 'b' any more. (The reason was >> always rather flimsy; I copied it from C but the reason why it's >> needed there doesn't really apply to Python, as it is mostly useful >> inside macros.) >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > I'm rather -1. It's quite convenient and I don't want to add some '+' > signs everywhere I use it. I'm sure many people also have long string > literals out there and will have to endure the pain of a dull task to > "fix" their code. Fixing this is an easy task for lib2to3 though. I think the "convenience" argument doesn't cut it -- if Python didn't have it, can you imagine it being added? It would never make it past all the examples of code broken by missing commas. > However, in your case, foo('a' 'b') could raise a SyntaxWarning, since > the "continuation" is on the same line. There are plenty of examples where the continuation isn't on the same line (some were already posted here). -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Fri May 10 21:07:10 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 10 May 2013 12:07:10 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518D455E.9010402@stoneleaf.us> On 05/10/2013 11:48 AM, Guido van Rossum wrote: > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? Sounds good to me. -- ~Ethan~ From solipsis at pitrou.net Fri May 10 21:37:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 21:37:07 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? References: <20130510211613.53f7649d@fsol> Message-ID: <20130510213707.0df3f992@fsol> On Fri, 10 May 2013 12:30:15 -0700 Guido van Rossum wrote: > On Fri, May 10, 2013 at 12:16 PM, Antoine Pitrou wrote: > > On Fri, 10 May 2013 11:48:51 -0700 > > Guido van Rossum wrote: > >> I just spent a few minutes staring at a bug caused by a missing comma > >> -- I got a mysterious argument count error because instead of foo('a', > >> 'b') I had written foo('a' 'b'). > >> > >> This is a fairly common mistake, and IIRC at Google we even had a lint > >> rule against this (there was also a Python dialect used for some > >> specific purpose where this was explicitly forbidden). > >> > >> Now, with modern compiler technology, we can (and in fact do) evaluate > >> compile-time string literal concatenation with the '+' operator, so > >> there's really no reason to support 'a' 'b' any more. (The reason was > >> always rather flimsy; I copied it from C but the reason why it's > >> needed there doesn't really apply to Python, as it is mostly useful > >> inside macros.) > >> > >> Would it be reasonable to start deprecating this and eventually remove > >> it from the language? > > > > I'm rather -1. It's quite convenient and I don't want to add some '+' > > signs everywhere I use it. I'm sure many people also have long string > > literals out there and will have to endure the pain of a dull task to > > "fix" their code. > > Fixing this is an easy task for lib2to3 though. Assuming someone does it :-) You may also have to "fix" other software. For example, I don't know if gettext supports fetching literals from triple-quoted Python strings, while it works with string continuations. As for "+", saying it is a replacement is a bit simplified, because the syntax definition (for method calls) or operator precedence (for e.g. %-formatting) may force you to add parentheses. Regards Antoine. From barry at python.org Fri May 10 21:41:16 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 10 May 2013 15:41:16 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? References: <20130510211613.53f7649d@fsol> <518D4A70.1030807@egenix.com> Message-ID: <20130510154116.70f2b4ce@anarchist> On May 10, 2013, at 09:28 PM, M.-A. Lemburg wrote: >>> Would it be reasonable to start deprecating this and eventually remove >>> it from the language? I'm pretty mixed. OT1H, you're right, it's a common mistake and often *very* hard to spot. A SyntaxWarning when it appears on a single line doesn't help because I'm much more likely to forget a trailing comma in situations like: files = [ '/tmp/foo', '/etc/passwd' '/etc/group', '/var/cache', ] (g'wan, spot the missing comma ;). OTOH, doing things like: >s = ('abc\n' > 'def\n' > 'ghi\n') >t = ('some long paragraph spanning multiple lines in an editor, ' > 'without newlines') Is pretty common in code I see all the time. I'm not sure why; I use it occasionally, but only very rarely. A lot of folks like this style a lot though from what I can tell. >This looks and works much better than triple-quoted string literals, >esp. when defining such string literals in indented code. I also see this code a lot: from textwrap import dedent s = dedent("""\ abc def ghi """) I think having to deal with indentation could be a common reason why people use implicit concatenation instead of TQS. All things considered, I think the difficult-to-spot bugginess of implicit concatenation outweighs the occasional convenience of it. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From abarnert at yahoo.com Fri May 10 21:38:47 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 10 May 2013 12:38:47 -0700 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> Message-ID: <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> On May 10, 2013, at 10:42, Serhiy Storchaka wrote: > 10.05.13 20:30, Daniel Holth ???????(??): >> I wrote it because the last file in a particular zip archive is an >> embedded digital signature. To re-sign the file you may remove the >> file, add or remove signatures from that file, and append the new >> signatures file. > > Well. There is no need to include this specialized feature in the stdlib. Just use it in your project. Instead of touching private _didModify attribute, set `self.comment = self.comment`. It seems like the code is already making unwarranted assumptions about the internals of ZipFile, and taking out the access to a private attribute doesn't fix that, it just makes it less obvious. Either way, you'll want a comment explaining why this works (and which versions of the stdlib it's been verified to work with) if you're doing this from outside. From ned at nedbatchelder.com Fri May 10 21:43:52 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 10 May 2013 15:43:52 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: <518D4DF8.9030604@nedbatchelder.com> On 5/10/2013 1:16 PM, Terry Jan Reedy wrote: > Are you familiar with with the Code Context option on the Options > menu? Although I think it needs some polishing (to show all context, > not just the three innermost lines), it already does what you seem to > want, but with indented names rather than by nameless dots. IDLE is an odd beast. I never knew this existed! --Ned. From mal at egenix.com Fri May 10 21:46:44 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 10 May 2013 21:46:44 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> Message-ID: <518D4EA4.5070203@egenix.com> On 10.05.2013 21:30, Guido van Rossum wrote: > On Fri, May 10, 2013 at 12:16 PM, Antoine Pitrou wrote: >> On Fri, 10 May 2013 11:48:51 -0700 >> Guido van Rossum wrote: >>> I just spent a few minutes staring at a bug caused by a missing comma >>> -- I got a mysterious argument count error because instead of foo('a', >>> 'b') I had written foo('a' 'b'). >>> >>> This is a fairly common mistake, and IIRC at Google we even had a lint >>> rule against this (there was also a Python dialect used for some >>> specific purpose where this was explicitly forbidden). >>> >>> Now, with modern compiler technology, we can (and in fact do) evaluate >>> compile-time string literal concatenation with the '+' operator, so >>> there's really no reason to support 'a' 'b' any more. (The reason was >>> always rather flimsy; I copied it from C but the reason why it's >>> needed there doesn't really apply to Python, as it is mostly useful >>> inside macros.) >>> >>> Would it be reasonable to start deprecating this and eventually remove >>> it from the language? >> >> I'm rather -1. It's quite convenient and I don't want to add some '+' >> signs everywhere I use it. I'm sure many people also have long string >> literals out there and will have to endure the pain of a dull task to >> "fix" their code. > > Fixing this is an easy task for lib2to3 though. Think about code written to work in Python 2 and 3. Python 2 would have to get the compile-time concatenation as well, to prevent slow-downs due to run-time concatenation. And there would have to be a tool to add the '+' signs and parens to the Python 2 code... s = ('my name is %s and ' 'I live on %s street' % ('foo', 'bar')) --> s = ('my name is %s and ' + 'I live on %s street' % ('foo', 'bar')) results in: Traceback (most recent call last): File "", line 2, in TypeError: not all arguments converted during string formatting The second line is also a good example of how removing the feature would introduce a new difficult to see error :-) IMO, the issue is a task for an editor or a lint tool to highlight, not the Python compiler, IMO. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From python at mrabarnett.plus.com Fri May 10 21:54:58 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 10 May 2013 20:54:58 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518D5092.6010605@mrabarnett.plus.com> On 10/05/2013 20:26, Alexander Belopolsky wrote: > > > > On Fri, May 10, 2013 at 2:48 PM, Guido van Rossum > wrote: > > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > > I had a similar experience just few weeks ago. The bug was in a long > list written like this: > > ['item11', 'item12', ..., 'item17', > 'item21', 'item22', ..., 'item27' > ... > 'item91', 'item92', ..., 'item97'] > > Clearly the bug crept in when more items were added. (I try to keep > redundant commas at the end of the list to avoid this, but not everyone > likes this style.) > > > Would it be reasonable to start deprecating this and eventually remove > it from the language? > > > +1, but I would start by requiring () around concatenated strings. > I'm not so sure. Currently, parentheses, brackets and braces effectively make Python ignore a newline within them. (1 +2) is the same as: (1+2) and: [1 +2] is the same as: [1+2] Under the proposal: ("a" "b") or: ("a" "b") would be the same as: ("ab") but: ["a" "b"] or: ["a" "b"] would be a syntax error. From fuzzyman at gmail.com Fri May 10 22:09:08 2013 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 10 May 2013 21:09:08 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130510211613.53f7649d@fsol> References: <20130510211613.53f7649d@fsol> Message-ID: On 10 May 2013 20:16, Antoine Pitrou wrote: > On Fri, 10 May 2013 11:48:51 -0700 > Guido van Rossum wrote: > > I just spent a few minutes staring at a bug caused by a missing comma > > -- I got a mysterious argument count error because instead of foo('a', > > 'b') I had written foo('a' 'b'). > > > > This is a fairly common mistake, and IIRC at Google we even had a lint > > rule against this (there was also a Python dialect used for some > > specific purpose where this was explicitly forbidden). > > > > Now, with modern compiler technology, we can (and in fact do) evaluate > > compile-time string literal concatenation with the '+' operator, so > > there's really no reason to support 'a' 'b' any more. (The reason was > > always rather flimsy; I copied it from C but the reason why it's > > needed there doesn't really apply to Python, as it is mostly useful > > inside macros.) > > > > Would it be reasonable to start deprecating this and eventually remove > > it from the language? > > I'm rather -1. It's quite convenient and I don't want to add some '+' > signs everywhere I use it. I'm sure many people also have long string > literals out there and will have to endure the pain of a dull task to > "fix" their code. > > However, in your case, foo('a' 'b') could raise a SyntaxWarning, since > the "continuation" is on the same line. > > I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines. Michael > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri May 10 22:24:32 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Fri, 10 May 2013 16:24:32 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D5092.6010605@mrabarnett.plus.com> References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: +1; I've been bitten by this many times. As already mentioned, one big use case where this is useful is having multiline string literals without having all the annoying indentation leak into your code. I think this could be easily fixed with a convenient .dedent() or .strip_margin() function. On Fri, May 10, 2013 at 3:54 PM, MRAB wrote: > On 10/05/2013 20:26, Alexander Belopolsky wrote: > >> >> >> >> On Fri, May 10, 2013 at 2:48 PM, Guido van Rossum > > wrote: >> >> I just spent a few minutes staring at a bug caused by a missing comma >> -- I got a mysterious argument count error because instead of foo('a', >> 'b') I had written foo('a' 'b'). >> >> >> I had a similar experience just few weeks ago. The bug was in a long >> list written like this: >> >> ['item11', 'item12', ..., 'item17', >> 'item21', 'item22', ..., 'item27' >> ... >> 'item91', 'item92', ..., 'item97'] >> >> Clearly the bug crept in when more items were added. (I try to keep >> redundant commas at the end of the list to avoid this, but not everyone >> likes this style.) >> >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? >> >> >> +1, but I would start by requiring () around concatenated strings. >> >> I'm not so sure. > > Currently, parentheses, brackets and braces effectively make Python ignore > a newline within them. > > (1 > +2) > > is the same as: > > (1+2) > > and: > > [1 > +2] > > is the same as: > > [1+2] > > Under the proposal: > > ("a" > "b") > > or: > > ("a" "b") > > would be the same as: > > ("ab") > > but: > > ["a" > "b"] > > or: > > ["a" "b"] > > would be a syntax error. > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri May 10 22:25:12 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 10 May 2013 23:25:12 +0300 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> Message-ID: 10.05.13 22:38, Andrew Barnert ???????(??): > It seems like the code is already making unwarranted assumptions about the internals of ZipFile, and taking out the access to a private attribute doesn't fix that, it just makes it less obvious. Indeed. infolist() may return a copy or non-modifiable proxy, NameToInfo and fp are private attributes. ZipFile may save in private attribute an offset of central directory. From ezio.melotti at gmail.com Fri May 10 22:40:08 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Fri, 10 May 2013 23:40:08 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D5092.6010605@mrabarnett.plus.com> References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: On Fri, May 10, 2013 at 10:54 PM, MRAB wrote: > Under the proposal: > > ("a" > "b") > > or: > > ("a" "b") > > would be the same as: > > ("ab") > > but: > > ["a" > "b"] > > or: > > ["a" "b"] > > would be a syntax error. > This would actually be fine with me. I use implicit string literal concatenation mostly within (...), and even though I've seen (and sometimes written) code like ['this is a ' 'long string', 'this is another ' 'long string'] I agree that requiring extra (...) in this case is reasonable, i.e.: [('this is a ' 'long string'), ('this is another ' 'long string')] The same would apply to other literals like {...} (for both sets and dicts), and possibly for tuples too (assuming that it's possible to figure out when a tuple is being created). I also write code like: raise SomeException('this is a long message ' 'that spans 2 lines') or even: self.assertTrue(somefunc(), 'somefunc() returned ' 'a false value and this is wrong') In these cases I wouldn't like redundant (...) (or even worse extra '+'s), especially for the first case. I also think that forgetting a comma in a list of function args between two string literal args is quite uncommon, whereas forgetting it in a sequence of strings (list, set, dict, tuple) is much more common, so this approach should cover most of the cases. Best Regards, Ezio Melotti From dholth at gmail.com Fri May 10 22:47:31 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 May 2013 16:47:31 -0400 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> Message-ID: On Fri, May 10, 2013 at 4:25 PM, Serhiy Storchaka wrote: > 10.05.13 22:38, Andrew Barnert ???????(??): > >> It seems like the code is already making unwarranted assumptions about the >> internals of ZipFile, and taking out the access to a private attribute >> doesn't fix that, it just makes it less obvious. > > > Indeed. infolist() may return a copy or non-modifiable proxy, NameToInfo and > fp are private attributes. ZipFile may save in private attribute an offset > of central directory. It would be nice to have a better low-level API to ZipFile. Does it ever do these things that it may do? From storchaka at gmail.com Fri May 10 23:12:26 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 11 May 2013 00:12:26 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: 10.05.13 23:40, Ezio Melotti ???????(??): > I also think that forgetting a comma in a list of function args > between two string literal args is quite uncommon, whereas forgetting > it in a sequence of strings (list, set, dict, tuple) is much more > common, so this approach should cover most of the cases. Tuples. From ram.rachum at gmail.com Fri May 10 23:16:27 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Fri, 10 May 2013 14:16:27 -0700 (PDT) Subject: [Python-ideas] 2 ideas for `concurrent.futures` Message-ID: <4d27ae74-8813-4eca-bdef-4f832f06147b@googlegroups.com> I have 2 ideas I thought about for the `futures` module: 1. A method `Executor.filter` that will be to the built-in `filter` what `Executor.filter` is to the built-in `map`. 2. A keyword argument `join_on_exit` to `Executor.__init__`, with a default of `False`. When `True`, upon `Executor.__exit__` all futures will be run to completion. This'll be useful to avoid the `Cannot schedule new futures after shutdown` exception without manually exhausting all the iterators returned by e.g. `Executor.map`. What do you think? Thanks, Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From antonio.s.messina at gmail.com Fri May 10 23:17:21 2013 From: antonio.s.messina at gmail.com (Antonio Messina) Date: Fri, 10 May 2013 23:17:21 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: My 2 cents: as an user, I often split very long text lines (mostly log entries or exception messages) into multiple lines in order to stay under 80chars (PEP8 docet), like: log.warning("Configuration item '%s' was renamed to '%s'," " please change occurrences of '%s' to '%s'" " in configuration file '%s'.", oldkey, newkey, oldkey, newkey, filename) This should become (if I understand the proposal) something like: log.warning("Configuration item '%s' was renamed to " % oldkey + "'%s', please change occurrences of '%s'" % (newkey, oldkey) + " to '%s' in configuration file '%s'." % (newkey, filename)) but imagine what would happen if you have to rephrase the text, and reorder the variables and fix the `+` signs... On the other hands, I think I've only got the ``func("a" "b")`` error once or twice in my life. .a. -- antonio.s.messina at gmail.com antonio.messina at uzh.ch +41 (0)44 635 42 22 GC3: Grid Computing Competence Center http://www.gc3.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland From graffatcolmingov at gmail.com Fri May 10 23:53:10 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Fri, 10 May 2013 17:53:10 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: On Fri, May 10, 2013 at 5:17 PM, Antonio Messina wrote: > My 2 cents: as an user, I often split very long text lines (mostly log > entries or exception messages) into multiple lines in order to stay > under 80chars (PEP8 docet), like: > > log.warning("Configuration item '%s' was renamed to '%s'," > " please change occurrences of '%s' to '%s'" > " in configuration file '%s'.", > oldkey, newkey, oldkey, newkey, filename) Actually it would just become log.warning(("Configuration item '%s' was renamed to '%s'," + " please change occurrences of '%s' to '%s'" + " in configuration file '%s'."), oldkey, newkey, oldkey, newkey, filename) Perhaps without the inner set of parentheses. The issue of string formatting wouldn't apply here since log.* does the formatting for you. A more apt example of what they were talking about earlier is s = ("foo %s bar bogus" % (var1) "spam %s spam %s spam" % (var2, var3)) Would have to become s = (("foo %s bar bogus" % (var1)) + ("spam %s spam %s spam" % (var2, var3))) Because + has operator precedence over % otherwise, var1 would be concatenated with "spam %s spam %s spam" and then you would have substitution take place. From antonio.s.messina at gmail.com Sat May 11 00:00:16 2013 From: antonio.s.messina at gmail.com (Antonio Messina) Date: Sat, 11 May 2013 00:00:16 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: On Fri, May 10, 2013 at 11:53 PM, Ian Cordasco wrote: > On Fri, May 10, 2013 at 5:17 PM, Antonio Messina > wrote: >> My 2 cents: as an user, I often split very long text lines (mostly log >> entries or exception messages) into multiple lines in order to stay >> under 80chars (PEP8 docet), like: >> >> log.warning("Configuration item '%s' was renamed to '%s'," >> " please change occurrences of '%s' to '%s'" >> " in configuration file '%s'.", >> oldkey, newkey, oldkey, newkey, filename) > > Actually it would just become > > log.warning(("Configuration item '%s' was renamed to '%s'," + > " please change occurrences of '%s' to '%s'" + > " in configuration file '%s'."), > oldkey, newkey, oldkey, newkey, filename) > > Perhaps without the inner set of parentheses. The issue of string > formatting wouldn't apply here since log.* does the formatting for > you. A more apt example of what they were talking about earlier is You are right, I've picked up the wrong example. Please rephrase it using "raise SomeException()" instead of "log.warning()", which is the other case I often have to split the string over multiple lines: raise ConfigurationError("Configuration tiem '%s' was renamed to '%s'," " please change occurrences of '%s' to '%s'" " in configuration file '%s'." % (oldkey, newkey, oldkey, newkey, filename)) .a. -- antonio.s.messina at gmail.com antonio.messina at uzh.ch +41 (0)44 635 42 22 GC3: Grid Computing Competence Center http://www.gc3.uzh.ch/ University of Zurich Winterthurerstrasse 190 CH-8057 Zurich Switzerland From apalala at gmail.com Sat May 11 00:07:24 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Fri, 10 May 2013 17:37:24 -0430 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> Message-ID: On Fri, May 10, 2013 at 3:00 PM, Guido van Rossum wrote: > There are plenty of examples where the continuation isn't on the same > line (some were already posted here). > +1 I've never used the feature and don't intent to. A related annoyance is the trailing comma at the end of stuff (probably a leftover from a previous edit). For example: def fun(a, b, c,): Parses fine. But the one that has bitten me is the comma at the end of a line: >>> x = 1, >>> x (1,) >>> x == 1, # inconsistency? (False,) >>> x == (1,) True >>> y = a_very_long_call(param1, param2, param3), # this trailing comma is difficult to spot I'd prefer that the syntax for creating one-tuples requires the parenthesis, and that trailing commas are disallowed. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat May 11 00:23:13 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 10 May 2013 15:23:13 -0700 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> Message-ID: <007B250D-C694-4A66-A4C9-F34A8A9ED4E7@yahoo.com> On May 10, 2013, at 13:47, Daniel Holth wrote: > On Fri, May 10, 2013 at 4:25 PM, Serhiy Storchaka wrote: >> 10.05.13 22:38, Andrew Barnert ???????(??): >> >>> It seems like the code is already making unwarranted assumptions about the >>> internals of ZipFile, and taking out the access to a private attribute >>> doesn't fix that, it just makes it less obvious. >> >> >> Indeed. infolist() may return a copy or non-modifiable proxy, NameToInfo and >> fp are private attributes. ZipFile may save in private attribute an offset >> of central directory. > > It would be nice to have a better low-level API to ZipFile. That might be nice. It would also answer the people who want to do the gzip themselves (to optimize p7zip style) but use the high level part of zipfile. > Does it > ever do these things that it may do? Depends. If by "it" you mean "Python" in the abstract... There's no answer to that. If you mean CPython 2.7.0-2.7.4 or 3.3.0-3.3.1, then you can verify it pretty easily (and I suspect the OP already did). Which means if you do this from outside, you'll want a comment explaining why it works, and which versions of Python you've verified it against. From jonathan.eunice at gmail.com Sat May 11 00:58:11 2013 From: jonathan.eunice at gmail.com (Jonathan Eunice) Date: Fri, 10 May 2013 18:58:11 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? Message-ID: Implicit line concatenation is one of those rare places where Python turns an oddly blind eye to the "Explicit is better than implicit" rule it otherwise loves. I can't speak to how much inconvenience/breakage it would cause, but deprecating it would seem to increase the language's "coherence" with--or at least, adherence to--its principles. But I have no real stake in it; I seldom if ever use the construct. I prefer a trick I learned in Perl: Using a "here document" cleanup function. This allows multi-line literal strings to be stated in a program, at whatever level of indentation is appropriate for code clarity, and the indentation to be automatically removed. For example, using [textdata](https://pypi.python.org/pypi/textdata): from textdata import * data = lines(""" There was an old woman who lived in a shoe. She had so many children, she didn't know what to do; """) will result in: ['There was an old woman who lived in a shoe.', "She had so many children, she didn't know what to do;"] This is dedented, but also had some blank lines at the start and end removed (the blanks make Python formatting look good, but might gunk up subsequent processing). `lines` can also do what implicit concatenation does: data = lines(join=True, text=""" this that """) gives a purely concatenated result: 'thisthat' The `join` kwarg, given a string, will join on that string. Some edge-case options aside, being able to state indented literal strings without fussing with line-by-line quoting is the jewel. Especially if Python's implicit line concatenation is going to be deprecated, you might consider adding an "ease of multi-line string specification" function like `lines` to the standard library (in `textwrap`?) to ease the passing. From eliben at gmail.com Sat May 11 01:27:32 2013 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 May 2013 16:27:32 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 11:48 AM, Guido van Rossum wrote: > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? > I would also be happy to see this error-prone syntax go (was bitten by it a number of times in the past), but I have a practical question here: Realistically, what does "start deprecating" and "eventually remove" means here? This is a significant backwards-compatibility breaking change that will *definitely* break existing code. So would it be removed just in "Python 4"? Or are you talking about an actual 3.x release like "deprecate in 3.4 and remove in 3.5" ? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat May 11 00:33:59 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 10 May 2013 15:33:59 -0700 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: Message-ID: <518D75D7.4000201@stoneleaf.us> On 05/10/2013 05:32 AM, Daniel Holth wrote: > Check out this efficient way to remove the last file from any ordinary zip file. Seems pretty cool, but very specialized and potentially fragile. -1 on adding to the stdlib. -- ~Ethan~ From guido at python.org Sat May 11 01:41:34 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 10 May 2013 16:41:34 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 4:27 PM, Eli Bendersky wrote: > > On Fri, May 10, 2013 at 11:48 AM, Guido van Rossum wrote: >> >> I just spent a few minutes staring at a bug caused by a missing comma >> -- I got a mysterious argument count error because instead of foo('a', >> 'b') I had written foo('a' 'b'). >> >> This is a fairly common mistake, and IIRC at Google we even had a lint >> rule against this (there was also a Python dialect used for some >> specific purpose where this was explicitly forbidden). >> >> Now, with modern compiler technology, we can (and in fact do) evaluate >> compile-time string literal concatenation with the '+' operator, so >> there's really no reason to support 'a' 'b' any more. (The reason was >> always rather flimsy; I copied it from C but the reason why it's >> needed there doesn't really apply to Python, as it is mostly useful >> inside macros.) >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > > I would also be happy to see this error-prone syntax go (was bitten by it a > number of times in the past), but I have a practical question here: > > Realistically, what does "start deprecating" and "eventually remove" means > here? This is a significant backwards-compatibility breaking change that > will *definitely* break existing code. So would it be removed just in > "Python 4"? Or are you talking about an actual 3.x release like "deprecate > in 3.4 and remove in 3.5" ? It's probably common enough that we'd have to do a silent deprecation in 3.4, a nosy deprecation in 3.5, and then delete it in 3.6, or so. Or maybe even more conservative. Plus we should work on a conversion tool that adds + and () as needed, *and* tell authors of popular lint tools to add rules for this. The hacky proposals for making it a syntax warning "sometimes" don't feel right to me. I do realize that this will break a lot of code, and that's the only reason why we may end up punting on this, possibly until Python 4, or forever. But I don't think the feature is defensible from a language usability POV. It's just about backward compatibility at this point. -- --Guido van Rossum (python.org/~guido) From pjenvey at underboss.org Sat May 11 01:43:43 2013 From: pjenvey at underboss.org (Philip Jenvey) Date: Fri, 10 May 2013 16:43:43 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> Message-ID: <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> On May 10, 2013, at 1:09 PM, Michael Foord wrote: > On 10 May 2013 20:16, Antoine Pitrou wrote: > > I'm rather -1. It's quite convenient and I don't want to add some '+' > signs everywhere I use it. I'm sure many people also have long string > literals out there and will have to endure the pain of a dull task to > "fix" their code. > > However, in your case, foo('a' 'b') could raise a SyntaxWarning, since > the "continuation" is on the same line. > > I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines. Strongly -1 on this proposal, I also use this quite often. -- Philip Jenvey From ncoghlan at gmail.com Sat May 11 01:51:28 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 May 2013 09:51:28 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On 11 May 2013 04:50, "Guido van Rossum" wrote: > > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? I could live with it if we get "dedent()" as a string method. I'd be even happier if constant folding was extended to platform independent method calls on literals, but I don't believe there's a sane way to maintain the "platform independent" constraint. OTOH, it's almost on the scale of "remove string mod formatting". Shipping at least a basic linting tool in the stdlib would probably be almost as effective and substantially less disruptive. lib2to3 should provide some decent infrastructure for that. Cheers, Nick. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat May 11 01:57:02 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 10 May 2013 16:57:02 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: Well, I think if you can live with x = ('foo\n' 'bar\n' 'baz\n' ) I think you could live with x = ('foo\n' + 'bar\n' + 'baz\n' ) as well... (Extra points if you figure out how to have a + on the last line too. :-) So, as I said, it's not the convenience that matters, it's how much it is in use. :-( --Guido On Fri, May 10, 2013 at 4:51 PM, Nick Coghlan wrote: > > On 11 May 2013 04:50, "Guido van Rossum" wrote: >> >> I just spent a few minutes staring at a bug caused by a missing comma >> -- I got a mysterious argument count error because instead of foo('a', >> 'b') I had written foo('a' 'b'). >> >> This is a fairly common mistake, and IIRC at Google we even had a lint >> rule against this (there was also a Python dialect used for some >> specific purpose where this was explicitly forbidden). >> >> Now, with modern compiler technology, we can (and in fact do) evaluate >> compile-time string literal concatenation with the '+' operator, so >> there's really no reason to support 'a' 'b' any more. (The reason was >> always rather flimsy; I copied it from C but the reason why it's >> needed there doesn't really apply to Python, as it is mostly useful >> inside macros.) >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > I could live with it if we get "dedent()" as a string method. I'd be even > happier if constant folding was extended to platform independent method > calls on literals, but I don't believe there's a sane way to maintain the > "platform independent" constraint. > > OTOH, it's almost on the scale of "remove string mod formatting". Shipping > at least a basic linting tool in the stdlib would probably be almost as > effective and substantially less disruptive. lib2to3 should provide some > decent infrastructure for that. > > Cheers, > Nick. > >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From graffatcolmingov at gmail.com Sat May 11 01:57:47 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Fri, 10 May 2013 19:57:47 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On May 10, 2013 7:51 PM, "Nick Coghlan" wrote: > > > On 11 May 2013 04:50, "Guido van Rossum" wrote: > > > > I just spent a few minutes staring at a bug caused by a missing comma > > -- I got a mysterious argument count error because instead of foo('a', > > 'b') I had written foo('a' 'b'). > > > > This is a fairly common mistake, and IIRC at Google we even had a lint > > rule against this (there was also a Python dialect used for some > > specific purpose where this was explicitly forbidden). > > > > Now, with modern compiler technology, we can (and in fact do) evaluate > > compile-time string literal concatenation with the '+' operator, so > > there's really no reason to support 'a' 'b' any more. (The reason was > > always rather flimsy; I copied it from C but the reason why it's > > needed there doesn't really apply to Python, as it is mostly useful > > inside macros.) > > > > Would it be reasonable to start deprecating this and eventually remove > > it from the language? > > I could live with it if we get "dedent()" as a string method. I'd be even happier if constant folding was extended to platform independent method calls on literals, but I don't believe there's a sane way to maintain the "platform independent" constraint. > > OTOH, it's almost on the scale of "remove string mod formatting". Shipping at least a basic linting tool in the stdlib would probably be almost as effective and substantially less disruptive. lib2to3 should provide some decent infrastructure for that. I have cc'd the code-quality mailing list since several linger authors are subscribed there. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat May 11 02:08:00 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 10 May 2013 20:08:00 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 7:57 PM, Guido van Rossum wrote: > I think you could live with > > x = ('foo\n' + > 'bar\n' + > 'baz\n' > ) > > as well... (Extra points if you figure out how to have a + on the last > line too. :-) Does this earn a point? x = (+ 'foo\n' + 'bar\n' + 'baz\n' ) :-)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Sat May 11 02:29:28 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 10 May 2013 17:29:28 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: I got bit by this quite recently, leaving out a comma in a long list of strings and I only found the bug by accident. This being python "ideas" I'll throw one out. Add another prefix character to strings: a = [m'abc' 'def'] # equivalent to ['abcdef'] A string with an m prefix is continued on one or more following lines. A string must have an m prefix to be continued (but this change would have to be phased in). A conversion tool need merely recognize the string continuations and insert m's. I chose the m character for multi-line but the character choice is available for bikeshedding. The m prefix can be combined with u and/or r but not with triple-quotes. The following are not allowed: b = ['abc' # syntax error (m is required for continuation) 'def') c = [m'abc'] # syntax error (when m is used, continuation lines must be present) d = [m'abc' m'def'] # syntax error (m only allowed for first string) The reason to prohibit cases c and d guard against comma errors with these forms. Consider these cases with missing or extra commas. e = [m'abc', # extra comma causes syntax error 'def'] f = [m'abc' # missing comma causes syntax error m'def', 'ghi'] Yes, I know this doesn't guard against all comma errors. You could protect against more with prefix and suffix (e.g., an m at the end of the last string) but I'm skeptical it's worth it. Conversion to this could be done in three stages: (1) accept m's (case a), deprecate missing m's (case b), error for misused m's (case c-f) (2) warn on missing m's (case b) (3) error on missing m's (case b) --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From epsilonmichael at gmail.com Sat May 11 02:36:02 2013 From: epsilonmichael at gmail.com (Michael Mitchell) Date: Fri, 10 May 2013 19:36:02 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 7:08 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > Does this earn a point? > > x = (+ 'foo\n' > > + 'bar\n' > + 'baz\n' > ) > Plus doesn't make sense as a unary operator on strings. x = ('foo\n' + 'bar\n' + 'baz\n' + '') This would work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat May 11 02:43:48 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 11 May 2013 12:43:48 +1200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130510213707.0df3f992@fsol> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> Message-ID: <518D9444.20200@canterbury.ac.nz> Antoine Pitrou wrote: > As for "+", saying it is a replacement is a bit simplified, because > the syntax definition (for method calls) or operator precedence (for > e.g. %-formatting) may force you to add parentheses. Maybe we could turn ... into a "string continuation operator": print("This is example %d of a line that is "... "too long" % example_number) -- Greg From greg.ewing at canterbury.ac.nz Sat May 11 02:50:28 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 11 May 2013 12:50:28 +1200 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> Message-ID: <518D95D4.1090505@canterbury.ac.nz> Serhiy Storchaka wrote: > Indeed. infolist() may return a copy or non-modifiable proxy, NameToInfo > and fp are private attributes. ZipFile may save in private attribute an > offset of central directory. Seems to me it would be better to provide ZipFile with a general remove() operation that does the right thing, with optimisation for the case where the file happens to be at the end. -- Greg From python at mrabarnett.plus.com Sat May 11 03:12:56 2013 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 11 May 2013 02:12:56 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518D9B18.90605@mrabarnett.plus.com> On 11/05/2013 01:29, Bruce Leban wrote: > I got bit by this quite recently, leaving out a comma in a long list of > strings and I only found the bug by accident. > > This being python "ideas" I'll throw one out. > > Add another prefix character to strings: > > a = [m'abc' > 'def'] # equivalent to ['abcdef'] > > A string with an m prefix is continued on one or more following lines. A > string must have an m prefix to be continued (but this change would have > to be phased in). A conversion tool need merely recognize the string > continuations and insert m's. I chose the m character for multi-line but > the character choice is available for bikeshedding. The m prefix can be > combined with u and/or r but not with triple-quotes. The following are > not allowed: > > b = ['abc' # syntax error (m is required for continuation) > 'def') > > c = [m'abc'] # syntax error (when m is used, continuation lines > must be present) > > d = [m'abc' > m'def'] # syntax error (m only allowed for first string) > > The reason to prohibit cases c and d guard against comma errors with > these forms. Consider these cases with missing or extra commas. > > e = [m'abc', # extra comma causes syntax error > 'def'] > > f = [m'abc' # missing comma causes syntax error > m'def', > 'ghi'] > > Yes, I know this doesn't guard against all comma errors. You could > protect against more with prefix and suffix (e.g., an m at the end of > the last string) but I'm skeptical it's worth it. > > Conversion to this could be done in three stages: > > (1) accept m's (case a), deprecate missing m's (case b), error for > misused m's (case c-f) > (2) warn on missing m's (case b) > (3) error on missing m's (case b) > It wouldn't help with: f = [m'abc' 'def' 'ghi'] vs: f = [m'abc' 'def', 'ghi'] I think I'd go more for a triple-quoted string with a prefix for dedenting and removing newlines: f = [m''' abc def ghi '''] where f == ['abcdefghi']. From jsbueno at python.org.br Sat May 11 04:13:54 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 10 May 2013 23:13:54 -0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D9B18.90605@mrabarnett.plus.com> References: <518D9B18.90605@mrabarnett.plus.com> Message-ID: Any chance that along with that there could come up a syntax for ignoring identation space inside multiline strings along wih this deprecation? Think something along: logger.warn(I'''C: 'Ello, Miss? Owner: What do you mean "miss"? C: I'm sorry, I have a cold. I wish to make a complaint! O: We're closin' for lunch. C: Never mind that, my lad. I wish to complain about this parrot what I purchased not half an hour ago from this very boutique.\ ''') Against: logger.warn( 'Owner: What do you mean "miss"?\n' + 'C: I'm sorry, I have a cold. I wish to make a complaint!\n' + 'O: We're closin\' for lunch.\n' + 'C: Never mind that, my lad. I wish to complain about this\n' + 'parrot what I purchased not half an hour ago from this very boutique.\n' ) I know this sugestion has come and gone before - but it still looks like a good idea for me - there is no ambiguity there - you either punch enough spaces to get your content aligned with the " i''' "in the first line, or get a SyntaxError. From rovitotv at gmail.com Sat May 11 04:38:37 2013 From: rovitotv at gmail.com (Todd V. Rovito) Date: Fri, 10 May 2013 22:38:37 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: <15F917E4-A038-4DE3-83DF-C5DD3992B3AB@gmail.com> On May 10, 2013, at 1:16 PM, Terry Jan Reedy wrote: > Todd, please don't suggest that people post half-baked, possibly impossible to code, ideas to the tracker. The tracker already has a thousand enhancement requests. Many are dead clutter. Others need discussion that they will never get on the tracker. In general, I think it is much better for code ideas to come first to this list or, for Idle ideas, idle-dev, to see if they are new, feasible, and have sufficient support to be applied once coded. No problem I won't do it again. Alexandre, I would like to talk about your idea more so feel free to bring the conversation over to idle-dev. Thanks. From stephen at xemacs.org Sat May 11 05:31:06 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 11 May 2013 12:31:06 +0900 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D9B18.90605@mrabarnett.plus.com> References: <518D9B18.90605@mrabarnett.plus.com> Message-ID: <87txmadwg5.fsf@uwakimon.sk.tsukuba.ac.jp> MRAB writes: > I think I'd go more for a triple-quoted string with a prefix for > dedenting and removing newlines: > > f = [m''' > abc > def > ghi > '''] > > where f == ['abcdefghi']. Cool enough, but >>> f = [m''' ... abc ... def ... ghi ... '''] >>> f == ['abc def ghi'] True Worse, >>> f = [m''' ... abc ... def ... ghi ... '''] >>> f == ['abc def ghi'] True Yikes! (Yeah, I know about consenting adults.) From dreamingforward at gmail.com Sat May 11 05:43:00 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Fri, 10 May 2013 20:43:00 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D4187.5030902@whoosh.ca> References: <518D4187.5030902@whoosh.ca> Message-ID: On Fri, May 10, 2013 at 11:50 AM, Matt Chaput wrote: > On 5/10/2013 2:48 PM, Guido van Rossum wrote: >> >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > Yes please! I've been bitten by the same issue more than once. +1 -m From ncoghlan at gmail.com Sat May 11 05:37:04 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 May 2013 13:37:04 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Sat, May 11, 2013 at 10:29 AM, Bruce Leban wrote: > I got bit by this quite recently, leaving out a comma in a long list of > strings and I only found the bug by accident. > > This being python "ideas" I'll throw one out. > > Add another prefix character to strings: > > a = [m'abc' > 'def'] # equivalent to ['abcdef'] As MRAB suggested, a prefix for a compile time dedent would likely be more useful - then you'd just use a triple quoted string and be done with it. The other one I occasionally wish for is a compile time equivalent of str.split (if we had that, we likely wouldn't see APIs like collections.namedtuple and enum.Enum accepting space separated strings). Amongst my ideas-so-farfetched-I-never-even-wrote-them-up (which is saying something, given some of the ideas I *have* written up) is a notation like: !processor!"STRING LITERAL" Where the compile time string processors had to be registered through an appropriate API (probably in the sys module). Then you would just define preprocessors like "merge" or "dedent" or "split" or "sh" of "format" and get the appropriate compile time raw string->AST translation. So for this use case, you would do: a = [!merge!"""\ abc def""" Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From dreamingforward at gmail.com Sat May 11 05:50:56 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Fri, 10 May 2013 20:50:56 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D9444.20200@canterbury.ac.nz> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> Message-ID: > Maybe we could turn ... into a "string continuation > operator": > > print("This is example %d of a line that is "... > "too long" % example_number) I think that is an awesome idea. -- MarkJ Tacoma, Washington From raymond.hettinger at gmail.com Sat May 11 06:04:56 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 10 May 2013 21:04:56 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <69CB3F9E-9E58-4D67-BF19-BE67F462A1D7@gmail.com> On May 10, 2013, at 11:48 AM, Guido van Rossum wrote: > Would it be reasonable to start deprecating this and eventually remove > it from the language? I don't think it would be missed. I very rarely see it used in practice. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjenvey at underboss.org Sat May 11 06:20:16 2013 From: pjenvey at underboss.org (Philip Jenvey) Date: Fri, 10 May 2013 21:20:16 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <69CB3F9E-9E58-4D67-BF19-BE67F462A1D7@gmail.com> References: <69CB3F9E-9E58-4D67-BF19-BE67F462A1D7@gmail.com> Message-ID: <930167C9-A12E-4E05-83AA-0EDC99168987@underboss.org> On May 10, 2013, at 9:04 PM, Raymond Hettinger wrote: > > On May 10, 2013, at 11:48 AM, Guido van Rossum wrote: > >> Would it be reasonable to start deprecating this and eventually remove >> it from the language? > > I don't think it would be missed. I very rarely see it used in practice. Really? I see it used over multiple lines all over the place. -- Philip Jenvey From abarnert at yahoo.com Sat May 11 07:05:55 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 10 May 2013 22:05:55 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> Message-ID: <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> On May 10, 2013, at 20:50, Mark Janssen wrote: >> Maybe we could turn ... into a "string continuation >> operator": >> >> print("This is example %d of a line that is "... >> "too long" % example_number) > > I think that is an awesome idea. How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols). Also, this gives two ways to do it, that have the exact same effect when they're both legal. The only difference is that the new way is only legal in a restricted set of cases. By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples? From random832 at fastmail.us Sat May 11 07:11:44 2013 From: random832 at fastmail.us (Random832) Date: Sat, 11 May 2013 01:11:44 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> Message-ID: <518DD310.10100@fastmail.us> On 05/11/2013 01:05 AM, Andrew Barnert wrote: > How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols). > > By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples? You just answered your own question. The reason it's better than + in the same position, for those people, is that it would have higher precedence than %. From abarnert at yahoo.com Sat May 11 07:12:58 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 10 May 2013 22:12:58 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <50153BC4-AB57-4F54-95CC-C13540E120E9@yahoo.com> On May 10, 2013, at 20:37, Nick Coghlan wrote: > On Sat, May 11, 2013 at 10:29 AM, Bruce Leban wrote: >> I got bit by this quite recently, leaving out a comma in a long list of >> strings and I only found the bug by accident. >> >> This being python "ideas" I'll throw one out. >> >> Add another prefix character to strings: >> >> a = [m'abc' >> 'def'] # equivalent to ['abcdef'] > > As MRAB suggested, a prefix for a compile time dedent would likely be > more useful - then you'd just use a triple quoted string and be done > with it. The other one I occasionally wish for is a compile time > equivalent of str.split (if we had that, we likely wouldn't see APIs > like collections.namedtuple and enum.Enum accepting space separated > strings). Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant? If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants. (Doesn't the + optimization already make that assumption anyway?) From stephen at xemacs.org Sat May 11 07:16:00 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 11 May 2013 14:16:00 +0900 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> Message-ID: <87obcidrlb.fsf@uwakimon.sk.tsukuba.ac.jp> Mark Janssen writes: > > Maybe we could turn ... into a "string continuation > > operator": > > > > print("This is example %d of a line that is "... > > "too long" % example_number) > > I think that is an awesome idea. Violates TOOWTDI. >>> print("This is an" + # traditional explicit operator ... " %s idea." % ("awesome" if False else "unimpressive")) This is an unimpressive idea. >>> already works (and always has AFAIK -- modulo the ternary operator, of course). From abarnert at yahoo.com Sat May 11 07:15:48 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 10 May 2013 22:15:48 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518DD310.10100@fastmail.us> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> Message-ID: <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> On May 10, 2013, at 22:11, Random832 wrote: > On 05/11/2013 01:05 AM, Andrew Barnert wrote: >> How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols). >> >> By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples? > You just answered your own question. The reason it's better than + in the same position, for those people, is that it would have higher precedence than %. Ah, that makes sense. Except that % formatting is supposed to be one of those "we haven't deprecated it, but we will, so stop using it" features, so it seems a little odd to add new syntax to make % formatting easier. Also, doesn't this imply that ... is now an operator in some contexts, but a literal in others? From g.brandl at gmx.net Sat May 11 07:24:39 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 11 May 2013 07:24:39 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> Message-ID: Am 11.05.2013 01:43, schrieb Philip Jenvey: > > On May 10, 2013, at 1:09 PM, Michael Foord wrote: > >> On 10 May 2013 20:16, Antoine Pitrou wrote: >> >> I'm rather -1. It's quite convenient and I don't want to add some '+' >> signs everywhere I use it. I'm sure many people also have long string >> literals out there and will have to endure the pain of a dull task to >> "fix" their code. >> >> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since >> the "continuation" is on the same line. >> >> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines. > > Strongly -1 on this proposal, I also use this quite often. -1 here. I use it a lot too, and find it very convenient, and while I could live with the change, I think it should have been made together with the lot of other syntax changes going to Python 3. Georg From random832 at fastmail.us Sat May 11 07:30:56 2013 From: random832 at fastmail.us (Random832) Date: Sat, 11 May 2013 01:30:56 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> Message-ID: <518DD790.6090702@fastmail.us> On 05/11/2013 01:15 AM, Andrew Barnert wrote: > Ah, that makes sense. > > Except that % formatting is supposed to be one of those "we haven't deprecated it, but we will, so stop using it" features, so it seems a little odd to add new syntax to make % formatting easier. > Well, technically the same would apply to .format(), I guess. From steve at pearwood.info Sat May 11 07:36:39 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 11 May 2013 15:36:39 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518DD8E7.10204@pearwood.info> On 11/05/13 04:48, Guido van Rossum wrote: > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) > > Would it be reasonable to start deprecating this and eventually remove > it from the language? Not unless you guarantee that compile-time folding of string literals with '+' will be a language feature rather than an optional optimization. I frequently use implicit string concatenation for long strings, or to keep within the 80 character line limit. I teach people to prefer it over '+' because: - constant folding is an implementation detail that is not guaranteed, and not all versions of Python support; - even when provided, constant folding is an optimization which might not be present in the future[1]; - implicit string concatenation is a language feature, so every Python must support it; - and is nicer than the current alternatives involving backslashes or triple-quoted strings. The problems caused by implicit string concatenation are uncommon and mild. Having two string literals immediately next to each other is uncommon; forgetting the comma makes it rarer. So I think the benefit of implicit string concatenation far outweighs the occasional problem. [1] I recall you (GvR) publicly complaining about CPython optimizations and suggesting that they are more effort than they are worth and should be dropped. I don't recall whether you explicitly included constant folding in that. -- Steven From greg.ewing at canterbury.ac.nz Sat May 11 07:39:57 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 11 May 2013 17:39:57 +1200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> Message-ID: <518DD9AD.8030403@canterbury.ac.nz> Andrew Barnert wrote: > Except that % formatting is supposed to be one of those > "we haven't deprecated it, but we will, so stop using it" features, > so it seems a little odd to add new syntax to make % formatting easier. Except that the same problem also occurs with .format() formatting. >>> "a{}b" "c{}d" .format(1,2) 'a1bc2d' but >>> "a{}b" + "c{}d" .format(1,2) 'a{}bc1d' so you need >>> ("a{}b" + "c{}d") .format(1,2) 'a1bc2d' > Also, doesn't this imply that ... is now an operator in some contexts, > but a literal in others? It would have different meanings in different contexts, yes. But I wouldn't think of it as an operator, more as a token indicating string continuation, in the same way that the backslash indicates line continuation. -- Greg From random832 at fastmail.us Sat May 11 07:44:30 2013 From: random832 at fastmail.us (Random832) Date: Sat, 11 May 2013 01:44:30 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518DD8E7.10204@pearwood.info> References: <518DD8E7.10204@pearwood.info> Message-ID: <518DDABE.4050608@fastmail.us> On 05/11/2013 01:36 AM, Steven D'Aprano wrote: > Not unless you guarantee that compile-time folding of string literals > with '+' will be a language feature rather than an optional optimization. What makes you think that implicit concatenation being compile-time isn't optional? From steve at pearwood.info Sat May 11 07:53:53 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 11 May 2013 15:53:53 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <50153BC4-AB57-4F54-95CC-C13540E120E9@yahoo.com> References: <50153BC4-AB57-4F54-95CC-C13540E120E9@yahoo.com> Message-ID: <518DDCF1.3000408@pearwood.info> On 11/05/13 15:12, Andrew Barnert wrote: > Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant? String constants do not need to be concatenated only at import time. Strings frequently need to be concatenated at run-time, or at function call time, or inside loops. For constants known at compile time, it is better to use a string literal rather than a string calculated at run-time for the same reason that it is better to write 2468 rather than 2000+400+60+8 -- because it better reflects the way we think about the program, not just because of the run-time expense of extra unnecessary additions/concatenations. > If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants. In principle, the keyhole optimizer could make that assumption. In practice, there is a limit to how much effort people put into the optimizer. Constant-folding method calls is probably past the point of diminishing returns. -- Steven From steve at pearwood.info Sat May 11 08:00:34 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 11 May 2013 16:00:34 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518DDABE.4050608@fastmail.us> References: <518DD8E7.10204@pearwood.info> <518DDABE.4050608@fastmail.us> Message-ID: <518DDE82.3080009@pearwood.info> On 11/05/13 15:44, Random832 wrote: > On 05/11/2013 01:36 AM, Steven D'Aprano wrote: >> Not unless you guarantee that compile-time folding of string literals with '+' will be a language feature rather than an optional optimization. > > What makes you think that implicit concatenation being compile-time isn't optional? http://docs.python.org/3/reference/lexical_analysis.html#string-literal-concatenation In the sense that there is no ISO standard for Python, *everything* is optional if Guido decrees that it is. But compile-time implicit concatenation is a documented language feature, not an implementation-dependent optimization. -- Steven From abarnert at yahoo.com Sat May 11 09:13:27 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 11 May 2013 00:13:27 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518DDCF1.3000408@pearwood.info> References: <50153BC4-AB57-4F54-95CC-C13540E120E9@yahoo.com> <518DDCF1.3000408@pearwood.info> Message-ID: <638A25D9-DF34-4848-90F5-FE56F2FA1B65@yahoo.com> On May 10, 2013, at 22:53, Steven D'Aprano wrote: > On 11/05/13 15:12, Andrew Barnert wrote: > >> Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant? > > > String constants do not need to be concatenated only at import time. > > Strings frequently need to be concatenated at run-time, or at function call time, or inside loops. For constants known at compile time, it is better to use a string literal rather than a string calculated at run-time for the same reason that it is better to write 2468 rather than 2000+400+60+8 -- because it better reflects the way we think about the program, not just because of the run-time expense of extra unnecessary additions/concatenations. Well, you have the choice of either: count = 2000 + 400 + 60 + 8 for e in hugeiter: foo(e, count) Or: for e in hugeiter: foo(e, 2468) # 2000 + 400 + 60 + 8 And again, considering that the whole point of string concatenation is dealing with cases that are hard to fit into 80 cols otherwise, the former option is, if anything, even more appropriate. >> If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants. > > In principle, the keyhole optimizer could make that assumption. In practice, there is a limit to how much effort people put into the optimizer. Constant-folding method calls is probably past the point of diminishing returns. Adding new optimizations just for the hell of it is obviously not a good idea. But we're talking about the cost of adding an optimization to vs. adding a new type of auto-dedenting string literal. It seems like about the same scope either way, and the former doesn't require any changes to the grammar, docs, other implementations, etc.--or, more importantly, existing user code. And it might even improve other related cases. If the problem is so important we're seriously considering changing the syntax, it seems a little unwarranted to reject the optimization out of hand. Or, contrarily, if the optimization is obviously not worth doing, changing the syntax to let people do the same optimization manually seems excessive. From stefan_ml at behnel.de Sat May 11 11:37:54 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 11 May 2013 11:37:54 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> Message-ID: Georg Brandl, 11.05.2013 07:24: > Am 11.05.2013 01:43, schrieb Philip Jenvey: >> On May 10, 2013, at 1:09 PM, Michael Foord wrote: >>> On 10 May 2013 20:16, Antoine Pitrou wrote: >>> >>> I'm rather -1. It's quite convenient and I don't want to add some '+' >>> signs everywhere I use it. I'm sure many people also have long string >>> literals out there and will have to endure the pain of a dull task to >>> "fix" their code. >>> >>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since >>> the "continuation" is on the same line. >>> >>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines. >> >> Strongly -1 on this proposal, I also use this quite often. > > -1 here. I use it a lot too, and find it very convenient, and while I could > live with the change, I think it should have been made together with the lot > of other syntax changes going to Python 3. I used to sort-of dislike it in the past and only recently started using it more often, specifically for dealing with long string literals. I really like it for that, although I've also been bitten by the "missing comma" bug. I guess I'm -0.5 on removing it. Stefan From stefan_ml at behnel.de Sat May 11 11:42:31 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 11 May 2013 11:42:31 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: Ezio Melotti, 10.05.2013 22:40: > ['this is a ' > 'long string', > 'this is another ' > 'long string'] > I agree that requiring extra (...) in this case is reasonable, i.e.: > [('this is a ' > 'long string'), > ('this is another ' > 'long string')] -1, IMHO this makes it more verbose and thus harder to read, because it takes a while to figure out that the parentheses are not meant to surround tuples in this case, which would be the one obvious reason to spot them inside of a list. In a way, it's the reverse of the "spot the missing comma" problem, more like a "spot that there's really no comma". That's just as bad, if you ask me. Stefan From stefan_ml at behnel.de Sat May 11 12:00:36 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 11 May 2013 12:00:36 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518DDCF1.3000408@pearwood.info> References: <50153BC4-AB57-4F54-95CC-C13540E120E9@yahoo.com> <518DDCF1.3000408@pearwood.info> Message-ID: Steven D'Aprano, 11.05.2013 07:53: > In principle, the keyhole optimizer could make that assumption. In > practice, there is a limit to how much effort people put into the > optimizer. Constant-folding method calls is probably past the point of > diminishing returns. Plus, such an optimisation can have a downside. Contrived example: if DEBUG: print('a'.replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa') .replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa') .replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa')) Expanding this into a string literal will trade space for time, whereas the original code clearly trades time for space. The same applies to string splitting. A list of many short strings takes up more space than a split call on one large string. May not seem like a major concern in most cases that involve string literals, but we shouldn't ignore the possibility that the author of the code might have used the explicit method call quite deliberately. Stefan From breamoreboy at yahoo.co.uk Sat May 11 13:10:37 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 11 May 2013 12:10:37 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> Message-ID: On 11/05/2013 06:15, Andrew Barnert wrote: > > Ah, that makes sense. > > Except that % formatting is supposed to be one of those "we haven't deprecated it, but we will, so stop using it" features, so it seems a little odd to add new syntax to make % formatting easier. > I don't think so, see http://mail.python.org/pipermail/python-dev/2012-February/116790.html -- If you're using GoogleCrap? please read this http://wiki.python.org/moin/GoogleGroupsPython. Mark Lawrence From stefan at drees.name Sat May 11 13:43:53 2013 From: stefan at drees.name (Stefan Drees) Date: Sat, 11 May 2013 13:43:53 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D9444.20200@canterbury.ac.nz> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> Message-ID: <518E2EF9.4080108@drees.name> Am 11.05.13 02:43, schrieb Greg Ewing: > Antoine Pitrou wrote: >> As for "+", saying it is a replacement is a bit simplified, because >> the syntax definition (for method calls) or operator precedence (for >> e.g. %-formatting) may force you to add parentheses. > > Maybe we could turn ... into a "string continuation > operator": > > print("This is example %d of a line that is "... > "too long" % example_number) > at least trying to follow the complete thread so only a late feedback on this proposal from me: The mysterious type [Ellipsis] comes to the rescue with all of its three characters - helping to stay below 80 chars ? In this message I avoid further adding or subtracting numbers to not overflow the result ;-) but I somhow like the current two possible ways of doing "it", as when - manually - migrating code eg. from php to python I may either remove dots or replace these with plus signs. So I have a fast working migrated code base and then - while the clients work with it - I have a more relaxed schedule to further clean it up. [Ellipsis]: http://docs.python.org/3.3/reference/datamodel.html#index-8 All the best, Stefan. From ubershmekel at gmail.com Sat May 11 15:19:24 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Sat, 11 May 2013 16:19:24 +0300 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <518D95D4.1090505@canterbury.ac.nz> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> <518D95D4.1090505@canterbury.ac.nz> Message-ID: http://bugs.python.org/issue6818 http://bugs.python.org/review/6818/show I'm not sure what ended up with that actually... Yuval On Sat, May 11, 2013 at 3:50 AM, Greg Ewing wrote: > Serhiy Storchaka wrote: > >> Indeed. infolist() may return a copy or non-modifiable proxy, NameToInfo >> and fp are private attributes. ZipFile may save in private attribute an >> offset of central directory. >> > > Seems to me it would be better to provide ZipFile with a > general remove() operation that does the right thing, with > optimisation for the case where the file happens to be > at the end. > > -- > Greg > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat May 11 15:28:57 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 11 May 2013 16:28:57 +0300 Subject: [Python-ideas] Add a .pop() method to ZipFile In-Reply-To: <518D95D4.1090505@canterbury.ac.nz> References: <20130510150114.56f776af@pitrou.net> <44A6B3E4-13C8-461A-BB25-B7C87D50DBBF@masklinn.net> <6D66251B-CDA7-4ABD-9512-F78A51799A06@yahoo.com> <5FFA138A-CAB3-439C-B818-CEAF46B97B2A@yahoo.com> <518D95D4.1090505@canterbury.ac.nz> Message-ID: 11.05.13 03:50, Greg Ewing ???????(??): > Seems to me it would be better to provide ZipFile with a > general remove() operation that does the right thing, with > optimisation for the case where the file happens to be > at the end. http://bugs.python.org/issue6818 However in general case this means opening a new file, copying the filtered content from the old file to the new one and then replacing the old file by the new one. And even this could fail for weird zipfiles with overlapped files. From storchaka at gmail.com Sat May 11 16:12:27 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 11 May 2013 17:12:27 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <50153BC4-AB57-4F54-95CC-C13540E120E9@yahoo.com> <518DDCF1.3000408@pearwood.info> Message-ID: 11.05.13 13:00, Stefan Behnel ???????(??): > Plus, such an optimisation can have a downside. Contrived example: > > if DEBUG: > print('a'.replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa') > .replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa') > .replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa')) > > Expanding this into a string literal will trade space for time, whereas the > original code clearly trades time for space. The same applies to string > splitting. A list of many short strings takes up more space than a split > call on one large string. > > May not seem like a major concern in most cases that involve string > literals, but we shouldn't ignore the possibility that the author of the > code might have used the explicit method call quite deliberately. x = 0 if x: x = 9**9**9 From jsbueno at python.org.br Sat May 11 16:21:02 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Sat, 11 May 2013 11:21:02 -0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <87txmadwg5.fsf@uwakimon.sk.tsukuba.ac.jp> References: <518D9B18.90605@mrabarnett.plus.com> <87txmadwg5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Please - check my e-mail correctly On 11 May 2013 00:31, Stephen J. Turnbull wrote: > MRAB writes: > > > I think I'd go more for a triple-quoted string with a prefix for > > dedenting and removing newlines: > > > > f = [m''' > > abc > > def > > ghi > > '''] > > I think the prefix idea is obvious - and I used the letter "i" in my message - for "idented" -0 it may be a pooorr choice indeed since it looks like it may not be noticed sometimes close to the quotes. > > where f == ['abcdefghi']. > > Cool enough, but > >>>> f = [m''' > ... abc > ... def > ... ghi > ... '''] >>>> f == ['abc def ghi'] > True In my porposal, this woukld yield a Syntax Error - any contents of the string would have to be indented to the same level of the prefix. Sorry if that was not clear enough. > > Worse, > >>>> f = [m''' > ... abc > ... def > ... ghi > ... '''] >>>> f == ['abc def ghi'] > True > > Yikes! (Yeah, I know about consenting adults.) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From storchaka at gmail.com Sat May 11 16:28:48 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 11 May 2013 17:28:48 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518D9444.20200@canterbury.ac.nz> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> Message-ID: 11.05.13 03:43, Greg Ewing ???????(??): > Maybe we could turn ... into a "string continuation > operator": > > print("This is example %d of a line that is "... > "too long" % example_number) > Maybe "/"? ;) From jbvsmo at gmail.com Sat May 11 17:03:09 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Sat, 11 May 2013 12:03:09 -0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: > > Would it be reasonable to start deprecating this and eventually remove > it from the language? > > -1... I find it very useful and clean for multiple lines and actually don't remember having bugs because of it. It could be deprecated/removed when used on a single line though. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kabie2011 at gmail.com Sat May 11 18:09:02 2013 From: kabie2011 at gmail.com (Kabie) Date: Sun, 12 May 2013 00:09:02 +0800 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: +1 for this I don't understand why would anyone wants to use this on a single line of text. But please keep it for multiple lines usage. 2013/5/11 Jo?o Bernardo > > It could be deprecated/removed when used on a single line though. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tismer at stackless.com Sat May 11 18:37:54 2013 From: tismer at stackless.com (Christian Tismer) Date: Sat, 11 May 2013 18:37:54 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> Message-ID: <518E73E2.5060408@stackless.com> On 11.05.13 11:37, Stefan Behnel wrote: > Georg Brandl, 11.05.2013 07:24: >> Am 11.05.2013 01:43, schrieb Philip Jenvey: >>> On May 10, 2013, at 1:09 PM, Michael Foord wrote: >>>> On 10 May 2013 20:16, Antoine Pitrou wrote: >>>> >>>> I'm rather -1. It's quite convenient and I don't want to add some '+' >>>> signs everywhere I use it. I'm sure many people also have long string >>>> literals out there and will have to endure the pain of a dull task to >>>> "fix" their code. >>>> >>>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since >>>> the "continuation" is on the same line. >>>> >>>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines. >>> Strongly -1 on this proposal, I also use this quite often. >> -1 here. I use it a lot too, and find it very convenient, and while I could >> live with the change, I think it should have been made together with the lot >> of other syntax changes going to Python 3. > I used to sort-of dislike it in the past and only recently started using it > more often, specifically for dealing with long string literals. I really > like it for that, although I've also been bitten by the "missing comma" bug. > > I guess I'm -0.5 on removing it. > I'm +1 on removing it, if it is combined with better indentation options for triple-quoted strings. So if there was some notation (not specified yet how) that triggers correct indentation at compile time without extra functional hacks, so that long_text = """ this text is left justified and this line indents by two spaces """ is stripped the leading and trailing \n and indentation is justified, then I think the need for the implicit whitespace operator would be small. cheers -- chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From apalala at gmail.com Sat May 11 18:39:35 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 11 May 2013 12:09:35 -0430 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: After reading about other people's use-cases, I' now: -1 I think that all that's required for solving Guido's original use case is a new warning in pylint, pep8, or flake. PEP8 could be updated to discourage the use of automatic concatenation in those places. The warning would apply only to automatic concatenations within parameter passing and structures, and not to assignments or formatting through %. Doing it this way would solve the use case by declaring certain uses of automatic concatenation a "code smell", and automating detection of the bad uses, without any changes to the language. All that Guido needs to do is change PEP8, and wait for the static analyzers to follow. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat May 11 18:48:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 02:48:30 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E73E2.5060408@stackless.com> References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> <518E73E2.5060408@stackless.com> Message-ID: On Sun, May 12, 2013 at 2:37 AM, Christian Tismer wrote: > So if there was some notation (not specified yet how) that triggers correct > indentation at compile time without extra functional hacks, so that > > long_text = """ > this text is left justified > and this line indents by two spaces > """ > > is stripped the leading and trailing \n and indentation is justified, > then I think the need for the implicit whitespace operator would be small. Through participating in this thread, I've realised that the distinction between when I use a triple quoted string (with or without textwrap.dedent()) and when I use implicit string concatenation is whether or not I want the newlines in the result. Often I can avoid the issue entirely by splitting a statement into multiple pieces, but I think Guido's right that if we didn't have implicit string concatenation there's no way we would add it ("just use a triple quoted string with escaped newlines" or "just use runtime string concatenation"), but given that we *do* have it, I don't think it's worth the hassle of removing it over a bug that a lint program should be able to pick up. So I'm back to where I started, which is that if this kind of problem really bothers anyone, start thinking seriously about the idea of a standard library linter. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat May 11 18:49:14 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 02:49:14 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> <518E73E2.5060408@stackless.com> Message-ID: On Sun, May 12, 2013 at 2:48 AM, Nick Coghlan wrote: > On Sun, May 12, 2013 at 2:37 AM, Christian Tismer wrote: > >> So if there was some notation (not specified yet how) that triggers correct >> indentation at compile time without extra functional hacks, so that >> >> long_text = """ >> this text is left justified >> and this line indents by two spaces >> """ >> >> is stripped the leading and trailing \n and indentation is justified, >> then I think the need for the implicit whitespace operator would be small. > > Through participating in this thread, I've realised that the > distinction between when I use a triple quoted string (with or without > textwrap.dedent()) and when I use implicit string concatenation is > whether or not I want the newlines in the result. Often I can avoid > the issue entirely by splitting a statement into multiple pieces, but ... not always. (Sorry, got distracted and left the sentence unfinished). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From python at mrabarnett.plus.com Sat May 11 18:55:43 2013 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 11 May 2013 17:55:43 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518E780F.5040505@mrabarnett.plus.com> On 11/05/2013 04:37, Nick Coghlan wrote: > On Sat, May 11, 2013 at 10:29 AM, Bruce Leban wrote: >> I got bit by this quite recently, leaving out a comma in a long list of >> strings and I only found the bug by accident. >> >> This being python "ideas" I'll throw one out. >> >> Add another prefix character to strings: >> >> a = [m'abc' >> 'def'] # equivalent to ['abcdef'] > > As MRAB suggested, a prefix for a compile time dedent would likely be > more useful - then you'd just use a triple quoted string and be done > with it. The other one I occasionally wish for is a compile time > equivalent of str.split (if we had that, we likely wouldn't see APIs > like collections.namedtuple and enum.Enum accepting space separated > strings). > > Amongst my ideas-so-farfetched-I-never-even-wrote-them-up (which is > saying something, given some of the ideas I *have* written up) is a > notation like: > > !processor!"STRING LITERAL" > > Where the compile time string processors had to be registered through > an appropriate API (probably in the sys module). Then you would just > define preprocessors like "merge" or "dedent" or "split" or "sh" of > "format" and get the appropriate compile time raw string->AST > translation. > > So for this use case, you would do: > > a = [!merge!"""\ > abc > def""" > Do you really need the "!"? String literals can already have a prefix, such as "r". At compile time, the string literal could be preprocessed according to its prefix (some kind of import hook, working on the AST?). The current prefixes are "" (plain literal), "r", "b", "u", etc. From tismer at stackless.com Sat May 11 19:05:05 2013 From: tismer at stackless.com (Christian Tismer) Date: Sat, 11 May 2013 19:05:05 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518E7A41.60903@stackless.com> On 11.05.13 05:37, Nick Coghlan wrote: > On Sat, May 11, 2013 at 10:29 AM, Bruce Leban wrote: >> I got bit by this quite recently, leaving out a comma in a long list of >> strings and I only found the bug by accident. >> >> This being python "ideas" I'll throw one out. >> >> Add another prefix character to strings: >> >> a = [m'abc' >> 'def'] # equivalent to ['abcdef'] > As MRAB suggested, a prefix for a compile time dedent would likely be > more useful - then you'd just use a triple quoted string and be done > with it. The other one I occasionally wish for is a compile time > equivalent of str.split (if we had that, we likely wouldn't see APIs > like collections.namedtuple and enum.Enum accepting space separated > strings). > > Amongst my ideas-so-farfetched-I-never-even-wrote-them-up (which is > saying something, given some of the ideas I *have* written up) is a > notation like: > > !processor!"STRING LITERAL" > > Where the compile time string processors had to be registered through > an appropriate API (probably in the sys module). Then you would just > define preprocessors like "merge" or "dedent" or "split" or "sh" of > "format" and get the appropriate compile time raw string->AST > translation. > > So for this use case, you would do: > > a = [!merge!"""\ > abc > def""" > Ah, I see we are on the same path here. Just not sure if it is right to move into a compile-time preprocessor language or to just handle the most common cases with a simple prefix? One example is code snippets which need proper de-indentation. I think a simple stripping of white-space in text = s""" leftmost column two-char indent """ would solve 95 % of common indentation and concatenation cases. I don't think provision for merging is needed very often. If text occurs deeply nested in code, then it is also quite likely to be part of an expression, anyway. My major use-case is text constants in a class or function that is multiple lines long and should be statically ready to use without calling a function. (here an 's' as a strip prefix, but I'm not sold on that) cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From mal at egenix.com Sat May 11 19:24:02 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 11 May 2013 19:24:02 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E7A41.60903@stackless.com> References: <518E7A41.60903@stackless.com> Message-ID: <518E7EB2.30004@egenix.com> On 11.05.2013 19:05, Christian Tismer wrote: > I think a simple stripping of white-space in > > text = s""" > leftmost column > two-char indent > """ > > would solve 95 % of common indentation and concatenation cases. > I don't think provision for merging is needed very often. > If text occurs deeply nested in code, then it is also quite likely to > be part of an expression, anyway. > My major use-case is text constants in a class or function that > is multiple lines long and should be statically ready to use without > calling a function. > > (here an 's' as a strip prefix, but I'm not sold on that) This is not a good solution for long lines where you don't want to have embedded line endings. Taken from existing code: _litmonth = ('(?P' 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' 'm?r|mae|mrz|mai|okt|dez|' 'fev|avr|juin|juil|aou|ao?|d?c|' 'ene|abr|ago|dic|' 'out' ')[a-z,\.;]*') or raise errors.DataError( 'Inconsistent revenue item currency: ' 'transaction=%r; transaction_position=%r' % (transaction, transaction_position)) We usually try to keep the code line length under 80 chars, so splitting literals in that way is rather common, esp. in nested code paths. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 11 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From graffatcolmingov at gmail.com Sat May 11 20:18:45 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Sat, 11 May 2013 14:18:45 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> <518E73E2.5060408@stackless.com> Message-ID: On Sat, May 11, 2013 at 12:48 PM, Nick Coghlan wrote: > On Sun, May 12, 2013 at 2:37 AM, Christian Tismer wrote: > >> So if there was some notation (not specified yet how) that triggers correct >> indentation at compile time without extra functional hacks, so that >> >> long_text = """ >> this text is left justified >> and this line indents by two spaces >> """ >> >> is stripped the leading and trailing \n and indentation is justified, >> then I think the need for the implicit whitespace operator would be small. > > Through participating in this thread, I've realised that the > distinction between when I use a triple quoted string (with or without > textwrap.dedent()) and when I use implicit string concatenation is > whether or not I want the newlines in the result. Often I can avoid > the issue entirely by splitting a statement into multiple pieces, but > > I think Guido's right that if we didn't have implicit string > concatenation there's no way we would add it ("just use a triple > quoted string with escaped newlines" or "just use runtime string > concatenation"), but given that we *do* have it, I don't think it's > worth the hassle of removing it over a bug that a lint program should > be able to pick up. > > So I'm back to where I started, which is that if this kind of problem > really bothers anyone, start thinking seriously about the idea of a > standard library linter. Really this should be trivial for all of the linters that already exist. That aside, (and this is not an endorsement for this proposal) but can you not just do long_text = """\ this is left justified \ and this is continued on the same line and this is indented by two spaces """ I'm personally in favor of not allowing the concatenation to be on the same line but allowing it across multiple lines. While linters would be great for this, why not just introduce the SyntaxError since (as has already been demonstrated) some of the concatenation already happens at compile time. From tismer at stackless.com Sat May 11 20:37:12 2013 From: tismer at stackless.com (Christian Tismer) Date: Sat, 11 May 2013 20:37:12 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E7EB2.30004@egenix.com> References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> Message-ID: <518E8FD8.8030800@stackless.com> On 11.05.13 19:24, M.-A. Lemburg wrote: > On 11.05.2013 19:05, Christian Tismer wrote: >> I think a simple stripping of white-space in >> >> text = s""" >> leftmost column >> two-char indent >> """ >> >> would solve 95 % of common indentation and concatenation cases. >> I don't think provision for merging is needed very often. >> If text occurs deeply nested in code, then it is also quite likely to >> be part of an expression, anyway. >> My major use-case is text constants in a class or function that >> is multiple lines long and should be statically ready to use without >> calling a function. >> >> (here an 's' as a strip prefix, but I'm not sold on that) > This is not a good solution for long lines where you don't want to > have embedded line endings. Taken from existing code: > > _litmonth = ('(?P' > 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' > 'm?r|mae|mrz|mai|okt|dez|' > 'fev|avr|juin|juil|aou|ao?|d?c|' > 'ene|abr|ago|dic|' > 'out' > ')[a-z,\.;]*') > > or > raise errors.DataError( > 'Inconsistent revenue item currency: ' > 'transaction=%r; transaction_position=%r' % > (transaction, transaction_position)) > > We usually try to keep the code line length under 80 chars, > so splitting literals in that way is rather common, esp. in > nested code paths. > Your first example is a regex, which could be used as-is. Your second example is indented five levels deep. That is a coding style which I would propose to write differently for better readability. And if you stick with it, why not use the "+"? I want to support constant strings, which should not be somewhere in the middle of code. Your second example is computed, anyway, not the case that I want to solve. cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From ncoghlan at gmail.com Sat May 11 20:51:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 04:51:09 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E780F.5040505@mrabarnett.plus.com> References: <518E780F.5040505@mrabarnett.plus.com> Message-ID: On Sun, May 12, 2013 at 2:55 AM, MRAB wrote: > Do you really need the "!"? String literals can already have a prefix, > such as "r". > > At compile time, the string literal could be preprocessed according to > its prefix (some kind of import hook, working on the AST?). The current > prefixes are "" (plain literal), "r", "b", "u", etc. 1. Short prefixes are inherently cryptic (especially single letter ones) 2. The existing prefixes control how the source code is converted to a string, they don't permit conversion to a completely different construct 3. Short prefixes are not extensible and rapidly run into namespacing issues As noted, I prefer not to solve this problem at all (and add a basic lint capability instead). However, if we do try to solve it, then I'd prefer a syntax that adds a general extensible capability rather than one that piles additional complications on the existing string prefix mess. If we support dedent, do we also support merging adjacent whitespace characters into a single string? Do we support splitting a string? Do we support upper case or lower case or taking its length? Two responses make sense to me: accept the status quo (perhaps with linter support), or design and champion a general compile time string processing capability (that doesn't rely on encoding tricks or a custom import hook). Expanding on the already cryptic string prefix system does *not* strike me as a reasonable idea at all. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From dreamingforward at gmail.com Sat May 11 20:52:15 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sat, 11 May 2013 11:52:15 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> Message-ID: >>> Maybe we could turn ... into a "string continuation >>> operator": >>> >>> print("This is example %d of a line that is "... >>> "too long" % example_number) >> >> I think that is an awesome idea. > > How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols). It partitions the conceptual space. "+" is a mathematical operator, but strings are not numbers. That's the negative argument for it. The positive, further, argument is that the elipsis has a long history of being a continuation indicator in text. > By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples? An interesting correlation indeed. -- MarkJ Tacoma, Washington From graffatcolmingov at gmail.com Sat May 11 20:57:49 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Sat, 11 May 2013 14:57:49 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> Message-ID: On Sat, May 11, 2013 at 2:52 PM, Mark Janssen wrote: >>>> Maybe we could turn ... into a "string continuation >>>> operator": >>>> >>>> print("This is example %d of a line that is "... >>>> "too long" % example_number) >>> >>> I think that is an awesome idea. >> >> How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols). > > It partitions the conceptual space. "+" is a mathematical operator, > but strings are not numbers. That's the negative argument for it. > The positive, further, argument is that the elipsis has a long history > of being a continuation indicator in text. But + is already a supported operation on strings and has been since at least python 2. It is already there and it doesn't require a new dunder method for concatenating with the Ellipsis object. It's also relatively fast and already performed at compile time. If we're going to remove this implicit concatenation, why do we have to add a fancy new feature that's non-obvious and going to need extra implementation? >> By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples? > > An interesting correlation indeed. Albeit one that is probably unrelated. I use str.format everywhere (mostly because I don't support python 2.5 in most of my work) and I'm against it. I just haven't given examples against it because others have already presented examples that I would have provided. From dreamingforward at gmail.com Sat May 11 21:22:58 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sat, 11 May 2013 12:22:58 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <87obcidrlb.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <87obcidrlb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: > > I think that is an awesome idea. > > Violates TOOWTDI. > > >>> print("This is an" + # traditional explicit operator > ... " %s idea." % ("awesome" if False else "unimpressive")) > This is an unimpressive idea. > >>> But you see you just helped me demonstrate my point: the Python interpreter *itself* uses ... as a line-continuation operater! Also, it won't violate TOOWTDI if the "+" operator is deprecated for strings. Strings are different from numbers anyway, it's an old habit/wart to use "+" for them. *moving out of the way* :)) -- MarkJ Tacoma, Washington From graffatcolmingov at gmail.com Sat May 11 21:27:51 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Sat, 11 May 2013 15:27:51 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <87obcidrlb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sat, May 11, 2013 at 3:22 PM, Mark Janssen wrote: >> > I think that is an awesome idea. >> >> Violates TOOWTDI. >> >> >>> print("This is an" + # traditional explicit operator >> ... " %s idea." % ("awesome" if False else "unimpressive")) >> This is an unimpressive idea. >> >>> > > But you see you just helped me demonstrate my point: the Python > interpreter *itself* uses ... as a line-continuation operater! It also uses it when you define a class or function, should those declarations use Ellipsis everywhere too? (For reference: >>> class A: ... a = 1 ... def __init__(self, **kwargs): ... for k, v in kwargs.items(): ... if k != 'a': ... setattr(self, k, v) ... >>> i = A() But this is getting off-topic and the question is purely rhetorical.) -- Ian From mal at egenix.com Sat May 11 23:14:14 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 11 May 2013 23:14:14 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E8FD8.8030800@stackless.com> References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> <518E8FD8.8030800@stackless.com> Message-ID: <518EB4A6.10009@egenix.com> On 11.05.2013 20:37, Christian Tismer wrote: > On 11.05.13 19:24, M.-A. Lemburg wrote: >> On 11.05.2013 19:05, Christian Tismer wrote: >>> I think a simple stripping of white-space in >>> >>> text = s""" >>> leftmost column >>> two-char indent >>> """ >>> >>> would solve 95 % of common indentation and concatenation cases. >>> I don't think provision for merging is needed very often. >>> If text occurs deeply nested in code, then it is also quite likely to >>> be part of an expression, anyway. >>> My major use-case is text constants in a class or function that >>> is multiple lines long and should be statically ready to use without >>> calling a function. >>> >>> (here an 's' as a strip prefix, but I'm not sold on that) >> This is not a good solution for long lines where you don't want to >> have embedded line endings. Taken from existing code: >> >> _litmonth = ('(?P' >> 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' >> 'm?r|mae|mrz|mai|okt|dez|' >> 'fev|avr|juin|juil|aou|ao?|d?c|' >> 'ene|abr|ago|dic|' >> 'out' >> ')[a-z,\.;]*') >> >> or >> raise errors.DataError( >> 'Inconsistent revenue item currency: ' >> 'transaction=%r; transaction_position=%r' % >> (transaction, transaction_position)) >> >> We usually try to keep the code line length under 80 chars, >> so splitting literals in that way is rather common, esp. in >> nested code paths. >> > > Your first example is a regex, which could be used as-is. > > Your second example is indented five levels deep. That is a coding > style which I would propose to write differently for better readability. > And if you stick with it, why not use the "+"? > > I want to support constant strings, which should not be somewhere > in the middle of code. Your second example is computed, anyway, > not the case that I want to solve. You're not addressing the main point I was trying to make :-) Triple-quoted strings work for strings that are supposed to have embedded newlines, but they don't provide a good alternative for long strings without embedded newlines. Regarding using '+' in these cases: of course that would be possible, but it clutters up the code, often requires additional parens, it's slower and can lead to other weird errors when forgetting parens, which are not much different than the one Guido mentioned in his original email. In all the years I've been writing Python, I've only very rarely had an issue with missing commas between strings. Most cases I ran into were missing commas in lists of tuples, not strings: l = [ 'detect_target_type', (None, Is, '"', +1, 'double_quoted_target') (None, Is, '\'', +1, 'single_quoted_target'), (None, IsIn, separators, 'unquoted_target', 'empty_target'), ] This gives: Traceback (most recent call last): File "", line 4, in TypeError: 'tuple' object is not callable :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 11 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From pjenvey at underboss.org Sat May 11 23:23:45 2013 From: pjenvey at underboss.org (Philip Jenvey) Date: Sat, 11 May 2013 14:23:45 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <26629027-323D-4BE3-A802-C6568A7157D7@underboss.org> Message-ID: <1D54CEDD-BD14-4729-8D3B-F567C3A918C9@underboss.org> On May 10, 2013, at 10:24 PM, Georg Brandl wrote: > Am 11.05.2013 01:43, schrieb Philip Jenvey: >> >> On May 10, 2013, at 1:09 PM, Michael Foord wrote: >> >>> On 10 May 2013 20:16, Antoine Pitrou wrote: >>> >>> I'm rather -1. It's quite convenient and I don't want to add some '+' >>> signs everywhere I use it. I'm sure many people also have long string >>> literals out there and will have to endure the pain of a dull task to >>> "fix" their code. >>> >>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since >>> the "continuation" is on the same line. >>> >>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines. >> >> Strongly -1 on this proposal, I also use this quite often. > > -1 here. I use it a lot too, and find it very convenient, and while I could > live with the change, I think it should have been made together with the lot > of other syntax changes going to Python 3. Also note that it was already proposed and rejected for Python 3. http://www.python.org/dev/peps/pep-3126 -- Philip Jenvey From ron3200 at gmail.com Sun May 12 00:19:14 2013 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 11 May 2013 17:19:14 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518DD9AD.8030403@canterbury.ac.nz> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> <518DD9AD.8030403@canterbury.ac.nz> Message-ID: <518EC3E2.3030906@gmail.com> Greg, I meant to send my reply earlier to the list. On 05/11/2013 12:39 AM, Greg Ewing wrote: >> Also, doesn't this imply that ... is now an operator in some contexts, > > but a literal in others? Could it's use as a literal be depreciated? I haven't seen it used in that except in examples. > It would have different meanings in different contexts, yes. > > But I wouldn't think of it as an operator, more as a token > indicating string continuation, in the same way that the > backslash indicates line continuation. Yep, it would be a token that the tokenizer would handle. So it would be handled before anything else just as the line continuation '\' is. After the file is tokenized, it is removed and won't interfere with anything else. It could be limited to strings, or expanded to include numbers and possibly other literals. a = "a long text line "... "that is continued "... "on several lines." pi = 3.1415926535... 8979323846... 2643383279 You can't do this with a line continuation '\'. Another option would be to have dedented multi-line string tokens |""" and |'''. Not too different than r""" or b""". s = |"""Multi line string | |paragraph 1 | |paragraph 2 |""" a = |"""\ |a long text line \ |that is continued \ |on several lines.\ |""" The rule for this is, for strings that start with |""" or |''', each following line needs to be proceeded with whitespace + '|', until the closing quote is reached. The tokenizer would just find and remove them as it comes across them. Any '|' on a line after the first '|' would be unaffected, so they don't need to be escaped. IT's a very explicit syntax. It's very obvious what is part of the string and what isn't. Something like this would end the endless debate on dedents. That alone might be worth it. ;-) I know the | is also a binary 'or' operator, but it's use for that is in a different contex, so I don't think it would be a problem. Both of these options would be implemented in the tokenizer and are really just tools to formatting source code rather than actual additions or changes to the language. Cheers, Ron From greg.ewing at canterbury.ac.nz Sun May 12 01:46:30 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 12 May 2013 11:46:30 +1200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> Message-ID: <518ED856.10905@canterbury.ac.nz> Someone wrote: > By the way, is it just a coincidence that almost all of the people sticking up > for keeping or replacing implicit concatenation instead of just scrapping it are > using % formatting in their examples? In my case this is because it's the context in which I use this feature most often. -- Greg From greg.ewing at canterbury.ac.nz Sun May 12 01:55:34 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 12 May 2013 11:55:34 +1200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> Message-ID: <518EDA76.2040602@canterbury.ac.nz> Ian Cordasco wrote: > On Sat, May 11, 2013 at 2:52 PM, Mark Janssen wrote: >>It partitions the conceptual space. "+" is a mathematical operator, >>but strings are not numbers. > > But + is already a supported operation on strings I still think about these two kinds of concatenation in different ways, though. When I use implicit concatenation, I don't think in terms of taking two strings and joining them together. I'm just writing a single string literal that happens to span two source lines. I believe that distinguishing them visually helps readability. Using + for both makes things look more complicated than they really are. -- Greg From greg.ewing at canterbury.ac.nz Sun May 12 01:59:17 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 12 May 2013 11:59:17 +1200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> Message-ID: <518EDB55.6020701@canterbury.ac.nz> Ian Cordasco wrote: > But + is already a supported operation on strings and has been since > at least python 2. It is already there and it doesn't require a new > dunder method for concatenating with the Ellipsis object. There would be no dunder method, because it's not a run-time operation. It's a syntax for writing a string literal that spans more than one line. Using it between any two things that are not string literals would be a syntax error. -- Greg From greg.ewing at canterbury.ac.nz Sun May 12 02:11:52 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 12 May 2013 12:11:52 +1200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <87obcidrlb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <518EDE48.4070208@canterbury.ac.nz> Mark Janssen wrote: > Strings are different from numbers anyway, it's an old > habit/wart to use "+" for them. > > *moving out of the way* :)) /me throws a dictionary at Mark Janssen with a bookmark at the entry for "plus", showing that its usage in English is much wider than it is in mathematics. -- Greg From stephen at xemacs.org Sun May 12 02:30:04 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 12 May 2013 09:30:04 +0900 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <87obcidrlb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87mws1doqb.fsf@uwakimon.sk.tsukuba.ac.jp> Mark Janssen writes: > > > I think that is an awesome idea. > > > > Violates TOOWTDI. > > > > >>> print("This is an" + # traditional explicit operator > > ... " %s idea." % ("awesome" if False else "unimpressive")) > > This is an unimpressive idea. > > >>> > > But you see you just helped me demonstrate my point: the Python > interpreter *itself* uses ... as a line-continuation operater! No, it doesn't. It's a (physical) line *separator* there. This: >>> "This is a syntax" + File "", line 1 "this is a syntax " + ^ SyntaxError: invalid syntax >>> is a syntax error. If "... " were a line continuation, it would be a prompt for the rest of the line, but you never get there. > Also, it won't violate TOOWTDI if the "+" operator is deprecated for > strings. Strings are different from numbers anyway, it's an old > habit/wart to use "+" for them. They're both just mathematical objects that have operations defined on them. Although in math we usually express multiplication by juxtaposition, I personally think EIBTI applies here. Ie, IMO using "+" makes a lot of sense although the precedence argument is a good one (but not good enough for introducing another operator, especially using a symbol that already has a different syntactic meaning). I think it's pretty clear that deprecating compile-time concatenation by juxtaposition would be massively unpopular, so the deprecation should be delayed until there's a truly attractive alternative. I think the various proposals for a dedenting syntax come close, but there remains too much resistance for my taste, and I suspect Guido won't push it. I also agree with those who think that it probably should wait for Python 4, given that it was apparently considered and rejected for Python 3. From solipsis at pitrou.net Sun May 12 02:33:54 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 May 2013 02:33:54 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518EDA76.2040602@canterbury.ac.nz> Message-ID: <20130512023354.1b3728a1@fsol> On Sun, 12 May 2013 11:55:34 +1200 Greg Ewing wrote: > Ian Cordasco wrote: > > On Sat, May 11, 2013 at 2:52 PM, Mark Janssen wrote: > > >>It partitions the conceptual space. "+" is a mathematical operator, > >>but strings are not numbers. > > > > But + is already a supported operation on strings > > I still think about these two kinds of concatenation in > different ways, though. When I use implicit concatenation, > I don't think in terms of taking two strings and joining > them together. I'm just writing a single string literal > that happens to span two source lines. > > I believe that distinguishing them visually helps > readability. Using + for both makes things look more > complicated than they really are. Agreed. Regards Antoine. From stephen at xemacs.org Sun May 12 05:10:21 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 12 May 2013 12:10:21 +0900 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130512023354.1b3728a1@fsol> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518EDA76.2040602@canterbury.ac.nz> <20130512023354.1b3728a1@fsol> Message-ID: <87ip2oevvm.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > Greg Ewing wrote: > > I believe that distinguishing them visually helps > > readability. Using + for both makes things look more > > complicated than they really are. > Agreed. In principle, I'm with Guido on this one. TOOWTDI and EIBTI weigh heavily with me, and I have been bitten by the "sequence of strings ends with no comma" bug more than once (though never twice in one day ;-). Nor do I really care whether concatenation is a runtime or compile-time operation. But vox populi is deafening.... BTW, I see no reason not to optimize "'a' + 'b'", as you can always force runtime evaluation with "''.join(['a','b'])" (which looks insane here, but probably wouldn't in a case where forcing runtime evaluation was useful). From apalala at gmail.com Sun May 12 06:10:49 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sat, 11 May 2013 23:40:49 -0430 Subject: [Python-ideas] Anonymous blocks (again): Message-ID: Hello, I've banged my self against the wall, and there simply isn't a pythonic way to make a context manager iterate over the yielded-to block of code. I've had to resource to this in Grako[*]: def block(): self.rule() self.ast['rules'] = self.last_node closure(block) But what I think would be more pythonic is: closure( def: self.rule() self.ast['rules'] = self.last_node ) Or, better yet (though I know I can't have it): with positive_closure(): self.rule() self.ast['rules'] = self.last_node The thing is that a "closure" needs to call the given block repeatedly, while remaining in control of the context of each invocation. The examples given above would signal the closure to be over with an exception. The convoluted handling of exceptions on context manager's __exit__ make it impossible (for me) to construct a context manager that can call the yield-to block several times. Anyway, the anonymous def syntax, with or without parameters, is a given, and a solution for many qualms about the Python way of things. [*] https://bitbucket.org/apalala/grako -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun May 12 07:10:26 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 15:10:26 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518EC3E2.3030906@gmail.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> <518DD9AD.8030403@canterbury.ac.nz> <518EC3E2.3030906@gmail.com> Message-ID: On Sun, May 12, 2013 at 8:19 AM, Ron Adam wrote: > > Greg, I meant to send my reply earlier to the list. > > > > On 05/11/2013 12:39 AM, Greg Ewing wrote: >>> >>> Also, doesn't this imply that ... is now an operator in some contexts, >> >> > but a literal in others? > > > Could it's use as a literal be depreciated? I haven't seen it used in that > except in examples. I take it you don't use Python for multi-dimensional array based programming, then. The ellipsis literal was added at the request of the numeric programming folks, so they had a notation for "all remaining columns" in an index tuple, and it is still used for that today. The only change related to this in Python 3 was to lift the syntactic restriction that limited the literal form to container subscripts. This change eliminated Python 2's discrepancy between defining index tuples directly in the subscript and in saving them to a variable first, or passing them as arguments to a function. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun May 12 07:15:57 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 15:15:57 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 2:10 PM, Juancarlo A?ez wrote: > Hello, > > I've banged my self against the wall, and there simply isn't a pythonic way > to make a context manager iterate over the yielded-to block of code. > > I've had to resource to this in Grako[*]: > > def block(): > self.rule() > self.ast['rules'] = self.last_node > closure(block) In current Python, decorator abuse can be a reasonable option: @closure def block(): self.rule() self.ast['rules'] = self.last_node Or, if PEP 403 were accepted and implemented: @in closure(f): def f(): self.rule() self.ast['rules'] = self.last_node (The latter would have the advantage of working with arbitrary expressions) Anyway, if this is a topic that interests you, I strongly recommend reading both PEP 403 and PEP 3150 in full. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun May 12 07:17:32 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 15:17:32 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 3:15 PM, Nick Coghlan wrote: > Or, if PEP 403 were accepted and implemented: > > @in closure(f): > def f(): > self.rule() > self.ast['rules'] = self.last_node Oops, typo in that example (the extra colon on the @in expression): @in closure(f) def f(): self.rule() self.ast['rules'] = self.last_node Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From apalala at gmail.com Sun May 12 08:11:32 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 12 May 2013 01:41:32 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 12:45 AM, Nick Coghlan wrote: > In current Python, decorator abuse can be a reasonable option: > > @closure > def block(): > self.rule() > self.ast['rules'] = self.last_node > Buf for that to work, you'd still have to call: block() And that would make it: @closure def block(): self.rule() self.ast['rules'] = self.last_node block() Which I think makes little sense to a human reader, at least not in the pythonic way, and less so when compared to my (map/reduce..functional). closure(block) Your proposal was the approach I previously used in Grako, and I deprecated it in favor of the currently standing state of things in Python, which is: *If you want an executable block of code you can iterate upon, then define a (non-anonymous) function, and pass it to the iterator.* Cheers. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Sun May 12 10:18:55 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 12 May 2013 04:18:55 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 2:11 AM, Juancarlo A?ez wrote: > On Sun, May 12, 2013 at 12:45 AM, Nick Coghlan wrote: >> >> In current Python, decorator abuse can be a reasonable option: >> >> @closure >> def block(): >> self.rule() >> self.ast['rules'] = self.last_node > > > Buf for that to work, you'd still have to call: > > block() > > And that would make it: > > @closure > def block(): > self.rule() > self.ast['rules'] = self.last_node > block() No, not with how closure was defined in the original post. (Otherwise, the code would be closure(block)()). Consider: >>> def closure(f): ... f(); f(); f() ... >>> @closure ... def block(): ... print "Hi" ... Hi Hi Hi -- Devin From apalala at gmail.com Sun May 12 13:56:40 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 12 May 2013 07:26:40 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 3:48 AM, Devin Jeanpierre wrote: > the code would be closure(block)()). > > Consider: > > >>> def closure(f): > ... f(); f(); f() > ... > >>> @closure > ... def block(): > ... print "Hi" > ... > Hi > Hi > Hi > Mmm. Interesting, but unpythonic. A decorator that executes the target right away? I also tried: with closure: while True: block But the context-manager design is shortsighted, and it will exit on the first exception it sees, no matter what. I've tried everything, so I'm pretty sure that there's no clean solution in 2.7/3.3. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 12 14:06:21 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 12 May 2013 05:06:21 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <518F85BD.5060008@stoneleaf.us> Wow. Judging from the size of this thread one might think you had suggested enumerating the string literals. ;) -- ~Ethan~ From ncoghlan at gmail.com Sun May 12 16:39:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 May 2013 00:39:05 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 9:56 PM, Juancarlo A?ez wrote: > But the context-manager design is shortsighted, and it will exit on the > first exception it sees, no matter what. That's not shortsighted, it was a deliberate design decision to *prevent* with statements from being used as a replacement for explicit while and for loops. See PEP 343. > I've tried everything, so I'm pretty sure that there's no clean solution in > 2.7/3.3. Correct. This is why PEP 403 exists. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From apalala at gmail.com Sun May 12 19:22:53 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 12 May 2013 12:52:53 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 10:09 AM, Nick Coghlan wrote: > > I've tried everything, so I'm pretty sure that there's no clean solution > in > > 2.7/3.3. > > Correct. This is why PEP 403 exists. PEP 403 sucks! It's a very ill attempt at replacing the need for anonymous blocks, which could be done syntax very like the current one. ATIAIHTSAT! This thread is closed, AFAIC. I am in peace with the must-be-functions of Python blocks. Smarter people than me will figure things out. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Sun May 12 20:14:14 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 12 May 2013 14:14:14 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: <518FDBF6.2090606@nedbatchelder.com> On 5/12/2013 1:22 PM, Juancarlo A?ez wrote: > > On Sun, May 12, 2013 at 10:09 AM, Nick Coghlan > wrote: > > > I've tried everything, so I'm pretty sure that there's no clean > solution in > > 2.7/3.3. > > Correct. This is why PEP 403 exists. > > > PEP 403 sucks! There's no need for that tone. Everyone here is being respectful, you can be also. --Ned. > > It's a very ill attempt at replacing the need for anonymous blocks, > which could be done syntax very like the current one. > > ATIAIHTSAT! > > This thread is closed, AFAIC. I am in peace with the must-be-functions > of Python blocks. Smarter people than me will figure things out. > > Cheers, > > -- > Juancarlo *A?ez* > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Sun May 12 21:08:35 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 12 May 2013 14:38:35 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <518FDBF6.2090606@nedbatchelder.com> References: <518FDBF6.2090606@nedbatchelder.com> Message-ID: On Sun, May 12, 2013 at 1:44 PM, Ned Batchelder wrote: > There's no need for that tone. Everyone here is being respectful, you can > be also. > My apologies. I'm sorry. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 13 01:40:57 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 May 2013 09:40:57 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On 13 May 2013 03:23, "Juancarlo A?ez" wrote: > > > On Sun, May 12, 2013 at 10:09 AM, Nick Coghlan wrote: >> >> > I've tried everything, so I'm pretty sure that there's no clean solution in >> > 2.7/3.3. >> >> Correct. This is why PEP 403 exists. > > > PEP 403 sucks! > > It's a very ill attempt at replacing the need for anonymous blocks, which could be done syntax very like the current one. Anonymous blocks in Ruby depend on the convention that the block is always the last positional argument. Python has no such convention, thus any "block like" solution will require a mechanism that allows the user to tell the interpreter where the trailing callable should be referenced in the preceding simple statement. Earlier versions of PEP 403 used a magic symbol for this, but that ended up being ugly and non-obvious. Thus, I changed it to the current explicit forward reference. For throwaway callbacks, using a short meaningless name like "f" should be sufficiently brief, and in many cases a more meaningful name may be used in order to make the code more self-documenting. Now, do you have any constructive feedback on the PEP that still accounts for Python's lack of a standard location for passing callables to functions, or is this reaction simply a matter of "I don't want to have to type 'f' twice because I don't have to do that in other languages"? Regards, Nick. > > ATIAIHTSAT! > > This thread is closed, AFAIC. I am in peace with the must-be-functions of Python blocks. Smarter people than me will figure things out. > > Cheers, > > -- > Juancarlo A?ez -------------- next part -------------- An HTML attachment was scrubbed... URL: From mm at ensoft.co.uk Mon May 13 02:30:44 2013 From: mm at ensoft.co.uk (Martin Morrison) Date: Mon, 13 May 2013 01:30:44 +0100 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On 13 May 2013, at 00:40, Nick Coghlan wrote: > Now, do you have any constructive feedback on the PEP that still accounts for Python's lack of a standard location for passing callables to functions, or is this reaction simply a matter of "I don't want to have to type 'f' twice because I don't have to do that in other languages"? > Moving past the outright negative feedback, and having only just seen the PEP, the proposed syntax did strike me as awkward and unintuitive. Maybe there is some explanation for why decorator-like syntax was used - if so, please do link me so I can read up. What struck me though is that the proposed syntax limits the ability to have multiple "anonymous blocks" within a single statement. Instead, I was thinking some syntax like the following might be nicer: in x = do_something(in_arg, success_hdlr, error_hdlr): def success_hdlr(result): ... # Do something with result def error_hdlr(error): ... # Do something with error That is instead of a decorator-like syntax, make the "in" keyword reusable to introduce a new block, whose "argument" is a statement that can forward reference some names, which are then defined within the block. This allows multiple temporary names to be defined (each in a separate statement within the block). Some further thought is required on whether only def (and maybe class) statements should be allowed within the "in" block. Although I guess there's technically nothing wrong with: in x = y + z: y = 12 z = 30 Other than it's a very verbose way of doing something simple. ;-) But maybe there are more useful examples? Cheers, Martin > Regards, > Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mm at ensoft.co.uk Mon May 13 02:44:15 2013 From: mm at ensoft.co.uk (Martin Morrison) Date: Mon, 13 May 2013 01:44:15 +0100 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: <88B5EEA0-C1E5-4B1C-BF47-1C5273D36EE3@ensoft.co.uk> On 9 May 2013, at 11:29, Piotr Duda wrote: > These also apply for other objects like NamedTuple or mentioned NamedValues. +1 I really like this proposal. In one of my libraries, I actually went one step further with the frame hack and injected the resulting object into the parent namespace in an attempt to avoid the duplicated declaration of the name. The syntax was something like: namedtuple.MyName("foo", "bar") as a bare statement, which would inject "MyName" into the namespace (using the frame hack) as well as getting the right module name. This is obviously very, very ugly. :-) > To solve these problems I propose to add simple syntax that assigns > these attributes to arbitrary object: > def name = expression FWIW, this syntax looks the most obvious to me, as it clearly communicates both the assignment of the return value of the expression to the name, and the fact that name is a local definition (and thus likely to acquire additional properties). Cheers, Martin > other possible forms may be: > def name from expression > class name = expression > class name from expression > name := expression # new operator > > > which would be equivalent for: > _tmp = expression > _tmp.__name__ = 'name' > _tmp.__qualname__ = ... # corresponding qualname > _tmp.__module__ = __name__ > # apply decorators if present > name = _tmp > > with new syntax declaring Enum will look like > def Animals = Enum('dog cat bird') > > as pointed by Larry it may be done using existing syntax in form: > @Enum('dog cat bird') > def Animals(): pass > > but it's ugly, and may by confusing. > > > Other examples: > def MyTuple = NamedTuple("a b c d") > def PI = NamedValue(3.1415926) > > > -- > ???????? > ?????? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From mertz at gnosis.cx Mon May 13 03:04:17 2013 From: mertz at gnosis.cx (David Mertz) Date: Sun, 12 May 2013 18:04:17 -0700 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 5:30 PM, Martin Morrison wrote: > Moving past the outright negative feedback, and having only just seen the > PEP, the proposed syntax did strike me as awkward and unintuitive. Maybe > there is some explanation for why decorator-like syntax was used - if so, > please do link me so I can read up. > I just read PEP 403 myself also. I confess I likewise have trouble getting past the unnatural feeling (to me, at least at first brush) of the decorator syntax. The rejected alternative of the 'given' keyword seems less unnatural. I wonder though why not just use the ML style here. E.g. Spell this: > in x = do_something(in_arg, success_hdlr, error_hdlr): > def success_hdlr(result): > ... # Do something with result > def error_hdlr(error): > ... # Do something with error > As: in x = do_something(in_arg, success_hdlr, error_hdlr, const) let: > def success_hdlr(result): > ... # Do something with result > def error_hdlr(error): > ... # Do something with error > const = 42 > Well, I've reversed the order of in/let from ML, but the keywords are the same. But as Martin points out, I can't see any reason to preclude defining multiple one-off blocks... not even if the name definitions aren't 'def' or 'class' (hence my addition of defining 'const=42' in my slightly expanded version). Yes it's still a pseudo-block, and we do have to do something with scoping. But it reads better to me than 'given', and also better than the bare 'in' block introduction without the explicit 'let'. Yours, David... -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon May 13 03:14:12 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 12 May 2013 20:44:12 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <30AFAFFD-C84D-47B5-99BF-659FBB30679B@cantab.net> References: <30AFAFFD-C84D-47B5-99BF-659FBB30679B@cantab.net> Message-ID: Although I said I was over this never-ending topic, I thought I might say that: let: def block(): # do stuff in: closure(block) Would be well understood by many in in the Python and functional-language communities. Cheers, On Sun, May 12, 2013 at 7:43 PM, Martin Morrison < martin.morrrison at cantab.net> wrote: > On 13 May 2013, at 00:40, Nick Coghlan wrote: > > Now, do you have any constructive feedback on the PEP that still accounts > for Python's lack of a standard location for passing callables to > functions, or is this reaction simply a matter of "I don't want to have to > type 'f' twice because I don't have to do that in other languages"? > > Moving past the outright negative feedback, and having only just seen the > PEP, the proposed syntax did strike me as awkward and unintuitive. Maybe > there is some explanation for why decorator-like syntax was used - if so, > please do link me so I can read up. > > What struck me though is that the proposed syntax limits the ability to > have multiple "anonymous blocks" within a single statement. Instead, I was > thinking some syntax like the following might be nicer: > > in x = do_something(in_arg, success_hdlr, error_hdlr): > def success_hdlr(result): > ... # Do something with result > def error_hdlr(error): > ... # Do something with error > > That is instead of a decorator-like syntax, make the "in" keyword reusable > to introduce a new block, whose "argument" is a statement that can forward > reference some names, which are then defined within the block. This allows > multiple temporary names to be defined (each in a separate statement within > the block). > > Some further thought is required on whether only def (and maybe class) > statements should be allowed within the "in" block. Although I guess > there's technically nothing wrong with: > > in x = y + z: > y = 12 > z = 30 > > Other than it's a very verbose way of doing something simple. ;-) But > maybe there are more useful examples? > > Cheers, > Martin > > Regards, > Nick. > > -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Mon May 13 03:27:07 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Sun, 12 May 2013 21:27:07 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <88B5EEA0-C1E5-4B1C-BF47-1C5273D36EE3@ensoft.co.uk> References: <88B5EEA0-C1E5-4B1C-BF47-1C5273D36EE3@ensoft.co.uk> Message-ID: Since MacroPy came up, I thought I should chime in (since I made it) Macros could pretty trivially give you any syntax that doesn't blow up in a SyntaxError. that means "def x = y" is out of the question, unless someone changes the python runtime's lexer/parser to make that work. What we have right now: @case class MyName(foo, bar): pass allows the class to have methods (put them in place of the `pass`, just like any other class) and generally looks quite pretty. It would be ~50 lines of code to make a macro for: @enum class MyEnum(Red, White, Blue): pass or: @enum class MyEnum(index): Monday(0) Tuesday(1) ' Wednesday(2) ... I would argue that "solving the more general problem" is exactly what macros are about (if you think Macros are sketchy, you should look at the implementation of namedtuple!), but I'm sure many would disagree. I just wanted to clarify what is possible. -Haoyi On Sun, May 12, 2013 at 8:44 PM, Martin Morrison wrote: > On 9 May 2013, at 11:29, Piotr Duda wrote: > > > These also apply for other objects like NamedTuple or mentioned > NamedValues. > > +1 > I really like this proposal. > > In one of my libraries, I actually went one step further with the frame > hack and injected the resulting object into the parent namespace in an > attempt to avoid the duplicated declaration of the name. The syntax was > something like: > > namedtuple.MyName("foo", "bar") > > as a bare statement, which would inject "MyName" into the namespace (using > the frame hack) as well as getting the right module name. This is obviously > very, very ugly. :-) > > > To solve these problems I propose to add simple syntax that assigns > > these attributes to arbitrary object: > > def name = expression > > FWIW, this syntax looks the most obvious to me, as it clearly communicates > both the assignment of the return value of the expression to the name, and > the fact that name is a local definition (and thus likely to acquire > additional properties). > > Cheers, > Martin > > > other possible forms may be: > > def name from expression > > class name = expression > > class name from expression > > name := expression # new operator > > > > > > which would be equivalent for: > > _tmp = expression > > _tmp.__name__ = 'name' > > _tmp.__qualname__ = ... # corresponding qualname > > _tmp.__module__ = __name__ > > # apply decorators if present > > name = _tmp > > > > with new syntax declaring Enum will look like > > def Animals = Enum('dog cat bird') > > > > as pointed by Larry it may be done using existing syntax in form: > > @Enum('dog cat bird') > > def Animals(): pass > > > > but it's ugly, and may by confusing. > > > > > > Other examples: > > def MyTuple = NamedTuple("a b c d") > > def PI = NamedValue(3.1415926) > > > > > > -- > > ???????? > > ?????? > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon May 13 03:33:47 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 12 May 2013 18:33:47 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: Message-ID: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> On May 9, 2013, at 3:29 AM, Piotr Duda wrote: > > Animals = Enum('Animals', 'dog cat bird') > which violates DRY This is a profound misreading of DRY which is all about not repeating big chunks of algorithmic logic. The above code is clear and doesn't require any special syntax. Remember, that is why we got rid of the print-keyword in favor of the print-function (the new print requires no special rules and works like ordinary functions). Also remember that Enum and NamedTuple calls typically only occur once in code. The bulk of the code simply *uses* the declared enumerations or named tuples. In other words, you're inventing new syntax to solve a very unimportant problem. In most code, you will save perhaps one single word, but it will come at the expense of an even and odd and unexpected 'def' syntactic magic: old: Animals = Enum('Animals', 'dog cat bird') new: def Animals = Enum('dog cat bird') net savings: 6 characters net loss: complexify Enum, abuse the def syntax, looks weird, ... old: a = Animals.dog; b=Animals.cat new: a = Animals.dog; b=Animals.cat new change where it matters: zero! > These also apply for other objects like NamedTuple or mentioned NamedValues. > > To solve these problems I propose to add simple syntax that assigns > these attributes to arbitrary object: > def name = expression > other possible forms may be: > def name from expression > class name = expression > class name from expression > name := expression # new operator > > > which would be equivalent for: > _tmp = expression > _tmp.__name__ = 'name' > _tmp.__qualname__ = ... # corresponding qualname > _tmp.__module__ = __name__ > # apply decorators if present > name = _tmp > > > > > Other examples: > def MyTuple = NamedTuple("a b c d") > def PI = NamedValue(3.1415926) Sorry, but I think this is yet another terrible idea. People like Python because of its beautiful and intuitive syntax. Why throw that out the window for such an unimportant problem? Raymond From ned at nedbatchelder.com Mon May 13 03:53:52 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 12 May 2013 21:53:52 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> Message-ID: <519047B0.7060505@nedbatchelder.com> On 5/12/2013 9:33 PM, Raymond Hettinger wrote: > Sorry, but I think this is yet another terrible idea. This seems uncivil to me. You may dislike most of the ideas on this list, and in fact, the vast majority of them will be rejected, but there's no need to be harsh. --Ned. From raymond.hettinger at gmail.com Mon May 13 04:09:47 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 12 May 2013 19:09:47 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <519047B0.7060505@nedbatchelder.com> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <519047B0.7060505@nedbatchelder.com> Message-ID: On May 12, 2013, at 6:53 PM, Ned Batchelder wrote: >> Sorry, but I think this is yet another terrible idea. > > This seems uncivil to me. Really, we can't say we think something is a really bad idea? New wording: "Sorry, but I think this proposal may not be a net positive for the language." Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Mon May 13 04:43:28 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 12 May 2013 22:43:28 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <519047B0.7060505@nedbatchelder.com> Message-ID: <51905350.1010909@nedbatchelder.com> On 5/12/2013 10:09 PM, Raymond Hettinger wrote: > > On May 12, 2013, at 6:53 PM, Ned Batchelder > wrote: > >>> Sorry, but I think this is yet another terrible idea. >> >> This seems uncivil to me. > > Really, we can't say we think something is a really bad idea? > > New wording: "Sorry, but I think this proposal may not be a net > positive for the language." > > Raymond, I apologize. I probably misread your intent. I certainly didn't mean to make you feel unwelcome. I wanted to be sure the people proposing ideas don't feel unwelcome. --Ned. > Raymond > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon May 13 04:53:55 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 13 May 2013 11:53:55 +0900 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> Martin Morrison writes: > That is instead of a decorator-like syntax, make the "in" keyword > reusable to introduce a new block, whose "argument" is a statement > that can forward reference some names, which are then defined > within the block. This allows multiple temporary names to be > defined (each in a separate statement within the block). This idea and its presumed defects are described (using the "given" syntax of PEP 3150) in the section "Using a nested suite" in PEP 403. > Some further thought is required on whether only def (and maybe > class) statements should be allowed within the "in" block. Although > I guess there's technically nothing wrong with: > > in x = y + z: > y = 12 > z = 30 > > Other than it's a very verbose way of doing something simple. ;-) Violates TOOWTDI according to PEP 403. David Mertz and Juancarlo A?ez riff on the theme: >[Why not spell it something like]: > > in x = do_something(in_arg, success_hdlr, error_hdlr, const) let: > def success_hdlr(result): > ... # Do something with result > def error_hdlr(error): > ... # Do something with error > const = 42 (Note the "let" at the end of the "in" clause.) Python doesn't use redundant keywords for a single construct. "let" is redundant with the following "def"s. On top of that, "let" being a new keyword will kill this syntax, I think. From apalala at gmail.com Mon May 13 05:58:35 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Sun, 12 May 2013 23:28:35 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, May 12, 2013 at 10:23 PM, Stephen J. Turnbull wrote: > Python doesn't use redundant keywords for a single construct. > "let" is redundant with the following "def"s. On top of that, "let" > being a new keyword will kill this syntax, I think. > I don't want new syntax (I think I don't). What I want is to be able to invoke a block of code repeatedly, within a context, and in a pythonic way. within closure(): do_this() and_do_that() If it can't be pythonic (clear to the reader), I'm not interested. It's good enough as it is: def block(): do_this() do_that() closure(block) Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon May 13 06:30:10 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 00:00:10 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, May 12, 2013 at 11:28 PM, Juancarlo A?ez wrote: > What I want is to be able to invoke a block of code repeatedly, within a > context, and in a pythonic way. > > within closure(): > > do_this() > and_do_that() > > Hey! I must say that I'm speaking from a niche-perspective. I'm seeking that automatically generated parsers are readable by their creators. I do think that anonymous blocks would be good in broad ways, but my interest in them at shit time is quite narrow. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon May 13 06:47:15 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 13 May 2013 14:47:15 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <51907053.5010108@pearwood.info> On 13/05/13 13:58, Juancarlo A?ez wrote: > I don't want new syntax (I think I don't). > > What I want is to be able to invoke a block of code repeatedly, within a > context, and in a pythonic way. Surely that would be: with context(): while condition: # or a for loop block of code goes here If you want something different to this, then I think you do want new syntax. Otherwise, what do you gain beyond what can already be done now? Or am I missing something? -- Steven From ncoghlan at gmail.com Mon May 13 07:17:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 May 2013 15:17:15 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <51907053.5010108@pearwood.info> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 2:47 PM, Steven D'Aprano wrote: > On 13/05/13 13:58, Juancarlo A?ez wrote: > >> I don't want new syntax (I think I don't). >> >> What I want is to be able to invoke a block of code repeatedly, within a >> context, and in a pythonic way. > > > Surely that would be: > > with context(): > while condition: # or a for loop > block of code goes here > > > If you want something different to this, then I think you do want new > syntax. Otherwise, what do you gain beyond what can already be done now? > > Or am I missing something? Ruby uses anonymous callbacks for things where Python instead uses dedicated syntax: Python -> Ruby decorated function definitions -> callbacks for loops + iterator protocol -> callbacks with statements + context management protocol -> callbacks callbacks -> callbacks (but with much nicer syntax) Blocks are a *really* nice way of doing callbacks, so nice that Ruby just doesn't have some of the concepts Python does - it uses callbacks instead. While I have no real interest in Ruby's use of embedded callbacks inside expressions (or various other pieces of control flow magic that Ruby blocks support), I *do* think their ability to easily supply a full nested callback to a single statement is valuable, and gets to the heart of people's interest in multi-line lambdas in Python. PEP 403 is mostly about adapting that feature to an ecosystem which doesn't have the "the callback is the last positional parameter" convention that the block syntax established for Ruby. Using a forward reference to a class instead lets you have multiple forward references through attribute access. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From greg.ewing at canterbury.ac.nz Mon May 13 08:25:20 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 13 May 2013 18:25:20 +1200 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <51908750.20603@canterbury.ac.nz> Juancarlo A?ez wrote: > What I want is to be able to invoke a block of code repeatedly, within a > context, and in a pythonic way. > > within closure(): > do_this() > and_do_that() Whenever this kind of thing has been considered before, one of the stumbling blocks has always been how to handle things like this: while something: within closure(): do_this() if moon_is_blue: break and_do_that() This is one of the main reasons that the with-statement ended up being implemented the way it is, instead of by passing an implicit closure. -- Greg From steve at pearwood.info Mon May 13 11:11:21 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 13 May 2013 19:11:21 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: <20130513091121.GA23554@ando> On Mon, May 13, 2013 at 03:17:15PM +1000, Nick Coghlan wrote: > On Mon, May 13, 2013 at 2:47 PM, Steven D'Aprano wrote: > > On 13/05/13 13:58, Juancarlo A?ez wrote: > > > >> I don't want new syntax (I think I don't). > >> > >> What I want is to be able to invoke a block of code repeatedly, within a > >> context, and in a pythonic way. > > > > Surely that would be: > > > > with context(): > > while condition: # or a for loop > > block of code goes here > > > > > > If you want something different to this, then I think you do want new > > syntax. Otherwise, what do you gain beyond what can already be done now? > > > > Or am I missing something? > > Ruby uses anonymous callbacks for things where Python instead uses > dedicated syntax: > > Python -> Ruby > > decorated function definitions -> callbacks > for loops + iterator protocol -> callbacks > with statements + context management protocol -> callbacks > callbacks -> callbacks (but with much nicer syntax) > > Blocks are a *really* nice way of doing callbacks, so nice that Ruby > just doesn't have some of the concepts Python does - it uses callbacks > instead. I'm obviously still missing something, because I'm aware of Ruby's blocks, but I don't quite see how they apply to Juancarlo's *specific* use-case, as described above. Unless Juancarlo's use-case is more general than I understood, it seems to me that we don't need blocks, anonymous or otherwise, to "invoke a block of code repeatedly, within a context", in a Pythonic way. Perhaps a concrete (even if toy or made-up) example might help me understand. The only thing I can think of is, if I had a bunch of similar loops inside the same context, where only the body of the loop was different, I might want to factor it out something like this: the_block = {define a block of code, somehow} def do_stuff(block): with context: while condition: {execute the block of code} do_stuff(the_block) do_stuff(another_block) but I think that requires new syntax, and Juancarlo specifically says he doesn't want new syntax. -- Steven From lakshmi.vyas at gmail.com Mon May 13 11:22:24 2013 From: lakshmi.vyas at gmail.com (Lakshmi Vyas) Date: Mon, 13 May 2013 14:52:24 +0530 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <20130513091121.GA23554@ando> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <20130513091121.GA23554@ando> Message-ID: <5190B0D0.8030800@gmail.com> > Perhaps a concrete (even if toy or made-up) example might help me > understand. Not sure if this example fits Juancarlo's criterion: Here is a place where I really craved for blocks and resorted to using a context manager + decorators: https://github.com/gitbot/gitbot/blob/master/gitbot/lib/s3.py#L140-L169 The use case is essentially: recursively loop through a folder and push to Amazon S3 evaluating rules for each file / folder. Here is the implementation: https://github.com/lakshmivyas/fswrap/blob/master/fswrap.py#L317-L439 Just removing the need for the decorators would make this pattern completely acceptable *for me*. Thanks Lakshmi Steven D'Aprano wrote: > On Mon, May 13, 2013 at 03:17:15PM +1000, Nick Coghlan wrote: >> On Mon, May 13, 2013 at 2:47 PM, Steven D'Aprano wrote: >>> On 13/05/13 13:58, Juancarlo A?ez wrote: >>> >>>> I don't want new syntax (I think I don't). >>>> >>>> What I want is to be able to invoke a block of code repeatedly, within a >>>> context, and in a pythonic way. >>> Surely that would be: >>> >>> with context(): >>> while condition: # or a for loop >>> block of code goes here >>> >>> >>> If you want something different to this, then I think you do want new >>> syntax. Otherwise, what do you gain beyond what can already be done now? >>> >>> Or am I missing something? >> Ruby uses anonymous callbacks for things where Python instead uses >> dedicated syntax: >> >> Python -> Ruby >> >> decorated function definitions -> callbacks >> for loops + iterator protocol -> callbacks >> with statements + context management protocol -> callbacks >> callbacks -> callbacks (but with much nicer syntax) >> >> Blocks are a *really* nice way of doing callbacks, so nice that Ruby >> just doesn't have some of the concepts Python does - it uses callbacks >> instead. > > I'm obviously still missing something, because I'm aware of Ruby's > blocks, but I don't quite see how they apply to Juancarlo's *specific* > use-case, as described above. > > Unless Juancarlo's use-case is more general than I understood, it seems > to me that we don't need blocks, anonymous or otherwise, to "invoke a > block of code repeatedly, within a context", in a Pythonic way. > > Perhaps a concrete (even if toy or made-up) example might help me > understand. The only thing I can think of is, if I had a bunch of > similar loops inside the same context, where only the body of the loop > was different, I might want to factor it out something like this: > > > the_block = {define a block of code, somehow} > > def do_stuff(block): > with context: > while condition: > {execute the block of code} > > > do_stuff(the_block) > do_stuff(another_block) > > > but I think that requires new syntax, and Juancarlo specifically says he > doesn't want new syntax. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Mon May 13 16:57:43 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 13 May 2013 10:57:43 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: On Fri, May 10, 2013 at 1:16 PM, Terry Jan Reedy wrote: > On 5/10/2013 8:52 AM, Todd V. Rovito wrote: >> On May 10, 2013, at 5:36 AM, Alexandre Boulay >> wrote: >>> I think that could be a good idea to put colored dots on idle's >>> scroll bar for each def or class created, each got its own color, > I cannot really understand what you are proposing. The scroll bar is for > scrolling, and it has the arrow buttons and the bar itself that would > interfere with placing dots. Furthermore, scroll bars are widgets defined by > tk and as far as I know, IDLE has no control over the detailed appearance. I suspect he is suggesting that the scrollbar represent position in the file in terms of "3 top level classes above, 2 below" instead of just by line count. This sounds straightforward as an overlay graphic. Perhaps even changing the navigation so that clicking on the scrollbar at 1/3 of the way down will move you to 1/3 of the way down the file, instead of "one page up from where you current are." This would no longer be a standard scrollbar, but it might well be better. -jJ From random832 at fastmail.us Mon May 13 17:10:09 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 13 May 2013 11:10:09 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: <1368457809.26982.140661230099241.02184D85@webmail.messagingengine.com> On Mon, May 13, 2013, at 10:57, Jim Jewett wrote: > Perhaps even changing the navigation so that clicking on the scrollbar > at 1/3 of the way down will move you to 1/3 of the way down the file, > instead of "one page up from where you current are." This would no > longer be a standard scrollbar, but it might well be better. Some platforms' standard scrollbars do this on a shift-click or a middle-click. If you're really talking about designing a custom scrollbar control, you should be looking at what various platforms do. -- Random832 From apalala at gmail.com Mon May 13 17:09:49 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 10:39:49 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <51907053.5010108@pearwood.info> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 12:17 AM, Steven D'Aprano wrote: > with context(): > while condition: # or a for loop > block of code goes here > > > If you want something different to this, then I think you do want new > syntax. Otherwise, what do you gain beyond what can already be done now? > > Or am I missing something? > It's not obvious from my example, but the idea is that the invoker be able to provide context for _each_ iteration. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Mon May 13 17:19:03 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 13 May 2013 11:19:03 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: <1368457809.26982.140661230099241.02184D85@webmail.messagingengine.com> References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> <1368457809.26982.140661230099241.02184D85@webmail.messagingengine.com> Message-ID: This is just pure bike-shedding, but you should really check out - PyCharm's scrollbar, with its colored dots for errors/etc. - Sublime Text's minimap/scrollbar Both of which are really really nice On Mon, May 13, 2013 at 11:10 AM, wrote: > On Mon, May 13, 2013, at 10:57, Jim Jewett wrote: > > Perhaps even changing the navigation so that clicking on the scrollbar > > at 1/3 of the way down will move you to 1/3 of the way down the file, > > instead of "one page up from where you current are." This would no > > longer be a standard scrollbar, but it might well be better. > > Some platforms' standard scrollbars do this on a shift-click or a > middle-click. > > If you're really talking about designing a custom scrollbar control, you > should be looking at what various platforms do. > > -- > Random832 > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ubershmekel at gmail.com Mon May 13 17:20:57 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 13 May 2013 18:20:57 +0300 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: On Mon, May 13, 2013 at 5:57 PM, Jim Jewett wrote: > Perhaps even changing the navigation so that clicking on the scrollbar > at 1/3 of the way down will move you to 1/3 of the way down the file, > instead of "one page up from where you current are." This would no > longer be a standard scrollbar, but it might well be better. > Non-standard scroll bar sounds bad to me. Here's how this feature works in Eclipse/pydev http://i.imgur.com/kQrc5n0.png Basically the scrollbar has another small strip next to it. So the locations are clickable, and they bring you to the right place, without breaking the normal scroll bar behavior. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon May 13 17:23:37 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 10:53:37 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <5190B0D0.8030800@gmail.com> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <20130513091121.GA23554@ando> <5190B0D0.8030800@gmail.com> Message-ID: On Mon, May 13, 2013 at 4:52 AM, Lakshmi Vyas wrote: > Here is a place where I really craved for blocks and resorted to using a > context manager + decorators: > > https://github.com/gitbot/gitbot/blob/master/gitbot/lib/s3.py#L140-L169 > That is a VERY interesting pattern: with source.walker as walker: def ignore(name): return match_pattern(ignore_patterns, name) @walker.folder_visitor def visit_folder(folder): Make the context be the source of the decorators, and do the iteration on __exit__. This could work for me, but you must admit it is very much twisting context managers arms to the extreme. with self.closure() as c: @c def *_*(): match_this() match_that() I'd like the above to be something like (warning:new keyword ahead): within self.closure(): match_this() match_that() A clean, anonymous block. -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon May 13 17:40:49 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 11:10:49 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 10:39 AM, Juancarlo A?ez wrote: > If you want something different to this, then I think you do want new >> syntax. Otherwise, what do you gain beyond what can already be done now? >> >> Or am I missing something? >> > > It's not obvious from my example, but the idea is that the invoker be able > to provide context for _each_ iteration. > I can explain better. While parsing a closure, the parser knows it should stop iterating because the embedded expression fails to parse midway. At that point, the last iteration must be rolled back (rewind the input stream and discard syntax tree nodes). To roll back just the last iteration, the iterator needs to control the context for each iteration, because it can't predict which will be the last, the one to fail. The above works perfectly defining a function for the embedded expression using a synthetic name, like "c123", and passing it to closure(): def c123(): match_this() match_that() closure(c123) My quest is because the above seems quite unpythonic. Lakshmi suggested this pattern, which I think is an improvement: with closure() as c: @c.exp def expre(): match_this() match_that() What I think would be great is to have the action (closure) precede an anonymous block. closure( def(): match_this() match_that() ) Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Mon May 13 17:49:11 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 13 May 2013 11:49:11 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> Message-ID: On Sun, May 12, 2013 at 9:33 PM, Raymond Hettinger wrote: > On May 9, 2013, at 3:29 AM, Piotr Duda wrote: >> Animals = Enum('Animals', 'dog cat bird') >> which violates DRY > This is a profound misreading of DRY which is all about not repeating > big chunks of algorithmic logic. DRY, like most heuristics, is about making mistakes less likely. Mistakes are likely with huge chunks of repeated logic, because people are inclined to fix things at only one location. Mistakes are likely with the above because it is conceptually only one location, but syntactically two -- and doing something different in the second location is a mistake that the compiler won't catch. The problem with >> Animals = Enum('Animals', 'dog cat bird') is that you might accidentally type >> Animals = Enum('Animal', 'dog cat bird') or >> Anmals = Enum('Animals', 'dog cat bird') instead. -jJ From tjreedy at udel.edu Mon May 13 18:28:40 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 13 May 2013 12:28:40 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On 5/13/2013 11:40 AM, Juancarlo A?ez wrote: > The above works perfectly defining a function for the embedded > expression using a synthetic name, like "c123", and passing it to closure(): > > def c123(): > match_this() > match_that() > > closure(c123) > > > My quest is because the above seems quite unpythonic. I disagree that the above is unpythonic. In Python, functions are objects like everything else. Define them, pass them to functions, like anything else. 'Unpythonic' is treating functions as special, other than that they are called (have a call method). Terry From yselivanov.ml at gmail.com Mon May 13 18:32:34 2013 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 13 May 2013 12:32:34 -0400 Subject: [Python-ideas] external editor commands in python REPL Message-ID: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> Hi all, While working on a repl-like shell for one of our applications, I implemented a buffer-editing mechanism similar to PostgreSQL shell. The idea is that you have special shell commands, starting with '\': - '\e' opens an external editor - '\r' clears the buffer - etc Playing with it for couple of days proved that it makes working with the REPL much more convenient. Here's a quick screencast: https://dl.dropboxusercontent.com/u/21052/repl.mov And if anyone wants to try it, here is the hack: https://gist.github.com/1st1/5569467 Run it under python 3: '$ python3 shell.py' I think it would be nice if we can include similar feature to the standard python REPL, as it makes it more convenient and OTOH starting python line with '\' is almost always illegal. - Yury From guido at python.org Mon May 13 18:39:24 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 13 May 2013 09:39:24 -0700 Subject: [Python-ideas] external editor commands in python REPL In-Reply-To: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> References: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> Message-ID: Have you seen IPython? On Mon, May 13, 2013 at 9:32 AM, Yury Selivanov wrote: > Hi all, > > While working on a repl-like shell for one of our applications, > I implemented a buffer-editing mechanism similar to PostgreSQL > shell. The idea is that you have special shell commands, starting > with '\': > > - '\e' opens an external editor > - '\r' clears the buffer > - etc > > Playing with it for couple of days proved that it makes working with > the REPL much more convenient. Here's a quick screencast: > https://dl.dropboxusercontent.com/u/21052/repl.mov > > And if anyone wants to try it, here is the hack: > https://gist.github.com/1st1/5569467 > Run it under python 3: '$ python3 shell.py' > > I think it would be nice if we can include similar feature to the > standard python REPL, as it makes it more convenient and OTOH > starting python line with '\' is almost always illegal. > > - > Yury > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon May 13 18:41:34 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 13 May 2013 12:41:34 -0400 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: On 5/13/2013 11:20 AM, Yuval Greenfield wrote: > On Mon, May 13, 2013 at 5:57 PM, Jim Jewett > wrote: > > Perhaps even changing the navigation so that clicking on the scrollbar > at 1/3 of the way down will move you to 1/3 of the way down the file, > instead of "one page up from where you current are." This would no > longer be a standard scrollbar, but it might well be better. > > > > Non-standard scroll bar sounds bad to me. Here's how this feature works > in Eclipse/pydev http://i.imgur.com/kQrc5n0.png That reminds me of the merge conflict strips in, for instance kdiff3. > Basically the scrollbar has another small strip next to it. So the > locations are clickable, and they bring you to the right place, without > breaking the normal scroll bar behavior. Assuming that putting dots on a strip beside text is feasible with tkinter, that would be a possible optional extension. There are already proposals for line number strips for files and prompt strips for the shell. Terry From mertz at gnosis.cx Mon May 13 18:43:05 2013 From: mertz at gnosis.cx (David Mertz) Date: Mon, 13 May 2013 09:43:05 -0700 Subject: [Python-ideas] external editor commands in python REPL In-Reply-To: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> References: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> Message-ID: There are several external tools that "compete" in this space already. For example IPython and BPython (take a look at http://wiki.python.org/moin/PythonEditors#Enhanced_Python_shells) for descriptions of other ones also. As well, a good number of text editors have the capability of incorporating a Python shell within them, which is kind of the "same thing inverted". Rather than enhance the deliberately simple standard Python shell, I would recommend using one of those 3rd party tools (or developing your own if you see something missing in them). On Mon, May 13, 2013 at 9:32 AM, Yury Selivanov wrote: > Hi all, > > While working on a repl-like shell for one of our applications, > I implemented a buffer-editing mechanism similar to PostgreSQL > shell. The idea is that you have special shell commands, starting > with '\': > > - '\e' opens an external editor > - '\r' clears the buffer > - etc > > Playing with it for couple of days proved that it makes working with > the REPL much more convenient. Here's a quick screencast: > https://dl.dropboxusercontent.com/u/21052/repl.mov > > And if anyone wants to try it, here is the hack: > https://gist.github.com/1st1/5569467 > Run it under python 3: '$ python3 shell.py' > > I think it would be nice if we can include similar feature to the > standard python REPL, as it makes it more convenient and OTOH > starting python line with '\' is almost always illegal. > > - > Yury > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Mon May 13 18:44:15 2013 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 13 May 2013 12:44:15 -0400 Subject: [Python-ideas] external editor commands in python REPL In-Reply-To: References: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> Message-ID: <849688B1-4F85-4FD3-9C23-B321BAF57C2D@gmail.com> On 2013-05-13, at 12:39 PM, Guido van Rossum wrote: > Have you seen IPython? Of course. Tried it multiple times, however, can't get used to it. The idea is to make the standard REPL a bit more convenient, because for most users it's probably already has enough features. But writing multiline code in current REPL is a bit harder than it should be. - Yury From guido at python.org Mon May 13 18:55:42 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 13 May 2013 09:55:42 -0700 Subject: [Python-ideas] external editor commands in python REPL In-Reply-To: <849688B1-4F85-4FD3-9C23-B321BAF57C2D@gmail.com> References: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> <849688B1-4F85-4FD3-9C23-B321BAF57C2D@gmail.com> Message-ID: On Mon, May 13, 2013 at 9:44 AM, Yury Selivanov wrote: > On 2013-05-13, at 12:39 PM, Guido van Rossum wrote: > > > Have you seen IPython? > > Of course. Tried it multiple times, however, can't get used to it. > It would have been useful if you had mentioned that in your first post. Work claiming to break new ground should always carefully compare with existing solutions. > The idea is to make the standard REPL a bit more convenient, because > for most users it's probably already has enough features. But writing > multiline code in current REPL is a bit harder than it should be. > Maybe. Then again, there are already other solutions, and if you say you don't like those, well, others do like them, and plenty of others are also perfectly happy with the standard REPL. How long have you worked on your shell.py? Do you think that by the time Python 3.4 comes out you won't have added a host of other additional conveniences? Or that maybe you've lost interest, and donated a module to the standard library that no-one cares to maintain? Note that I'm not even debating the specific features you're proposing -- I'm just trying to explain to you that your proposal is not ready for stdlib inclusion, and possibly never will be. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 13 19:01:03 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 13 May 2013 10:01:03 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> Message-ID: On Mon, May 13, 2013 at 8:49 AM, Jim Jewett wrote: > The problem with > > >> Animals = Enum('Animals', 'dog cat bird') > > is that you might accidentally type > > >> Animals = Enum('Animal', 'dog cat bird') > or > >> Anmals = Enum('Animals', 'dog cat bird') > > instead. Sure. But coming up with a syntactic solution for this issue is not easy. So far all the proposals from this thread (and from past threads trying to address the same issues, including PEP 403) look terrible to me -- none of the proposals are more than random permutations of symbols that are currently syntactically invalid are given a fairly random new meaning. So in the mean time please live with the slight redundancy in this case. Next time you may want to try and design syntax so that you won't have to type the same method name twice when you're defining a function and later calling it. :-) -- --Guido van Rossum (python.org/~guido) From mrts.pydev at gmail.com Mon May 13 19:02:58 2013 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Mon, 13 May 2013 20:02:58 +0300 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: MacroPy is awesome both conceptually and feature-demo-wise - thanks for working on and sharing this! Simple, elegant things are hardest to come by - to utilize module loading hooks for AST transformation seems really natural from hindsight, except no-one did that before. On Wed, May 8, 2013 at 11:04 PM, Haoyi Li wrote: > Just an update for people who are interested, > > The project (https://github.com/lihaoyi/macropy) is more or less done for > now, in its current state as a proof of concept/demo. Almost all of it runs > perfectly on both CPython and PyPy, except for the pattern matcher which > has some bugs on PyPy we haven't ironed out yet. Jython doesn't work at > all: it seems to handle a number of things about the ast module pretty > differently from either PyPy or CPython. > > We've got a pretty impressive list of feature demos: > > - Quasiquotes , a > quick way to manipulate fragments of a program > - String Interpolation, > a common feature in many languages > - Pyxl , > integrating XML markup into a Python program > - Tracing and Smart > Asserts > - Case Classes , easy Algebraic > Data Types from > Scala > - Pattern Matching from > the Functional Programming world > - LINQ to SQL from C# > - Quick Lambdas from > Scala and Groovy, > - Parser Combinators, > inspired by Scala's > . > > And have pushed a release to PyPI (https://pypi.python.org/pypi/MacroPy), > to make it easier for people to download it and mess around. Hopefully > somebody will find this useful in messing around with the Python language! > > Thanks! > -Haoyi > > > On Sat, Apr 27, 2013 at 11:05 PM, Haoyi Li wrote: > >> I pushed a simple implementation of case classes using >> Macros, as well as a really nice to use parser combinator library. >> The case classes are interesting because they overlap a lot with >> enumerations: auto-generated __str__, __repr__, inheritence via nesting, >> they can have members and methods, etc. >> >> They also show off pretty well how far Python's syntax (and semantic!) >> can be stretched using macros, so if anyone still has some crazy ideas for >> enumerations and wants to prototype them without hacking the CPython >> interpreter, this is your chance! >> >> Thanks! >> -Haoyi >> >> >> On Wed, Apr 24, 2013 at 3:15 PM, Haoyi Li wrote: >> >>> @Jonathan: That would be possible, although I can't say I know how to do >>> it. A naive macro that wraps everything and has a "substitute awaits for >>> yields, wrap them in inlineCallbacks(), and substitute returns for >>> returnValue()s" may work, but I'm guessing it would run into a forest of >>> edge cases where the code isn't so simple (what if you *want* a return? >>> etc.). >>> >>> pdb *should* show the code after macro expansion. Without source maps, >>> I'm not sure there's any way around that, so debugging may be hard. >>> >>> Of course, if the alternative is macros of forking the interpreter, >>> maybe macros is the easier way to do it =) Debugging a buggy custom-forked >>> interpreter probably isn't easy either! >>> >>> >>> On Wed, Apr 24, 2013 at 5:48 PM, Jonathan Slenders >> > wrote: >>> >>>> One use case I have is for Twisted's inlineCallbacks. I forked the >>>> pypy project to implement the await-keyword. Basically it transforms: >>>> >>>> def async_function(deferred_param): >>>> a = await deferred_param >>>> b = await some_call(a) >>>> return b >>>> >>>> into: >>>> >>>> @defer.inlineCallbacks >>>> def async_function(deferred_param): >>>> a = yield deferred_param >>>> b = yield some_call(a) >>>> yield defer.returnValue(b) >>>> >>>> >>>> Are such things possible? And if so, what lines of code would pdb show >>>> during introspection of the code? >>>> >>>> It's interesting, but when macros become more complicated, the >>>> debugging of these things can turn out to be really hard, I think. >>>> >>>> >>>> 2013/4/24 Haoyi Li : >>>> > I haven't tested in on various platforms, so hard to say for sure. >>>> MacroPy >>>> > basically relies on a few things: >>>> > >>>> > - exec/eval >>>> > - PEP 302 >>>> > - the ast module >>>> > >>>> > All of these are pretty old pieces of python (almost 10 years old!) >>>> so it's >>>> > not some new-and-fancy functionality. Jython seems to have all of >>>> them, I >>>> > couldn't find any information about PyPy. >>>> > >>>> > When the project is more mature and I have some time, I'll see if I >>>> can get >>>> > it to work cross platform. If anyone wants to fork the repo and try >>>> it out, >>>> > that'd be great too! >>>> > >>>> > -Haoyi >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > On Wed, Apr 24, 2013 at 11:55 AM, Andrew Barnert >>>> wrote: >>>> >> >>>> >> On Apr 24, 2013, at 8:05, Haoyi Li wrote: >>>> >> >>>> >> You actually can get a syntax like that without macros, using >>>> >> stack-introspection, locals-trickery and lots of `eval`. The >>>> question is >>>> >> whether you consider macros more "extreme" than stack-introspection, >>>> >> locals-trickery and `eval`! A JIT compiler will probably be much >>>> happier >>>> >> with macros. >>>> >> >>>> >> >>>> >> That last point makes this approach seem particularly interesting to >>>> me, >>>> >> which makes me wonder: Is your code CPython specific, or does it >>>> also work >>>> >> with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot >>>> easier to >>>> >> mess with in the first place than CPython, having macros at the same >>>> >> language level as your code is just as interesting in both >>>> implementations. >>>> >> >>>> >> >>>> >> On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy >>>> >> wrote: >>>> >>> >>>> >>> On 4/23/2013 11:49 PM, Haoyi Li wrote: >>>> >>>> >>>> >>>> I thought this may be of interest to some people on this list, >>>> even if >>>> >>>> not strictly an "idea". >>>> >>>> >>>> >>>> I'm working on MacroPy , a >>>> little >>>> >>>> >>>> >>>> pure-python library that allows user-defined AST rewrites as part >>>> of the >>>> >>>> import process (using PEP 302). >>>> >>> >>>> >>> >>>> >>> From the readme >>>> >>> ''' >>>> >>> String Interpolation >>>> >>> >>>> >>> a, b = 1, 2 >>>> >>> c = s%"%{a} apple and %{b} bananas" >>>> >>> print c >>>> >>> #1 apple and 2 bananas >>>> >>> ''' >>>> >>> I am a little surprised that you would base a cutting edge >>>> extension on >>>> >>> Py 2. Do you have it working with 3.3 also? >>>> >>> >>>> >>> '''Unlike the normal string interpolation in Python, MacroPy's >>>> string >>>> >>> interpolation allows the programmer to specify the variables to be >>>> >>> interpolated inline inside the string.''' >>>> >>> >>>> >>> Not true as I read that. >>>> >>> >>>> >>> a, b = 1, 2 >>>> >>> print("{a} apple and {b} bananas".format(**locals())) >>>> >>> print("%(a)s apple and %(b)s bananas" % locals()) >>>> >>> #1 apple and 2 bananas >>>> >>> #1 apple and 2 bananas >>>> >>> >>>> >>> I rather like the anon funcs with anon params. That only works when >>>> each >>>> >>> param is only used once in the expression, but that restriction is >>>> the >>>> >>> normal case. >>>> >>> >>>> >>> I am interested to see what you do with pattern matching. >>>> >>> >>>> >>> tjr >>>> >>> >>>> >>> _______________________________________________ >>>> >>> Python-ideas mailing list >>>> >>> Python-ideas at python.org >>>> >>> http://mail.python.org/mailman/listinfo/python-ideas >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> Python-ideas mailing list >>>> >> Python-ideas at python.org >>>> >> http://mail.python.org/mailman/listinfo/python-ideas >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > Python-ideas mailing list >>>> > Python-ideas at python.org >>>> > http://mail.python.org/mailman/listinfo/python-ideas >>>> > >>>> >>> >>> >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Mon May 13 19:04:51 2013 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 13 May 2013 13:04:51 -0400 Subject: [Python-ideas] external editor commands in python REPL In-Reply-To: References: <15E7904B-67C7-4BB3-860F-6AFA72E2FA35@gmail.com> <849688B1-4F85-4FD3-9C23-B321BAF57C2D@gmail.com> Message-ID: <5045AF87-2D4A-479E-9B45-32101B7E7E14@gmail.com> On 2013-05-13, at 12:55 PM, Guido van Rossum wrote: > On Mon, May 13, 2013 at 9:44 AM, Yury Selivanov wrote: > On 2013-05-13, at 12:39 PM, Guido van Rossum wrote: > > > Have you seen IPython? > > Of course. Tried it multiple times, however, can't get used to it. > > It would have been useful if you had mentioned that in your first post. Work claiming to break new ground should always carefully compare with existing solutions. Right. > > The idea is to make the standard REPL a bit more convenient, because > for most users it's probably already has enough features. But writing > multiline code in current REPL is a bit harder than it should be. > > Maybe. Then again, there are already other solutions, and if you say you don't like those, well, others do like them, and plenty of others are also perfectly happy with the standard REPL. > > How long have you worked on your shell.py? Do you think that by the time Python 3.4 comes out you won't have added a host of other additional conveniences? Or that maybe you've lost interest, and donated a module to the standard library that no-one cares to maintain? ~ 1-2 hours ;) It's a quick hack, obviously. As for the features, conveniences etc -- If the python-dev crowd likes the idea I'd be glad to write a PEP to distill the functionality and/or provide a reference implementation. And regarding the maintenance of donated code -- it wouldn't be my first contribution to python ;) Thank you, Yury From g.brandl at gmx.net Mon May 13 19:06:50 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 13 May 2013 19:06:50 +0200 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: Am 13.05.2013 19:02, schrieb Mart S?mermaa: > MacroPy is awesome both conceptually and feature-demo-wise - thanks for working > on and sharing this! > Simple, elegant things are hardest to come by - to utilize module loading hooks > for AST transformation seems really natural from hindsight, except no-one did > that before. *cough* https://bitbucket.org/birkenfeld/karnickel Georg From haoyi.sg at gmail.com Mon May 13 19:17:36 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 13 May 2013 13:17:36 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: We were aware of Karnickel before we started, along with MetaPython ( https://code.google.com/p/metapython/) and Pyxl ( https://github.com/dropbox/pyxl) Apart from being abandoned, neither of the first two really demonstrates any usability (although Pyxl is used quite heavily), which is why we went ahead with MacroPy. On Mon, May 13, 2013 at 1:06 PM, Georg Brandl wrote: > Am 13.05.2013 19:02, schrieb Mart S?mermaa: > > MacroPy is awesome both conceptually and feature-demo-wise - thanks for > working > > on and sharing this! > > Simple, elegant things are hardest to come by - to utilize module > loading hooks > > for AST transformation seems really natural from hindsight, except > no-one did > > that before. > > *cough* > > https://bitbucket.org/birkenfeld/karnickel > > Georg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon May 13 19:23:39 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 12:53:39 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 11:58 AM, Terry Jan Reedy wrote: > I disagree that the above is unpythonic. In Python, functions are objects > like everything else. Define them, pass them to functions, like anything > else. 'Unpythonic' is treating functions as special, other than that they > are called (have a call method). I beg to disagree. Functions are objects in python, but they get particular treatment. You can do: def f(): pass x = f But you can't do: x = def(): pass Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon May 13 19:53:17 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 13 May 2013 13:53:17 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On 5/13/2013 1:23 PM, Juancarlo A?ez wrote: > > On Mon, May 13, 2013 at 11:58 AM, Terry Jan Reedy > > wrote: > > I disagree that the above is unpythonic. In Python, functions are > objects like everything else. Define them, pass them to functions, > like anything else. 'Unpythonic' is treating functions as special, > other than that they are called (have a call method). > > > I beg to disagree. Functions are objects in python, but they get > particular treatment. So do classes, modules, ..., > You can do: > > def f(): > pass > x = f > > But you can't do: > > x = def(): pass So what? Really. There is nothing particular about that. Neither can you do x = class (): pass x = with open(..): x = import itertools x = if True: y = 3 x = The fact that Python is a mixed syntax language with both expressions and statements is fundamental to its design. -- Terry Jan Reedy From g.brandl at gmx.net Mon May 13 20:03:23 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 13 May 2013 20:03:23 +0200 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: Am 13.05.2013 19:17, schrieb Haoyi Li: > We were aware of Karnickel before we started, along with MetaPython > (https://code.google.com/p/metapython/) and Pyxl (https://github.com/dropbox/pyxl) > > Apart from being abandoned, neither of the first two really demonstrates any > usability (although Pyxl is used quite heavily), which is why we went ahead with > MacroPy. Sure, I never intended it to be usable :) I was just responding to the claim that nobody did macros-with-import-hooks before. Georg From haoyi.sg at gmail.com Mon May 13 20:11:35 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 13 May 2013 14:11:35 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: Sure =) Karnickel was one of the things that made us think "yeah, this should be doable". I must confess I never made it to that readme! On Mon, May 13, 2013 at 2:03 PM, Georg Brandl wrote: > Am 13.05.2013 19:17, schrieb Haoyi Li: > > We were aware of Karnickel before we started, along with MetaPython > > (https://code.google.com/p/metapython/) and Pyxl ( > https://github.com/dropbox/pyxl) > > > > Apart from being abandoned, neither of the first two really demonstrates > any > > usability (although Pyxl is used quite heavily), which is why we went > ahead with > > MacroPy. > > Sure, I never intended it to be usable :) I was just responding to the > claim > that nobody did macros-with-import-hooks before. > > Georg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Mon May 13 20:33:01 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 14:03:01 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 1:23 PM, Terry Jan Reedy wrote: > x = class (): pass > x = with open(..): > x = import itertools > x = if True: y = 3 > x = > I can explain. Expressions like "", [], (), {} are called "constructors" in programming language theory, because the "construct" objects, and give them an initial state. In Python, functions, methods, and classes are objects too, and their constructors are "class" and "def". But there's an asymmetry in the language in that those two constructors don't produce an assignable value. if, while, with, etc. are statements, not constructors. Other languages make those return values too, but that's not the Python way of things. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Mon May 13 21:10:14 2013 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 13 May 2013 12:10:14 -0700 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 11:33 AM, Juancarlo A?ez wrote: > In Python, functions, methods, and classes are objects too, and their > constructors are "class" and "def". But there's an asymmetry in the > language in that those two constructors don't produce an assignable value. > Not all functions produce a useful value. [].append(1) [].sort() print(1) Yes, these do return a value since functions always return a value, but they don't return a useful value. Likewise, class and def have side effects so if they were functions they would probably return None and you would have the same issue that you couldn't usefully use them inside another statement, just like this: x = print(y) Def is not a constructor. It is an assignment statement. def f(x): return x+1 f = lambda x: x+1 are equivalent. Python does not allow either one of these assignment statements to be embedded in another statement. It does allow lambda functions to be embedded in other statements. The issue here is essentially that the def syntax allows more complex function syntax than lambda. And the complaint is that that you have to declare a function "out of order" and choose a name for it. This has the same problem: # can't do this case a.b[i]: when 0: pass when 1: pass when 2 ... 4: pass else: pass # can't do this either if (_value = a.b[i]) == 0: pass elif _value == 1: pass elif _value in 2 ... 4: pass else: pass # have to do this _value = a.b[i] if _value == 0: pass elif _value == 1: pass elif _value >= 2 and value <= 4: pass else: pass This is a much more common scenario than wanting anonymous blocks. I'm not proposing to change this. I'm just pointing out that if you're complaining about not being able to assign a value inside a statement, there are more common cases to look at. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon May 13 21:21:25 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 13 May 2013 12:21:25 -0700 Subject: [Python-ideas] improve Idle In-Reply-To: References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: <40500C14-F4EE-4D28-91E0-FE34148C8B4F@yahoo.com> It's probably worth looking at a slew of different IDEs (Eclipse, Visual Studio, Xcode, CodeBlocks, etc.) to survey all the ways this kind of thing is done, before deciding what's the best match for idle. Sent from a random iPhone On May 13, 2013, at 9:41, Terry Jan Reedy wrote: > On 5/13/2013 11:20 AM, Yuval Greenfield wrote: >> On Mon, May 13, 2013 at 5:57 PM, Jim Jewett > > wrote: >> >> Perhaps even changing the navigation so that clicking on the scrollbar >> at 1/3 of the way down will move you to 1/3 of the way down the file, >> instead of "one page up from where you current are." This would no >> longer be a standard scrollbar, but it might well be better. >> >> >> >> Non-standard scroll bar sounds bad to me. Here's how this feature works >> in Eclipse/pydev http://i.imgur.com/kQrc5n0.png > > That reminds me of the merge conflict strips in, for instance kdiff3. > >> Basically the scrollbar has another small strip next to it. So the >> locations are clickable, and they bring you to the right place, without >> breaking the normal scroll bar behavior. > > Assuming that putting dots on a strip beside text is feasible with tkinter, that would be a possible optional extension. There are already proposals for line number strips for files and prompt strips for the shell. > > Terry > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From apalala at gmail.com Mon May 13 21:30:39 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 15:00:39 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: On Mon, May 13, 2013 at 2:40 PM, Bruce Leban wrote: > This is a much more common scenario than wanting anonymous blocks. I'm not > proposing to change this. I'm just pointing out that if you're complaining > about not being able to assign a value inside a statement, there are more > common cases to look at. > I'm not complaining. I'm just pointing out that there may be more readable ways to express things in Python than out-of-order. I know that making def and class be expressions would cause enormous problems. Because context managers are limited to their intended purpose, I proposed new syntax that would provide for anonymous blocks in certain places. Instead of having to do: def synthetic1(): self.match('a') def synthetic2(): self.match('b') self.closure(synthetic1) self.closure(synthetic2) Which is as close to cryptic as can bee, it could be: within self.closure(): self.match('a') within self.closure(): self.match('b') Which is, at least, in the right order. Adding the new keyword would allow reusing the parsing for def: within self.closure(): def(): self.match('a') within self.closure(): def(): self.match('b') So anonymous blocks can also take parameters, without the need to make def an assignable expression. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon May 13 21:31:09 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 13 May 2013 21:31:09 +0200 Subject: [Python-ideas] improve Idle References: <96BEF194-FC3D-4F8F-8366-C7ADA1458B27@gmail.com> Message-ID: <20130513213109.1e02cf25@fsol> On Mon, 13 May 2013 10:57:43 -0400 Jim Jewett wrote: > On Fri, May 10, 2013 at 1:16 PM, Terry Jan Reedy wrote: > > On 5/10/2013 8:52 AM, Todd V. Rovito wrote: > >> On May 10, 2013, at 5:36 AM, Alexandre Boulay > >> wrote: > > >>> I think that could be a good idea to put colored dots on idle's > >>> scroll bar for each def or class created, each got its own color, > > > I cannot really understand what you are proposing. The scroll bar is for > > scrolling, and it has the arrow buttons and the bar itself that would > > interfere with placing dots. Furthermore, scroll bars are widgets defined by > > tk and as far as I know, IDLE has no control over the detailed appearance. > > I suspect he is suggesting that the scrollbar represent position in > the file in terms of "3 top level classes above, 2 below" instead of > just by line count. This sounds straightforward as an overlay > graphic. > > Perhaps even changing the navigation so that clicking on the scrollbar > at 1/3 of the way down will move you to 1/3 of the way down the file, > instead of "one page up from where you current are." This would no > longer be a standard scrollbar, but it might well be better. I wouldn't want to disturb an Idle discussion, but changing the behaviour of standard widgets is most often a terrible idea (unless perhaps you're an UI wizard). You shouldn't do it. Regards Antoine. From jonathan.eunice at gmail.com Mon May 13 22:19:52 2013 From: jonathan.eunice at gmail.com (Jonathan Eunice) Date: Mon, 13 May 2013 13:19:52 -0700 (PDT) Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: <31e95784-6f48-41a9-a67c-30d58eb354c4@googlegroups.com> That many Python functions don't produce useful values makes it difficult or impossible to do "chained" or "fluent" operations a la JavaScript or Perl. I love that Python avoids "expression soup" and long run-on invocations, but the lack of appreciable support for "fluency" or even a smidgeon of functional style seems to regularly "verticalize" Python code, with several lines required do common things, such as fully construct / initialize / setup an object. Apologize if this seems tangential. To me it seems related to some of the examples in this thread where Python pushes configuration statements/calls oddly after the related code block. On Monday, May 13, 2013 3:10:14 PM UTC-4, Bruce Leban wrote: > > > Not all functions produce a useful value. > > [].append(1) > [].sort() > print(1) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mm at ensoft.co.uk Mon May 13 22:21:18 2013 From: mm at ensoft.co.uk (Martin Morrison) Date: Mon, 13 May 2013 21:21:18 +0100 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> Message-ID: On 13 May 2013, at 18:01, Guido van Rossum wrote: > On Mon, May 13, 2013 at 8:49 AM, Jim Jewett wrote: >> The problem with >> >>>> Animals = Enum('Animals', 'dog cat bird') >> >> is that you might accidentally type >> >>>> Animals = Enum('Animal', 'dog cat bird') >> or >>>> Anmals = Enum('Animals', 'dog cat bird') >> >> instead. > > Sure. But coming up with a syntactic solution for this issue is not > easy. So far all the proposals from this thread (and from past threads > trying to address the same issues, including PEP 403) look terrible to > me -- none of the proposals are more than random permutations of > symbols that are currently syntactically invalid are given a fairly > random new meaning. Guido, it sounds like you are not completely opposed to the general idea here, but rather find all the proposed syntaxes to be ugly? This also ties in to your comments about the getframe hack in the new Enum implementation - I think everyone fully agrees with your comments about making things easier for everyone, I'm just not so sure that stack introspection is the best solution, irrespective of the difficulty of implementation in other implementations. It just feels too "magic" and implicit - and we all know explicit is better. Bruce Leban just today on another python-ideas thread said something far more clearly than I ever could to explain why the "def Foo = " syntax feels "right" to me; to quote, including his example: Def is not a constructor. It is an assignment statement. def f(x): return x+1 f = lambda x: x+1 are equivalent. With this interpretation of def, it feels perfect. :-) Anyway, just my 2 cents. Cheers, Martin > So in the mean time please live with the slight redundancy in this > case. Next time you may want to try and design syntax so that you > won't have to type the same method name twice when you're defining a > function and later calling it. :-) > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From mm at ensoft.co.uk Mon May 13 22:43:00 2013 From: mm at ensoft.co.uk (Martin Morrison) Date: Mon, 13 May 2013 21:43:00 +0100 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <6C352A96-8A28-458E-A4A8-8D976756C19B@ensoft.co.uk> On 13 May 2013, at 03:53, "Stephen J. Turnbull" wrote: > Martin Morrison writes: > >> That is instead of a decorator-like syntax, make the "in" keyword >> reusable to introduce a new block, whose "argument" is a statement >> that can forward reference some names, which are then defined >> within the block. This allows multiple temporary names to be >> defined (each in a separate statement within the block). > > This idea and its presumed defects are described (using the "given" > syntax of PEP 3150) in the section "Using a nested suite" in PEP 403. [Thanks for pointing this out - I seemed to miss the crux of that in my first reading] The general argument against seems to be that it makes implementation difficult, and PEP3150 has (what to me is) very awkward syntax involving leading dots, etc. to try to help the implementation. This feels wrong to me (sacrificing functionality/clarity for ease of implementation - at least during the first pass). Also, would it not be possible to implement it something like: in : Becomes: _locals = dict() exec(, globals(), _locals) _composed = MagicDict(write_to=locals(), first_read_from=_locals) exec(, globals(), _composed) del _locals, _composed (to clarify: MagicDict is some way of composing two dicts so that write operations always go to a specific dict, but read operations first go to another dict. i.e. Names are first looked up in the temporary locals created from the suite, then in the 'real' locals, but all new name bindings go into the actual local scope) There may need to be some special handling for 'nonlocal' if required, but 'global' should Just Work. Also there is no need for any of the statement restrictions discussed in PEP3150. >> Some further thought is required on whether only def (and maybe >> class) statements should be allowed within the "in" block. Although >> I guess there's technically nothing wrong with: >> >> in x = y + z: >> y = 12 >> z = 30 >> >> Other than it's a very verbose way of doing something simple. ;-) > > Violates TOOWTDI according to PEP 403. Inherently since the attempt of these PEPs is to reorder some code, they violate TOOWTDI. However, the reordering provides substantial benefits to readability. In that sense, they are very much like decorators (and thus I can see why some attempt was made to align the syntaxes). In any case, TOOWTDI refers to only one *obvious* way to do it (at least obvious if you are Dutch ;-)). > David Mertz and Juancarlo A?ez riff on the theme: > >> [Why not spell it something like]: >> >> in x = do_something(in_arg, success_hdlr, error_hdlr, const) let: >> def success_hdlr(result): >> ... # Do something with result >> def error_hdlr(error): >> ... # Do something with error >> const = 42 > > (Note the "let" at the end of the "in" clause.) > > Python doesn't use redundant keywords for a single construct. > "let" is redundant with the following "def"s. On top of that, "let" > being a new keyword will kill this syntax, I think. Agreed. I also prefer a prefix (in my example "in") over the "given" or "let" postfix. It makes it obvious as I approach the construct that it is going to do something different, rather than requiring me to look at the end of the line to see the colon. cf. with, for, while, if - in fact all complex statements. Cheers, Martin From abarnert at yahoo.com Mon May 13 23:07:03 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 13 May 2013 14:07:03 -0700 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <31e95784-6f48-41a9-a67c-30d58eb354c4@googlegroups.com> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <31e95784-6f48-41a9-a67c-30d58eb354c4@googlegroups.com> Message-ID: On May 13, 2013, at 13:19, Jonathan Eunice wrote: > That many Python functions don't produce useful values makes > it difficult or impossible to do "chained" or "fluent" operations a la > JavaScript or Perl. Fluency and a clean expression/statement divide are almost directly contrary goals. Similarly, reducing "vertical" code and making all structure explicit are almost directly contrary goals. So; > I love that Python avoids "expression soup" and long run-on > invocations, but the lack of appreciable support for "fluency" or even > a smidgeon of functional style seems to regularly "verticalize" > Python code If you made Python fluent, you would allow, and maybe even encourage, JS-style "expression soup". It's a tradeoff, and I think Python made the right choice here. I've got a lot of logic that I migrate back and forth between Python and JS, and it's definitely true that a 3-line JS function often becomes a 5-line Python function and vice versa. But the Python function is nevertheless usually faster to read, so I don't think this is a problem. > , with several lines required do common things, such as > fully construct / initialize / setup an object. Often the answer to that is that the right API for a Python class isn't the same as the right API for a similar JS prototype. For example, because pythonic style makes much more use of exceptions than typical JS style, you don't need multistage initialization nearly as often. From apalala at gmail.com Mon May 13 23:35:17 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Mon, 13 May 2013 17:05:17 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: Just a reminder of why I started this discussion. I couldn't find a way to have: with context(): do_this() do_that() in which the embedded block would be executed multiple times, through yield. My fault! Lack of understanding about the context manager protocol. It is not possible to yield to the block multiple times because an exception within the block will exit the context no matter what __exit__() does. Adding new syntax is the kill-all way of things, but I think that it may be possible to achieve what I think is reasonable by adding to the context manager protocol. Perhaps I should just settle for something like: with context() as c: white True: with c.inner(): do_this() do_that() But that has the problem that the context manager protocol forces try: finally: to be spread between __enter__() and __exit__(). The best solution I've seen so far doesn't require any changes: with context() as c: @c def _(): do_this() do_that() Make the decorator remember the decorated function, and let __exit__() do the iteration. The only problem with that is that the stack trace for an exception would be weird. That's why I thought of new syntax: within context() as c: do_this() do_that() Python would def the anonymous block, and pass it to a __do__(block) method in the context. That wouldn't allow passing parameters to the anonymous block, but the block can use the context (c), and the variables over which it has visibility. All that said, I'm happy with how things work now, except for the need to come up with synthetic names for what should be anonymous functions. Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.eunice at gmail.com Tue May 14 00:00:58 2013 From: jonathan.eunice at gmail.com (Jonathan Eunice) Date: Mon, 13 May 2013 15:00:58 -0700 (PDT) Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <31e95784-6f48-41a9-a67c-30d58eb354c4@googlegroups.com> Message-ID: <991a4fc8-6ba7-45d8-b733-78a78d9b11ad@googlegroups.com> In the real world, I'm not sure that there ever is a truly clean statement / expression divide. server = DataServe(...) server.start() vies with server = DataServe(...).start() or server = DataServe(..., start=True) And while there are lot of places where Python tries to make something very a statement rather than an expression (e.g. print, import, raise, or assert), one doesn't have to go very far to find variations on these that are functional. py3 itself changed its game on print. I myself do not find modest uses of fluency any less clear or explicit, and I believe it can improve the clarity of some logically-combined activities. But I'd certainly agree that the aggressive functional chaining you can find highly functional languages, or in JS, which I'll caricature as: d.find(...).filter(...).more(...).last().render() can be pretty off-putting and opaque. Through a few functional for-loops in there, as JS often does, and it's downright ugly. I don't want to take the blocks and closures discussion off-track; I'm not advocating hyper-fluency; and I freely admit that a language's omissions and constraints can be easily as important as the features it provides. I'm all about optimizing the macro result, so if having a few more vertical lines is the constraint that makes the total code more comprehensible, c'est la vie. But it does seem, to me at least, that there's a connection with some of the issues people bang into trying to specify block context and the fact that while Python constructors return a object than can be directly used, that very few of the update methods do. On Monday, May 13, 2013 5:07:03 PM UTC-4, Andrew Barnert wrote: > > On May 13, 2013, at 13:19, Jonathan Eunice > > wrote: > > > That many Python functions don't produce useful values makes > > it difficult or impossible to do "chained" or "fluent" operations a la > > JavaScript or Perl. > > Fluency and a clean expression/statement divide are almost directly > contrary goals. > > Similarly, reducing "vertical" code and making all structure explicit are > almost directly contrary goals. > > So; > > > I love that Python avoids "expression soup" and long run-on > > invocations, but the lack of appreciable support for "fluency" or even > > a smidgeon of functional style seems to regularly "verticalize" > > Python code > > If you made Python fluent, you would allow, and maybe even encourage, > JS-style "expression soup". It's a tradeoff, and I think Python made the > right choice here. > > I've got a lot of logic that I migrate back and forth between Python and > JS, and it's definitely true that a 3-line JS function often becomes a > 5-line Python function and vice versa. But the Python function is > nevertheless usually faster to read, so I don't think this is a problem. > > > , with several lines required do common things, such as > > fully construct / initialize / setup an object. > > Often the answer to that is that the right API for a Python class isn't > the same as the right API for a similar JS prototype. For example, because > pythonic style makes much more use of exceptions than typical JS style, you > don't need multistage initialization nearly as often. > > _______________________________________________ > Python-ideas mailing list > Python... at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.eunice at gmail.com Tue May 14 00:18:21 2013 From: jonathan.eunice at gmail.com (Jonathan Eunice) Date: Mon, 13 May 2013 15:18:21 -0700 (PDT) Subject: [Python-ideas] shutil.trash() In-Reply-To: References: <51603D71.6070706@hardcoded.net> Message-ID: <821de581-0521-4407-8a05-3e2851bf8a60@googlegroups.com> Move-to-trash is a pretty common operation on Windows, Mac, and Linux (when configured for GUI use). Such user-facing OSs account for hundreds of millions or billions of systems. That move-to-trash seems an odd operation compared to `os` is probably more a reflection on how Unix-centric `os` is. `dup`, `dup2`, `chown`, and `execv`? They go back at least to 6th or 7th Edition Unix in the mid-1970s. The standard Python standard library should be able to express common operations, and updating it to accommodate more recent idioms than POSIX makes sense to me. `shutil` seems the natural home. But practically speaking, `send2trash` doesn't seem complete enough to promote. It seems to use different implementations for Python 2 and 3. I didn't see an empty-trash API. It would need clear semantics in the case that "the trashcan" is not available (e.g. Linux without desktop extensions loaded). And testing is key. It would have to be robustly tested across Python versions and implementations, as well as different OSs, OS versions, and user environment frameworks, under both correct and error conditions. If it were all these things, I can see a natural path into `shutil`. Until then, PyPI is the natural home. If the standard library could learn about the trash can, perhaps eventually it could learn about the clipboard! On Saturday, April 6, 2013 6:25:54 PM UTC-4, Gregory P. Smith wrote: > > Is it widely used? > > I think it sounds useful for someone but is the kind of thing that should > be fine as an extension module on PyPI for most people's needs. It seems > like the kind of functionality that would go along with a GUI library. > Other software is unlikely to care about an OSes concept of trash and > simply rm/del/unlink things. > > otherwise, yes, shutil is a reasonable place if it were to be added. > > > > On Sat, Apr 6, 2013 at 8:21 AM, Virgil Dupras > > wrote: > >> Hi all, >> >> A while ago, I've developed this library, send2trash ( >> https://bitbucket.org/hsoft/**send2trash), which can send files to trash on Mac OS X, Windows, and any platform >> that conforms to FreeDesktop. >> >> The current version uses ctypes, but earlier versions were straight C >> modules. >> >> I was wondering if you think this has a place in the stdlib, maybe as >> "shutil.trash()"? >> >> Virgil Dupras >> ______________________________**_________________ >> Python-ideas mailing list >> Python... at python.org >> http://mail.python.org/**mailman/listinfo/python-ideas >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue May 14 02:03:13 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 14 May 2013 12:03:13 +1200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> Message-ID: <51917F41.3030608@canterbury.ac.nz> Martin Morrison wrote: > Def is not a constructor. It is an assignment statement. It's really some of both. It constructs a function object, and binds it to a name. These two operations are entwined, in a sense, because the name being bound to is used as an argument in the construction. We're after a generalisation of this entwined construction-and-assignment. (Astruction? Consignment?) I'm not entirely happy with the current proposal: def name = expr because it doesn't fully entwine. The expr can be a constructor, but doesn't have to be, and even when it is, the construction occurs separately from the assignment. Also, it looks like an ordinary assignment with 'def' stuck in front, which, as Guido points out, seems somewhat random. I'd like to propose something a bit different: def name as expr(arg, ...) which would expand to something like name = expr(arg, ..., __name__ = 'name', __module__ = 'module') For example, def Animal as Enum('cat dog platypus') This reads quite naturally: "define Animal as an Enum with these arguments." Another example based on my own use case: def width as overridable_property("The width of the widget.") (Yes, this is yet another, different use of the word "as", but I don't see anything wrong with that. Small words in English often don't mean much on their own and derive most of their meaning from their context.) -- Greg From ethan at stoneleaf.us Tue May 14 02:14:32 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 17:14:32 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <51917F41.3030608@canterbury.ac.nz> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: <519181E8.4090508@stoneleaf.us> On 05/13/2013 05:03 PM, Greg Ewing wrote: > > I'd like to propose something a bit different: > > > def name as expr(arg, ...) > > which would expand to something like > > name = expr(arg, ..., __name__ = 'name', __module__ = 'module') > > For example, > > def Animal as Enum('cat dog platypus') > > This reads quite naturally: "define Animal as an > Enum with these arguments." > > Another example based on my own use case: > > def width as overridable_property("The width of the widget.") +1 for the idea > (Yes, this is yet another, different use of the word "as", > but I don't see anything wrong with that. Small words in > English often don't mean much on their own and derive most > of their meaning from their context.) and another +1 :) -- ~Ethan~ From alexander.belopolsky at gmail.com Tue May 14 03:30:07 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 13 May 2013 21:30:07 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <51917F41.3030608@canterbury.ac.nz> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: On Mon, May 13, 2013 at 8:03 PM, Greg Ewing wrote: > I'd like to propose something a bit different: > > > def name as expr(arg, ...) > Note that in the current uses of "as" the assignment target is on the right: with expr as name: import pkg.mod as name I am not sure "def expr(...) as name" is any better than Greg's original proposal, though. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue May 14 03:36:48 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 14 May 2013 11:36:48 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> Message-ID: <51919530.8050703@pearwood.info> On 14/05/13 03:23, Juancarlo A?ez wrote: > On Mon, May 13, 2013 at 11:58 AM, Terry Jan Reedy wrote: > >> I disagree that the above is unpythonic. In Python, functions are objects >> like everything else. Define them, pass them to functions, like anything >> else. 'Unpythonic' is treating functions as special, other than that they >> are called (have a call method). > > > I beg to disagree. Functions are objects in python, but they get particular > treatment. > > You can do: > > def f(): > pass > x = f > > > But you can't do: > > x = def(): pass That has nothing to do with *function objects*, and everything to do with the *keyword* def being a statement. If you avoid "def", you can do this: x = lambda: None or even this: from types import FunctionType x = FunctionType(code, globals, name, argdefs, closure) Creating functions in this way is not exactly convenient, but it is possible. There is a sense in which functions (and classes, and modules) are "different", not because Python chooses to treat them differently, but because they are inherently different. They are complex, almost free-form compound objects, and there is no "nice" syntax for creating them inside an expression. The closest Python comes to is lambda for functions, and that is limited to a single expression. But regardless of the difficulty of creating a function object, once you have one, it is a first class object. Anything you can do with any other object, you can do with a function object. I'm with Terry on this: the code snippet you gave, where a function object is passed to another function, is a standard Python idiom and perfectly pythonic. It's not uncommon to have to create data before you use it, even when you could create it in-place where you use it. E.g. we might choose to write: data = [lots of items here] x = some_value() result = func(x, data) instead of: result = func(some_value(), [lots of items here]) to make the function call more readable, or to keep to some maximum line length, or in order to re-use some of the values. So even when we *can* embed values in a call, often we choose not to. The fact that you don't have a choice when it comes to functions is not a major problem. Even if you could write: result = func(def (): lots of code goes here) you probably shouldn't. -- Steven From steve at pearwood.info Tue May 14 03:42:31 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 14 May 2013 11:42:31 +1000 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> Message-ID: <51919687.9000607@pearwood.info> On 14/05/13 01:49, Jim Jewett wrote: > On Sun, May 12, 2013 at 9:33 PM, Raymond Hettinger > wrote: > >> On May 9, 2013, at 3:29 AM, Piotr Duda wrote: > >>> Animals = Enum('Animals', 'dog cat bird') >>> which violates DRY > >> This is a profound misreading of DRY which is all about not repeating >> big chunks of algorithmic logic. > > DRY, like most heuristics, is about making mistakes less likely. > > Mistakes are likely with huge chunks of repeated logic, because people > are inclined to fix things at only one location. > > Mistakes are likely with the above because it is conceptually only one > location, but syntactically two -- and doing something different in > the second location is a mistake that the compiler won't catch. "Likely"? I think not. If you (generic "you", not you personally) are such a careless coder that you are *likely* to mistype the name in a *single line* like `name = Enum("name", "items...")` then there is probably no help for you. Mistakes happen to the best of us, but they are *rare*. Besides, strictly speaking setting the Enum name different to the name being bound is not necessarily an error. We can, and frequently do, define functions with one name and then bind them to a different name, e.g. in decorators. Having to repeat the name is a wart, but it is a tiny wart, and surely not worth the amount of angst it apparently causes. > The problem with > >>> Animals = Enum('Animals', 'dog cat bird') > > is that you might accidentally type > >>> Animals = Enum('Animal', 'dog cat bird') > or >>> Anmals = Enum('Animals', 'dog cat bird') > > instead. In which case, either something will break, and you will fix the broken code, and then it will work, or nothing will break, in which case it almost certainly doesn't matter and you can pretend you did it on purpose. -- Steven From haoyi.sg at gmail.com Tue May 14 03:55:18 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 13 May 2013 21:55:18 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <51919530.8050703@pearwood.info> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> Message-ID: I do not think expression soup is particularly bad, it's just Javascript's implementation (as usual) that is pretty borked. In Scala, expression chaining lets you do some pretty nice things: memory.take(freePointer) .grouped(10) .map(_.fold("")(_+"\t"+_)) .reduce(_+"\n"+_) This code converts a flat array of ints into pretty 10-ints-per-row heap dumps, and is probably nicer than most things you would write using for loops and variables. In general, method chaining is the same as having an implicit "this" object that you are operating on without needing to specify it (since it gets returned by each method). Apart from saving syntax (less tokens to write) this also reduces the number of possible ways you can do things. I mean, if you write this: memory.take(freePointer) memory.grouped(10) memory.map(_.map(_+"")) memory.map(_.reduce(_+"\t"+_)) memory.reduce(_+"\n"+_) It's about the same; but how many times have you seen code like this: memory.take(freePointer) // some comment memory.grouped(10) unrelatedthing.dostuff() memory.map(_.map(_+"")) unrelatedthing.domorestuff() /** * SOME BIG COMMENT * i am cow * hear me moo * i weight twice as much as you * and i look good on the barbecue */ do_stuff_with_cows() memory.map(_.reduce(_+"\t"+_)) memory.reduce(_+"\n"+_) Which makes it a huge pain to figure out what is going on with memory? Method chaining *prevents* this sort of thing from happening in the first place, which is really nice. Even if I try to avoid this, I haven't seen any code base where this hasn't happened in various places, causing endless headaches. On Mon, May 13, 2013 at 9:36 PM, Steven D'Aprano wrote: > On 14/05/13 03:23, Juancarlo A?ez wrote: > >> On Mon, May 13, 2013 at 11:58 AM, Terry Jan Reedy >> wrote: >> >> I disagree that the above is unpythonic. In Python, functions are objects >>> like everything else. Define them, pass them to functions, like anything >>> else. 'Unpythonic' is treating functions as special, other than that they >>> are called (have a call method). >>> >> >> >> I beg to disagree. Functions are objects in python, but they get >> particular >> treatment. >> >> You can do: >> >> def f(): >> pass >> x = f >> >> >> But you can't do: >> >> x = def(): pass >> > > > That has nothing to do with *function objects*, and everything to do with > the *keyword* def being a statement. If you avoid "def", you can do this: > > x = lambda: None > > > or even this: > > from types import FunctionType > x = FunctionType(code, globals, name, argdefs, closure) > > > Creating functions in this way is not exactly convenient, but it is > possible. > > > There is a sense in which functions (and classes, and modules) are > "different", not because Python chooses to treat them differently, but > because they are inherently different. They are complex, almost free-form > compound objects, and there is no "nice" syntax for creating them inside an > expression. The closest Python comes to is lambda for functions, and that > is limited to a single expression. > > But regardless of the difficulty of creating a function object, once you > have one, it is a first class object. Anything you can do with any other > object, you can do with a function object. I'm with Terry on this: the code > snippet you gave, where a function object is passed to another function, is > a standard Python idiom and perfectly pythonic. > > It's not uncommon to have to create data before you use it, even when you > could create it in-place where you use it. E.g. we might choose to write: > > data = [lots of items here] > x = some_value() > result = func(x, data) > > > instead of: > > result = func(some_value(), [lots of items here]) > > > to make the function call more readable, or to keep to some maximum line > length, or in order to re-use some of the values. So even when we *can* > embed values in a call, often we choose not to. The fact that you don't > have a choice when it comes to functions is not a major problem. Even if > you could write: > > result = func(def (): lots of code goes here) > > > you probably shouldn't. > > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue May 14 04:20:27 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 19:20:27 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <51919687.9000607@pearwood.info> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51919687.9000607@pearwood.info> Message-ID: <51919F6B.5060409@stoneleaf.us> On 05/13/2013 06:42 PM, Steven D'Aprano wrote: > > In which case, either something will break, and you will fix the broken code, and then it will work, or nothing will > break, in which case it almost certainly doesn't matter and you can pretend you did it on purpose. +1 QOTW From stephen at xemacs.org Tue May 14 04:54:49 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 14 May 2013 11:54:49 +0900 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> Message-ID: <8738tqe0ee.fsf@uwakimon.sk.tsukuba.ac.jp> Haoyi Li writes: > Method chaining *prevents* [intermingling unrelated code] from > happening in the first place, which is really nice. Even if I try > to avoid this, I haven't seen any code base where this hasn't > happened in various places, causing endless headaches. There's nothing intrinsicly wrong with your preference, but as far as I can see it's un-Pythonic. Specifically, as you mention method chaining is equivalent to carrying along an implicit "this" argument. But in Pythonic code, explicit is better than implicit. For that reason, methods with side effects generally have a useless return value (case in point: .sort()). In other words, making method chaining difficult is a deliberate design decision. You can disagree with that decision, but you *do* have a choice of languages, so Python doesn't *need* to accomodate your preference. Also (and this is more or less specious, but I don't have enough brain cells today to decide whether it's less or more), many people have difficulty with the functional tools approach you present in lieu of iteration statements. Python does provide those tools, but they're considered power tools to be used only when the expense in maintainability by mere mortals is considered carefully, and decided to be the lesser cost. From haoyi.sg at gmail.com Tue May 14 05:11:03 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 13 May 2013 23:11:03 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <8738tqe0ee.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> <8738tqe0ee.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: I'm not arguing against "python is just so", I'm arguing against "function chaining is inherently ugly". If people say "we shouldn't do this in python, just because" "we shouldn't do this in python because it goes against the convention", I'm fine with that. If people say "we shouldn't do this in python, because method chaining is inherently ugly" I will disagree. No need to chase me away from the python community and ask me to use a different language =D On Mon, May 13, 2013 at 10:54 PM, Stephen J. Turnbull wrote: > Haoyi Li writes: > > > Method chaining *prevents* [intermingling unrelated code] from > > happening in the first place, which is really nice. Even if I try > > to avoid this, I haven't seen any code base where this hasn't > > happened in various places, causing endless headaches. > > There's nothing intrinsicly wrong with your preference, but as far as > I can see it's un-Pythonic. Specifically, as you mention method > chaining is equivalent to carrying along an implicit "this" argument. > But in Pythonic code, explicit is better than implicit. For that > reason, methods with side effects generally have a useless return > value (case in point: .sort()). In other words, making method > chaining difficult is a deliberate design decision. You can disagree > with that decision, but you *do* have a choice of languages, so Python > doesn't *need* to accomodate your preference. > > Also (and this is more or less specious, but I don't have enough brain > cells today to decide whether it's less or more), many people have > difficulty with the functional tools approach you present in lieu of > iteration statements. Python does provide those tools, but they're > considered power tools to be used only when the expense in > maintainability by mere mortals is considered carefully, and decided > to be the lesser cost. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at bendarnell.com Tue May 14 05:20:57 2013 From: ben at bendarnell.com (Ben Darnell) Date: Mon, 13 May 2013 23:20:57 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: On Fri, May 10, 2013 at 4:40 PM, Ezio Melotti wrote: > On Fri, May 10, 2013 at 10:54 PM, MRAB wrote:I > also think that forgetting a comma in a list of function args > between two string literal args is quite uncommon, whereas forgetting > it in a sequence of strings (list, set, dict, tuple) is much more > common, so this approach should cover most of the cases. > This is my experience as well. When I've run into problems by forgetting a comma it's nearly always been in a list, not in function arguments. (and it's never been between two items on the same line, so the proposal in one of the subthreads here to disallow implicit concatenation only between two strings on the same line wouldn't help much). The problem is that in other languages, a trailing comma is forbidden, while in python it is optional. This means that lists like [ 1, 2, 3, ] may or may not have a comma after the third element. The comma is there often enough that you can fall out of the habit of checking for it when you extend the list. The most pythonic solution is therefore to follow the example of the single-element tuple and make the trailing comma mandatory ;) -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 14 05:49:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 May 2013 13:49:44 +1000 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <51917F41.3030608@canterbury.ac.nz> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: On Tue, May 14, 2013 at 10:03 AM, Greg Ewing wrote: > Martin Morrison wrote: >> >> Def is not a constructor. It is an assignment statement. > > > It's really some of both. It constructs a function > object, and binds it to a name. These two operations > are entwined, in a sense, because the name being > bound to is used as an argument in the construction. > > We're after a generalisation of this entwined > construction-and-assignment. (Astruction? Consignment?) > I'm not entirely happy with the current proposal: > > def name = expr > > because it doesn't fully entwine. The expr can be a > constructor, but doesn't have to be, and even when it > is, the construction occurs separately from the > assignment. Also, it looks like an ordinary assignment > with 'def' stuck in front, which, as Guido points > out, seems somewhat random. It doesn't seem random to me, and it ties into some conceptual issues I've been pondering for a long time. I see this as potentially related to PEP 395 and the way __main__ and pseudo-modules are prone to breaking pickle and exposing implementation details, as well as to the fact we added @functools.wraps to preserve the original location details of decorated functions. It also ties into Guido's rationale for wanting to preserve the way that namedtuple (and now enum) special case simple module level assignments to help ensure that pickling works as expected (see http://bugs.python.org/issue17947#msg189160) What Piotr's proposal crystalised for me is the idea that we really have two different kinds of name binding in Python. I'm going to call them "incidental binding" and "definitive binding". Incidental binding is the way we normally think about variables in Python - it's just a local name that is used to reference an object. Objects know nothing about their incidental bindings, and (aside from function parameters and keyword arguments) the specific name used is typically of no interest outside the scope where the binding happens. For loops, with statements, except clauses, assignment statements - these all create incidental bindings, where the target name and the location of the binding is of no interest to the object being bound. By contrast, definitive bindings are exactly those where the object being bound *cares* about the name and where it happens. The most obvious case where the definitive binding matters is pickle compatibility, because the definitive name is what gets used to retrieve the appropriate code when unpickling. However, it also affects introspection, as the existence of @functools.wraps shows. def statements and class statements are definitive bindings - through the constructor, they set __name__, __qualname__ and __module__ on the bound object based on the location of the statement itself. @functools.wraps works because it converts a normally definitive binding (the declaration of a wrapper function) into an incidental binding by copying the definitive binding details from the function being wrapped. What the frame introspection hack in namedtuple and enum achieves is to allow an ordinary module level assignment to be treated as definitive - these APIs require that the name be passed in explicitly, but if the module is omitted they can look it up in the globals of the calling frame (Guido proposes that we provider a nicer API for getting that dynamic context information, and I'm now persuaded that he's right, but that's independent of the underlying conceptual issue). The reason I'm a big fan of Piotr's idea in general, and the "def NAME = EXPR" syntax in particular is that I think it takes this currently implicit, nebulous concept and elevates it to the status it deserves: giving an object an incidental label for convenient local reference and defining the *canonical* name for that object are different operations, and it is reasonable to provide an explicit syntax for the latter that is distinct-from-but-related-to the syntax for the former. Specifically, as Piotr suggested, I would like to see "def = " become syntactic sugar for something like: # We start with an ordinary incidental binding = _ref = if hasattr(_ref, "__defname__"): # We elevate this to a definitive binding by passing # __name__, __qualname__ and __module__ to the bound object _ref.__defname__("", + "", __name__) The beauty of this syntax is that it means if we define __defname__ appropriately on function objects and on type, then we can cleanly separate the "object creation" step from the "name binding" step, allowing functional APIs to exist on an equal playing field with the dedicated syntax. Such a change would also help explain why we *don't* allow arbitrary assignment targets in a def statement - as a definitive name binding, there are additional constraints that don't apply to an incidental binding. > This reads quite naturally: "define Animal as an > Enum with these arguments." Whereas I prefer the idea that the RHS is created as an anonymous object, and then bound to a canonical name. > Another example based on my own use case: > > def width as overridable_property("The width of the widget.") > > (Yes, this is yet another, different use of the word "as", > but I don't see anything wrong with that. Small words in > English often don't mean much on their own and derive most > of their meaning from their context.) Strong -1. We've had this discussion before, and any use of "as" should be to bind to a name on the RHS and the value bound should *not* be the expression on the LHS, but some related object (an imported module, a caught exception, the result of an __enter__ method). If the expression is being bound directly, then the name should appear on the LHS and use "=" as the symbol. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From bruce at leapyear.org Tue May 14 06:40:36 2013 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 13 May 2013 21:40:36 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: On Mon, May 13, 2013 at 8:49 PM, Nick Coghlan wrote: > What Piotr's proposal crystalised for me is the idea that we really > have two different kinds of name binding in Python. I'm going to call > them "incidental binding" and "definitive binding". > ... > Nice and clear explanation. > if hasattr(_ref, "__defname__"): > ... > > The beauty of this syntax is that it means if we define __defname__ > appropriately on function objects and on type, then ... > Why make __defname__ optional? If the author explicitly sticks a def in front of an assignment when they shouldn't, I think that should be an error. Do you really want: def a = 3 to be allowed? --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 14 08:32:00 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 May 2013 16:32:00 +1000 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: On Tue, May 14, 2013 at 2:40 PM, Bruce Leban wrote: > > On Mon, May 13, 2013 at 8:49 PM, Nick Coghlan wrote: >> >> What Piotr's proposal crystalised for me is the idea that we really >> have two different kinds of name binding in Python. I'm going to call >> them "incidental binding" and "definitive binding". >> ... > > > Nice and clear explanation. > >> >> if hasattr(_ref, "__defname__"): >> ... > > >> >> The beauty of this syntax is that it means if we define __defname__ >> appropriately on function objects and on type, then ... > > > Why make __defname__ optional? If the author explicitly sticks a def in > front of an assignment when they shouldn't, I think that should be an error. > Do you really want: > > def a = 3 > > > to be allowed? Good point. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue May 14 08:36:27 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 May 2013 16:36:27 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: On Tue, May 14, 2013 at 7:35 AM, Juancarlo A?ez wrote: > Just a reminder of why I started this discussion. > > I couldn't find a way to have: > > with context(): > do_this() > do_that() > > > in which the embedded block would be executed multiple times, through yield. > My fault! Lack of understanding about the context manager protocol. > > It is not possible to yield to the block multiple times because an exception > within the block will exit the context no matter what __exit__() does. Have you considered an iterator that produces context managers rather than the other way around? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From greg.ewing at canterbury.ac.nz Tue May 14 08:37:03 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 14 May 2013 18:37:03 +1200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: <5191DB8F.5060603@canterbury.ac.nz> Nick Coghlan wrote: > We've had this discussion before, and any use of "as" > should be to bind to a name on the RHS I don't remember any consensus being reached on this. My opinion is that imposing any such restriction on the use of "as" would be a foolish consistency that rules out a lot of natural-sounding constructs. -- Greg From greg.ewing at canterbury.ac.nz Tue May 14 08:43:47 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 14 May 2013 18:43:47 +1200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: <5191DD23.1080204@canterbury.ac.nz> Bruce Leban wrote: > Do you really want: > > def a = 3 > > to be allowed? But it's not just any old 3, it's the one defined in the fuzzbizz module! And it should print out as "fuzzbizz.3" to make that clear! -- Greg From ncoghlan at gmail.com Tue May 14 08:44:53 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 May 2013 16:44:53 +1000 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <5191DB8F.5060603@canterbury.ac.nz> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> <5191DB8F.5060603@canterbury.ac.nz> Message-ID: On Tue, May 14, 2013 at 4:37 PM, Greg Ewing wrote: > Nick Coghlan wrote: >> >> We've had this discussion before, and any use of "as" >> should be to bind to a name on the RHS > > > I don't remember any consensus being reached on this. > My opinion is that imposing any such restriction on > the use of "as" would be a foolish consistency that > rules out a lot of natural-sounding constructs. As far as I recall, it wasn't consensus, it was Guido saying "No!" to something I suggested and then me agreeing that his objections made sense :) (I forget what I was proposing at the time, though, I just remember it involved a "NAME as EXPR" clause and Guido really didn't like it for the reasons I gave) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Tue May 14 10:51:31 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 14 May 2013 17:51:31 +0900 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <5191DB8F.5060603@canterbury.ac.nz> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> <5191DB8F.5060603@canterbury.ac.nz> Message-ID: <871u9adjvw.fsf@uwakimon.sk.tsukuba.ac.jp> Greg Ewing writes: > My opinion is that imposing any such restriction on > the use of "as" would be a foolish consistency that > rules out a lot of natural-sounding constructs. Natural language is poorly fitted to be a programming language precisely because everything is possible. Not all natural constructs need to be anointed as Python syntax. It's especially important that constructs' semantics are indicated by their syntax. I suspect that use of both "... NAME as EXPR" and "... EXPR as NAME" would come at a readability cost. We should also remember that there are lots of Python programmers to whom none of the syntax that is natural-sounding to the English- trained ear is particularly mnemonic. The consistent application of a few regular rules of formation and failure to adhere to idiomatic variants is one important reason you can typically distinguish native from non-native writing at a glance. I suspect that catering to this preference for consistency with existing simple rules will make it easier for anybody (regardless of mother tongue) to become fluent in Python. Regardless the decision about use of "NAME as EXPR" syntax, I'm +1 on Nick's explanation that the "def" keyword indicates definitive binding of a name occurs as well as incidental (formal) binding, while its absence means that incidental binding only occurs. For that reason I think the same "=" operator should be used to signify the incidental binding. From stefan at drees.name Tue May 14 11:30:39 2013 From: stefan at drees.name (Stefan Drees) Date: Tue, 14 May 2013 11:30:39 +0200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <871u9adjvw.fsf@uwakimon.sk.tsukuba.ac.jp> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> <5191DB8F.5060603@canterbury.ac.nz> <871u9adjvw.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <5192043F.1080904@drees.name> Stephen J. Turnbull writes: > Greg Ewing writes: > > > My opinion is that imposing any such restriction on > > the use of "as" would be a foolish consistency that > > rules out a lot of natural-sounding constructs. > > Natural language is poorly fitted to be a programming language > precisely because everything is possible. Not all natural constructs > need to be anointed as Python syntax. It's especially important that > constructs' semantics are indicated by their syntax. I suspect that > use of both "... NAME as EXPR" and "... EXPR as NAME" would come at a > readability cost. > > We should also remember that there are lots of Python programmers to > whom none of the syntax that is natural-sounding to the English- > trained ear is particularly mnemonic. The consistent application of a > few regular rules of formation and failure to adhere to idiomatic > variants is one important reason you can typically distinguish native > from non-native writing at a glance. I suspect that catering to this > preference for consistency with existing simple rules will make it > easier for anybody (regardless of mother tongue) to become fluent in > Python. ... Stepping in other peoples shoes and looking through their glasses in my experience does not always produce meaningful perceptions, esp. when core concepts of life - like "nativeness" of language - are involved. There may be no excuse for a programmer to not learn the world language english, but prefering simple consistently applied rules will presumably enhance every language in creative use. The message is rarely inside a word or a simple phrase, isn't it? All the best-native-ltr-but-non-native-English-greetings, Stefan From markus at unterwaditzer.net Tue May 14 12:05:56 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Tue, 14 May 2013 12:05:56 +0200 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <42bdf52a-1a26-4d8e-93ce-353551b0b091@email.android.com> References: <42bdf52a-1a26-4d8e-93ce-353551b0b091@email.android.com> Message-ID: <8ac43bf9-e1b5-44e7-a2c2-78e1bae2f37e@email.android.com> (For some reason my mail software decided not to cc the list) -------- Original Message -------- From: Markus Unterwaditzer Sent: Tue May 14 07:31:33 CEST 2013 To: Greg Ewing Subject: Re: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects Greg Ewing wrote: >I'm not entirely happy with the current proposal: > > def name = expr > >because it doesn't fully entwine. The expr can be a >constructor, but doesn't have to be, and even when it >is, the construction occurs separately from the >assignment. Also, it looks like an ordinary assignment >with 'def' stuck in front, which, as Guido points >out, seems somewhat random. I don't agree with that, i think it's good the proposed syntax looks similar to an assignment, as the feature is clearly related to assignments. > >I'd like to propose something a bit different: > > > def name as expr(arg, ...) > >which would expand to something like > > name = expr(arg, ..., __name__ = 'name', __module__ = 'module') To me that implies that the __init__ method of a class has to implement explicit support for this feature, and that there's no way to make a standard implementation for object. I like the proposed __def__ method much more. -- Markus From apalala at gmail.com Tue May 14 12:41:26 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 14 May 2013 06:11:26 -0430 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: Message-ID: Nick, On Tue, May 14, 2013 at 2:06 AM, Nick Coghlan wrote: > Have you considered an iterator that produces context managers rather > than the other way around? > What an interesting idea! for c in self.closure(): with c do: match_this() match_that() I don't know yet if it's doable, but it certainly looks good. I'll try. Thanks! Cheers, -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From apalala at gmail.com Tue May 14 13:10:04 2013 From: apalala at gmail.com (=?UTF-8?Q?Juancarlo_A=C3=B1ez?=) Date: Tue, 14 May 2013 06:40:04 -0430 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518D5092.6010605@mrabarnett.plus.com> Message-ID: On Mon, May 13, 2013 at 10:50 PM, Ben Darnell wrote: > [ > 1, > 2, > 3, > ] > Ouch! -- Juancarlo *A?ez* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrts.pydev at gmail.com Tue May 14 14:29:05 2013 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Tue, 14 May 2013 15:29:05 +0300 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: I gladly stand corrected :)! Cheers to both of you, MS On Mon, May 13, 2013 at 9:11 PM, Haoyi Li wrote: > Sure =) Karnickel was one of the things that made us think "yeah, this > should be doable". I must confess I never made it to that readme! > > > On Mon, May 13, 2013 at 2:03 PM, Georg Brandl wrote: > >> Am 13.05.2013 19:17, schrieb Haoyi Li: >> > We were aware of Karnickel before we started, along with MetaPython >> > (https://code.google.com/p/metapython/) and Pyxl ( >> https://github.com/dropbox/pyxl) >> > >> > Apart from being abandoned, neither of the first two really >> demonstrates any >> > usability (although Pyxl is used quite heavily), which is why we went >> ahead with >> > MacroPy. >> >> Sure, I never intended it to be usable :) I was just responding to the >> claim >> that nobody did macros-with-import-hooks before. >> >> Georg >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Tue May 14 14:45:28 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 14 May 2013 09:45:28 -0300 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> Message-ID: On 13 May 2013 22:55, Haoyi Li wrote: > I do not think expression soup is particularly bad, it's just Javascript's > implementation (as usual) that is pretty borked. In Scala, expression > chaining lets you do some pretty nice things: > > > > memory.take(freePointer) > > .grouped(10) > > .map(_.fold("")(_+"\t"+_)) > > .reduce(_+"\n"+_) > I hope you know that if you enjoy this style, Python _is_ for you, and I consider it part of the "multiparadigm" language. You just have to design your methods to always return "self" - or make a class decorator to do so. But in case you are using other people's classes an adaptor for methosds that woudl return "None" is easy to achieve. I made this example a couple months ago to get it working: class Chain: def __init__(self, obj, root=None): self.__obj = obj def __getattr__(self, attr): val = getattr(self.__obj, attr) if callable(val): self.__callable = val return self return val def __call__(self, *args, **kw): val = self.__callable(*args, **kw) if val is None: return self return val Which allows, for example: >>> a = [] >>> Chain(a).append(5).append(6).append(-1).sort().append(3) <__main__.Chain object at 0x12b6f50> >>> a [-1, 5, 6, 3] And would work in your example as well, should you have a class with the desired methods. js -><- From tjreedy at udel.edu Tue May 14 17:22:49 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Tue, 14 May 2013 11:22:49 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> Message-ID: On 5/14/2013 8:45 AM, Joao S. O. Bueno wrote: > On 13 May 2013 22:55, Haoyi Li wrote: >> I do not think expression soup is particularly bad, it's just Javascript's >> implementation (as usual) that is pretty borked. In Scala, expression >> chaining lets you do some pretty nice things: >> >> >> >> memory.take(freePointer) >> >> .grouped(10) >> >> .map(_.fold("")(_+"\t"+_)) >> >> .reduce(_+"\n"+_) >> > > > I hope you know that if you enjoy this style, > Python _is_ for you, and I consider it part of the > "multiparadigm" language. > > You just have to design your methods to always return > "self" - or make a class decorator to do so. > > But in case you are using other people's classes > an adaptor for methosds that woudl return "None" > is easy to achieve. > > I made this example a couple months ago to get it working: > > class Chain: > def __init__(self, obj, root=None): > self.__obj = obj > def __getattr__(self, attr): > val = getattr(self.__obj, attr) > if callable(val): > self.__callable = val > return self > return val > def __call__(self, *args, **kw): > val = self.__callable(*args, **kw) > if val is None: > return self > return val None val should not always be converted to self. Consider [None].pop(), where None is the real return, not a placeholder for no return. > Which allows, for example: > >>>> a = [] >>>> Chain(a).append(5).append(6).append(-1).sort().append(3) Which either raises or does the wrong thing for list.pop > <__main__.Chain object at 0x12b6f50> >>>> a > [-1, 5, 6, 3] > > And would work in your example as well, should you have a class with > the desired methods. From greg at krypto.org Tue May 14 18:36:54 2013 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 May 2013 09:36:54 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518EC3E2.3030906@gmail.com> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518DD310.10100@fastmail.us> <884FA241-F99F-47A9-AF15-D9CC7770FC92@yahoo.com> <518DD9AD.8030403@canterbury.ac.nz> <518EC3E2.3030906@gmail.com> Message-ID: On Sat, May 11, 2013 at 3:19 PM, Ron Adam wrote: > > Greg, I meant to send my reply earlier to the list. > > > > On 05/11/2013 12:39 AM, Greg Ewing wrote: > >> Also, doesn't this imply that ... is now an operator in some contexts, >>> >> > but a literal in others? >> > > Could it's use as a literal be depreciated? I haven't seen it used in > that except in examples. > > > > It would have different meanings in different contexts, yes. >> >> But I wouldn't think of it as an operator, more as a token >> indicating string continuation, in the same way that the >> backslash indicates line continuation. >> > > Yep, it would be a token that the tokenizer would handle. So it would be > handled before anything else just as the line continuation '\' is. After > the file is tokenized, it is removed and won't interfere with anything else. > > It could be limited to strings, or expanded to include numbers and > possibly other literals. > > a = "a long text line "... > "that is continued "... > "on several lines." > > pi = 3.1415926535... > 8979323846... > 2643383279 > > You can't do this with a line continuation '\'. > > > Another option would be to have dedented multi-line string tokens |""" and > |'''. Not too different than r""" or b""". > > s = |"""Multi line string > | > |paragraph 1 > | > |paragraph 2 > |""" > > a = |"""\ > |a long text line \ > |that is continued \ > |on several lines.\ > |""" > > The rule for this is, for strings that start with |""" or |''', each > following line needs to be proceeded with whitespace + '|', until the > closing quote is reached. The tokenizer would just find and remove them as > it comes across them. Any '|' on a line after the first '|' would be > unaffected, so they don't need to be escaped. > > +1 to adding something like that. i loathe code that uses textwrap.dedent on constants. poor memory and runtime overhead. I was just writing up a response to suggest adding auto-detended multi-line strings to take care of one of the major use cases. I went with a naive d""" approach but I also like your | idea here. though it might cause too many people to want to line up the opening | and the following |s (which isn't necessary at all and is actively harmful for code style if it forces tedious reindentation when refactoring code that alters the length of the lhs before the opening |""") -gps > IT's a very explicit syntax. It's very obvious what is part of the string > and what isn't. Something like this would end the endless debate on > dedents. That alone might be worth it. ;-) > > I know the | is also a binary 'or' operator, but it's use for that is in a > different contex, so I don't think it would be a problem. > > Both of these options would be implemented in the tokenizer and are really > just tools to formatting source code rather than actual additions or > changes to the language. > > Cheers, > Ron > > > > > > > > > > > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Tue May 14 18:42:37 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 14 May 2013 09:42:37 -0700 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> Message-ID: On May 13, 2013, at 18:55, Haoyi Li wrote: > I do not think expression soup is particularly bad, it's just Javascript's implementation (as usual) that is pretty borked. In Scala, expression chaining lets you do some pretty nice things: > > > > memory.take(freePointer) > > > .grouped(10) > > > .map(_.fold("")(_+"\t"+_)) > > > .reduce(_+"\n"+_) In Python, all of these are non-mutating functions that return a new value. And you _can_ chain them together. Exactly the same way you would in Lisp, ML, or Haskell, or even JavaScript. What JavaScript and other "fluent" languages add is a way to chain together _mutating_ methods. This hides a fundamental distinction between map and append, or sorted and sort. And that part is the problem. The mutability distinction is closely tied to the expression/assignment distinction. Notice that modern fluent languages also try to make _everything_ an expression, even for loops. And traditional (lispy) functional languages that don't have statements build the equivalent of the expression/statement distinction on top of mutability (e.g., set special forms). You could point out that Scala is more readable than the Python equivalent, something like this: reduce(lambda ..., map(reduce(... grouped(10, ... And yes, that's a mess. But that's a separate issue. It's not because map doesn't return anything, it's because map isn't a method of list. If your point about extra code creeping into the middle is that chaining methods, as opposed to chaining functions calls, discourages unnecessarily turning expressions into statements... That's an interesting point. But a separate argument from fluency, None-returning mutators, and statements that aren't expressions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Tue May 14 18:57:03 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 14 May 2013 09:57:03 -0700 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> Message-ID: <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> On May 14, 2013, at 8:22, Terry Jan Reedy wrote: > On 5/14/2013 8:45 AM, Joao S. O. Bueno wrote: >> On 13 May 2013 22:55, Haoyi Li wrote: >>> I do not think expression soup is particularly bad, it's just Javascript's >>> implementation (as usual) that is pretty borked. In Scala, expression >>> chaining lets you do some pretty nice things: >>> >>> >>> >>> memory.take(freePointer) >>> >>> .grouped(10) >>> >>> .map(_.fold("")(_+"\t"+_)) >>> >>> .reduce(_+"\n"+_) >> >> >> I hope you know that if you enjoy this style, >> Python _is_ for you, and I consider it part of the >> "multiparadigm" language. >> >> You just have to design your methods to always return >> "self" - or make a class decorator to do so. >> >> But in case you are using other people's classes >> an adaptor for methosds that woudl return "None" >> is easy to achieve. >> >> I made this example a couple months ago to get it working: >> >> class Chain: >> def __init__(self, obj, root=None): >> self.__obj = obj >> def __getattr__(self, attr): >> val = getattr(self.__obj, attr) >> if callable(val): >> self.__callable = val >> return self >> return val >> def __call__(self, *args, **kw): >> val = self.__callable(*args, **kw) >> if val is None: >> return self >> return val > > None val should not always be converted to self. Consider [None].pop(), where None is the real return, not a placeholder for no return. I think this gets to a key issue. Chain is relying on the fact that all methods return something useful, or None. But None is something useful. This is a limitation that a language like Haskell doesn't have. You could have a "list of A" type whose methods all return "maybe A", meaning they return "Just something useful" or Nothing. In that case, Nothing is not something useful (but Just Nothing is). The benefit that comes along with this limitation is duck typing. We don't need a monad, type constructors, type deconstruction matching, and function-lifting functions because duck typing gets us 80% of the benefit for no effort. The downside is that we don't get that extra 20%. You don't have to build a lifting function when the need is implicit, but you _can't_ build a lifting function when you explicitly want it. I think that's a good tradeoff, but it is still a tradeoff. And Python's consistency is part of what makes it a good tradeoff. JavaScript has even more flexibility in this area, so in theory it should be even more powerful. But in practice, it's not. And that's because it doesn't have consistent rules that define the boundaries--instead of "mutators don't return", it's "some mutators return this, others return the argument, others don't return". So, where Python has problems with the other-20% side of its tradeoff, JavaScript has the same problems with the whole 100%, so it gets a lot less benefit out of the dynamic typing tradeoff. >> Which allows, for example: >> >>>>> a = [] >>>>> Chain(a).append(5).append(6).append(-1).sort().append(3) > > Which either raises or does the wrong thing for list.pop > >> <__main__.Chain object at 0x12b6f50> >>>>> a >> [-1, 5, 6, 3] >> >> And would work in your example as well, should you have a class with >> the desired methods. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From dickinsm at gmail.com Tue May 14 19:24:39 2013 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 14 May 2013 18:24:39 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E7EB2.30004@egenix.com> References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> Message-ID: On Sat, May 11, 2013 at 6:24 PM, M.-A. Lemburg wrote: > On 11.05.2013 19:05, Christian Tismer wrote: > > I think a simple stripping of white-space in > > > > text = s""" > > leftmost column > > two-char indent > > """ > > > > would solve 95 % of common indentation and concatenation cases. > > > > This is not a good solution for long lines where you don't want to > have embedded line endings. Taken from existing code: > > _litmonth = ('(?P' > 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' > 'm?r|mae|mrz|mai|okt|dez|' > 'fev|avr|juin|juil|aou|ao?|d?c|' > 'ene|abr|ago|dic|' > 'out' > ')[a-z,\.;]*') > > or > raise errors.DataError( > 'Inconsistent revenue item currency: ' > 'transaction=%r; transaction_position=%r' % > (transaction, transaction_position)) > Agreed. I use the implicit concatenation a lot for exception messages like the one above; we also tend to keep line length to 80 characters *and* use nice verbose exception messages. I could live with adding the extra '+' characters and parentheses, but I think it would be a net loss of readability. The _litmonth example looks like a candidate for re.VERBOSE and a triple-quoted string, though. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Tue May 14 19:34:39 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Tue, 14 May 2013 13:34:39 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> Message-ID: Andrew, I agree with 100% of what you said; you put it much more clearly than I could. #soapbox My point in stirring up all this is that we should not confuse "empirically bad" with "unpythonic". It is fine to reject something just because it's unpythonic, and does not fit into the rest of python's ecosystem, and it's fine to reject something because it is empirically bad. However, it is a mistake to say something (e.g. method chaining) is empirically bad *because* it is unpythonic. I say this because I think it does happen, whether intentionally or subconsciously. It's easy to blindly chant mantras like "explicit is better than implicit", but that blinds us to the deep and insightful trade-offs in these decisions, one of which is, of course, consistency with the rest of the ecosystem ("pythonicity"). #end soapbox Sorry for hijacking your thread! I think a better implementation for anonymous blocks (however it turns out) would be a wonderful thing; I've also abused decorators to do a lot of these things @sys.meta_path.append @singleton class ImportFinder(object): ... I also think being able to pass multiple anonymous blocks into a function could also greatly reduce unnecessary uses of inheritance; I'm sure everyone has encountered situations where you are inheriting from a class, not because you actually *want* to create objects, but simply because you want to override one or more of the methods that the class has, such that when you call imp.do_stuff(), it will use the overriding methods. In this way, inheritance is often used as an poor substitute for passing in multiple blocks into some function; it's a poor substitute because it is far more powerful than necessary, adding tons syntactic boilerplate and greatly increasing the number of things that can go wrong, when all you want is a pure function into which you can pass more than one block to customize it's behavior. On Tue, May 14, 2013 at 12:57 PM, Andrew Barnert wrote: > On May 14, 2013, at 8:22, Terry Jan Reedy wrote: > > > On 5/14/2013 8:45 AM, Joao S. O. Bueno wrote: > >> On 13 May 2013 22:55, Haoyi Li wrote: > >>> I do not think expression soup is particularly bad, it's just > Javascript's > >>> implementation (as usual) that is pretty borked. In Scala, expression > >>> chaining lets you do some pretty nice things: > >>> > >>> > >>> > >>> memory.take(freePointer) > >>> > >>> .grouped(10) > >>> > >>> .map(_.fold("")(_+"\t"+_)) > >>> > >>> .reduce(_+"\n"+_) > >> > >> > >> I hope you know that if you enjoy this style, > >> Python _is_ for you, and I consider it part of the > >> "multiparadigm" language. > >> > >> You just have to design your methods to always return > >> "self" - or make a class decorator to do so. > >> > >> But in case you are using other people's classes > >> an adaptor for methosds that woudl return "None" > >> is easy to achieve. > >> > >> I made this example a couple months ago to get it working: > >> > >> class Chain: > >> def __init__(self, obj, root=None): > >> self.__obj = obj > >> def __getattr__(self, attr): > >> val = getattr(self.__obj, attr) > >> if callable(val): > >> self.__callable = val > >> return self > >> return val > >> def __call__(self, *args, **kw): > >> val = self.__callable(*args, **kw) > >> if val is None: > >> return self > >> return val > > > > None val should not always be converted to self. Consider [None].pop(), > where None is the real return, not a placeholder for no return. > > I think this gets to a key issue. > > Chain is relying on the fact that all methods return something useful, or > None. But None is something useful. > > This is a limitation that a language like Haskell doesn't have. You could > have a "list of A" type whose methods all return "maybe A", meaning they > return "Just something useful" or Nothing. In that case, Nothing is not > something useful (but Just Nothing is). > > The benefit that comes along with this limitation is duck typing. We don't > need a monad, type constructors, type deconstruction matching, and > function-lifting functions because duck typing gets us 80% of the benefit > for no effort. The downside is that we don't get that extra 20%. You don't > have to build a lifting function when the need is implicit, but you _can't_ > build a lifting function when you explicitly want it. > > I think that's a good tradeoff, but it is still a tradeoff. > > And Python's consistency is part of what makes it a good tradeoff. > JavaScript has even more flexibility in this area, so in theory it should > be even more powerful. But in practice, it's not. And that's because it > doesn't have consistent rules that define the boundaries--instead of > "mutators don't return", it's "some mutators return this, others return the > argument, others don't return". So, where Python has problems with the > other-20% side of its tradeoff, JavaScript has the same problems with the > whole 100%, so it gets a lot less benefit out of the dynamic typing > tradeoff. > > >> Which allows, for example: > >> > >>>>> a = [] > >>>>> Chain(a).append(5).append(6).append(-1).sort().append(3) > > > > Which either raises or does the wrong thing for list.pop > > > >> <__main__.Chain object at 0x12b6f50> > >>>>> a > >> [-1, 5, 6, 3] > >> > >> And would work in your example as well, should you have a class with > >> the desired methods. > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Tue May 14 19:43:39 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 14 May 2013 19:43:39 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> Message-ID: <519277CB.9020903@egenix.com> On 14.05.2013 19:24, Mark Dickinson wrote: > On Sat, May 11, 2013 at 6:24 PM, M.-A. Lemburg wrote: > >> On 11.05.2013 19:05, Christian Tismer wrote: >>> I think a simple stripping of white-space in >>> >>> text = s""" >>> leftmost column >>> two-char indent >>> """ >>> >>> would solve 95 % of common indentation and concatenation cases. >>> >> > >> This is not a good solution for long lines where you don't want to >> have embedded line endings. Taken from existing code: >> >> _litmonth = ('(?P' >> 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' >> 'm?r|mae|mrz|mai|okt|dez|' >> 'fev|avr|juin|juil|aou|ao?|d?c|' >> 'ene|abr|ago|dic|' >> 'out' >> ')[a-z,\.;]*') >> >> or >> raise errors.DataError( >> 'Inconsistent revenue item currency: ' >> 'transaction=%r; transaction_position=%r' % >> (transaction, transaction_position)) >> > > Agreed. I use the implicit concatenation a lot for exception messages like > the one above; we also tend to keep line length to 80 characters *and* use > nice verbose exception messages. I could live with adding the extra '+' > characters and parentheses, but I think it would be a net loss of > readability. > > The _litmonth example looks like a candidate for re.VERBOSE and a > triple-quoted string, though. It's taken out of context, just to demonstrate some real world example of how long strings are broken down to handy 80 char code lines. The _litmonth variable is used as component to build other REs and those typically also contain (important) whitespace, so re.VERBOSE won't work. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 14 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From dickinsm at gmail.com Tue May 14 19:57:39 2013 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 14 May 2013 18:57:39 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <519277CB.9020903@egenix.com> References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> <519277CB.9020903@egenix.com> Message-ID: On Tue, May 14, 2013 at 6:43 PM, M.-A. Lemburg wrote: > > The _litmonth example looks like a candidate for re.VERBOSE and a > > triple-quoted string, though. > > It's taken out of context, just to demonstrate some real world > example of how long strings are broken down to handy 80 char > code lines. > > The _litmonth variable is used as component to build other REs > and those typically also contain (important) whitespace, > so re.VERBOSE won't work. Ah, okay. Makes sense. Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From zuo at chopin.edu.pl Tue May 14 20:00:40 2013 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Tue, 14 May 2013 20:00:40 +0200 Subject: [Python-ideas] =?utf-8?q?Implicit_string_literal_concatenation_co?= =?utf-8?q?nsidered_harmful=3F?= In-Reply-To: References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> Message-ID: 14.05.2013 19:24, Mark Dickinson wrote: >> ? ? ? ? ? ? ? ? ? ? raise errors.DataError( >> ? ? ? ? ? ? ? ? ? ? ? ? 'Inconsistent revenue item currency: ' >> ? ? ? ? ? ? ? ? ? ? ? ? 'transaction=%r; transaction_position=%r' % >> ? ? ? ? ? ? ? ? ? ? ? ? (transaction, transaction_position)) > > Agreed. ?I use the implicit concatenation a lot for exception > messages like the one above Me too. But what do you think about: raise errors.DataError( 'Inconsistent revenue item currency: ' c'transaction=%r; transaction_position=%r' % (transaction, transaction_position)) c'...' -- for explicit string (c)ontinuation or (c)oncatenation. Regards. *j From solipsis at pitrou.net Tue May 14 20:05:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 20:05:01 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> Message-ID: <20130514200501.6de4d767@fsol> On Sat, 11 May 2013 19:24:02 +0200 "M.-A. Lemburg" wrote: > > This is not a good solution for long lines where you don't want to > have embedded line endings. Taken from existing code: > > _litmonth = ('(?P' > 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' > 'm?r|mae|mrz|mai|okt|dez|' > 'fev|avr|juin|juil|aou|ao?|d?c|' > 'ene|abr|ago|dic|' > 'out' > ')[a-z,\.;]*') For the record, I know this isn't the point of your message, but you're probably missing 'f?v' (accented) above :-) Regards Antoine. From barry at python.org Tue May 14 22:45:55 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 14 May 2013 16:45:55 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: <20130514164555.65094bde@anarchist> On May 14, 2013, at 01:49 PM, Nick Coghlan wrote: >What Piotr's proposal crystalised for me is the idea that we really >have two different kinds of name binding in Python. I'm going to call >them "incidental binding" and "definitive binding". Perhaps "definitional" is another way to describe the latter. class statements and def statements both define a new object and bound that new object to a name. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tjreedy at udel.edu Tue May 14 23:24:47 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Tue, 14 May 2013 17:24:47 -0400 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <87d2svegjg.fsf@uwakimon.sk.tsukuba.ac.jp> <51907053.5010108@pearwood.info> <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> Message-ID: On 5/14/2013 1:34 PM, Haoyi Li wrote: > Andrew, I agree with 100% of what you said; you put it much more clearly > than I could. Ditto. > #soapbox > My point in stirring up all this is that we should not confuse > "empirically bad" with "unpythonic". It is fine to reject something just > because it's unpythonic, and does not fit into the rest of python's > ecosystem, and it's fine to reject something because it is empirically bad. > > However, it is a mistake to say something (e.g. method chaining) is > empirically bad *because* it is unpythonic. I would almost say that doing so is unpythonic ;-). Method chainging in itself *is* python when done, for instance, with the subset of string methods that return a string. I am sure that one can find things like f.read().string().lower() in the stdlib. This gets to Andrew points. As for your soapbox issue: People sometimes misuse Tim Peter's Zen of Python points. He wrote them to stimulate thought, not serve as a substitute for thought, and certainly not be a pile of mudballs to be used to chase people away. I regard 'python' versus 'unpythonic' much the same way. Misused, but with a proper use. I occasionally use it as a summary lead in followed by an explanation ('I think this is unpythonic in that ...'). And by the way, as for your macro module: I like that Python can be and is used to do 'crazy' things, even if I think something is too crazy for the stdlib. (I also notice that standards change. Metaclasses were originally a crazy hack. If I remember correctly, they were formally supported as part of unifying types and classes. 10 years later, we are just adding the 1st stdlib module use thereof.) -- Terry Jan Reedy From jonathan.eunice at gmail.com Tue May 14 21:53:53 2013 From: jonathan.eunice at gmail.com (Jonathan Eunice) Date: Tue, 14 May 2013 12:53:53 -0700 (PDT) Subject: [Python-ideas] Let's be more orderly! Message-ID: Python?s nicely evolved its standard data structures. Bringing set into the core, adding OrderedDict (and friends), and establishing collections ABCs?all these up-level common facilities, broadly improving program clarity and correctness. I?d like Python to take the next step, especially regarding ordering. Using a compatible, separate implementation for OrderedDict is a fine way to gracefully extend the language, but it leaves ordering only half-accomodated. Consider: OrderedDict(a=2, b=3, c=7) yields: OrderedDict([('a', 2), ('c', 7), ('b', 3)]) The items are immediately disordered, having been jumbled passing through a conventional dict. One can initialize using a list of items, of course, but that reminds me of the line from NetHack: ?You enter what seems to be an older, more primitive world.? Everyone rightly accepts doing a bit more specification for truly ?extra? data structure features?persistence, say. And if falling back to lists of tuples is the best that can be done for ordered structures, well, okay. We?ll live with it. But from an app developer?s point of view, ordering is a basic, essential property. It seems like something that should be gracefully accommodated, as a built-in, rather than as ?an extra? or something that requires falling back to Late Medieval Python. kwargs arrived in 1.4, back in 1996, right? So I propose that kwargs, at least, default to an ordered mapping rather than a pure hash mapping. Ideally, {...} literals would also be ordered. I suspect this will be an unpopular idea among implementers, for whom unordered dict is a pervasive and long-optimized tool. But this is a correctness, or at least a consistency, issue. I don?t see any elegant alternative way to initialize ordered data structures unless the modern Python initialization idiom(s), esp. kwargs, themselves observe order. Historically, sort features were usually unstable because that?s easier to implement and faster to run. Over time, stable sort has become the norm, either as an option (e.g. GNU?s sort --stable, Perl?s use sort 'stable' as of 5.8) or implicitly (e.g. Python?s sorted, as of 2.2). Over time, getting better results proved more broadly important than getting near-correct results faster; and both by code optimization and system improvement, the associated time or space cost of stable ordering was mooted. I?d like the same to happen for Python mappings. It?s my understanding that Ruby 1.9 has recently made this shift. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed May 15 00:23:26 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 14 May 2013 15:23:26 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: Message-ID: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> On May 14, 2013, at 12:53, Jonathan Eunice wrote: > Using a compatible, separate implementation for OrderedDict is a fine way to gracefully extend the language, but it leaves ordering only half-accomodated. Consider: > > OrderedDict(a=2, b=3, c=7) If your proposal is to replace dict with OrderedDict, I think you need at least one use case besides OrderedDict's constructor. > But from an app developer?s point of view, ordering is a basic, essential property. > There are plenty of things that are basic, essential properties of a particular type, but there is very little that's a basic, essential property of _all_ types. Surely you wouldn't suggest that a complex number should remember whether you specified the real or imaginary component first. So your argument is that order of insertion is a basic property of _mappings_ in particular. And I think blist.sorteddict, trie.Trie, etc. are good arguments against even that assertion. It's not just about performance; it's about correctness. Insertion order is not a fundamental property of mappings. If you're just suggesting that collections.abc should grow an OrderedMapping, and/or that kwargs should be an OrderedDict, either or both might be reasonable. But if you're suggesting that Mapping and dict should both become ordered, I disagree with both. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Wed May 15 00:43:10 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 14 May 2013 18:43:10 -0400 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <51919687.9000607@pearwood.info> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51919687.9000607@pearwood.info> Message-ID: On Mon, May 13, 2013 at 9:42 PM, Steven D'Aprano wrote: > On 14/05/13 01:49, Jim Jewett wrote: >> On Sun, May 12, 2013 at 9:33 PM, Raymond Hettinger >> wrote: >>> On May 9, 2013, at 3:29 AM, Piotr Duda wrote: >>>> Animals = Enum('Animals', 'dog cat bird') >>>> which violates DRY >>> This is a profound misreading of DRY which is all about not repeating >>> big chunks of algorithmic logic. >> DRY, like most heuristics, is about making mistakes less likely. >> Mistakes are likely with huge chunks of repeated logic, because people >> are inclined to fix things at only one location. >> Mistakes are likely with the above because it is conceptually only one >> location, but syntactically two -- and doing something different in >> the second location is a mistake that the compiler won't catch. > "Likely"? I think not. > If you (generic "you", not you personally) are such a careless coder that > you are *likely* to mistype the name in a *single line* like `name = > Enum("name", "items...")` then there is probably no help for you. Mistakes > happen to the best of us, but they are *rare*. Likely relative to other my other mistakes, yes. > Besides, strictly speaking setting the Enum name different to the name being > bound is not necessarily an error. We can, and frequently do, define > functions with one name and then bind them to a different name, e.g. in > decorators. If the name didn't need to match, then you could just as well use strings and retype them everywhere. It would be caught (or not) in testing ... Wanting to ensure that typos don't slip in -- even to documentation-only sections -- is much of the motivation for enums, as well as for the recurrent calls for a make statement. >> is that you might accidentally type ... > >>> Anmals = Enum('Animals', 'dog cat bird') > In which case, either something will break, and you will fix the broken > code, and then it will work, or nothing will break, in which case it almost > certainly doesn't matter and you can pretend you did it on purpose. I regularly maintain Java (and formerly C) code in which a name was misspelled, or even just spelled oddly.* The code works, so long as other code is sufficiently consistent, but it is a lot harder to maintain. (That said, *most* of the extra problems are from grep not finding the code, which wouldn't be a problem for this *particular* case -- at least not until some other tool started using the class name for pickling or documentation or something.) * "oddly" can even mean "correctly", if the project otherwise uses abbreviations. -jJ From steve at pearwood.info Wed May 15 00:47:36 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 15 May 2013 08:47:36 +1000 Subject: [Python-ideas] Anonymous blocks (again): In-Reply-To: References: <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> Message-ID: <20130514224735.GA9103@ando> On Tue, May 14, 2013 at 05:24:47PM -0400, Terry Jan Reedy wrote: > As for your soapbox issue: People sometimes misuse Tim Peter's Zen of > Python points. He wrote them to stimulate thought, not serve as a > substitute for thought, and certainly not be a pile of mudballs to be > used to chase people away. +1000 -- Steven From steve at pearwood.info Wed May 15 01:19:32 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 15 May 2013 09:19:32 +1000 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51919687.9000607@pearwood.info> Message-ID: <20130514231932.GB9103@ando> On Tue, May 14, 2013 at 06:43:10PM -0400, Jim Jewett wrote: > On Mon, May 13, 2013 at 9:42 PM, Steven D'Aprano wrote: > > On 14/05/13 01:49, Jim Jewett wrote: > >> On Sun, May 12, 2013 at 9:33 PM, Raymond Hettinger > >> wrote: > >>> On May 9, 2013, at 3:29 AM, Piotr Duda wrote: > > >>>> Animals = Enum('Animals', 'dog cat bird') > >>>> which violates DRY > > >>> This is a profound misreading of DRY which is all about not repeating > >>> big chunks of algorithmic logic. > > >> DRY, like most heuristics, is about making mistakes less likely. > > >> Mistakes are likely with huge chunks of repeated logic, because people > >> are inclined to fix things at only one location. > > >> Mistakes are likely with the above because it is conceptually only one > >> location, but syntactically two -- and doing something different in > >> the second location is a mistake that the compiler won't catch. > > > "Likely"? I think not. > > > If you (generic "you", not you personally) are such a careless coder that > > you are *likely* to mistype the name in a *single line* like `name = > > Enum("name", "items...")` then there is probably no help for you. Mistakes > > happen to the best of us, but they are *rare*. > > Likely relative to other my other mistakes, yes. I see what you did there :-) But seriously, *in my experience*, mere typos of names are not usually critical errors. They are usually discovered and fixed immediately when the code raises a NameError. So I have little (not zero, but *little*) concern about the risk of typos like: Animals = Enum("Animal", "cow dog cat sheep") Chances are that you will detect this immediately you try to run the code, when Animal.cow raises NameError. Either way though, it's an easy error to fix, and an easy mistake for a linter to check. So why I acknowledge that *in principle* this is a weakness of Python that can lead to bugs, it is my opinion that *in practice* it is close enough to harmless that there's no reason to rush into a sub-optimal fix for it. > > Besides, strictly speaking setting the Enum name different to the name being > > bound is not necessarily an error. We can, and frequently do, define > > functions with one name and then bind them to a different name, e.g. in > > decorators. > > If the name didn't need to match, then you could just as well use strings and > retype them everywhere. It would be caught (or not) in testing ... The advantage of having Animal.cow, Animal.dog etc. is to standardise on the cow and dog parts, not the Animal part. Animal is mostly just the container, it is the enums that you usually care about. For many purposes, we won't even care about the container. We'll do something like this: Directions = Enum("Directions", "UP DOWN LEFT RIGHT") globals().update(Directions.members()) (have I got the name "members" right?) and then always refer to UP, DOWN etc. directly. If we don't care about pickling the enums, and do care about "namespace pollution", we might even `del Directions` and be done with it. > Wanting to ensure that typos don't slip in -- even to documentation-only > sections -- is much of the motivation for enums, as well as for the recurrent > calls for a make statement. I agree. Having to type the class name twice is a wart. But it's not a big one. More of a pimple. -- Steven From steve at pearwood.info Wed May 15 01:34:45 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 15 May 2013 09:34:45 +1000 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: Message-ID: <20130514233445.GC9103@ando> On Tue, May 14, 2013 at 12:53:53PM -0700, Jonathan Eunice wrote: > But from an app developer?s point of view, ordering is a basic, essential > property. [...] > So I propose that kwargs, at least, default to an ordered mapping rather > than a pure hash mapping. Speak for yourself. I don't think it is, and while having a fixed order is sometimes useful, often it is not necessary. Thinking about my code, I cannot think of even one function or method which would get a benefit from having kwargs be ordered. Frankly, with the exception of OrderedDict itself, if your functions would like to treat kwargs args differently based on their order, e.g. func(a=1, b=2) vs func(b=2, a=1), I think your design may be broken. Keeping things ordered imposes a performance cost. I think you would need to demonstrate that the advantage of having kwargs be an ordered dict for the cases where it matters outweighs the cost for the cases where it doesn't matter. If somebody demonstrates that the cost of shifting to an ordered dict is minimal, and the advantage is non-trivial, then and only then would I support the idea. > Historically, sort features were usually unstable because that?s easier to > implement and faster to run. Over time, stable sort has become the norm, I don't think that is a particularly good analogy. Stable sorting is intuitively correct. Treating keyword args differently according to their order is intuitively the wrong thing to do, at least most of the time. -- Steven From ethan at stoneleaf.us Wed May 15 01:24:36 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 16:24:36 -0700 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: <20130514231932.GB9103@ando> References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51919687.9000607@pearwood.info> <20130514231932.GB9103@ando> Message-ID: <5192C7B4.6070709@stoneleaf.us> On 05/14/2013 04:19 PM, Steven D'Aprano wrote: > > For many purposes, we won't even care about the container. We'll do > something like this: > > Directions = Enum("Directions", "UP DOWN LEFT RIGHT") > globals().update(Directions.members()) > > (have I got the name "members" right?) Almost. It's __members__. -- ~Ethan~ From timothy.c.delaney at gmail.com Wed May 15 02:36:38 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Wed, 15 May 2013 10:36:38 +1000 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <20130514233445.GC9103@ando> References: <20130514233445.GC9103@ando> Message-ID: On 15 May 2013 09:34, Steven D'Aprano wrote: > > I don't think that is a particularly good analogy. Stable sorting is > intuitively correct. Treating keyword args differently according to > their order is intuitively the wrong thing to do, at least most of the > time. > The argument *for* an ordered kwargs however is that same one that was used for Enums iterating in definition order by default - it's an ordering that can't be recovered once it's lost. However, it's not a property that I think is absolutely necessary for kwargs and we shouldn't lose performance to gain that property, but there have been times when I would have liked it. Barry created a new dict implementation a while back that as a side-effect retained insertion order so long as no keys were removed. That would be suitable IMO for kwargs as a guarantee - definition order so long as nothing has been removed. It was discussed and there was the suggestion to actively break this functionality in order to prevent people relying on it. I'm not sure what the end result of the discussion was off the top of my head. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Wed May 15 02:41:29 2013 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 May 2013 17:41:29 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <20130514233445.GC9103@ando> Message-ID: On Tue, May 14, 2013 at 5:36 PM, Tim Delaney wrote: > On 15 May 2013 09:34, Steven D'Aprano wrote: > >> >> I don't think that is a particularly good analogy. Stable sorting is >> intuitively correct. Treating keyword args differently according to >> their order is intuitively the wrong thing to do, at least most of the >> time. >> > > The argument *for* an ordered kwargs however is that same one that was > used for Enums iterating in definition order by default - it's an ordering > that can't be recovered once it's lost. > > However, it's not a property that I think is absolutely necessary for > kwargs and we shouldn't lose performance to gain that property, but there > have been times when I would have liked it. > > Barry created a new dict implementation a while back that as a side-effect > retained insertion order so long as no keys were removed. That would be > suitable IMO for kwargs as a guarantee - definition order so long as > nothing has been removed. It was discussed and there was the suggestion to > actively break this functionality in order to prevent people relying on it. > I'm not sure what the end result of the discussion was off the top of my > head. > There was also some conversation at the pycon sprints this year about if keyword arguments could use an ordered dict or not but I wasn't paying enough attention to that to be able to give a summary of what was discussed. My gut feeling is that it'd add overhead even though I would find it useful at times. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Wed May 15 02:45:20 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 14 May 2013 20:45:20 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <20130514233445.GC9103@ando> References: <20130514233445.GC9103@ando> Message-ID: On Tue, May 14, 2013 at 7:34 PM, Steven D'Aprano wrote: > On Tue, May 14, 2013 at 12:53:53PM -0700, Jonathan Eunice wrote: >> But from an app developer?s point of view, ordering is a basic, essential >> property. > Speak for yourself. I don't think it is, and while having a fixed order > is sometimes useful, often it is not necessary. The fact that it is often useful -- if only for debugging and testing -- can make it seem like a basic property that shouldn't be sacrificed without a reason. Think of the contortions that dict code (prior to the Denial Of Service scare) went through to maintain a stable (albeit arbitrary) order. I also suspect I'm not the only one who looks at the self.(var) = (var) of an __init__ function and feels that the arguments are really more of an association-list, so that creating a map was just wasted work. I do NOT propose to fix this code smell in the general case, though. > Frankly, with the exception of OrderedDict itself, if your functions > would like to treat kwargs args differently based on their order, e.g. > func(a=1, b=2) vs func(b=2, a=1), I think your design may be broken. I agree. But the line between "broken" and "easier to debug" isn't always bright. (That said, if kwargs in particular were essentially ordered, I would want to allow repeats, as do the web mappings. I'm not sure that would be a net positive for readability.) >> Historically, sort features were usually unstable because that?s easier to >> implement Not really, for the more obvious algorithms. But those aren't the fastest. >> and faster to run. Over time, stable sort has become the norm, > Stable sorting is intuitively correct. My intuition is that if two objects are equal, it shouldn't matter what order they come in. Preference for a stable sort only comes after lots of experience with data flows involving (or abusing) multi-step sorts. -jJ From donspauldingii at gmail.com Wed May 15 03:57:43 2013 From: donspauldingii at gmail.com (Don Spaulding) Date: Tue, 14 May 2013 20:57:43 -0500 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> Message-ID: On Tue, May 14, 2013 at 5:23 PM, Andrew Barnert wrote: > On May 14, 2013, at 12:53, Jonathan Eunice > wrote: > > Using a compatible, separate implementation for OrderedDict is a fine way > to gracefully extend the language, but it leaves ordering only > half-accomodated. Consider: > > OrderedDict(a=2, b=3, c=7) > > If your proposal is to replace dict with OrderedDict, I think you need at > least one use case besides OrderedDict's constructor. > I don't understand the dismissal of OrderedDict.__init__ as an invalid use case. It would be a substantial usability improvement to special-case OrderedDict at compile-time purely to get the ability to instantiate odict literals (not that I'm suggesting that). In the interest of moving the discussion forward, I've had a few use cases along these lines. Let's say I want to create simple HTML elements by hand: def create_element(tag, text='', **attributes): attrs = ['{}="{}"'.format(k,v) for k, v in attributes.items()] return "<{0} {1}>{2}".format(tag, ' '.join(attrs), text) print(create_element('img', alt="Some cool stuff.", src="coolstuff.jpg")) Some cool stuff. In python today, if I want to the attributes to retain the order in which they appear as arguments, the function has to be changed in such a way that all calling code is forced to look like some variation on this theme: >>> create_element('img', [('alt', 'Some cool stuff.'), ('src', 'coolstuff.jpg')]) It's not that it's impossible to do, it's that dict-based API's would benefit from the function being able to decide on its own whether or not it cared about the order of arguments. Having to express a kwargs-based or plain-old-dict-based function as a list-of-2-tuples function is... uncool. ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 15 04:07:30 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 19:07:30 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> Message-ID: <5192EDE2.2050402@stoneleaf.us> On 05/14/2013 06:57 PM, Don Spaulding wrote: > On Tue, May 14, 2013 at 5:23 PM, Andrew Barnert > wrote: > > On May 14, 2013, at 12:53, Jonathan Eunice > wrote: > >> Using a compatible, separate implementation for |OrderedDict| is a fine way to gracefully extend the language, but >> it leaves ordering only half-accomodated. Consider: >> >> OrderedDict(a=2, b=3, c=7) >> > If your proposal is to replace dict with OrderedDict, I think you need at least one use case besides OrderedDict's > constructor. > > > I don't understand the dismissal of OrderedDict.__init__ as an invalid use case. It's not being dismissed, but it's only one. There are thousands of functions using **kwds that simply don't care about the order. Should they all pay the performance price so that some tiny fraction can benefit? While it is correctly said that if performance is a Big Deal you shouldn't be using Python, we also are not interested in making it slower without a really good reason. -- ~Ethan~ From dholth at gmail.com Wed May 15 04:38:33 2013 From: dholth at gmail.com (Daniel Holth) Date: Tue, 14 May 2013 22:38:33 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <5192EDE2.2050402@stoneleaf.us> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <5192EDE2.2050402@stoneleaf.us> Message-ID: On Tue, May 14, 2013 at 10:07 PM, Ethan Furman wrote: > On 05/14/2013 06:57 PM, Don Spaulding wrote: >> >> On Tue, May 14, 2013 at 5:23 PM, Andrew Barnert > > wrote: >> >> >> On May 14, 2013, at 12:53, Jonathan Eunice > > wrote: >> >>> Using a compatible, separate implementation for |OrderedDict| is a >>> fine way to gracefully extend the language, but >>> it leaves ordering only half-accomodated. Consider: >>> >>> OrderedDict(a=2, b=3, c=7) >>> >> If your proposal is to replace dict with OrderedDict, I think you need >> at least one use case besides OrderedDict's >> constructor. >> >> >> I don't understand the dismissal of OrderedDict.__init__ as an invalid use >> case. > > > It's not being dismissed, but it's only one. There are thousands of > functions using **kwds that simply don't care about the order. Should they > all pay the performance price so that some tiny fraction can benefit? > > While it is correctly said that if performance is a Big Deal you shouldn't > be using Python, we also are not interested in making it slower without a > really good reason. > > -- > ~Ethan~ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas I'm not convinced that the performance argument is valid. There are clever ways to optimize ordered dicts. It would be quite a change to the language. From donspauldingii at gmail.com Wed May 15 04:56:13 2013 From: donspauldingii at gmail.com (Don Spaulding) Date: Tue, 14 May 2013 21:56:13 -0500 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <5192EDE2.2050402@stoneleaf.us> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <5192EDE2.2050402@stoneleaf.us> Message-ID: On Tue, May 14, 2013 at 9:07 PM, Ethan Furman wrote: > On 05/14/2013 06:57 PM, Don Spaulding wrote: > >> On Tue, May 14, 2013 at 5:23 PM, Andrew Barnert > abarnert at yahoo.com>> wrote: >> >> >> On May 14, 2013, at 12:53, Jonathan Eunice > jonathan.eunice at gmail.**com >> wrote: >> >> Using a compatible, separate implementation for |OrderedDict| is a >>> fine way to gracefully extend the language, but >>> it leaves ordering only half-accomodated. Consider: >>> >>> OrderedDict(a=2, b=3, c=7) >>> >>> If your proposal is to replace dict with OrderedDict, I think you >> need at least one use case besides OrderedDict's >> constructor. >> >> >> I don't understand the dismissal of OrderedDict.__init__ as an invalid >> use case. >> > > It's not being dismissed, but it's only one. There are thousands of > functions using **kwds that simply don't care about the order. Should they > all pay the performance price so that some tiny fraction can benefit? > > While it is correctly said that if performance is a Big Deal you shouldn't > be using Python, we also are not interested in making it slower without a > really good reason. > I'm of the opinion that the status quo is "fast, but kinda wrong". Ideally we'd have a "fast, and correct" implementation, but I'd settle for a "negligibly slower, but still correct" implementation. Nobody wants Python 3-dot-next to be slower, but how much slower are we really talking about given that Raymond Hettinger recently proposed[0] a plain-old-dict implementation that uses less space, performs better, and as an unintended side-effect just happens to maintain its initial order? Aside from the performance impact, isn't any code that relies on any current ordering behavior of **kwargs broken by design? [0]: http://mail.python.org/pipermail/python-dev/2012-December/123028.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Wed May 15 05:00:19 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 15 May 2013 12:00:19 +0900 Subject: [Python-ideas] [Spam] Re: Anonymous blocks (again): In-Reply-To: <20130514224735.GA9103@ando> References: <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> <20130514224735.GA9103@ando> Message-ID: <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > On Tue, May 14, 2013 at 05:24:47PM -0400, Terry Jan Reedy wrote: > > As for your soapbox issue: Which is, strictly speaking, off-topic, since he disclaims the intent to advocate a change to Python, let alone propose a concrete change. His post on method chaining was thoughtful, but really, it's hardly fair to criticize anyone for confusing a-Pythonicity of method chaining with some inherent flaw in that style. This list is for advocating changes to Python, and I don't see how my post could be construed in that context as claiming the method-chaining style is inherently flawed, vs. inappropriate for further support in Python. > > People sometimes misuse Tim Peter's Zen of Python points. He > > wrote them to stimulate thought, not serve as a substitute for > > thought, and certainly not be a pile of mudballs to be used to > > chase people away. > > +1000 As the person who cited "Pythonicity" and several points of the Zen in this thread, I would appreciate instruction as to how I "misused" the terms or "substituted mudballs for thought", rather than simply being bashed for criticizing someone else's (admittedly thoughtful) post. And especially not being bashed at a multiplication factor of 1000. @Haoyi: I'm sorry. I do apologize for the implication that anybody should "go away". I should have been more careful with pronouns. The "you" who likes method chaining was intended to be different from the "you" who "*do* have the choice of languages". But I could have, and should have, avoided writing "you" in the second case. You are clearly making useful contributions in code as well as in discussion, and I'd appreciate you staying around for a while. A long while. The intended point, restated, was "If method chaining were not well- supported in other languages, it would be arguable that this might be a good innovation for Python even though it's not positively Pythonic. But it is well-supported elsewhere, so there is no *need* for it in Python." I can see how one might disagree with that, but I hope noone finds it offensive. From timothy.c.delaney at gmail.com Wed May 15 05:08:04 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Wed, 15 May 2013 13:08:04 +1000 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <5192EDE2.2050402@stoneleaf.us> Message-ID: On 15 May 2013 12:56, Don Spaulding wrote: > Raymond Hettinger recently proposed[0] a plain-old-dict implementation > that uses less space, performs better, and as an unintended side-effect > just happens to maintain its initial order? > > [0]: http://mail.python.org/pipermail/python-dev/2012-December/123028.html > Whoops - that's what I was referring to, but misattributed it to Barry. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 15 04:54:57 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 19:54:57 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <5192EDE2.2050402@stoneleaf.us> Message-ID: <5192F901.2090100@stoneleaf.us> On 05/14/2013 07:38 PM, Daniel Holth wrote: > On Tue, May 14, 2013 at 10:07 PM, Ethan Furman wrote: >> On 05/14/2013 06:57 PM, Don Spaulding wrote: >>> >>> On Tue, May 14, 2013 at 5:23 PM, Andrew Barnert >> > wrote: >>> >>> >>> On May 14, 2013, at 12:53, Jonathan Eunice >> > wrote: >>> >>>> Using a compatible, separate implementation for |OrderedDict| is a >>>> fine way to gracefully extend the language, but >>>> it leaves ordering only half-accomodated. Consider: >>>> >>>> OrderedDict(a=2, b=3, c=7) >>>> >>> If your proposal is to replace dict with OrderedDict, I think you need >>> at least one use case besides OrderedDict's >>> constructor. >>> >>> >>> I don't understand the dismissal of OrderedDict.__init__ as an invalid use >>> case. >> >> >> It's not being dismissed, but it's only one. There are thousands of >> functions using **kwds that simply don't care about the order. Should they >> all pay the performance price so that some tiny fraction can benefit? >> >> While it is correctly said that if performance is a Big Deal you shouldn't >> be using Python, we also are not interested in making it slower without a >> really good reason. > > I'm not convinced that the performance argument is valid. There are > clever ways to optimize ordered dicts. It would be quite a change to > the language. Best way to find out is branch and try it. Then we'll have hard numbers instead of lots of hand waving and opinions. ;) If any performance hit is negligible I would certainly be interested in having them. -- ~Ethan~ From haoyi.sg at gmail.com Wed May 15 06:26:09 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 15 May 2013 00:26:09 -0400 Subject: [Python-ideas] [Spam] Re: Anonymous blocks (again): In-Reply-To: <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> References: <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> <20130514224735.GA9103@ando> <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: No offense taken =) On Tue, May 14, 2013 at 11:00 PM, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > On Tue, May 14, 2013 at 05:24:47PM -0400, Terry Jan Reedy wrote: > > > > As for your soapbox issue: > > Which is, strictly speaking, off-topic, since he disclaims the intent > to advocate a change to Python, let alone propose a concrete change. > > His post on method chaining was thoughtful, but really, it's hardly > fair to criticize anyone for confusing a-Pythonicity of method > chaining with some inherent flaw in that style. This list is for > advocating changes to Python, and I don't see how my post could be > construed in that context as claiming the method-chaining style is > inherently flawed, vs. inappropriate for further support in Python. > > > > People sometimes misuse Tim Peter's Zen of Python points. He > > > wrote them to stimulate thought, not serve as a substitute for > > > thought, and certainly not be a pile of mudballs to be used to > > > chase people away. > > > > +1000 > > As the person who cited "Pythonicity" and several points of the Zen in > this thread, I would appreciate instruction as to how I "misused" the > terms or "substituted mudballs for thought", rather than simply being > bashed for criticizing someone else's (admittedly thoughtful) post. > And especially not being bashed at a multiplication factor of 1000. > > @Haoyi: I'm sorry. I do apologize for the implication that anybody > should "go away". I should have been more careful with pronouns. The > "you" who likes method chaining was intended to be different from the > "you" who "*do* have the choice of languages". But I could have, and > should have, avoided writing "you" in the second case. You are > clearly making useful contributions in code as well as in discussion, > and I'd appreciate you staying around for a while. A long while. > > The intended point, restated, was "If method chaining were not well- > supported in other languages, it would be arguable that this might be > a good innovation for Python even though it's not positively Pythonic. > But it is well-supported elsewhere, so there is no *need* for it in > Python." I can see how one might disagree with that, but I hope noone > finds it offensive. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Wed May 15 07:15:05 2013 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 15 May 2013 00:15:05 -0500 Subject: [Python-ideas] Syntax for easy binding __name__, __module__, __qualname__ to arbitrary objects In-Reply-To: References: <6BAEC806-2BDF-4D58-8C5D-9D8258A55FCD@gmail.com> <51917F41.3030608@canterbury.ac.nz> Message-ID: <519319D9.1060009@gmail.com> On 05/13/2013 10:49 PM, Nick Coghlan wrote: > What Piotr's proposal crystalised for me is the idea that we really > have two different kinds of name binding in Python. I'm going to call > them "incidental binding" and "definitive binding". As far as conceptual aspects of python go, I like to think of the ':' as a associate operator. (*) This works both in function definitions, class definitions, and dictionary literals. The part before the colon is associated to the part after the colon. Then def, class, and {}'s, determine the specific type of association being made. So they are associate modifiers. In the case of class and def, it's associating a code object created from the following block of code. And in dictionary literals, it associates a key/object pair. An '=' is a simple name/object association with restrictions on the name, the object must already exist, and the associated pair is put in the current name space. Associate modifiers for '=' are 'global' and 'non_local'. There are a number of ways you could expand on this theme. One is to create additional modifiers such as 'enum', or combine the '=' with the ':' to get ':=' and '=:'. Possibly ':=' could be an enum, and maybe '=:' would be a lambda or local named block of code. I don't particularly like reusing def. I've always thought of 'def' as being short for 'define function' and 'class' for being short for 'define class'. So 'def foo(x): code_block' spells "define function 'foo' that takes args (x), and associate it to code_block. Where 'associate' is the ':'. Reusing either def or class to do something different doesn't seem correct to me. I think it would make python more confusing to those just starting out. But Probably the above consistencies were never thought out in this way, or maybe they were at one time and there was never a need to express it publicly. But I kind of like how it fits together conceptually even if it's just my own way of looking at it. Cheers, Ron (* Except in slicing syntax, which isn't related at all.) From steve at pearwood.info Wed May 15 07:33:56 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 15 May 2013 15:33:56 +1000 Subject: [Python-ideas] [Spam] Re: Anonymous blocks (again): In-Reply-To: <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> References: <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> <20130514224735.GA9103@ando> <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20130515053355.GD9103@ando> On Wed, May 15, 2013 at 12:00:19PM +0900, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > On Tue, May 14, 2013 at 05:24:47PM -0400, Terry Jan Reedy wrote: [...] > > > People sometimes misuse Tim Peter's Zen of Python points. He > > > wrote them to stimulate thought, not serve as a substitute for > > > thought, and certainly not be a pile of mudballs to be used to > > > chase people away. > > > > +1000 > > As the person who cited "Pythonicity" and several points of the Zen in > this thread, I would appreciate instruction as to how I "misused" the > terms or "substituted mudballs for thought", rather than simply being > bashed for criticizing someone else's (admittedly thoughtful) post. > And especially not being bashed at a multiplication factor of 1000. I'm sorry, I did not intend my agreement to be read as a criticism of you. To be perfectly honest, I may not have even read your earlier emails. (These threads tend to be long, and my time is not unlimited.) I was agreeing with Terry as a general point. I too see far too many people throwing out misapplied references to the Zen, or as an knee-jerk way to avoid thinking about a problem. (Especially "Only One Way", which isn't even in the Zen.) I'm not going to name names, because (1) I don't remember specific examples, and (2) even if I did, it wouldn't be productive to shame people for misapplying the Zen long after the fact. Hell, it's quite likely that I have been one of those people, I know that sometimes I react conservatively to some suggestions, perhaps *too* conservatively. So I'm sorry that you read my agreement as a criticism of your comments, it was not intended that way, it was just me being enthusiastic to agree with Terry's reminder that we all should avoid using the Zen to avoid thought. -- Steven From tjreedy at udel.edu Wed May 15 10:36:19 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Wed, 15 May 2013 04:36:19 -0400 Subject: [Python-ideas] [Spam] Re: Anonymous blocks (again): In-Reply-To: <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> References: <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> <20130514224735.GA9103@ando> <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Stephen, the post by Haoyi that I responded to neither mentioned you nor quoted you. Nothing I wrote in response to that post was specifically aimed at you. 'Some people' meant some people, here, on python-list, occasionaly pydev, blogs, stackoverflow, ... . Steven D'Aprano, who also frequents python-list, has seen many of the same posts that stimulated my comment, and obviously had a similar reaction. If you are labelling my post spam, I disagree. but lets drop it. Terry From haoyi.sg at gmail.com Wed May 15 12:48:37 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 15 May 2013 06:48:37 -0400 Subject: [Python-ideas] [Spam] Re: Anonymous blocks (again): In-Reply-To: References: <51919530.8050703@pearwood.info> <5B302063-98CD-4885-AE03-9586F508BDFC@yahoo.com> <20130514224735.GA9103@ando> <87wqr1c5h8.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: I would be lying if i I wasn't personally turned off a bit by Stepehen Turnbull's post, and I think most of us were, so let's not beat around the bush saying "but *technically*... we didn't mentioned his *name*..." the context was clear enough. It was more-or-less in reference to what he said even if we didn't say so explicitly. I think it is clear to all of us that it was his post that sparked off my #soapboxing, and we can talk about the reasons why it can be interpreted as chasing people away (I don't think that's up for debate), whether he meant it that way or not. It's probably to do with the "python doesn't need you, go use another language" motif, but I won't do any further literary analysis on the sentences. But he's apologized, so any personal offense that was taken has now been discarded =) On Wed, May 15, 2013 at 4:36 AM, Terry Jan Reedy wrote: > Stephen, the post by Haoyi that I responded to neither mentioned you nor > quoted you. Nothing I wrote in response to that post was specifically aimed > at you. 'Some people' meant some people, here, on python-list, occasionaly > pydev, blogs, stackoverflow, ... . Steven D'Aprano, who also frequents > python-list, has seen many of the same posts that stimulated my comment, > and obviously had a similar reaction. > > If you are labelling my post spam, I disagree. but lets drop it. > > Terry > > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tismer at stackless.com Wed May 15 14:18:35 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 15 May 2013 14:18:35 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518EDA76.2040602@canterbury.ac.nz> References: <20130510211613.53f7649d@fsol> <20130510213707.0df3f992@fsol> <518D9444.20200@canterbury.ac.nz> <94A4F714-D071-4FFB-9774-B94E1CA3A691@yahoo.com> <518EDA76.2040602@canterbury.ac.nz> Message-ID: <51937D1B.2010901@stackless.com> On 12.05.13 01:55, Greg Ewing wrote: > Ian Cordasco wrote: >> On Sat, May 11, 2013 at 2:52 PM, Mark Janssen >> wrote: > >>> It partitions the conceptual space. "+" is a mathematical operator, >>> but strings are not numbers. >> >> But + is already a supported operation on strings > > I still think about these two kinds of concatenation in > different ways, though. When I use implicit concatenation, > I don't think in terms of taking two strings and joining > them together. I'm just writing a single string literal > that happens to span two source lines. > > I believe that distinguishing them visually helps > readability. Using + for both makes things look more > complicated than they really are. > Thinking more about this, yes I see that "+" is really different for various reasons, when you just want to write a long string. "+" involves precedence rules, which is actually too much. Writing continuation lines with '\' is much less convenient, because you cannot insert comments. What I still don't like is the pure absence of anything that makes the concatenation more visible. So I'm searching for different ways to denote concatenating of subsequent strings. Or to put it the other way round: We also can see it as ways to denote the _interruption_ of a string. Thinking out loud... A string is built, then we break its construction into pieces that are glued together by the parser. Hmm, this sounds again more like triple-quoted strings. Still searching... -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From random832 at fastmail.us Wed May 15 17:20:12 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 15 May 2013 11:20:12 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> Message-ID: <1368631212.21852.140661231419973.59838C14@webmail.messagingengine.com> On Tue, May 14, 2013, at 21:57, Don Spaulding wrote: > I don't understand the dismissal of OrderedDict.__init__ as an invalid > use > case. It would be a substantial usability improvement to special-case > OrderedDict at compile-time purely to get the ability to instantiate > odict > literals (not that I'm suggesting that). Maybe we should be talking about literals. OrderedDict(a=3, b=3, c=7) is not and never will be a literal. From donspauldingii at gmail.com Wed May 15 18:04:47 2013 From: donspauldingii at gmail.com (Don Spaulding) Date: Wed, 15 May 2013 11:04:47 -0500 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <1368631212.21852.140661231419973.59838C14@webmail.messagingengine.com> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368631212.21852.140661231419973.59838C14@webmail.messagingengine.com> Message-ID: On Wed, May 15, 2013 at 10:20 AM, wrote: > On Tue, May 14, 2013, at 21:57, Don Spaulding wrote: > > I don't understand the dismissal of OrderedDict.__init__ as an invalid > > use > > case. It would be a substantial usability improvement to special-case > > OrderedDict at compile-time purely to get the ability to instantiate > > odict > > literals (not that I'm suggesting that). > > Maybe we should be talking about literals. OrderedDict(a=3, b=3, c=7) is > not and never will be a literal. > Forgive my misuse of the term 'literal' here. I meant only to say that anywhere you currently use a plain-old-dictionary literal, there's no way to easily switch it to a value which preserves its order. For example: foo = { 'b': 1, 'a': 2 } Has to turn into something far less appealing: foo = OrderedDict() foo['b'] = 1 foo['a'] = 2 # or... foo = OrderedDict([ ('b', 1), ('a', 2) ]) Even if the only ordering change that was made was to magically give OrderedDict.__init__ its **kwargs in order, it would clean up these instances, which I initially referred to as literals. foo = OrderedDict( b=1, a=2 ) I was explicitly not advocating that change, just noting that the OrderedDict.__init__ use case is a perfect example of how this would be used to enable something that currently isn't possible without, IMO, extra noise in the definition of dicts of this nature. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Wed May 15 18:27:46 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Wed, 15 May 2013 18:27:46 +0200 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368631212.21852.140661231419973.59838C14@webmail.messagingengine.com> Message-ID: 2013/5/15 Don Spaulding > Even if the only ordering change that was made was to magically give > OrderedDict.__init__ its **kwargs in order, it would clean up these > instances, which I initially referred to as literals. > > foo = OrderedDict( > b=1, > a=2 > ) > Since PEP3115, classes can __prepare__ a custom dict: from collections import OrderedDict class OrderedDictBuilder(type): @classmethod def __prepare__(metacls, name, bases): return OrderedDict() def __new__(cls, name, bases, classdict): del classdict['__module__'] # ugh return classdict Then we can (ab)use the Class syntax to preserve the order! class foo(metaclass=OrderedDictBuilder): b = 1 a = 2 assert repr(foo) == "OrderedDict([('b', 1), ('a', 2)])" There is probably a way to get rid of the "metaclass=" part. I'm not sure to like it, though. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 15 18:40:09 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 15 May 2013 09:40:09 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368631212.21852.140661231419973.59838C14@webmail.messagingengine.com> Message-ID: <5193BA69.2040708@stoneleaf.us> On 05/15/2013 09:27 AM, Amaury Forgeot d'Arc wrote: > 2013/5/15 Don Spaulding > > > Even if the only ordering change that was made was to magically give OrderedDict.__init__ its **kwargs in order, it > would clean up these instances, which I initially referred to as literals. > > foo = OrderedDict( > b=1, > a=2 > ) > > > Since PEP3115, classes can __prepare__ a custom dict: > > from collections import OrderedDict > class OrderedDictBuilder(type): > @classmethod > def __prepare__(metacls, name, bases): > return OrderedDict() > def __new__(cls, name, bases, classdict): > del classdict['__module__'] # ugh > return classdict > > Then we can (ab)use the Class syntax to preserve the order! > > class foo(metaclass=OrderedDictBuilder): > b = 1 > a = 2 > > assert repr(foo) == "OrderedDict([('b', 1), ('a', 2)])" > > There is probably a way to get rid of the "metaclass=" part. > I'm not sure to like it, though. class OrderedDict(metaclass=OrderedDictBuilder): pass class foo(OrderedDict): b = 1 a = 2 -- ~Ethan~ From ethan at stoneleaf.us Wed May 15 19:12:32 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 15 May 2013 10:12:32 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <5193BA69.2040708@stoneleaf.us> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368631212.21852.140661231419973.59838C14@webmail.messagingengine.com> <5193BA69.2040708@stoneleaf.us> Message-ID: <5193C200.8010809@stoneleaf.us> On 05/15/2013 09:40 AM, Ethan Furman wrote: > On 05/15/2013 09:27 AM, Amaury Forgeot d'Arc wrote: >> >> There is probably a way to get rid of the "metaclass=" part. >> I'm not sure to like it, though. > > class OrderedDict(metaclass=OrderedDictBuilder): > pass > > class foo(OrderedDict): > b = 1 > a = 2 Argh, please disregard -- I should have read more carefully. -- ~Ethan~ From abarnert at yahoo.com Wed May 15 20:01:55 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 11:01:55 -0700 (PDT) Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> Message-ID: <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> From: Don Spaulding Sent: Tuesday, May 14, 2013 6:57 PM >On Tue, May 14, 2013 at 5:23 PM, Andrew Barnert wrote: > >On May 14, 2013, at 12:53, Jonathan Eunice wrote: >> >>Using a compatible, separate implementation for?OrderedDict?is a fine way to gracefully extend the language, but it leaves ordering only half-accomodated. Consider: >>>OrderedDict(a=2, b=3, c=7) >>If your proposal is to replace dict with OrderedDict, I think you need at least one use case besides OrderedDict's constructor. > >I don't understand the dismissal of OrderedDict.__init__ as an invalid use case.? It would be a substantial usability improvement to special-case OrderedDict at compile-time purely to get the ability to instantiate odict literals (not that I'm suggesting that). I'm not dismissing it. If it were one of multiple varied use cases, it would contribute to the argument. But if the _only_ use case is something this special (we can't pass an OrderedDict to it, because that's obviously circular), it's not a good argument for a change with wide-ranging effects. >In the interest of moving the discussion forward, I've had a few use cases along these lines.? Let's say I want to create simple HTML elements by hand: > >? ? def create_element(tag, text='', **attributes): >? ??? ? attrs = ['{}="{}"'.format(k,v) for k, v in attributes.items()] >? ??? ? return "<{0} {1}>{2}".format(tag, ' '.join(attrs), text) >??? >? ? print(create_element('img', alt="Some cool stuff.", src="coolstuff.jpg")) >? ? Some cool stuff. Well, HTML explicitly assigns no meaning to the order of attributes.?And I think this is a symptom of a larger problem. Every month, half a dozen people come to StackOverflow asking how to get an ordered dictionary. Most of them are asking because they want to preserve the order of JSON objects?which, again, is explicitly defined as unordered.?If code relies on the order of HTML attributes, or JSON object members, it's wrong, and it's going to break, and it's better to find that out early. All that being said, sometimes HTML and JSON are read by humans as well as by software, at least for debugging purposes, and sometimes it's more readable with a specific (or at least consistent and predictable) order. So, the suggestion is definitely not without merit. For some tests, you'll want the order scrambled to make sure you're not incorrectly relying on order, but, on the other hand, for debugging the output, you'll want it ordered to make it more readable. >It's not that it's impossible to do, it's that dict-based API's would benefit from the function being able to decide on its own whether or not it cared about the order of arguments.? Having to express a kwargs-based or plain-old-dict-based function as a list-of-2-tuples function is... uncool.? ;-) This is an interesting idea. If there were a way for the function to decide what type is used for creating its kwargs, you could do all kinds of cool things?have that switch you could turn on or off I just mentioned for different kinds of testing, or preserve order in "debug mode" but leave it arbitrary and as fast as possible in "production mode", or take a blist.sorteddict if you're intending to stash it and use it as the starting point for a blist.sorteddict anyway, or whatever.?And it wouldn't affect the 99% of functions that don't care. The syntax seems pretty obvious: ? ? def kwargs(mapping_constructor): ? ? ? ? def deco(fn): ? ? ? ? ? ? fn.kwargs_mapping_constructor?=?mapping_constructor ? ? ? ? ? ? return fn ? ? ? ? return deco ? ? @kwargs(OrderedDict) ? ? def foo(a, b, *args, **kwargs): ? ? ? ? pass Handling this at the calling site is a bit harder, but still not that hard. And this even solves the special problem of OrderedDict seemingly needing an OrderedDict: Just give it a?mapping_constructor that creates a list of tuples. From donspauldingii at gmail.com Wed May 15 21:35:23 2013 From: donspauldingii at gmail.com (Don Spaulding) Date: Wed, 15 May 2013 14:35:23 -0500 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: On Wed, May 15, 2013 at 1:01 PM, Andrew Barnert wrote: > From: Don Spaulding > Sent: Tuesday, May 14, 2013 6:57 PM > >In the interest of moving the discussion forward, I've had a few use > cases along these lines. Let's say I want to create simple HTML elements > by hand: > > > > > def create_element(tag, text='', **attributes): > > attrs = ['{}="{}"'.format(k,v) for k, v in attributes.items()] > > return "<{0} {1}>{2}".format(tag, ' '.join(attrs), text) > > > > print(create_element('img', alt="Some cool stuff.", > src="coolstuff.jpg")) > > Some cool stuff. > > Well, HTML explicitly assigns no meaning to the order of attributes. And I > think this is a symptom of a larger problem. Every month, half a dozen > people come to StackOverflow asking how to get an ordered dictionary. Most > of them are asking because they want to preserve the order of JSON > objects?which, again, is explicitly defined as unordered. If code relies on > the order of HTML attributes, or JSON object members, it's wrong, and it's > going to break, and it's better to find that out early. > Yes, I'm aware that HTML and JSON are explicit about the fact that order should not matter to parsers. But just because I know that, and you know that, doesn't mean that the person in charge of developing the XML-based or JSON-based web service I'm trying to write a wrapper for knows that. Twice now I've encountered poorly-written web services that have choked on something like: modifyStuff user_123456 ...with an error to the effect of "Cannot modifyStuff without specifying user credentials". So someone else has built a system around an XML parser that doesn't know that sibling elements aren't guaranteed to appear in any particular order. Obviously the best fix is for them to use a different parser, but my point is that there's no fix available to my function short of writing all calling locations into a list-of-tuples format. My function doesn't care about the order per se, it just doesn't want to *change the order* of the input while it's generating the output. As another example, the most recent instance where I've wanted to upgrade a regular dict to an odict, was when discovering a bug in a coworker's lookup table. It was a table that mapped an employee_type string to a Django queryset to be searched for a particular user_id. Consider this lookup function: EMPLOYEE_TYPES = { 'agent': Agent.objects.all(), 'staff': Staff.objects.all(), 'associate': Associate.objects.all() } def get_employee_type(user_id): for typ, queryset in EMP_TYPES.items(): if queryset.filter(user_id=user_id).exists(): return typ The bug we hit was because we cared about checking each queryset in the order they were specified. My coworker knows that dicts are unordered, but it slipped his mind while writing this code. Regardless, once you find a bug like this, what are your options for fixing this to process the lookups in a specific order? The first thing you think of is, "Oh, I just need to use an OrderedDict.". Well, technically yes, except there's no convenient way to instantiate an OrderedDict with more than one element at a time. So now you're back to rewriting calling sites into order-preserving lists-of-tuples again. Which is why I think the OrderedDict.__init__ case is in and of itself compelling. ;-) >It's not that it's impossible to do, it's that dict-based API's would > benefit from the function being able to decide on its own whether or not it > cared about the order of arguments. Having to express a kwargs-based or > plain-old-dict-based function as a list-of-2-tuples function is... uncool. > ;-) > > > This is an interesting idea. If there were a way for the function to > decide what type is used for creating its kwargs, you could do all kinds of > cool things?have that switch you could turn on or off I just mentioned for > different kinds of testing, or preserve order in "debug mode" but leave it > arbitrary and as fast as possible in "production mode", or take a > blist.sorteddict if you're intending to stash it and use it as the starting > point for a blist.sorteddict anyway, or whatever. And it wouldn't affect > the 99% of functions that don't care. > > The syntax seems pretty obvious: > > def kwargs(mapping_constructor): > def deco(fn): > fn.kwargs_mapping_constructor = mapping_constructor > return fn > return deco > > @kwargs(OrderedDict) > def foo(a, b, *args, **kwargs): > pass > That's an interesting concept. It would certainly address the most common need I see for better OrderedDict support in the language. > Handling this at the calling site is a bit harder, but still not that hard. > > I don't see how this would require changes to the calling site. Can you elaborate? -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Wed May 15 22:01:41 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 15 May 2013 16:01:41 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <1368648101.644.140661231533045.2B2BC19A@webmail.messagingengine.com> Hang on, what? Now, maybe that's true of your _particular_ XML spec. And this certainly seems symptomatic of a larger problem (e.g. firing off an event as soon as the action tag closes, rather than parsing the whole document). But XML elements are _not_ unordered. On Wed, May 15, 2013, at 15:35, Don Spaulding wrote: > Twice > now I've encountered poorly-written web services that have choked on > something like: > > > modifyStuff > user_123456 > > > ...with an error to the effect of "Cannot modifyStuff without specifying > user credentials". So someone else has built a system around an XML > parser > that doesn't know that sibling elements aren't guaranteed to appear in > any > particular order. -- Random832 (top-posted because my reply and your message aren't guaranteed to appear in any particular order.) From donspauldingii at gmail.com Wed May 15 22:36:10 2013 From: donspauldingii at gmail.com (Don Spaulding) Date: Wed, 15 May 2013 15:36:10 -0500 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <1368648101.644.140661231533045.2B2BC19A@webmail.messagingengine.com> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368648101.644.140661231533045.2B2BC19A@webmail.messagingengine.com> Message-ID: On Wed, May 15, 2013 at 3:01 PM, wrote: > Hang on, what? > > Now, maybe that's true of your _particular_ XML spec. And this certainly > seems symptomatic of a larger problem (e.g. firing off an event as soon > as the action tag closes, rather than parsing the whole document). But > XML elements are _not_ unordered. > Hmm, when I ran across this particular problem, I recall seeing somewhere that the preservation of element order in the spec is undefined. And indeed it appears that while attributes are called explicitly as not having significant order, the spec doesn't actually weigh in on element order one way or another. However, regardless of what's in the spec, it would seem that everyone just assumes element order to be significant anyway, so it doesn't really matter. http://lists.xml.org/archives/xml-dev/200101/msg00841.html > On Wed, May 15, 2013, at 15:35, Don Spaulding wrote: > > Twice > > now I've encountered poorly-written web services that have choked on > > something like: > > > > > > modifyStuff > > user_123456 > > > > > > ...with an error to the effect of "Cannot modifyStuff without specifying > > user credentials". So someone else has built a system around an XML > > parser > > that doesn't know that sibling elements aren't guaranteed to appear in > > any > > particular order. > > -- > Random832 > (top-posted because my reply and your message aren't guaranteed to > appear in any particular order.) > >>> OrderedDict(I='see', what='you', did='there') OrderedDict([('did', 'there'), ('I', 'see'), ('what', 'you')]) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Andy.Henshaw at gtri.gatech.edu Wed May 15 22:43:41 2013 From: Andy.Henshaw at gtri.gatech.edu (Henshaw, Andy) Date: Wed, 15 May 2013 20:43:41 +0000 Subject: [Python-ideas] sqlite3 Message-ID: Very minor nitpick, but can we deprecate the name "sqlite3" in favor of "sqlite"? I'm guessing that the "sqlite3" name is derived from the example program that can manage SQLite files. However, it appears that the proper name of the database system is SQLite (or "sqlite"). When I first started using this module, I remember wondering if there was another version that I should be using. -- Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Wed May 15 23:06:32 2013 From: dholth at gmail.com (Daniel Holth) Date: Wed, 15 May 2013 17:06:32 -0400 Subject: [Python-ideas] sqlite3 In-Reply-To: References: Message-ID: The database's command line tool is also called sqlite3. Version 2 was just called sqlite. On Wed, May 15, 2013 at 4:43 PM, Henshaw, Andy wrote: > Very minor nitpick, but can we deprecate the name ?sqlite3? in favor of > ?sqlite?? I?m guessing that the ?sqlite3? name is derived from the example > program that can manage SQLite files. However, it appears that the proper > name of the database system is SQLite (or ?sqlite?). When I first started > using this module, I remember wondering if there was another version that I > should be using. > > > > -- Andy > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From random832 at fastmail.us Wed May 15 23:08:02 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 15 May 2013 17:08:02 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368648101.644.140661231533045.2B2BC19A@webmail.messagingengine.com> Message-ID: <1368652082.18684.140661231560157.6AF05C96@webmail.messagingengine.com> On Wed, May 15, 2013, at 16:36, Don Spaulding wrote: > Hmm, when I ran across this particular problem, I recall seeing somewhere > that the preservation of element order in the spec is undefined. And > indeed it appears that while attributes are called explicitly as not > having > significant order, the spec doesn't actually weigh in on element order > one > way or another. However, regardless of what's in the spec, it would seem > that everyone just assumes element order to be significant anyway, so it > doesn't really matter. Just remember, HTML is an XML dialect. If assumption leads to the conclusion that paragraphs could be randomly shuffled around on a page, it's probably wrong. Now, _specific_ XML specs (i.e. the one that specifically defines "request" for the web service you were using) _can_ be defined in a way that doesn't care about order, and many do, but it's not something about XML itself. From abarnert at yahoo.com Wed May 15 23:27:36 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 14:27:36 -0700 (PDT) Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> From: Don Spaulding Sent: Wednesday, May 15, 2013 12:35 PM > >On Wed, May 15, 2013 at 1:01 PM, Andrew Barnert wrote: > >From: Don Spaulding >>Sent: Tuesday, May 14, 2013 6:57 PM >> >>>In the interest of moving the discussion forward, I've had a few use cases along these lines.? Let's say I want to create simple HTML elements by hand: >> >>> >>>? ? def create_element(tag, text='', **attributes): >>>? ??? ? attrs = ['{}="{}"'.format(k,v) for k, v in attributes.items()] >>>? ??? ? return "<{0} {1}>{2}".format(tag, ' '.join(attrs), text) >>>??? >>>? ? print(create_element('img', alt="Some cool stuff.", src="coolstuff.jpg")) >>>? ? Some cool stuff. >> >>Well, HTML explicitly assigns no meaning to the order of attributes.?And I think this is a symptom of a larger problem. Every month, half a dozen people come to StackOverflow asking how to get an ordered dictionary. Most of them are asking because they want to preserve the order of JSON objects?which, again, is explicitly defined as unordered.?If code relies on the order of HTML attributes, or JSON object members, it's wrong, and it's going to break, and it's better to find that out early. >> >Yes, I'm aware that HTML and JSON are explicit about the fact that order should not matter to parsers.? But just because I know that, and you know that, doesn't mean that the person in charge of developing the XML-based or JSON-based web service I'm trying to write a wrapper for knows that.? Twice now I've encountered poorly-written web services that have choked on something like: I suppose when the other side is poorly-written and out of your control, that's also a legitimate use for ordering, along with human readability for debugging. (Or maybe it's the same case?the human brain cares about order even when you tell it not to, and that's out of your control?) But I think it's another case where maybe it _shouldn't_ be on by default. Explicitly asking for an OrderedDict is a great way of signaling that someone cares about order, whether or not they should, right? >The first thing you think of is, "Oh, I just need to use an OrderedDict.".? Well, technically yes, except there's no convenient way to instantiate an OrderedDict with more than one element at a time.? So now you're back to rewriting calling sites into order-preserving lists-of-tuples again.? Which is why I think the OrderedDict.__init__ case is in and of itself compelling. ? ;-) But if the OrderedDict.__init__ case were the only good case, coming up with some other way to create OrderedDict objects might be a better solution than changing kwargs.?And if the OrderedDict solution automatically solved all of the other cases, that would _also_ mean that solving OrderedDict is what matters, not solving kwargs. You've already given cases that you could solve with Python as it is today, if only you had a good OrderedDict constructor. And, even for the cases that you _can't_ solve today, most of the obvious potential solutions will only work if OrderedDict is a solved problem, because they rely on OrderedDict. odict literals are an obvious example of that. So is my mapping_constructor idea. If everyone uses @kwargs(OrderedDict), then OrderedDict has to use @kwargs(_HackyOrderedDictBuilder), which is presumably some class that abuses the mapping protocol by wrapping custom __getitem__ and __setitem__ calls around list or something. Or consider this small change to the rules for passing **kwargs. Currently, Python guarantees to build a new dict-like object out of anything you pass, then update it. What if Python instead guaranteed to build a new mapping of the same type (e.g., via copy.copy), then update it in order? Then you could just do this: ? ? create_element('img', alt="Some cool stuff.", src="coolstuff.jpg", **OrderedDict()) Or take that last change, and also change the syntax to allow specifying default values for *args and **kwargs. Then: ? ? def?create_element(tag, text='', **attributes=OrderedDict()): And so on. There are tons of possible designs out there that cannot possibly be used for OrderedDict.__init__, but which are trivial for every other use case assuming that OrderedDict.__init__ has already been solved. That's why giving OrderedDict.__init__ as the primary use case is a mistake. >>The syntax seems pretty obvious: >> >>? ? def kwargs(mapping_constructor): >>? ? ? ? def deco(fn): >>? ? ? ? ? ? fn.kwargs_mapping_constructor?=?mapping_constructor >>? ? ? ? ? ? return fn >>? ? ? ? return deco >> >>? ? @kwargs(OrderedDict) >>? ? def foo(a, b, *args, **kwargs): >>? ? ? ? pass > >That's an interesting concept.? It would certainly address the most common need I see for better OrderedDict support in the language. > >>Handling this at the calling site is a bit harder, but still not that hard. > >I don't see how this would require changes to the calling site.? Can you elaborate? Sorry, I think I wasn't clear enough here.?For you, as a Python coder, the only change is in defining functions, not calling them.?But for the interpreter, there's obviously a change in CALL_FUNCTION?(and friends) or somewhere nearby?wherever it builds a dict out of the keyword arguments that don't match named parameters, it instead has to look up and use the mapping constructor. I meant to talk about the interpreter level, but it ended up sounding like I was talking about the user level. Anyway, it looks like the simplest implementation in CPython is about 5 one-liner changes in?ext_do_call?(http://hg.python.org/cpython/file/3.3/Python/ceval.c#l4294) and?update_keyword_args (http://hg.python.org/cpython/file/3.3/Python/ceval.c#l4171).?In PyPy, if I remember correctly, it would be a 1-liner change in the?standard argument factory function. I don't know about other implementations, but I doubt they'd be much worse. Thinking about the implementation raises some points about the interface.?CPython (with the simplest changes) will always call your constructor with no parameters, and then set the items one by one. So, maybe don't require any more than empty-construction, __setitem__, and __getitem__, instead of a fancy constructor and the full MutableMapping protocol. Alternatively,?PyPy's argument factory is already more flexible; maybe require that as part of the language? From abarnert at yahoo.com Wed May 15 23:31:03 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 14:31:03 -0700 (PDT) Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368648101.644.140661231533045.2B2BC19A@webmail.messagingengine.com> Message-ID: <1368653463.17995.YahooMailNeo@web184701.mail.ne1.yahoo.com> random832 is right, XML elements are ordered. But this is a tangent.?HTML attributes, JSON objects, and plenty of other types are _not_ ordered. So, even if your example is bad, you can trivially sub it into an equivalent example that's good. >________________________________ > From: Don Spaulding >To: random832 at fastmail.us >Cc: python-ideas >Sent: Wednesday, May 15, 2013 1:36 PM >Subject: Re: [Python-ideas] Let's be more orderly! > > > > > > > > >On Wed, May 15, 2013 at 3:01 PM, wrote: > >Hang on, what? >> >>Now, maybe that's true of your _particular_ XML spec. And this certainly >>seems symptomatic of a larger problem (e.g. firing off an event as soon >>as the action tag closes, rather than parsing the whole document). But >>XML elements are _not_ unordered. >> > > >Hmm, when I ran across this particular problem, I recall seeing somewhere that the preservation of element order in the spec is undefined.? And indeed it appears that while attributes are called explicitly as not having significant order, the spec doesn't actually weigh in on element order one way or another.? However, regardless of what's in the spec, it would seem that everyone just assumes element order to be significant anyway, so it doesn't really matter. > >http://lists.xml.org/archives/xml-dev/200101/msg00841.html > > > >>On Wed, May 15, 2013, at 15:35, Don Spaulding wrote: >>> Twice >>> now I've encountered poorly-written web services that have choked on >>> something like: >>> >>> >>> ? modifyStuff >>> ? user_123456 >>> >>> >>> ?...with an error to the effect of "Cannot modifyStuff without specifying >>> user credentials". ?So someone else has built a system around an XML >>> parser >>> that doesn't know that sibling elements aren't guaranteed to appear in >>> any >>> particular order. >> >>-- >>Random832 >>(top-posted because my reply and your message aren't guaranteed to >>appear in any particular order.) >> >? >>>> OrderedDict(I='see', what='you', did='there') >OrderedDict([('did', 'there'), ('I', 'see'), ('what', 'you')]) >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >http://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at 2sn.net Thu May 16 00:01:38 2013 From: python at 2sn.net (Alexander Heger) Date: Thu, 16 May 2013 08:01:38 +1000 Subject: [Python-ideas] sqlite3 In-Reply-To: References: Message-ID: The same way the movie was called "The Madness of King George" and not the proper " ... King George III" because the American audience would wonder whether they missed Parts I and II. On Thu, May 16, 2013 at 6:43 AM, Henshaw, Andy wrote: > Very minor nitpick, but can we deprecate the name ?sqlite3? in favor of > ?sqlite?? I?m guessing that the ?sqlite3? name is derived from the example > program that can manage SQLite files. However, it appears that the proper > name of the database system is SQLite (or ?sqlite?). When I first started > using this module, I remember wondering if there was another version that I > should be using. > > > > -- Andy > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From stephen at xemacs.org Thu May 16 03:08:07 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 16 May 2013 10:08:07 +0900 Subject: [Python-ideas] sqlite3 In-Reply-To: References: Message-ID: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> Henshaw, Andy writes: > Very minor nitpick, but can we deprecate the name "sqlite3" in > favor of "sqlite"? It's probably not a great idea. Two of the OS distributions I have installed (Debian and MacPorts) have both sqlite2 and sqlite3. Another (Gentoo) doesn't provide sqlite2 at all AFAICS, but is inconsistent about naming: some ebuild names (including the main library itself) call it "sqlite", but equally many refer to "sqlite3". sqlite3 may be improper, but it seems to be the unambiguous name. From abarnert at yahoo.com Thu May 16 03:57:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 18:57:10 -0700 Subject: [Python-ideas] sqlite3 In-Reply-To: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <6C71BDDC-6AFD-4258-87D2-0554D231E324@yahoo.com> The SQLite naming scheme shouldn't be that baffling, since it's pretty much PEP 394 s/python/sqlite/g. Sent from a random iPhone On May 15, 2013, at 18:08, "Stephen J. Turnbull" wrote: > Henshaw, Andy writes: > >> Very minor nitpick, but can we deprecate the name "sqlite3" in >> favor of "sqlite"? > > It's probably not a great idea. Two of the OS distributions I have > installed (Debian and MacPorts) have both sqlite2 and sqlite3. > Another (Gentoo) doesn't provide sqlite2 at all AFAICS, but is > inconsistent about naming: some ebuild names (including the main > library itself) call it "sqlite", but equally many refer to "sqlite3". > > sqlite3 may be improper, but it seems to be the unambiguous name. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From greg.ewing at canterbury.ac.nz Thu May 16 04:26:46 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 May 2013 14:26:46 +1200 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <519443E6.2040406@canterbury.ac.nz> On 16/05/13 07:35, Don Spaulding wrote: > So someone else has built a system around an XML parser that > doesn't know that sibling elements aren't guaranteed to appear in any > particular order. Are you *sure* those elements aren't required to appear in a particular order? It depends on how the DTD is written. The parser may actually be doing the right thing based on the DTD it was given or based on. From http://www.w3schools.com/dtd/dtd_elements.asp: Elements with Children (sequences) Elements with one or more children are declared with the name of the children elements inside parentheses: When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. -- Greg From cs at zip.com.au Thu May 16 04:24:36 2013 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 16 May 2013 12:24:36 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <20130516022436.GA86816@cskk.homeip.net> On 14May2013 20:00, Jan Kaliszewski wrote: | 14.05.2013 19:24, Mark Dickinson wrote: | | >>? ? ? ? ? ? ? ? ? ? raise errors.DataError( | >>? ? ? ? ? ? ? ? ? ? ? ? 'Inconsistent revenue item currency: ' | >>? ? ? ? ? ? ? ? ? ? ? ? 'transaction=%r; transaction_position=%r' % | >>? ? ? ? ? ? ? ? ? ? ? ? (transaction, transaction_position)) | > | >Agreed. ?I use the implicit concatenation a lot for exception | >messages like the one above | | Me too. | | But what do you think about: | | raise errors.DataError( | 'Inconsistent revenue item currency: ' | c'transaction=%r; transaction_position=%r' % | (transaction, transaction_position)) | | c'...' -- for explicit string (c)ontinuation or (c)oncatenation. I'm -1 on it myself. I'd expect c'' to act like b'' or u'' or r'': making a "string"-ish thing in a special way. But c'' doesn't; the nearest analog is r'' but c'' goes _backwards_. I much prefer: + 'foo' over c'foo' The former already works and is perfectly clear about what it's doing. The "c" does not do it any better and is easier to miss, visually. Cheers, -- Cameron Simpson On the contrary of what you may think, your hacker is fully aware of your company's dress code. He is fully aware of the fact that it doesn't help him to do his job. - Gregory Hosler From abarnert at yahoo.com Thu May 16 04:54:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 19:54:10 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <20130516022436.GA86816@cskk.homeip.net> References: <20130516022436.GA86816@cskk.homeip.net> Message-ID: On May 15, 2013, at 19:24, Cameron Simpson wrote: > I much prefer: > + 'foo' > over > c'foo' I agree, but this doesn't solve the precedence problem that everyone keeps bringing up. Summarizing (more in hopes that someone will correct me if I've missed something important than to help you or anyone else...): Implicit concatenation is bad because you often use it accidentally when you intended a comma. A rule only allowing implicit concatenation on separate lines doesn't help because both legit and accidental uses are usually on separate lines. There's no way a compiler or linter could help, because there's no programmatic way to distinguish good from bad uses: log("long log message with {} " "and {}", "one arg" "and another") Using + doesn't work because of operator precedence vs. % and .: print("long log message with {} " + "and {}".format("one arg", "and another")) Using an explicit dedent or similar method call doesn't work because the performance is unacceptable. Automatically optimizing the dedent call at compile time doesn't work because sometimes you need it to be at run time. Assuming all of those givens are true, it seems inescapable that either we need some new syntax, or we have to just accept the problem. From steve at pearwood.info Thu May 16 05:16:01 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 16 May 2013 13:16:01 +1000 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <51944F71.9080507@pearwood.info> On 16/05/13 07:27, Andrew Barnert wrote: >> The first thing you think of is, "Oh, I just need to use an OrderedDict.". Well, technically yes, except there's no convenient way to instantiate an OrderedDict with more than one element at a time. There's not *that* much difference between writing: OrderedDict(a=1, b=2, c=3) # doesn't work as expected and OrderedDict([('a', 1), ('b', 2), ('c', 3)]) that justifies creating OrderedDicts the hard way: d = OrderedDict() d['a'] = 1 d['b'] = 2 d['c'] = 3 as earlier suggested. Yes, it would be nice to use the first version, but it is a Nice To Have, not a Must Have. > So now you're back to rewriting calling sites into order-preserving lists-of-tuples again. Which is why I think the OrderedDict.__init__ case is in and of itself compelling. ;-) OrderedDicts are important, but they aren't important enough to impose their requirements on the entire language. * I've already mentioned the risk of performance costs. Most applications of dicts do not care about order. Imposing performance costs on those applications in order to satisfy a few that do is probably a bad trade off, unless those costs are trivial. * We're not just talking about CPython here. Anything that is part of the language must be applicable to all Python implementations, not just the big four (CPython, Jython, IronPython, PyPy) but all the little ones as well. Even if CPython adopts Raymond Hettinger's dict optimization that keeps order as a side-effect, do we really want to make that a language requirement? (I'm not saying that we should, or shouldn't, but only that the stakes are bigger than just CPython.) What's important is not just the magnitude of the changes necessary to make kwargs ordered, but the possible implementations that may be ruled out. It is possible for a language to over-specify features as well as under-specify, and we should be cautious about doing so. [...] > Or consider this small change to the rules for passing **kwargs. Currently, Python guarantees to build a new dict-like object out of anything you pass, then update it. What if Python instead guaranteed to build a new mapping of the same type (e.g., via copy.copy), then update it in order? Then you could just do this: > > create_element('img', alt="Some cool stuff.", src="coolstuff.jpg", **OrderedDict()) I can't help but feel that if order of keyword arguments is important, you should take an ordered dict as an explicit argument rather than accept keyword arguments. Given: def create_element(tag, alt, src): pass even if kwargs become ordered in some way, how will your create_element function distinguish between these two calls? create_element('img', alt='something', src='something.jpg') create_element('img', src='something.jpg', alt='something') I don't believe it can. Hence, when order is important, you cannot use keyword arguments to provide arguments *even if kwargs are ordered*. But if you write your function like this: def create_element(tag, mapping): pass and call it like this: create_element('img', OrderedDict([('alt', 'something'), ('src', 'something.jpg')])) then you can get order for free. Yes, it's a little less convenient to use a list of tuples than nice keyword syntax, but that's a solution that doesn't impose any costs on code that doesn't care about ordering. For what it's worth, I'm +0 on specifying that dicts must keep creation order unless items are deleted. I'm -1 on making OrderedDict the default dictionary type. -- Steven From steve at pearwood.info Thu May 16 05:24:28 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 16 May 2013 13:24:28 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130516022436.GA86816@cskk.homeip.net> Message-ID: <5194516C.5020709@pearwood.info> On 16/05/13 12:54, Andrew Barnert wrote: > Summarizing (more in hopes that someone will correct me if I've missed something important than to help you or anyone else...): > > Implicit concatenation is bad because you often use it accidentally when you intended a comma. For some definition of "often". If I've ever made this error, it was so long ago, and so trivially fixed, that I don't remember it. > There's no way a compiler or linter could help, because there's no programmatic way to distinguish good from bad uses: Of course they can *help*. Linters can flag the use of implicit concatenation, and leave it up to the user to decide. That's helping. If you're like me, and use implicit concatenation frequently with few or no problems, then you'll configure the linter to skip the warning. If you're one of the people who rarely or never uses it deliberately, or you work for Google where it goes against their in-house style guide, then you'll tell the linter to treat it as an error. I think that this is the sort of issue that linters are designed to solve. -- Steven From haoyi.sg at gmail.com Thu May 16 05:50:51 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 15 May 2013 23:50:51 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5194516C.5020709@pearwood.info> References: <20130516022436.GA86816@cskk.homeip.net> <5194516C.5020709@pearwood.info> Message-ID: To be fair, I've made this mistake twice in the last week. It's trivially fixed once you find it, but a missing , is pretty small and hard to miss! It caused a fair amount of head scratching I don't believe in the "kick it to the linter" solution, since that's basically a non-solution (don't know if it should be good or bad, so let someone else decide!). Until we get some @I_Know_What_Im_Doing decorator so that, in the source code, we can tell the linter to ignore things, it's just going to pop up every time and get in peoples way and add to the lint-spam that accompanies most major projects. Somewhat unrelated, but have any linter managed to solve this issue, whether storing a fine-grained stuff-to-be-ignored list in in-code pragmas or in a separate .linter_ignored file that somehow works while line numbers are constantly changing and such? On Wed, May 15, 2013 at 11:24 PM, Steven D'Aprano wrote: > On 16/05/13 12:54, Andrew Barnert wrote: > > Summarizing (more in hopes that someone will correct me if I've missed >> something important than to help you or anyone else...): >> >> Implicit concatenation is bad because you often use it accidentally when >> you intended a comma. >> > > For some definition of "often". > > If I've ever made this error, it was so long ago, and so trivially fixed, > that I don't remember it. > > > There's no way a compiler or linter could help, because there's no >> programmatic way to distinguish good from bad uses: >> > > > Of course they can *help*. Linters can flag the use of implicit > concatenation, and leave it up to the user to decide. That's helping. > > If you're like me, and use implicit concatenation frequently with few or > no problems, then you'll configure the linter to skip the warning. If > you're one of the people who rarely or never uses it deliberately, or you > work for Google where it goes against their in-house style guide, then > you'll tell the linter to treat it as an error. > > > I think that this is the sort of issue that linters are designed to solve. > > > > -- > Steven > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wuwei23 at gmail.com Thu May 16 07:01:04 2013 From: wuwei23 at gmail.com (alex23) Date: Wed, 15 May 2013 22:01:04 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130516022436.GA86816@cskk.homeip.net> <5194516C.5020709@pearwood.info> Message-ID: On May 16, 1:50?pm, Haoyi Li wrote: > I don't believe in the "kick it to the linter" solution, since that's > basically a non-solution (don't know if it should be good or bad, so let > someone else decide!). No, it's a "let the developer decide for themselves whether it's an issue" solution. > Until we get some @I_Know_What_Im_Doing decorator so > that, in the source code, we can tell the linter to ignore things, it's > just going to pop up every time and get in peoples way http://docs.pylint.org/faq.html#message-control From markus at unterwaditzer.net Thu May 16 07:18:04 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Thu, 16 May 2013 07:18:04 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <3db40634-e1bc-418c-b705-9b02c2e6bcb0@email.android.com> Guido van Rossum wrote: >I just spent a few minutes staring at a bug caused by a missing comma >-- I got a mysterious argument count error because instead of foo('a', >'b') I had written foo('a' 'b'). > >This is a fairly common mistake, and IIRC at Google we even had a lint >rule against this (there was also a Python dialect used for some >specific purpose where this was explicitly forbidden). > >Now, with modern compiler technology, we can (and in fact do) evaluate >compile-time string literal concatenation with the '+' operator, so >there's really no reason to support 'a' 'b' any more. (The reason was >always rather flimsy; I copied it from C but the reason why it's >needed there doesn't really apply to Python, as it is mostly useful >inside macros.) > >Would it be reasonable to start deprecating this and eventually remove >it from the language? Not sure why nobody mentioned it yet, maybe it's obviously not helping in this situation, but... What if such multi-line strings have to have their own set of parens around them? Valid: do_foo( ("foo" "bar"), "baz" ) Invalid: do_foo( "foo" "bar", "baz" ) -- Markus (from phone) From abarnert at yahoo.com Thu May 16 07:51:52 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 22:51:52 -0700 (PDT) Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <51944F71.9080507@pearwood.info> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> Message-ID: <1368683512.98116.YahooMailNeo@web184706.mail.ne1.yahoo.com> From: Steven D'Aprano Sent: Wednesday, May 15, 2013 8:16 PM > On 16/05/13 07:27, Andrew Barnert wrote: > >>> The first thing you think of is, "Oh, I just need to use an > OrderedDict.".? Well, technically yes, except there's no convenient way > to instantiate an OrderedDict with more than one element at a time. > > There's not *that* much difference between writing: You're quoting me quoting someone else (Don Spaulding) here. The problem may be that I'm using the horrible Yahoo webmail client, which is especially bad at indenting replies to rich-text emails,?and therefore it's hard for you to tell what's going on? But I think this led to some confusion farther down. >> Or consider this small change to the rules for passing **kwargs. Currently,? > Python guarantees to build a new dict-like object out of anything you pass, then > update it. What if Python instead guaranteed to build a new mapping of the same > type (e.g., via copy.copy), then update it in order? Then you could just do > this: >> >> ? ? ? create_element('img', alt="Some cool stuff.", > src="coolstuff.jpg", **OrderedDict()) > > I can't help but feel that if order of keyword arguments is important, you > should take an ordered dict as an explicit argument rather than accept keyword > arguments. I tossed out as wide a variety of solutions as I could come up with,?to show that almost anything you come up with is either only works if it doesn't have to work for OrderedDict.__init__, or at least gets a lot easier if it doesn't have to work for OrderedDict.__init__. The one you're replying to is?the last, and probably worst, of those spitballed ideas. I certainly wasn't proposing that we actually do it. Anyway, my point is this: If the goal is to solve ordered kwargs, don't try to make that solution work for OrderedDict.__init__ (so we can use OrderedDict as part of the solution). Alternatively, if the goal is to improve OrderedDict construction, don't try to do so by solving ordered kwargs. To be clear, going over my spitballed ideas and those earlier in the thread: I'm?-0 on having a map constructor attribute/slot for functions, -0.5 on a PyPy-style argument factory attribute/slot, -0 on adding odict literals with some new syntax, -1 on adding odict literals if they look like Python 3.3 OrderedDict constructor calls,?-1 on requiring kwargs to preserve the type it's handed, -1 on allowing default values for *args and **kwargs, -1 on?making OrderedDict the default dictionary type, -1 on making it the type for kwargs, -0.5 on specifying that dicts must keep creation order unless items are deleted, -1 for making that change in CPython without specifying it as part of the language. From abarnert at yahoo.com Thu May 16 07:54:53 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 22:54:53 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <3db40634-e1bc-418c-b705-9b02c2e6bcb0@email.android.com> References: <3db40634-e1bc-418c-b705-9b02c2e6bcb0@email.android.com> Message-ID: <1368683693.27670.YahooMailNeo@web184703.mail.ne1.yahoo.com> From: Markus Unterwaditzer Sent: Wednesday, May 15, 2013 10:18 PM > Not sure why nobody mentioned it yet, maybe it's obviously not helping in > this situation, but... > > What if such multi-line strings have to have their own set of parens around > them? > > Valid: > do_foo( > ? ? ("foo" > ? ? "bar"), > ? ? "baz" > ) > > Invalid: > do_foo( > ? ? "foo" > ? ? "bar", > ? ? "baz" > ) As I understand it, the main reason people didn't like Guido's suggestion of "just use +" was that (because of operator precedence) they'd sometimes have to add parentheses that are unnecessary today. So, I'm betting it will be just as unpopular with the same people. Personally, I don't dislike it. But then I don't dislike the "just use +" answer either. From abarnert at yahoo.com Thu May 16 08:06:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 15 May 2013 23:06:10 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5194516C.5020709@pearwood.info> References: <20130516022436.GA86816@cskk.homeip.net> <5194516C.5020709@pearwood.info> Message-ID: <1368684370.87363.YahooMailNeo@web184705.mail.ne1.yahoo.com> From: Steven D'Aprano To: python-ideas at python.org >> Implicit concatenation is bad because you often use it accidentally when > you intended a comma. > > For some definition of "often". Well, yes. But Guido says he makes this mistake often, and others agree with him, and the whole discussion wouldn't have come up if it weren't a problem. So,?we're still left with the?conclusion: >> There's no way a compiler or linter could help, because there's no? > programmatic way to distinguish good from bad uses: > > Of course they can *help*. Linters can flag the use of implicit concatenation, > and leave it up to the user to decide. That's helping. You're right; let e rephrase. There's no way a compiler could help, and a linter can mitigate but not solve the problem. Which means the conclusion is actually: >?Assuming all of those givens are true, it seems inescapable that either we need some new syntax, or we have to just accept the problem? ? (with some help from linters). I should also clarify that "accept the problem" could either mean "ban implicit concatenation" (as Guido initially suggested) or "leave implicit concatenation alone", so it's really 3 choices, not 2. Does that sound fair now? From bruce at leapyear.org Thu May 16 08:11:54 2013 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 15 May 2013 23:11:54 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <1368683693.27670.YahooMailNeo@web184703.mail.ne1.yahoo.com> References: <3db40634-e1bc-418c-b705-9b02c2e6bcb0@email.android.com> <1368683693.27670.YahooMailNeo@web184703.mail.ne1.yahoo.com> Message-ID: On May 15, 2013 10:57 PM, "Andrew Barnert" wrote: > As I understand it, the main reason people didn't like Guido's suggestion of "just use +" was that (because of operator precedence) they'd sometimes have to add parentheses that are unnecessary today. So, I'm betting it will be just as unpopular with the same people. The difference between requiring parens around implicit concatenation and around uses of + is that leaving the parens out in the first case would be a syntax error and do the wrong thing in the second case. --- Bruce (from my phone) -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Thu May 16 09:08:28 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 16 May 2013 10:08:28 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: 10.05.13 21:48, Guido van Rossum ???????(??): > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). > > Now, with modern compiler technology, we can (and in fact do) evaluate > compile-time string literal concatenation with the '+' operator, so > there's really no reason to support 'a' 'b' any more. (The reason was > always rather flimsy; I copied it from C but the reason why it's > needed there doesn't really apply to Python, as it is mostly useful > inside macros.) As was said before the '+' operator has less priority than the '%' operator and an attribute access, i.e. it requires parenthesis in some cases. However parenthesis introduce a noise and can cause other types of errors. In all cases only multiline implicit string literal concatenation cause problem. What if forbid implicit string literal concatenation only between string literals on different physical lines? A deliberate string literal concatenation can be made with explicit line joining. raise ValueError('Type names and field names must be valid '\ 'identifiers: %r' % name) raise ValueError('{} not bottom-level directory in '\ '{!r}'.format(_PYCACHE, path)) ignore_patterns = ( 'Function "%s" not defined.' % breakpoint, "warning: no loadable sections found in added symbol-file"\ " system-supplied DSO", "warning: Unable to find libthread_db matching"\ " inferior's thread library, thread debugging will"\ " not be available.", "warning: Cannot initialize thread debugging"\ " library: Debugger service failed", 'warning: Could not load shared library symbols for '\ 'linux-vdso.so', 'warning: Could not load shared library symbols for '\ 'linux-gate.so', 'Do you need "set solib-search-path" or '\ '"set sysroot"?', ) I think this introduces less noise than the '+' operator or other proposed alternatives. From storchaka at gmail.com Thu May 16 09:20:00 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 16 May 2013 10:20:00 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: 10.05.13 21:48, Guido van Rossum ???????(??): > I just spent a few minutes staring at a bug caused by a missing comma > -- I got a mysterious argument count error because instead of foo('a', > 'b') I had written foo('a' 'b'). > > This is a fairly common mistake, and IIRC at Google we even had a lint > rule against this (there was also a Python dialect used for some > specific purpose where this was explicitly forbidden). Could your please run this lint rules against Python sources? I found at least one bug in Tools/scripts/abitype.py: typeslots = [ 'tp_name', 'tp_basicsize', ... 'tp_subclasses', 'tp_weaklist', 'tp_del' 'tp_version_tag' ] http://bugs.python.org/issue17993 From Andy.Henshaw at gtri.gatech.edu Thu May 16 14:18:44 2013 From: Andy.Henshaw at gtri.gatech.edu (Henshaw, Andy) Date: Thu, 16 May 2013 12:18:44 +0000 Subject: [Python-ideas] sqlite3 In-Reply-To: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: From: Stephen J. Turnbull [mailto:stephen at xemacs.org] > Henshaw, Andy writes: >> Very minor nitpick, but can we deprecate the name "sqlite3" in > favor of "sqlite"? > It's probably not a great idea. Two of the OS distributions I have installed > (Debian and MacPorts) have both sqlite2 and sqlite3. > Another (Gentoo) doesn't provide sqlite2 at all AFAICS, but is inconsistent about naming: > some ebuild names (including the main library itself) call it "sqlite", but equally > many refer to "sqlite3". > sqlite3 may be improper, but it seems to be the unambiguous name. But, Python only ships one sqlite module. If there were two versions of the module and the developer needed to choose, then it would make sense. Why isn't there a tkinter8 to indicate with which version of tcl/tk the module is designed to operate? It seems like we're just propagating an awkward, vestigial, implementation detail for no good reason. From mal at egenix.com Thu May 16 14:30:39 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 16 May 2013 14:30:39 +0200 Subject: [Python-ideas] sqlite3 In-Reply-To: References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <5194D16F.5060408@egenix.com> On 16.05.2013 14:18, Henshaw, Andy wrote: > From: Stephen J. Turnbull [mailto:stephen at xemacs.org] >> Henshaw, Andy writes: > >>> Very minor nitpick, but can we deprecate the name "sqlite3" in > favor of "sqlite"? > >> It's probably not a great idea. Two of the OS distributions I have installed >> (Debian and MacPorts) have both sqlite2 and sqlite3. >> Another (Gentoo) doesn't provide sqlite2 at all AFAICS, but is inconsistent about naming: >> some ebuild names (including the main library itself) call it "sqlite", but equally >> many refer to "sqlite3". > >> sqlite3 may be improper, but it seems to be the unambiguous name. > > But, Python only ships one sqlite module. If there were two versions of the module > and the developer needed to choose, then it would make sense. Why isn't > there a tkinter8 to indicate with which version of tcl/tk the module is > designed to operate? It seems like we're just propagating an awkward, vestigial, > implementation detail for no good reason. sqlite3 requires version 3.x of sqlite. It is not compatible with version 2.x of sqlite due to API changes in sqlite, hence the version number in the name. Here's the thread discussing the addition: http://mail.python.org/pipermail/python-dev/2006-March/062905.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From Andy.Henshaw at gtri.gatech.edu Thu May 16 15:41:28 2013 From: Andy.Henshaw at gtri.gatech.edu (Henshaw, Andy) Date: Thu, 16 May 2013 13:41:28 +0000 Subject: [Python-ideas] sqlite3 In-Reply-To: <5194D16F.5060408@egenix.com> References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> <5194D16F.5060408@egenix.com> Message-ID: From: M.-A. Lemburg [mailto:mal at egenix.com] >On 16.05.2013 14:18, Henshaw, Andy wrote: >> From: Stephen J. Turnbull [mailto:stephen at xemacs.org] >> Henshaw, Andy writes: ... >> >> But, Python only ships one sqlite module. If there were two versions >> of the module and the developer needed to choose, then it would make >> sense. Why isn't there a tkinter8 to indicate with which version of >> tcl/tk the module is designed to operate? It seems like we're just >> propagating an awkward, vestigial, implementation detail for no good reason. > sqlite3 requires version 3.x of sqlite. It is not compatible with version 2.x > of sqlite due to API changes in sqlite, hence the version number in the name. > Here's the thread discussing the addition: > http://mail.python.org/pipermail/python-dev/2006-March/062905.html Interesting. It appears that they took the first name suggested, although there were multiple suggestions of just "sqlite" variants (e.g., db.sqlite or database.sqlite). Really, this is not important enough to continue arguing about, so I intend that this will be my last post on the subject. However, I don't think that my point has been addressed. Python only ships one version of the module, so there is no important reason to append a version number to the module name. It's a convention not done for other modules that wrap libraries, excepting maybe bzip2(bz2) and MD5, but those are the formal names (according to Wikipedia), and it should be deprecated for sqlite. From ethan at stoneleaf.us Thu May 16 15:50:23 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 16 May 2013 06:50:23 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <51944F71.9080507@pearwood.info> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> Message-ID: <5194E41F.6000006@stoneleaf.us> On 05/15/2013 08:16 PM, Steven D'Aprano wrote: > > I don't believe it can. Hence, when order is important, you cannot use keyword arguments to provide arguments *even if > kwargs are ordered*. But if you write your function like this: > > def create_element(tag, mapping): > pass > > and call it like this: > > create_element('img', OrderedDict([('alt', 'something'), ('src', 'something.jpg')])) > > then you can get order for free. Yes, it's a little less convenient to use a list of tuples than nice keyword syntax, > but that's a solution that doesn't impose any costs on code that doesn't care about ordering. Which 'free' are you talking about? Because if the solution requires extra typing and extra visual clutter, it's not free. -- ~Ethan~ From g.rodola at gmail.com Thu May 16 15:51:06 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Thu, 16 May 2013 15:51:06 +0200 Subject: [Python-ideas] Provide a more informative message on socket's bind() and connect*() Message-ID: Consider the following: >>> import socket >>> s = socket.socket() >>> addr = ('', 8080) >>> s.bind(addr) Traceback (most recent call last): File "foo.py", line 29, in s.bind(addr) File "/usr/lib/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(*args) socket.error: [Errno 98] Address already in use Problem here is that the information about the address passed to bind() is lost. While playing with Tulip I noticed some effort was put into providing a more informative message: https://code.google.com/p/tulip/source/browse/tulip/base_events.py?spec=svne05013a5516da73c97d9a11ece79283839e41bd0&r=f400984a064869a9326f5159ce7f6182087cb926#292 https://code.google.com/p/tulip/source/browse/tulip/base_events.py?spec=svne05013a5516da73c97d9a11ece79283839e41bd0&r=f400984a064869a9326f5159ce7f6182087cb926#449 I thought that maybe it makes sense to do that straight into the socket module by either overriding bind() (and also connect() and connect_ex()) or perhaps by providing a brand new PyErr_SetFromErrnoWithMsgObject() which can also be used elsewhere. Thoughts? --- Giampaolo -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu May 16 16:40:07 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 15:40:07 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <5194EFC7.4060409@mrabarnett.plus.com> On 16/05/2013 08:08, Serhiy Storchaka wrote: > 10.05.13 21:48, Guido van Rossum ???????(??): >> I just spent a few minutes staring at a bug caused by a missing comma >> -- I got a mysterious argument count error because instead of foo('a', >> 'b') I had written foo('a' 'b'). >> >> This is a fairly common mistake, and IIRC at Google we even had a lint >> rule against this (there was also a Python dialect used for some >> specific purpose where this was explicitly forbidden). >> >> Now, with modern compiler technology, we can (and in fact do) evaluate >> compile-time string literal concatenation with the '+' operator, so >> there's really no reason to support 'a' 'b' any more. (The reason was >> always rather flimsy; I copied it from C but the reason why it's >> needed there doesn't really apply to Python, as it is mostly useful >> inside macros.) > > As was said before the '+' operator has less priority than the '%' > operator and an attribute access, i.e. it requires parenthesis in some > cases. However parenthesis introduce a noise and can cause other types > of errors. > [snip] I wonder whether we could use ".". Or would that be too confusing? > In all cases only multiline implicit string literal concatenation cause > problem. What if forbid implicit string literal concatenation only > between string literals on different physical lines? A deliberate string > literal concatenation can be made with explicit line joining. > > raise ValueError('Type names and field names must be valid '\ > 'identifiers: %r' % name) > raise ValueError('Type names and field names must be valid ' . 'identifiers: %r' % name) > > raise ValueError('{} not bottom-level directory in '\ > '{!r}'.format(_PYCACHE, path)) > raise ValueError('{} not bottom-level directory in ' . '{!r}'.format(_PYCACHE, path)) From stefan at drees.name Thu May 16 16:43:14 2013 From: stefan at drees.name (Stefan Drees) Date: Thu, 16 May 2013 16:43:14 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5194EFC7.4060409@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> Message-ID: <5194F082.7050100@drees.name> On 16.05.13 16:40, MRAB wrote: > On 16/05/2013 08:08, Serhiy Storchaka wrote: >> 10.05.13 21:48, Guido van Rossum ???????(??): >>> [snip + snip] >> > [snip] > I wonder whether we could use ".". Or would that be too confusing? ... that is interesting (with respect to php -> python porting :-) I will take a seat and wait for the thread to evolve based on your dot. All the best, Stefan. From brett at python.org Thu May 16 16:43:03 2013 From: brett at python.org (Brett Cannon) Date: Thu, 16 May 2013 10:43:03 -0400 Subject: [Python-ideas] sqlite3 In-Reply-To: References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> <5194D16F.5060408@egenix.com> Message-ID: On Thu, May 16, 2013 at 9:41 AM, Henshaw, Andy wrote: > From: M.-A. Lemburg [mailto:mal at egenix.com] >>On 16.05.2013 14:18, Henshaw, Andy wrote: >>> From: Stephen J. Turnbull [mailto:stephen at xemacs.org] >>> Henshaw, Andy writes: > ... >>> >>> But, Python only ships one sqlite module. If there were two versions >>> of the module and the developer needed to choose, then it would make >>> sense. Why isn't there a tkinter8 to indicate with which version of >>> tcl/tk the module is designed to operate? It seems like we're just >>> propagating an awkward, vestigial, implementation detail for no good reason. > >> sqlite3 requires version 3.x of sqlite. It is not compatible with version 2.x >> of sqlite due to API changes in sqlite, hence the version number in the name. > >> Here's the thread discussing the addition: >> http://mail.python.org/pipermail/python-dev/2006-March/062905.html > > Interesting. It appears that they took the first name suggested, although > there were multiple suggestions of just "sqlite" variants (e.g., db.sqlite or > database.sqlite). > > Really, this is not important enough to continue arguing about, so I intend > that this will be my last post on the subject. However, I don't think that my > point has been addressed. Python only ships one version of the module, so > there is no important reason to append a version number to the module name. > It's a convention not done for other modules that wrap libraries, excepting maybe > bzip2(bz2) and MD5, but those are the formal names (according to Wikipedia), and > it should be deprecated for sqlite. Just so people have a glimpse of how a decision like this is made, I'm going to quickly reply and then consider the topic closed. If we go with the assumption that the name "sqlite3" is sub-optimal compared to "sqlite", taking the effort to deprecate the old name and switch to a new one is not at all worth it at this point. If we deprecated the name then every Python program *in the world* that used that module would need updating at some point (or we at least need to make that assumption). That is a massive undertaking when looked at in an aggregate fashion just to make a name fit more a project name than a project version/format name. If there had been found some confusion over the name then during the Python 2/3 switch we could have changed it so that 2to3 could have handle the name change like the urllib changes, etc. But since there is no reported confusion to any level high enough to even warrant that headache for code that tries to be source-compatible between Python 2 and 3 the module name won't change probably unless sqlite 4 came out, was not backwards-compatible, and we decided to support it in the stdlib. From rosuav at gmail.com Thu May 16 16:44:07 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 17 May 2013 00:44:07 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5194EFC7.4060409@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> Message-ID: On Fri, May 17, 2013 at 12:40 AM, MRAB wrote: > On 16/05/2013 08:08, Serhiy Storchaka wrote: >> As was said before the '+' operator has less priority than the '%' >> operator and an attribute access, i.e. it requires parenthesis in some >> cases. However parenthesis introduce a noise and can cause other types >> of errors. >> > I wonder whether we could use ".". Or would that be too confusing? And I apologized for borrowing an idea from bash. Taking an idea from PHP?!? Seriously, I don't think another operator is needed. If it's not going to be the implicit concatenation by abuttal, + or \ will carry the matter. But I share the opinion of several here: implicit concatenation is not as bad as the alternatives. ChrisA From mal at egenix.com Thu May 16 16:44:56 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 16 May 2013 16:44:56 +0200 Subject: [Python-ideas] sqlite3 In-Reply-To: References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> <5194D16F.5060408@egenix.com> Message-ID: <5194F0E8.6000201@egenix.com> On 16.05.2013 15:41, Henshaw, Andy wrote: > From: M.-A. Lemburg [mailto:mal at egenix.com] >> On 16.05.2013 14:18, Henshaw, Andy wrote: >>> From: Stephen J. Turnbull [mailto:stephen at xemacs.org] >>> Henshaw, Andy writes: > ... >>> >>> But, Python only ships one sqlite module. If there were two versions >>> of the module and the developer needed to choose, then it would make >>> sense. Why isn't there a tkinter8 to indicate with which version of >>> tcl/tk the module is designed to operate? It seems like we're just >>> propagating an awkward, vestigial, implementation detail for no good reason. > >> sqlite3 requires version 3.x of sqlite. It is not compatible with version 2.x >> of sqlite due to API changes in sqlite, hence the version number in the name. > >> Here's the thread discussing the addition: >> http://mail.python.org/pipermail/python-dev/2006-March/062905.html > > Interesting. It appears that they took the first name suggested, although > there were multiple suggestions of just "sqlite" variants (e.g., db.sqlite or > database.sqlite). > > Really, this is not important enough to continue arguing about, so I intend > that this will be my last post on the subject. However, I don't think that my > point has been addressed. Python only ships one version of the module, so > there is no important reason to append a version number to the module name. > It's a convention not done for other modules that wrap libraries, excepting maybe > bzip2(bz2) and MD5, but those are the formal names (according to Wikipedia), and > it should be deprecated for sqlite. Perhaps I wasn't clear enough: the "3" in the name originates from the SQLite library version. The Python module in the stdlib is not compatible with SQLite version 2 and it's well possible that it won't work with a future SQLite API version. At the time, this was needed, since most systems by default had SQLite version 2 installed. Too late to change now. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 16 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From songofacandy at gmail.com Thu May 16 16:50:14 2013 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 16 May 2013 22:50:14 +0800 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5194EFC7.4060409@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> Message-ID: I have some php experiment. Using dot for string concatenation cause readability hazard. func("foo", "bar". "bazz", "spam". "egg") On Thu, May 16, 2013 at 10:40 PM, MRAB wrote: > On 16/05/2013 08:08, Serhiy Storchaka wrote: > >> 10.05.13 21:48, Guido van Rossum ???????(??): >> >>> I just spent a few minutes staring at a bug caused by a missing comma >>> -- I got a mysterious argument count error because instead of foo('a', >>> 'b') I had written foo('a' 'b'). >>> >>> This is a fairly common mistake, and IIRC at Google we even had a lint >>> rule against this (there was also a Python dialect used for some >>> specific purpose where this was explicitly forbidden). >>> >>> Now, with modern compiler technology, we can (and in fact do) evaluate >>> compile-time string literal concatenation with the '+' operator, so >>> there's really no reason to support 'a' 'b' any more. (The reason was >>> always rather flimsy; I copied it from C but the reason why it's >>> needed there doesn't really apply to Python, as it is mostly useful >>> inside macros.) >>> >> >> As was said before the '+' operator has less priority than the '%' >> operator and an attribute access, i.e. it requires parenthesis in some >> cases. However parenthesis introduce a noise and can cause other types >> of errors. >> >> [snip] > I wonder whether we could use ".". Or would that be too confusing? > > > In all cases only multiline implicit string literal concatenation cause >> problem. What if forbid implicit string literal concatenation only >> between string literals on different physical lines? A deliberate string >> literal concatenation can be made with explicit line joining. >> >> raise ValueError('Type names and field names must be valid '\ >> 'identifiers: %r' % name) >> >> raise ValueError('Type names and field names must be valid ' . > > 'identifiers: %r' % name) > >> >> raise ValueError('{} not bottom-level directory in '\ >> '{!r}'.format(_PYCACHE, path)) >> >> raise ValueError('{} not bottom-level directory in ' . > '{!r}'.format(_PYCACHE, path)) > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu May 16 17:00:00 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 16:00:00 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> Message-ID: <5194F470.2030007@mrabarnett.plus.com> On 16/05/2013 15:44, Chris Angelico wrote: > On Fri, May 17, 2013 at 12:40 AM, MRAB wrote: >> On 16/05/2013 08:08, Serhiy Storchaka wrote: >>> As was said before the '+' operator has less priority than the '%' >>> operator and an attribute access, i.e. it requires parenthesis in some >>> cases. However parenthesis introduce a noise and can cause other types >>> of errors. >>> >> I wonder whether we could use ".". Or would that be too confusing? > > And I apologized for borrowing an idea from bash. Taking an idea from PHP?!? > It has high precendence as far as the parser is concerned. I know that Perl uses it. I haven't looked at PHP (I hear bad things about it! :-)). > Seriously, I don't think another operator is needed. If it's not going > to be the implicit concatenation by abuttal, + or \ will carry the > matter. But I share the opinion of several here: implicit > concatenation is not as bad as the alternatives. > It wouldn't be an operator as such. From abarnert at yahoo.com Thu May 16 17:57:48 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 16 May 2013 08:57:48 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5194F470.2030007@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> Message-ID: On May 16, 2013, at 8:00, MRAB wrote: > On 16/05/2013 15:44, Chris Angelico wrote: >> On Fri, May 17, 2013 at 12:40 AM, MRAB wrote: >>> On 16/05/2013 08:08, Serhiy Storchaka wrote: >>>> As was said before the '+' operator has less priority than the '%' >>>> operator and an attribute access, i.e. it requires parenthesis in some >>>> cases. However parenthesis introduce a noise and can cause other types >>>> of errors. >>> I wonder whether we could use ".". Or would that be too confusing? >> And I apologized for borrowing an idea from bash. Taking an idea from PHP?!? > It has high precendence as far as the parser is concerned. > > I know that Perl uses it. I haven't looked at PHP (I hear bad things > about it! :-)). > >> Seriously, I don't think another operator is needed. If it's not going >> to be the implicit concatenation by abuttal, + or \ will carry the >> matter. But I share the opinion of several here: implicit >> concatenation is not as bad as the alternatives. > It wouldn't be an operator as such Of course in php, perl, and every other language that uses dot for string concatenation, it _is_ an operator, so this will end up confusing the very people who initially find it comforting. And this means the parser has to figure out whether you mean dot for attribute access or dot for concatenation. That's not exactly a _hard_ problem, but it's not _trivial_. And then there's the fact that the "precedence" is different depending on which meaning the dot gets. Remember that what you're trying to solve is the problem that member-dot and % both have higher precedence than +. From python at mrabarnett.plus.com Thu May 16 18:23:02 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 17:23:02 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> Message-ID: <519507E6.4030701@mrabarnett.plus.com> On 16/05/2013 16:57, Andrew Barnert wrote: > On May 16, 2013, at 8:00, MRAB wrote: > >> On 16/05/2013 15:44, Chris Angelico wrote: >>> On Fri, May 17, 2013 at 12:40 AM, MRAB >>> wrote: >>>> On 16/05/2013 08:08, Serhiy Storchaka wrote: >>>>> As was said before the '+' operator has less priority than >>>>> the '%' operator and an attribute access, i.e. it requires >>>>> parenthesis in some cases. However parenthesis introduce a >>>>> noise and can cause other types of errors. >>>> I wonder whether we could use ".". Or would that be too >>>> confusing? > >>> And I apologized for borrowing an idea from bash. Taking an idea >>> from PHP?!? >> It has high precendence as far as the parser is concerned. >> >> I know that Perl uses it. I haven't looked at PHP (I hear bad >> things about it! :-)). >> >>> Seriously, I don't think another operator is needed. If it's not >>> going to be the implicit concatenation by abuttal, + or \ will >>> carry the matter. But I share the opinion of several here: >>> implicit concatenation is not as bad as the alternatives. >> It wouldn't be an operator as such > > Of course in php, perl, and every other language that uses dot for > string concatenation, it _is_ an operator, so this will end up > confusing the very people who initially find it comforting. > > And this means the parser has to figure out whether you mean dot for > attribute access or dot for concatenation. That's not exactly a > _hard_ problem, but it's not _trivial_. > > And then there's the fact that the "precedence" is different > depending on which meaning the dot gets. Remember that what you're > trying to solve is the problem that member-dot and % both have higher > precedence than +. > I thought the problem we were trying to solve was that "+" has a lower precedence than "%" and attribute/method access, so implicit concatenation that's followed by "%" or ".format" can't be replaced by "+" without adding extra parentheses. From tismer at stackless.com Thu May 16 18:57:29 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 16 May 2013 18:57:29 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <519507E6.4030701@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> Message-ID: <51950FF9.1000803@stackless.com> On 16.05.13 18:23, MRAB wrote: > On 16/05/2013 16:57, Andrew Barnert wrote: >> On May 16, 2013, at 8:00, MRAB wrote: >> >>> On 16/05/2013 15:44, Chris Angelico wrote: >>>> On Fri, May 17, 2013 at 12:40 AM, MRAB >>>> wrote: >>>>> On 16/05/2013 08:08, Serhiy Storchaka wrote: >>>>>> As was said before the '+' operator has less priority than >>>>>> the '%' operator and an attribute access, i.e. it requires >>>>>> parenthesis in some cases. However parenthesis introduce a >>>>>> noise and can cause other types of errors. >>>>> I wonder whether we could use ".". Or would that be too >>>>> confusing? >> >>>> And I apologized for borrowing an idea from bash. Taking an idea >>>> from PHP?!? >>> It has high precendence as far as the parser is concerned. >>> >>> I know that Perl uses it. I haven't looked at PHP (I hear bad >>> things about it! :-)). >>> >>>> Seriously, I don't think another operator is needed. If it's not >>>> going to be the implicit concatenation by abuttal, + or \ will >>>> carry the matter. But I share the opinion of several here: >>>> implicit concatenation is not as bad as the alternatives. >>> It wouldn't be an operator as such >> >> Of course in php, perl, and every other language that uses dot for >> string concatenation, it _is_ an operator, so this will end up >> confusing the very people who initially find it comforting. >> >> And this means the parser has to figure out whether you mean dot for >> attribute access or dot for concatenation. That's not exactly a >> _hard_ problem, but it's not _trivial_. >> >> And then there's the fact that the "precedence" is different >> depending on which meaning the dot gets. Remember that what you're >> trying to solve is the problem that member-dot and % both have higher >> precedence than +. >> > I thought the problem we were trying to solve was that "+" has a lower > precedence than "%" and attribute/method access, so implicit > concatenation that's followed by "%" or ".format" can't be replaced by > "+" without adding extra parentheses. I think the "." is a nice idea at first sight, but might become confusing in the end because what we actually need is a simple to use notation for the scanner/parser that denotes a continuation line, and _not_ an operator. Now, what about this? long_line = "the beginning and the"& # comments are ok " continuation of a string" The "&" is not a valid operator on strings and looks pretty much like gluing parts together. It is better than the "\" that just escapes the newline and cannot take comments. I would even enforce that the ampersand be on the same line. cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From bruce at leapyear.org Thu May 16 19:26:15 2013 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 16 May 2013 10:26:15 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51950FF9.1000803@stackless.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: On Thu, May 16, 2013 at 9:57 AM, Christian Tismer wrote: > The "&" is not a valid operator on strings and looks pretty much like > gluing parts together. It is better than the "\" that just escapes the > newline > and cannot take comments. > I don't like something that is a standard operator becoming special syntax. While it's true that string & string is not valid, it's not the case that string & ... is not valid. I dislike dot for the same reason. It's confusing that these would do different things: 'abc' & 'def' ('abc') & 'def' I like the \ idea because it's clearly syntax and not an operator, but the fact that it doesn't work with comments is annoying since one reason to break a string is to insert comments. I don't like that spaces after the \ are not allowed because trailing spaces are invisible to me but not to the parser. So what if the rule for trailing \ was changed to: The \ continuation character may be followed by white space and a comment. If a comment is present, there must be at least one whitespace character between the \ and the comment. That is: x = [ # THIS WOULD BE ALLOWED 'abc' \ 'def' \ # not the python keyword 'ghi' ] x = [ # THIS WOULD BE AN ERROR 'abc' \ 'def' # a comment but no continuation \ 'ghi' ] One thing I like about using \ is that it already works (aside from my proposed comment change). So anyone wanting to write forward/backward-compatible code can just add the \s now. If you want to start enforcing the restriction, just use from __future__ import explicit_string_continuation. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at drees.name Thu May 16 19:27:06 2013 From: stefan at drees.name (Stefan Drees) Date: Thu, 16 May 2013 19:27:06 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51950FF9.1000803@stackless.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: <519516EA.8070701@drees.name> On 16.05.13 18:57, Christian Tismer wrote: > On 16.05.13 18:23, MRAB wrote: >> On 16/05/2013 16:57, Andrew Barnert wrote: >>> On May 16, 2013, at 8:00, MRAB wrote: >>> >>>> On 16/05/2013 15:44, Chris Angelico wrote: >>>>> On Fri, May 17, 2013 at 12:40 AM, MRAB >>>>> wrote: >>>>>> On 16/05/2013 08:08, Serhiy Storchaka wrote: >>>>>>> As was said before the '+' operator has less priority than >>>>>>> the '%' operator and an attribute access, i.e. it requires >>>>>>> parenthesis in some cases. However parenthesis introduce a >>>>>>> noise and can cause other types of errors. >>>>>> I wonder whether we could use ".". Or would that be too >>>>>> confusing? >>> >>>>> And I apologized for borrowing an idea from bash. Taking an idea >>>>> from PHP?!? >>>> It has high precendence as far as the parser is concerned. >>>> >>>> I know that Perl uses it. I haven't looked at PHP (I hear bad >>>> things about it! :-)). >>>> >>>>> Seriously, I don't think another operator is needed. If it's not >>>>> going to be the implicit concatenation by abuttal, + or \ will >>>>> carry the matter. But I share the opinion of several here: >>>>> implicit concatenation is not as bad as the alternatives. >>>> It wouldn't be an operator as such >>> >>> Of course in php, perl, and every other language that uses dot for >>> string concatenation, it _is_ an operator, so this will end up >>> confusing the very people who initially find it comforting. >>> >>> And this means the parser has to figure out whether you mean dot for >>> attribute access or dot for concatenation. That's not exactly a >>> _hard_ problem, but it's not _trivial_. >>> >>> And then there's the fact that the "precedence" is different >>> depending on which meaning the dot gets. Remember that what you're >>> trying to solve is the problem that member-dot and % both have higher >>> precedence than +. >>> >> I thought the problem we were trying to solve was that "+" has a lower >> precedence than "%" and attribute/method access, so implicit >> concatenation that's followed by "%" or ".format" can't be replaced by >> "+" without adding extra parentheses. > > I think the "." is a nice idea at first sight, but might become confusing > in the end because what we actually need is a simple to use notation > for the scanner/parser that denotes a continuation line, and _not_ an > operator. > > Now, what about this? > > long_line = "the beginning and the"& # comments are ok > " continuation of a string" > > The "&" is not a valid operator on strings and looks pretty much like > gluing parts together. It is better than the "\" that just escapes the > newline > and cannot take comments. > I would even enforce that the ampersand be on the same line. 'a bitwise or :-?'& ' why not ...' in php the dot (.) is so abundantly used for staying within the line width limits, I often also insert it instead of a plus (+) when switching to python and the other way around. All the best, Stefan From python at mrabarnett.plus.com Thu May 16 19:38:18 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 18:38:18 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: <5195198A.3040000@mrabarnett.plus.com> On 16/05/2013 18:26, Bruce Leban wrote: > > On Thu, May 16, 2013 at 9:57 AM, Christian Tismer > wrote: > > The "&" is not a valid operator on strings and looks pretty much like > gluing parts together. It is better than the "\" that just escapes > the newline > and cannot take comments. > > > I don't like something that is a standard operator becoming special > syntax. While it's true that string & string is not valid, it's not the > case that string & ... is not valid. I dislike dot for the same reason. > It's confusing that these would do different things: > > 'abc' & 'def' > ('abc') & 'def' > > I like the \ idea because it's clearly syntax and not an operator, but > the fact that it doesn't work with comments is annoying since one reason > to break a string is to insert comments. I don't like that spaces after > the \ are not allowed because trailing spaces are invisible to me but > not to the parser. So what if the rule for trailing \ was changed to: > > The \ continuation character may be followed by white space and a > comment. If a comment is present, there must be at least one > whitespace character between the \ and the comment. > > Why do you say """there must be at least one whitespace character between the \ and the comment"""? > That is: > > x = [ # THIS WOULD BE ALLOWED > 'abc' \ > 'def' \ # not the python keyword > 'ghi' > ] > > x = [ # THIS WOULD BE AN ERROR > 'abc' \ > 'def' # a comment but no continuation \ > 'ghi' > ] > > One thing I like about using \ is that it already works (aside from my > proposed comment change). So anyone wanting to write > forward/backward-compatible code can just add the \s now. If you want to > start enforcing the restriction, just use from __future__ import > explicit_string_continuation. > From tismer at stackless.com Thu May 16 20:07:22 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 16 May 2013 20:07:22 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: <5195205A.4050400@stackless.com> Hey Bruce! On 16.05.13 19:26, Bruce Leban wrote: > > On Thu, May 16, 2013 at 9:57 AM, Christian Tismer > > wrote: > > The "&" is not a valid operator on strings and looks pretty much like > gluing parts together. It is better than the "\" that just escapes > the newline > and cannot take comments. > > > I don't like something that is a standard operator becoming special > syntax. While it's true that string & string is not valid, it's not > the case that string & ... is not valid. I dislike dot for the same > reason. It's confusing that these would do different things: > > 'abc' & 'def' > ('abc') & 'def' > > I like the \ idea because it's clearly syntax and not an operator, but > the fact that it doesn't work with comments is annoying since one > reason to break a string is to insert comments. I don't like that > spaces after the \ are not allowed because trailing spaces are > invisible to me but not to the parser. So what if the rule for > trailing \ was changed to: > > The \ continuation character may be followed by white space and a > comment. If a comment is present, there must be at least one > whitespace character between the \ and the comment. > > > That is: > > x = [ # THIS WOULD BE ALLOWED > 'abc' \ > 'def' \ # not the python keyword > 'ghi' > ] > > x = [ # THIS WOULD BE AN ERROR > 'abc' \ > 'def' # a comment but no continuation \ > 'ghi' > ] > > One thing I like about using \ is that it already works (aside from my > proposed comment change). So anyone wanting to write > forward/backward-compatible code can just add the \s now. If you want > to start enforcing the restriction, just use from __future__ import > explicit_string_continuation. Right, that's a good one! Although I hate the backslash from bad experience with windows. But actually the most reason that I always hated to use "\" for continuation lines is its strict behavior that does not allow any white space after it. Hey, it would be great if that proposal makes it ! cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Thu May 16 20:14:54 2013 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 16 May 2013 11:14:54 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5195198A.3040000@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> Message-ID: On Thu, May 16, 2013 at 10:38 AM, MRAB wrote: > Why do you say """there must be at least one whitespace character > between the \ and the comment"""? > Two reasons: (1) make the backslash more likely to stand out visually (and we can't require a space before it) (2) \# looks like it might be an escape sequence of some sort while I don't think \ # does, making this friendlier to readers. I'm not passionate about that detail if the rest of the proposal flies. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu May 16 20:20:47 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 17 May 2013 04:20:47 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> Message-ID: On Fri, May 17, 2013 at 4:14 AM, Bruce Leban wrote: > > On Thu, May 16, 2013 at 10:38 AM, MRAB wrote: >> >> Why do you say """there must be at least one whitespace character >> between the \ and the comment"""? > > > Two reasons: > > (1) make the backslash more likely to stand out visually (and we can't > require a space before it) > > (2) \# looks like it might be an escape sequence of some sort while I don't > think \ # does, making this friendlier to readers. > > I'm not passionate about that detail if the rest of the proposal flies. Spin that off as a separate thread, I think the change to the backslash rules stands alone. I would support it; allowing a line-continuation backslash to be followed by a comment is a Good Thing imo. ChrisA From ron3200 at gmail.com Thu May 16 20:28:11 2013 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 16 May 2013 13:28:11 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <5195253B.9060001@gmail.com> On 05/16/2013 02:08 AM, Serhiy Storchaka wrote: > In all cases only multiline implicit string literal concatenation cause > problem. What if forbid implicit string literal concatenation only between > string literals on different physical lines? A deliberate string literal > concatenation can be made with explicit line joining. And it already works. It might be a good PEP8 recommendation. > ignore_patterns = ( > 'Function "%s" not defined.' % breakpoint, > "warning: no loadable sections found in added symbol-file"\ > " system-supplied DSO", > "warning: Unable to find libthread_db matching"\ > " inferior's thread library, thread debugging will"\ > " not be available.", > "warning: Cannot initialize thread debugging"\ > " library: Debugger service failed", > 'warning: Could not load shared library symbols for '\ > 'linux-vdso.so', > 'warning: Could not load shared library symbols for '\ > 'linux-gate.so', > 'Do you need "set solib-search-path" or '\ > '"set sysroot"?', > ) In this example, the lines tend to run together visually, and the '\' competes with the comma. But these have more to do with style than syntax and can be improved by indenting the continued lines. I think the line continuation '\' character would also make a good explicit string literal concatenation character. It's already limited to only work across sequential lines as well. Cheers, Ron From bruce at leapyear.org Thu May 16 20:41:38 2013 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 16 May 2013 11:41:38 -0700 Subject: [Python-ideas] Allowing comments after line continuations Message-ID: At Chris Angelico's suggestion, starting another thread on this: The \ line continuation does not allow comments yet statements that span multiple lines may need internal comments. Also spaces after the \ are not allowed but trailing spaces are invisible to the reader but not to the parser. If you use parenthesis for continuation then you can add comments but there are cases where parenthesis don't work, for example, before in a with statement, as well as the current discussion of using \ to make implicit string concatenation explicit. So I propose adopting this rule for trailing \ continuation: The \ continuation character may be followed by white space and a comment. If a comment is present, there must be at least one whitespace character between the \ and the comment. That is: x = y + \ # comment allowed here z with a as x, \ # comment here may be useful b as y, \ # or here c as z: \ # or here pass x = y + # syntax error z Two reasons for requiring a space after the backslash: (1) make the backslash more likely to stand out visually (and we can't require a space before it) (2) \# looks like it might be an escape sequence of some sort while I don't think \ # does, making this friendlier to readers. I'm not passionate about that detail if the rest of the proposal flies. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Thu May 16 20:51:13 2013 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 16 May 2013 13:51:13 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5195198A.3040000@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> Message-ID: <51952AA1.7010708@gmail.com> On 05/16/2013 12:38 PM, MRAB wrote: >> I like the \ idea because it's clearly syntax and not an operator, but >> the fact that it doesn't work with comments is annoying since one reason >> to break a string is to insert comments. I don't like that spaces after >> the \ are not allowed because trailing spaces are invisible to me but >> not to the parser. So what if the rule for trailing \ was changed to: >> >> The \ continuation character may be followed by white space and a >> comment. If a comment is present, there must be at least one >> whitespace character between the \ and the comment. >> >> > Why do you say """there must be at least one whitespace character > between the \ and the comment"""? I'd like comments after a line continuation also. There is an issue with it in strings. The tokenizer uses the '\'+'\n' as a line continuation, rather than a single '\'. By doing that, it can handle line continuations on any line exactly the same. >>> "This is a backslash \, and this\ ... line is continued also." 'This is a backslash \\, and this line is continued also.' The \ is also used as a string escape sequence character also. Outside of strings the '\' anywhere except at the end of a line is an error. So we can do that without any issues with previous code. But we need to not change it's behaviour between quotes. Cheers Ron From jsbueno at python.org.br Thu May 16 21:03:53 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Thu, 16 May 2013 16:03:53 -0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> Message-ID: On 16 May 2013 12:57, Andrew Barnert wrote: > And this means the parser has to figure out whether you mean dot for attribute access or dot for concatenation. That's not exactly a _hard_ problem, but it's not _trivial_. If you say it mis not hard for the parser, ok - but it seems impossible for humans: upper = " World" print ("Hello". upper) - From python at mrabarnett.plus.com Thu May 16 21:11:35 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 20:11:35 +0100 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: <51952F67.4060302@mrabarnett.plus.com> On 16/05/2013 19:41, Bruce Leban wrote: > At Chris Angelico's suggestion, starting another thread on this: > > The \ line continuation does not allow comments yet statements that span > multiple lines may need internal comments. Also spaces after the \ are > not allowed but trailing spaces are invisible to the reader but not to > the parser. If you use parenthesis for continuation then you can add > comments but there are cases where parenthesis don't work, for example, > before in a with statement, as well as the current discussion of using \ > to make implicit string concatenation explicit. So I propose adopting > this rule for trailing \ continuation: > > The \ continuation character may be followed by white space and a > comment. If a comment is present, there must be at least one > whitespace character between the \ and the comment. > > > That is: > > x = y + \ # comment allowed here > z > > with a as x, \ # comment here may be useful > b as y, \ # or here > c as z: \ # or here > pass > > x = y + # syntax error > z > > Two reasons for requiring a space after the backslash: > > (1) make the backslash more likely to stand out visually (and we can't > require a space before it) > > (2) \# looks like it might be an escape sequence of some sort while I > don't think \ # does, making this friendlier to readers. > You don't get escape sequences outside strings, so I'd be inclined not to insist that it be followed by a space, although it could be suggested as good style. > I'm not passionate about that detail if the rest of the proposal flies. > +1 From mertz at gnosis.cx Thu May 16 21:18:46 2013 From: mertz at gnosis.cx (David Mertz) Date: Thu, 16 May 2013 12:18:46 -0700 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: <51952F67.4060302@mrabarnett.plus.com> References: <51952F67.4060302@mrabarnett.plus.com> Message-ID: +1000 On Thu, May 16, 2013 at 12:11 PM, MRAB wrote: > On 16/05/2013 19:41, Bruce Leban wrote: > >> At Chris Angelico's suggestion, starting another thread on this: >> >> The \ line continuation does not allow comments yet statements that span >> multiple lines may need internal comments. Also spaces after the \ are >> not allowed but trailing spaces are invisible to the reader but not to >> the parser. If you use parenthesis for continuation then you can add >> comments but there are cases where parenthesis don't work, for example, >> before in a with statement, as well as the current discussion of using \ >> to make implicit string concatenation explicit. So I propose adopting >> this rule for trailing \ continuation: >> >> The \ continuation character may be followed by white space and a >> comment. If a comment is present, there must be at least one >> whitespace character between the \ and the comment. >> >> >> That is: >> >> x = y + \ # comment allowed here >> z >> >> with a as x, \ # comment here may be useful >> b as y, \ # or here >> c as z: \ # or here >> pass >> >> x = y + # syntax error >> z >> >> Two reasons for requiring a space after the backslash: >> >> (1) make the backslash more likely to stand out visually (and we can't >> require a space before it) >> >> (2) \# looks like it might be an escape sequence of some sort while I >> don't think \ # does, making this friendlier to readers. >> >> You don't get escape sequences outside strings, so I'd be inclined not > to insist that it be followed by a space, although it could be > suggested as good style. > > > I'm not passionate about that detail if the rest of the proposal flies. >> >> +1 > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu May 16 21:29:11 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 20:29:11 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> Message-ID: <51953387.9070500@mrabarnett.plus.com> On 16/05/2013 20:03, Joao S. O. Bueno wrote: > On 16 May 2013 12:57, Andrew Barnert wrote: >> And this means the parser has to figure out whether you mean dot for attribute access or dot for concatenation. That's not exactly a _hard_ problem, but it's not _trivial_. > > If you say it mis not hard for the parser, ok - but it seems > impossible for humans: > > upper = " World" > print ("Hello". upper) > That's attribute access. The suggestion was to use it in place of implicit string concatenation, which occurs only between string _literals_: print ("Hello" . " World") and is currently illegal ("SyntaxError: invalid syntax"). From abarnert at yahoo.com Thu May 16 22:51:40 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 16 May 2013 13:51:40 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <519507E6.4030701@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> Message-ID: <1368737500.52226.YahooMailNeo@web184706.mail.ne1.yahoo.com> From: MRAB Sent: Thursday, May 16, 2013 9:23 AM > On 16/05/2013 16:57, Andrew Barnert wrote: >> And then there's the fact that the "precedence" is different >> depending on which meaning the dot gets. Remember that what you're >> trying to solve is the problem that member-dot and % both have higher >> precedence than +. >> > I thought the problem we were trying to solve was that "+" has a lower > precedence than "%" and attribute/method access, so implicit > concatenation that's followed by "%" or ".format" > can't be replaced by > "+" without adding extra parentheses. I was talking about the fact that Guido's 'Just use "+"' suggestion is insufficient, because it requires adding extra parentheses. Therefore, the problem we're trying to solve is 'member-dot and % both have higher?precedence than +.' Your?'"+" has a lower?precedence than "%" and attribute/method access' means the exact same thing, just stated in the opposite order.? So? I think I'm missing your point. From abarnert at yahoo.com Thu May 16 22:55:34 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 16 May 2013 13:55:34 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> Message-ID: <1368737734.9377.YahooMailNeo@web184704.mail.ne1.yahoo.com> From: Joao S. O. Bueno Sent: Thursday, May 16, 2013 12:03 PM > On 16 May 2013 12:57, Andrew Barnert wrote: >> And this means the parser has to figure out whether you mean dot for > attribute access or dot for concatenation. That's not exactly a _hard_ > problem, but it's not _trivial_. > > If you say it mis not hard for the parser, ok - but it seems > impossible for humans: > > upper = " World" > print ("Hello". upper) Given a rule like "it's only concatenation if both arguments are string literals", a?sufficiently complex parser, or a sufficiently knowledgeable human,?can figure out that this is attribute access. So it's clearly not impossible. But it's also not trivial. And that's my point. It makes the code harder to read for both parsers and humans, which is a significant tradeoff. If the benefit is high enough, it might be worth it anyway, but I don't know that it is. From abarnert at yahoo.com Thu May 16 22:58:11 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 16 May 2013 13:58:11 -0700 (PDT) Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: <1368737891.87772.YahooMailNeo@web184704.mail.ne1.yahoo.com> From: Bruce Leban Sent: Thursday, May 16, 2013 11:41 AM >The \ continuation character may be followed by white space and a comment. This seems clean and obvious once you learn it, and it will be easy for novices to learn, and it won't affect any existing (working) code. So, if this is enough to solve the string concatenation problem to everyone's satisfaction without any other changes, I'm definitely +1 on it.? Otherwise, I guess +0. From mertz at gnosis.cx Thu May 16 23:07:50 2013 From: mertz at gnosis.cx (David Mertz) Date: Thu, 16 May 2013 14:07:50 -0700 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: <1368737891.87772.YahooMailNeo@web184704.mail.ne1.yahoo.com> References: <1368737891.87772.YahooMailNeo@web184704.mail.ne1.yahoo.com> Message-ID: I feel like this change would only help modestly with the string concatenation issue. I just want it because... well, I've frequently wished it were there in working code that has nothing to do with string concatenation... and usually wound up using superfluous and less clear extra parentheses where continuation lines would be nicer. On Thu, May 16, 2013 at 1:58 PM, Andrew Barnert wrote: > From: Bruce Leban > Sent: Thursday, May 16, 2013 11:41 AM > > > >The \ continuation character may be followed by white space and a comment. > > This seems clean and obvious once you learn it, and it will be easy for > novices to learn, and it won't affect any existing (working) code. > > > So, if this is enough to solve the string concatenation problem to > everyone's satisfaction without any other changes, I'm definitely +1 on it. > > Otherwise, I guess +0. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Thu May 16 23:44:50 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 16 May 2013 17:44:50 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: On Thu, May 16, 2013 at 1:26 PM, Bruce Leban wrote: >... So what if the rule for trailing \ was changed to: > > The \ continuation character may be followed by white space and a comment. > If a comment is present, there must be at least one whitespace character > between the \ and the comment. YES!!! Even ignoring string concatenation, this would be a huge win. Limiting implicit string concatenation to "same logical line" or even "adjacent physical lines joined by a line-continuation '\'-character" *might* be even better. -jJ From dave at krondo.com Thu May 16 23:55:32 2013 From: dave at krondo.com (Dave Peticolas) Date: Thu, 16 May 2013 14:55:32 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: 2013/5/16 Jim Jewett > On Thu, May 16, 2013 at 1:26 PM, Bruce Leban wrote: > > >... So what if the rule for trailing \ was changed to: > > > > The \ continuation character may be followed by white space and a > comment. > > If a comment is present, there must be at least one whitespace character > > between the \ and the comment. > > YES!!! Even ignoring string concatenation, this would be a huge win. > > Limiting implicit string concatenation to "same logical line" or even > "adjacent physical lines joined by a line-continuation '\'-character" > *might* be even better. I think the latter would almost make it explicit string concatenation, no? That sounds like one of the cleanest solutions so far. -- --Dave Peticolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu May 16 23:55:48 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Thu, 16 May 2013 17:55:48 -0400 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: On 5/16/2013 2:41 PM, Bruce Leban wrote: > The \ line continuation does not allow comments yet statements that span > multiple lines may need internal comments. In a string, \ escapes the immediate next character. This idea of \ at the *end* of a line, just before the newline character, is that it escapes the newline, *just as is does within a string*. >>> 'abd\ ed' 'abded' >>> 1 +\ 2 3 In both cases, one would typically have a syntax error without the \. >>> 'abc SyntaxError: EOL while scanning string literal >>> a + SyntaxError: invalid syntax I think changing the correspondance is a bad idea. A code line is an string (but not an str object) that is fed to the interpreter. Besides which, end of line \ is generally discouraged and nearly always not needed as openers other that ' and ", namely {, [, and (, enable line continuation without \. So if you want comments on each line, add parens. > with a as x, \ # comment here may be useful > b as y, \ # or here > c as z: \ # or here > pass This is one of the very few cases where bracketing is not allowed. I do not remember why. The PEP might say why not. See ho >>> from itertools import (chain, # very helpful count) # also helpful Here () is allowed precisely for line continuation. > x = y + # syntax error > z x = (1 + # not a syntax error 2) # and more comment > Two reasons for requiring a space after the backslash: > > (1) make the backslash more likely to stand out visually (and we can't > require a space before it) I think it stands out better by itself at the end. Terry Jan Reedy From jimjjewett at gmail.com Fri May 17 00:17:55 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 16 May 2013 18:17:55 -0400 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: On Thu, May 16, 2013 at 5:55 PM, Terry Jan Reedy wrote: > On 5/16/2013 2:41 PM, Bruce Leban wrote: >> (1) make the backslash more likely to stand out visually (and we can't >> require a space before it) > I think it stands out better by itself at the end. Except that it makes invisible whitespace have a magical effect. Which -- at least for me -- is the primary reason *why* \-continuation is bad. Allowing whitespace (including comments) after the line-continuation would remove that gotcha. -jJ From python at mrabarnett.plus.com Fri May 17 00:19:38 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 16 May 2013 23:19:38 +0100 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <1368737500.52226.YahooMailNeo@web184706.mail.ne1.yahoo.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <1368737500.52226.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: <51955B7A.3050308@mrabarnett.plus.com> On 16/05/2013 21:51, Andrew Barnert wrote: > From: MRAB > > Sent: Thursday, May 16, 2013 9:23 AM > >> On 16/05/2013 16:57, Andrew Barnert wrote: > >>> And then there's the fact that the "precedence" is different >>> depending on which meaning the dot gets. Remember that what >>> you're trying to solve is the problem that member-dot and % both >>> have higher precedence than +. >>> >> I thought the problem we were trying to solve was that "+" has a >> lower precedence than "%" and attribute/method access, so implicit >> concatenation that's followed by "%" or ".format" can't be replaced >> by "+" without adding extra parentheses. > > I was talking about the fact that Guido's 'Just use "+"' suggestion > is insufficient, because it requires adding extra parentheses. > Therefore, the problem we're trying to solve is 'member-dot and % > both have higher precedence than +.' Your '"+" has a lower precedence > than "%" and attribute/method access' means the exact same thing, > just stated in the opposite order. > > So? I think I'm missing your point. > You said """there's the fact that the "precedence" is different depending on which meaning the dot gets""". My point was that "." between string literals (which is currently a syntax error) would indicate concatenation of those literals, but there would be no change in precedence; it wouldn't replace "+". From greg.ewing at canterbury.ac.nz Fri May 17 00:26:58 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 17 May 2013 10:26:58 +1200 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: <51955D32.6060507@canterbury.ac.nz> Terry Jan Reedy wrote: > In a string, \ escapes the immediate next character. This idea of \ at > the *end* of a line, just before the newline character, is that it > escapes the newline, *just as is does within a string*. That's how it currently works, but that doesn't mean it's how it *should* work. Especially if it leads to counter- intuitive and less-than-useful behaviour, which IMO it does. In between tokens, we expect whitespace to be treated flexibly, in the sense that wherever one whitespace character is allowed, we can substitute more than one. Line continuation with backslash currently breaks that expectation. -- Greg From tismer at stackless.com Fri May 17 00:46:41 2013 From: tismer at stackless.com (Christian Tismer) Date: Fri, 17 May 2013 00:46:41 +0200 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: <51955D32.6060507@canterbury.ac.nz> References: <51955D32.6060507@canterbury.ac.nz> Message-ID: <519561D1.8060002@stackless.com> On 17.05.13 00:26, Greg Ewing wrote: > Terry Jan Reedy wrote: >> In a string, \ escapes the immediate next character. This idea of \ >> at the *end* of a line, just before the newline character, is that it >> escapes the newline, *just as is does within a string*. > > That's how it currently works, but that doesn't mean it's > how it *should* work. Especially if it leads to counter- > intuitive and less-than-useful behaviour, which IMO it > does. > > In between tokens, we expect whitespace to be treated > flexibly, in the sense that wherever one whitespace > character is allowed, we can substitute more than one. > Line continuation with backslash currently breaks that > expectation. > I would appreciate it very much if "\" were more intelligent. So intelligent, that I don't want to avoid it, but want to use it! So let us make it ignore white-space and allow comments, and I'll be more than happy with an ugly, but really useful back-slash ! + 9**(9**9) -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tjreedy at udel.edu Fri May 17 02:04:26 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Thu, 16 May 2013 20:04:26 -0400 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: <51955D32.6060507@canterbury.ac.nz> References: <51955D32.6060507@canterbury.ac.nz> Message-ID: On 5/16/2013 6:26 PM, Greg Ewing wrote: > Terry Jan Reedy wrote: >> In a string, \ escapes the immediate next character. This idea of \ at >> the *end* of a line, just before the newline character, is that it >> escapes the newline, *just as is does within a string*. > > That's how it currently works, but that doesn't mean it's > how it *should* work. My point is that there is a logical consistency that makes the current behavior consistent to me. > Especially if it leads to counter-intuitive To me, having the \ below escape the newline that occurs 60 characters later is 'counter-intuitive'. a + \ # a very long comment that seems to go on and on forever The \ where it is looks to me like a stray typo and a bug. I would be less surprised if the code below worked, so my counter-proposal is that \ escaping of newline work after comments. >>> 1 + # current behavior \ SyntaxError: invalid syntax >>> 1 + # proposed behavior \ 2 3 >>> > and less-than-useful behaviour, which IMO it does. Useful is a different issue. My counte-proposal meets the goal of mixing comments with line-continuation. Terry From python at mrabarnett.plus.com Fri May 17 02:38:37 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 17 May 2013 01:38:37 +0100 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: <51955D32.6060507@canterbury.ac.nz> Message-ID: <51957C0D.204@mrabarnett.plus.com> On 17/05/2013 01:04, Terry Jan Reedy wrote: > On 5/16/2013 6:26 PM, Greg Ewing wrote: >> Terry Jan Reedy wrote: >>> In a string, \ escapes the immediate next character. This idea of \ at >>> the *end* of a line, just before the newline character, is that it >>> escapes the newline, *just as is does within a string*. >> >> That's how it currently works, but that doesn't mean it's >> how it *should* work. > > My point is that there is a logical consistency that makes the current > behavior consistent to me. > >> Especially if it leads to counter-intuitive > > To me, having the \ below escape the newline that occurs 60 characters > later is 'counter-intuitive'. > > a + \ # a very long comment that seems to go on and on forever > > The \ where it is looks to me like a stray typo and a bug. I would be > less surprised if the code below worked, so my counter-proposal is that > \ escaping of newline work after comments. > > >>> 1 + # current behavior \ > SyntaxError: invalid syntax > > >>> 1 + # proposed behavior \ > 2 > 3 > >>> > > > and less-than-useful behaviour, which IMO it does. > > Useful is a different issue. My counte-proposal meets the goal of mixing > comments with line-continuation. > So you want \ to have a special meaning at the end of a comment? I don't even like the current behaviour of \ at the end of a raw string literal! :-) If it _did_ have a special meaning, I would've expected it to indicate that the _comment itself_ is continued onto the next line. -1 From haoyi.sg at gmail.com Fri May 17 03:20:41 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 16 May 2013 21:20:41 -0400 Subject: [Python-ideas] MacroPy Final Report, Request for Feedback Message-ID: Hey All, I'd like to announce the release of MacroPy 0.1.7 ( https://github.com/lihaoyi/macropy). MacroPy is an implementation of Syntactic Macros in the Python Programming Language, which we used to implement a pretty impressive list of demo macros: - Case Classes , easy Algebraic Data Types from Scala - Pattern Matching from the Functional Programming world - Tail-call Optimization - String Interpolation, a common feature, and Pyxl, which is basically XML interpolation. - Tracing and Smart Asserts, from every programmer's wildest dreams. - PINQ to SQLAlchemy, a clone of LINQ to SQL from C# - Quick Lambdas from Scala and Groovy - Parser Combinators, inspired by Scala's - JS Snippets , cross compiling snippets of Python into equivalent Javascript Along with a really nice readme which serves both as a demonstration of the capabilities of macros , and also as an introduction to the macro-writing process . This is probably going to be the last release that bundles all these demos together; future releases will begin properly breaking up the various macros into their own projects, and nicely polishing the core infrastructure into a solid foundation for others to build upon. We think that this is a pretty cool project, and can serve as a easy way for people to prototype modifications to the python language. We're looking forward to your feedback and welcome contributions =) -Haoyi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbvsmo at gmail.com Fri May 17 03:29:35 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Thu, 16 May 2013 22:29:35 -0300 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: <51957C0D.204@mrabarnett.plus.com> References: <51955D32.6060507@canterbury.ac.nz> <51957C0D.204@mrabarnett.plus.com> Message-ID: +1 I always wanted that. -- Jo?o Bernardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri May 17 04:05:33 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 16 May 2013 19:05:33 -0700 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: <51955D32.6060507@canterbury.ac.nz> Message-ID: <5195906D.3040108@stoneleaf.us> On 05/16/2013 05:04 PM, Terry Jan Reedy wrote: > On 5/16/2013 6:26 PM, Greg Ewing wrote: >> Terry Jan Reedy wrote: >>> In a string, \ escapes the immediate next character. This idea of \ at >>> the *end* of a line, just before the newline character, is that it >>> escapes the newline, *just as is does within a string*. >> >> That's how it currently works, but that doesn't mean it's >> how it *should* work. > > My point is that there is a logical consistency that makes the current behavior consistent to me. > >> Especially if it leads to counter-intuitive > > To me, having the \ below escape the newline that occurs 60 characters later is 'counter-intuitive'. > > a + \ # a very long comment that seems to go on and on forever > > The \ where it is looks to me like a stray typo and a bug. I would be less surprised if the code below worked, so my > counter-proposal is that \ escaping of newline work after comments. > >--> 1 + # current behavior \ > SyntaxError: invalid syntax > >--> 1 + # proposed behavior \ > 2 > 3 I would rather see Greg's proposal; a backslash is already semi-magical with it's escaping property, so I think it makes more sense to have a backslash be able to interrupt an expression than a comment... although, having said that, brackets allow comments in the middle... I still like Greg's color better. ;) -- ~Ethan~ From bruce at leapyear.org Fri May 17 04:44:31 2013 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 16 May 2013 19:44:31 -0700 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: <51955D32.6060507@canterbury.ac.nz> Message-ID: On May 16, 2013 5:05 PM, "Terry Jan Reedy" wrote: > To me, having the \ below escape the newline that occurs 60 characters later is 'counter-intuitive'. > > a + \ # a very long comment that seems to go on and on forever > > The \ where it is looks to me like a stray typo and a bug. I would be less surprised if the code below worked, so my counter-proposal is that \ escaping of newline work after comments. > > >>> 1 + # current behavior \ > SyntaxError: invalid syntax > > >>> 1 + # proposed behavior \ > 2 > 3 > > >>> > > > and less-than-useful behaviour, which IMO it does. > > Useful is a different issue. My counte-proposal meets the goal of mixing comments with line-continuation. My objection to this is that it changes meaning of current code while my proposal doesn't. It also changes rule that everything after # is ignored. Simple example: x = y, # \ z = 1, 2 Admittedly contrived but I spent no time trying to get a less-contrived example. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan at ryanhiebert.com Fri May 17 04:50:26 2013 From: ryan at ryanhiebert.com (Ryan Hiebert) Date: Thu, 16 May 2013 19:50:26 -0700 (PDT) Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: <1368759025128.f8963fd8@Nodemailer> I'd like it if comments and/or white space could follow the \. Trailing white space after one is always difficult to notice, and the comment after the line continuation is a compelling use-case. Together, we'll have a bit less error prone line continuation syntax. ? Sent from Mailbox for iPhone On Thu, May 16, 2013 at 7:45 PM, Bruce Leban wrote: > On May 16, 2013 5:05 PM, "Terry Jan Reedy" wrote: >> To me, having the \ below escape the newline that occurs 60 characters > later is 'counter-intuitive'. >> >> a + \ # a very long comment that seems to go on and on forever >> >> The \ where it is looks to me like a stray typo and a bug. I would be > less surprised if the code below worked, so my counter-proposal is that \ > escaping of newline work after comments. >> >> >>> 1 + # current behavior \ >> SyntaxError: invalid syntax >> >> >>> 1 + # proposed behavior \ >> 2 >> 3 >> >> >>> >> >> > and less-than-useful behaviour, which IMO it does. >> >> Useful is a different issue. My counte-proposal meets the goal of mixing > comments with line-continuation. > My objection to this is that it changes meaning of current code while my > proposal doesn't. It also changes rule that everything after # is ignored. > Simple example: > x = y, # \ > z = 1, 2 > Admittedly contrived but I spent no time trying to get a less-contrived > example. > --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Fri May 17 05:00:01 2013 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 16 May 2013 22:00:01 -0500 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: <51955D32.6060507@canterbury.ac.nz> Message-ID: <51959D31.5090800@gmail.com> On 05/16/2013 07:04 PM, Terry Jan Reedy wrote: > On 5/16/2013 6:26 PM, Greg Ewing wrote: >> Terry Jan Reedy wrote: >>> In a string, \ escapes the immediate next character. This idea of \ at >>> the *end* of a line, just before the newline character, is that it >>> escapes the newline, *just as is does within a string*. It only escapes pairs it knows about. If the \ is followed by a character that isn't an escape sequence it knows, then it is just a slash. >>> "This \ is just a backslash." 'This \\ is just a backslash.' It would be easier if this wasn't the current behaviour. >> That's how it currently works, but that doesn't mean it's >> how it *should* work. > > My point is that there is a logical consistency that makes the current > behavior consistent to me. > >> Especially if it leads to counter-intuitive > > To me, having the \ below escape the newline that occurs 60 characters > later is 'counter-intuitive'. > > a + \ # a very long comment that seems to go on and on forever > > The \ where it is looks to me like a stray typo and a bug. This is a matter of style. It's why I put more space in front of the # if there is room. Ron From mal at egenix.com Fri May 17 09:30:50 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 May 2013 09:30:50 +0200 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: <5195DCAA.9000605@egenix.com> On 16.05.2013 20:41, Bruce Leban wrote: > At Chris Angelico's suggestion, starting another thread on this: > > The \ line continuation does not allow comments yet statements that span > multiple lines may need internal comments. Also spaces after the \ are not > allowed but trailing spaces are invisible to the reader but not to the > parser. If you use parenthesis for continuation then you can add comments > but there are cases where parenthesis don't work, for example, before in a > with statement, as well as the current discussion of using \ to make > implicit string concatenation explicit. So I propose adopting this rule for > trailing \ continuation: > > The \ continuation character may be followed by white space and a comment. > If a comment is present, there must be at least one whitespace character > between the \ and the comment. > > > That is: > > x = y + \ # comment allowed here > z > > with a as x, \ # comment here may be useful > b as y, \ # or here > c as z: \ # or here > pass > > x = y + # syntax error > z > > Two reasons for requiring a space after the backslash: > > (1) make the backslash more likely to stand out visually (and we can't > require a space before it) > > (2) \# looks like it might be an escape sequence of some sort while I don't > think \ # does, making this friendlier to readers. > > I'm not passionate about that detail if the rest of the proposal flies. I'm -1 on making the backslash more attractive to use :-) In most use cases, you can create much more readable code by using parenthesis, which easily allow spanning statements and expressions across multiple lines. Those read better, work better in editors (automatic indentation) and are less error prone than the backslash. If there are common use cases left which can currently not be handled by parens, I'd be +1 on fixing those. Your "with" example would be one such case, since the following currently gives a SyntaxError: with (a as x, # comment here may be useful b as y, # or here c as z): # or here pass -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 17 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From storchaka at gmail.com Fri May 17 10:03:16 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 17 May 2013 11:03:16 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> Message-ID: 16.05.13 20:26, Bruce Leban ???????(??): > I like the \ idea because it's clearly syntax and not an operator, but > the fact that it doesn't work with comments is annoying since one reason > to break a string is to insert comments. I don't like that spaces after > the \ are not allowed because trailing spaces are invisible to me but > not to the parser. So what if the rule for trailing \ was changed to: > > The \ continuation character may be followed by white space and a > comment. If a comment is present, there must be at least one > whitespace character between the \ and the comment. It's not needed. You could just use the "+" operator if you want to insert comments. Or verbose mode of regexpes. And it works right now. From vernondcole at gmail.com Fri May 17 10:35:20 2013 From: vernondcole at gmail.com (Vernon D. Cole) Date: Fri, 17 May 2013 02:35:20 -0600 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? Message-ID: Terry Jan Reedy wrote: > > To me, having the \ below escape the newline that occurs 60 characters > later is 'counter-intuitive'. > > a + \ # a very long comment that seems to go on and on forever > > It appears that my intuition is far different. I distinctly remember that when I was first learning Python and read that you _cannot_ put a comment after a line continuation -- my comment was: "What?? Why not?" Followed by: "Oh, well, if _that_ is the only thing wrong with the language I will probably use it a lot." +1 for \ # Can I vote more than once? On the other hand, implicit string literal concatenation is so obscure that, when I really needed it a week or two ago, I went back to the documentation to make sure that it was really part of Python, and not some other syntax that I was remembering. (Sometimes I have trouble keeping my Grandchildren sorted out, too.) I could not locate it in the docs and so solved the problem another way. This discussion has helped restore my faith in my memory. +1 for deprecating it -- in Python 4. Mark it as bad code smell as soon as there is an alternative. Do we need an _explicit_ string literal concatenation operator? Yes, we do, in order to deprecate the implicit, and as we all know: "Explicit is better than implicit." +1 What should that operator be? '+' is obvious. Too obvious. I would always wonder, somewhere deep down inside my soul: did the compiler _really_ optimize the expression, or is it being evaluated at run time, every time it passes through the loop? I would avoid using it in practice for that reason alone. I don't like the use of a dot, because, at my age, it is getting pretty hard to tell them apart from a comma. Besides, '.' is already pretty busy being an attribute marker and the marker which differentiates a float from an int. My favorite candidate so far is the humble, under used, ellipses. Does it even _have_ an operator precedence? I don't know, because in ten years of Python coding I have never used it. I have used "if someFeature is NotImplemented:" and I think that reads pretty well. Ellipses, on the other hand, I have never found a use for. I know it exists, but... And, as you see, a native English speaker indicates that there is something else missing, to be filled in later, when an ellipses appears at the end of something. Not all good ideas come from the Dutch ... sometimes they come from Antipodes. +1 for "..." -- Vernon Cole -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri May 17 10:45:21 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 17 May 2013 18:45:21 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: On Fri, May 17, 2013 at 6:35 PM, Vernon D. Cole wrote: > My favorite candidate so far is the humble, under used, ellipses. Does it > even _have_ an operator precedence? I don't know, because in ten years of > Python coding I have never used it. I have used "if someFeature is > NotImplemented:" and I think that reads pretty well. Ellipses, on the other > hand, I have never found a use for. I know it exists, but... It's not an operator, it's an operand. It doesn't have an entry on the precedence table for the same reason that None, 5, and "Hello" don't. ChrisA From steve at pearwood.info Fri May 17 10:49:14 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 17 May 2013 18:49:14 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5195253B.9060001@gmail.com> References: <5195253B.9060001@gmail.com> Message-ID: <5195EF0A.5090002@pearwood.info> On 17/05/13 04:28, Ron Adam wrote: > I think the line continuation '\' character would also make a good explicit string literal concatenation character. It's already limited to only work across sequential lines as well. Concatenating strings on the same line is a legitimate thing to do, because you can mix different quoting types. In my opinion, there's no really clean way to build string literals containing multiple types of quotation marks, but implicit concatenation works and is useful. Here's a contrived example: s = "'Aren't you supposed to be " '"working"?' "', he asked with a wink." Arguments about whether that is uglier than using backslashes to /dev/null please :-) -- Steven From solipsis at pitrou.net Fri May 17 10:51:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 17 May 2013 10:51:45 +0200 Subject: [Python-ideas] Allowing comments after line continuations References: <5195DCAA.9000605@egenix.com> Message-ID: <20130517105145.66ed7a2c@pitrou.net> Le Fri, 17 May 2013 09:30:50 +0200, "M.-A. Lemburg" a ?crit : > > I'm -1 on making the backslash more attractive to use :-) > > In most use cases, you can create much more readable code > by using parenthesis, which easily allow spanning statements > and expressions across multiple lines. > > Those read better, work better in editors (automatic indentation) > and are less error prone than the backslash. > > If there are common use cases left which can currently not be > handled by parens, I'd be +1 on fixing those. I kind of agree with Marc-Andr?. Regards Antoine. From steve at pearwood.info Fri May 17 10:59:38 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 17 May 2013 18:59:38 +1000 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: <51955D32.6060507@canterbury.ac.nz> Message-ID: <5195F17A.2000002@pearwood.info> On 17/05/13 10:04, Terry Jan Reedy wrote: > To me, having the \ below escape the newline that occurs 60 characters later is 'counter-intuitive'. > > a + \ # a very long comment that seems to go on and on forever You're misreading it, in my (not-so-)humble opinion :-) The backslash should not be interpreted as an escape, since escapes are only meaningful inside string literals. It should be interpreted as an instruction to the parser, telling it to treat the next line as a continuation of the current line. Anything following the \ needs to be ignored, so the only things which are legal after the backslash should be things which would be ignored anyway, namely whitespace or comments. Putting the backslash at the end of the comment is ruled out by the requirement that # comments out everything until the end of the line. Before line continuations within brackets was introduced, I would have cared much more about this issue, but now I'm finding it hard to care. I'm only +0 on allowing comments after \ and +0.5 on allowing whitespace. -- Steven From steve at pearwood.info Fri May 17 11:01:52 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 17 May 2013 19:01:52 +1000 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <1368683512.98116.YahooMailNeo@web184706.mail.ne1.yahoo.com> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> <1368683512.98116.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: <5195F200.5030108@pearwood.info> On 16/05/13 15:51, Andrew Barnert wrote: > From: Steven D'Aprano > > Sent: Wednesday, May 15, 2013 8:16 PM > >> On 16/05/13 07:27, Andrew Barnert wrote: >> >>>> The first thing you think of is, "Oh, I just need to use an >> OrderedDict.". Well, technically yes, except there's no convenient way >> to instantiate an OrderedDict with more than one element at a time. >> >> There's not *that* much difference between writing: > > You're quoting me quoting someone else (Don Spaulding) here. The problem may be that I'm using the horrible Yahoo webmail client, which is especially bad at indenting replies to rich-text emails, and therefore it's hard for you to tell what's going on? But I think this led to some confusion farther down. Ah, I knew that, but I lost the attribution to Don. Sorry about that. -- Steven From steve at pearwood.info Fri May 17 11:07:58 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 17 May 2013 19:07:58 +1000 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <5194E41F.6000006@stoneleaf.us> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> <5194E41F.6000006@stoneleaf.us> Message-ID: <5195F36E.3010905@pearwood.info> On 16/05/13 23:50, Ethan Furman wrote: > On 05/15/2013 08:16 PM, Steven D'Aprano wrote: >> >> I don't believe it can. Hence, when order is important, you cannot use keyword arguments to provide arguments *even if >> kwargs are ordered*. But if you write your function like this: >> >> def create_element(tag, mapping): >> pass >> >> and call it like this: >> >> create_element('img', OrderedDict([('alt', 'something'), ('src', 'something.jpg')])) >> >> then you can get order for free. Yes, it's a little less convenient to use a list of tuples than nice keyword syntax, >> but that's a solution that doesn't impose any costs on code that doesn't care about ordering. > > Which 'free' are you talking about? Because if the solution requires extra typing and extra visual clutter, it's not free. Free like a puppy :-) You make a good point. Perhaps "free" was a bad choice of words. Rather, let me say that if you need ordered keyword arguments, you can have them *right now* without waiting for the day when you can drop support for everything older that Python 3.4 (or whatever version gives you order-preserving kwargs). -- Steven From tismer at stackless.com Fri May 17 11:32:41 2013 From: tismer at stackless.com (Christian Tismer) Date: Fri, 17 May 2013 11:32:41 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> Message-ID: <5195F939.5030305@stackless.com> On 16.05.13 20:20, Chris Angelico wrote: > On Fri, May 17, 2013 at 4:14 AM, Bruce Leban wrote: >> On Thu, May 16, 2013 at 10:38 AM, MRAB wrote: >>> Why do you say """there must be at least one whitespace character >>> between the \ and the comment"""? >> >> Two reasons: >> >> (1) make the backslash more likely to stand out visually (and we can't >> require a space before it) >> >> (2) \# looks like it might be an escape sequence of some sort while I don't >> think \ # does, making this friendlier to readers. >> >> I'm not passionate about that detail if the rest of the proposal flies. > Spin that off as a separate thread, I think the change to the > backslash rules stands alone. I would support it; allowing a > line-continuation backslash to be followed by a comment is a Good > Thing imo. > I don't think these matters should be discussed in separate threads. We came from Guido's proposal to remove implicit string concatenation. In that context, some people argued that there should be no new ".", "&" or whatever operator rules, but better handling of the unbeloved backslash. I think both can and should be treated together. Doing so, I come to repeat this proposal: - implicit string concatenation becomes deprecated - the backslash will allow comments, as proposed by Bruce - continuation of a string on the next line will later enforce the backslash. So repeating Bruce's example, the following would be allowed: x = [ # THIS WOULD BE ALLOWED 'abc' \ 'def' \ # not the python keyword 'ghi' ] And this would be an error: x = [ # THIS WOULD BE AN ERROR 'abc' \ 'def' # a comment but no continuation \ 'ghi' ] '\' would become kind of a line glue operator that becomes needed to merge the strings. I don't think that parentheses are superior for that. Parentheses are for expressions and they suggest expressions. Avoiding parentheses where they don't group parts of expressions is imo a good thing. The reason why Python has grown the recommendation to use parentheses comes more from the absence of a good alternative. cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From storchaka at gmail.com Fri May 17 11:37:02 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 17 May 2013 12:37:02 +0300 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: 16.05.13 21:41, Bruce Leban ???????(??): > At Chris Angelico's suggestion, starting another thread on this: > > The \ line continuation does not allow comments yet statements that span > multiple lines may need internal comments. Also spaces after the \ are > not allowed but trailing spaces are invisible to the reader but not to > the parser. If you use parenthesis for continuation then you can add > comments but there are cases where parenthesis don't work, for example, > before in a with statement, as well as the current discussion of using \ > to make implicit string concatenation explicit. So I propose adopting > this rule for trailing \ continuation: > > The \ continuation character may be followed by white space and a > comment. If a comment is present, there must be at least one > whitespace character between the \ and the comment. I'm strong -1 on "\" line continuation working after comment. This is backward incompatible, contr-intuitive (it expected meaning that a comment continues on the next line) and error-prone (placing "#" at the start of line is used just to temporary exclude some fragment of code). I'm -0.5 on allowing comments after "\" line continuation. This complicates parser (and the one in human mind) and looks contrary to all other languages which use "\" for line continuation. I'm only -0.1 on allowing spaces after "\" line continuation. While "\ " causes SyntaxError at compile time it is not an issue. And trailing whitespaces should be avoided in any cases, after "\" or not. From vernondcole at gmail.com Fri May 17 11:37:17 2013 From: vernondcole at gmail.com (Vernon D. Cole) Date: Fri, 17 May 2013 03:37:17 -0600 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? Message-ID: on Fri, 17 May 2013 18:49:14 +1000 Steven D'Aprano wrote: > > Concatenating strings on the same line is a legitimate thing to do, > because you can mix different quoting types. In my opinion, there's no > really clean way to build string literals containing multiple types of > quotation marks, but implicit concatenation works and is useful. Here's a > contrived example: > > > s = "'Aren't you supposed to be " '"working"?' "', he asked with a wink." > > True. But I had to paste your example into the interpreter to figure out where the literals ended and started. s = "'Aren't you supposed to be " ... '"working"?' ... "', he asked with a wink." Is easier to read and understand. So is: s = "'Aren't you supposed to be " + '"working"?' + "', he asked with a wink." Which works today, if you don't mind doing it at run time. You can also use: s = """'Aren't you supposed to be "working"? ', he asked with a wink.""" Which also works fine, although the triple-double single-single combination is a bit frightening to look at. -- Vernon Cole -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsbueno at python.org.br Fri May 17 11:55:20 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 17 May 2013 06:55:20 -0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51953387.9070500@mrabarnett.plus.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <51953387.9070500@mrabarnett.plus.com> Message-ID: On 16 May 2013 16:29, MRAB wrote: > On 16/05/2013 20:03, Joao S. O. Bueno wrote: >> >> On 16 May 2013 12:57, Andrew Barnert wrote: >>> >>> And this means the parser has to figure out whether you mean dot for >>> attribute access or dot for concatenation. That's not exactly a _hard_ >>> problem, but it's not _trivial_. >> >> >> If you say it mis not hard for the parser, ok - but it seems >> impossible for humans: >> >> upper = " World" >> print ("Hello". upper) >> > That's attribute access. But you are suggesting it should be string concatenation. It is already in use for attribute access, as you can see - and one writting a program, or reading one should not have to be thinking """ah - but here I can't use the "." because I am concatenating a string in a variable, not a literal string"""" > > The suggestion was to use it in place of implicit string concatenation, > which occurs only between string _literals_: > > print ("Hello" . " World") > > and is currently illegal ("SyntaxError: invalid syntax"). What is that? One thing that works in a way for literals and in another way for expressions? Sorry, but there is onlye one word for this: Insanity! From rosuav at gmail.com Fri May 17 13:09:53 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 17 May 2013 21:09:53 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <51953387.9070500@mrabarnett.plus.com> Message-ID: On Fri, May 17, 2013 at 7:55 PM, Joao S. O. Bueno wrote: > On 16 May 2013 16:29, MRAB wrote: >> The suggestion was to use it in place of implicit string concatenation, >> which occurs only between string _literals_: >> >> print ("Hello" . " World") >> >> and is currently illegal ("SyntaxError: invalid syntax"). > > What is that? One thing that works in a way for literals and > in another way for expressions? > Sorry, but there is onlye one word for this: Insanity! One of the things I love about Python is that a "thing" can be used in the same ways whether it's from a literal, a variable/name lookup, a function return value, a class member, an instance member, etc, etc, etc. (Sometimes this requires strange magic, like member function calling, but you still have the principle that "a=foo.bar(quux)" and "_=foo.bar; a=_(quux)" do the same thing.) So anything that makes str.str mean something weird gets a -1 from me. The proposals involving ellipsis have at least the virtue that it's clearly a syntactic element and not an operator, but I suspect the syntax will be more problematic than useful. If it looks like an operator, it should BE an operator. ChrisA From steve at pearwood.info Fri May 17 13:41:44 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 17 May 2013 21:41:44 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5195F939.5030305@stackless.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> Message-ID: <51961778.6080905@pearwood.info> On 17/05/13 19:32, Christian Tismer wrote: > On 16.05.13 20:20, Chris Angelico wrote: >> On Fri, May 17, 2013 at 4:14 AM, Bruce Leban wrote: >>> I'm not passionate about that detail if the rest of the proposal flies. >> >> Spin that off as a separate thread, I think the change to the >> backslash rules stands alone. I would support it; allowing a >> line-continuation backslash to be followed by a comment is a Good >> Thing imo. >> > > I don't think these matters should be discussed in separate threads. They clearly should be in different threads. Line continuation is orthogonal to string continuation. You can have string concatenation on a single line: s = "Label:\t" r"Data containing \ backslashes" And you can have line continuations not involving strings: result = math.sin(23*theta) + cos(17*theta) - \ sin(3*theta**2)*cos(5*theta**3) Since the two things under discussion are independent, they should be discussed in different threads. > - implicit string concatenation becomes deprecated -1 Implicit string concatenation is useful, and used by many people without problems. > - the backslash will allow comments, as proposed by Bruce +0 It's not really that important these days. If you want comments, use brackets to group a multi-line expression. > - continuation of a string on the next line will later enforce the backslash. I don't understand what this sentence means. > So repeating Bruce's example, the following would be allowed: > > x = [ # THIS WOULD BE ALLOWED > 'abc' \ > 'def' \ # not the python keyword > 'ghi' > ] The backslashes are redundant, since the square brackets already enable a multi-line expression. > And this would be an error: > > x = [ # THIS WOULD BE AN ERROR > 'abc' \ > 'def' # a comment but no continuation \ > 'ghi' > ] > > '\' would become kind of a line glue operator that becomes > needed to merge the strings. -1 since there are uses for concatenating strings on a single line. > I don't think that parentheses are superior for that. > Parentheses are for expressions and they suggest expressions. > Avoiding parentheses where they don't group parts of expressions > is imo a good thing. I don't understand this objection, since the parentheses are being used to group an expression. And they are being used to group expressions. > The reason why Python has grown the recommendation to use parentheses > comes more from the absence of a good alternative. Maybe so, but now that we have multi-line expressions inside brackets, the need for an alternative is much reduced. -- Steven From ron3200 at gmail.com Fri May 17 16:14:39 2013 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 17 May 2013 09:14:39 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51961778.6080905@pearwood.info> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> Message-ID: <51963B4F.7090307@gmail.com> On 05/17/2013 06:41 AM, Steven D'Aprano wrote: > On 17/05/13 19:32, Christian Tismer wrote: >> On 16.05.13 20:20, Chris Angelico wrote: >>> On Fri, May 17, 2013 at 4:14 AM, Bruce Leban >>> wrote: > >>>> I'm not passionate about that detail if the rest of the proposal flies. >>> >>> Spin that off as a separate thread, I think the change to the >>> backslash rules stands alone. I would support it; allowing a >>> line-continuation backslash to be followed by a comment is a Good >>> Thing imo. >>> >> >> I don't think these matters should be discussed in separate threads. > > They clearly should be in different threads. Line continuation is > orthogonal to string continuation. You can have string concatenation on a > single line: > > s = "Label:\t" r"Data containing \ backslashes" Can you think of, or find an example of two adjacent strings on the same line that can't be written as a single string? s = "Label:\t Data containing \ backslashes" I'm curious about how much of a problem not having implicit string concatenations really is? > And you can have line continuations not involving strings: > > result = math.sin(23*theta) + cos(17*theta) - \ > sin(3*theta**2)*cos(5*theta**3) > > > Since the two things under discussion are independent, they should be > discussed in different threads. > > >> - implicit string concatenation becomes deprecated > > -1 > > Implicit string concatenation is useful, and used by many people without > problems. This is why they are trying to find an explicit alternative. >> - the backslash will allow comments, as proposed by Bruce > > +0 > > It's not really that important these days. If you want comments, use > brackets to group a multi-line expression. > > >> - continuation of a string on the next line will later enforce the >> backslash. > > I don't understand what this sentence means. > > >> So repeating Bruce's example, the following would be allowed: >> >> x = [ # THIS WOULD BE ALLOWED >> 'abc' \ >> 'def' \ # not the python keyword >> 'ghi' >> ] > > The backslashes are redundant, since the square brackets already enable a > multi-line expression. But it is also a source of errors which appears to happen often enough, or is annoying enough, to be worth changing. Guido's example was a situation where a comma was left out and two strings were joined inside a list without an error message. If you accidentally put a comma in a multi line expression inside parentheses, it becomes a tuple without an error message. >>> ('abc' ... 'def', ... 'ghi') ('abcdef', 'ghi') By removing implicit string concatenations, an error can be raised in some of these situations. The fact that these errors are silent and may not be noticed until a programs actually used is an important part of this. Or even worse, not noticed at all! >> '\' would become kind of a line glue operator that becomes >> needed to merge the strings. > > -1 since there are uses for concatenating strings on a single line. Guido's suggestion is just to live with using a '+'. His point was that any extra overhead wouldn't be that harmful as literal string concatenations tend to be in initiation parts of programs. But the + has a lower precedence than %. Which is inconvenient. >> I don't think that parentheses are superior for that. >> Parentheses are for expressions and they suggest expressions. >> Avoiding parentheses where they don't group parts of expressions >> is imo a good thing. I agree with this. Especially if the expression being grouped has parentheses inside it. > I don't understand this objection, since the parentheses are being used to > group an expression. > And they are being used to group expressions. It's also a matter of reducing errors. I think it improves readability as well. Which also reduces errors. >> The reason why Python has grown the recommendation to use parentheses >> comes more from the absence of a good alternative. > > Maybe so, but now that we have multi-line expressions inside brackets, the > need for an alternative is much reduced. If you use braces, you get a one item list as the result. Parentheses are used to change an expressions order of evaluation as well. Ron From ckaynor at zindagigames.com Fri May 17 19:07:49 2013 From: ckaynor at zindagigames.com (Chris Kaynor) Date: Fri, 17 May 2013 10:07:49 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <51953387.9070500@mrabarnett.plus.com> Message-ID: On Fri, May 17, 2013 at 4:09 AM, Chris Angelico wrote: > On Fri, May 17, 2013 at 7:55 PM, Joao S. O. Bueno > wrote: > > On 16 May 2013 16:29, MRAB wrote: > >> The suggestion was to use it in place of implicit string concatenation, > >> which occurs only between string _literals_: > >> > >> print ("Hello" . " World") > >> > >> and is currently illegal ("SyntaxError: invalid syntax"). > > > > What is that? One thing that works in a way for literals and > > in another way for expressions? > > Sorry, but there is onlye one word for this: Insanity! > > One of the things I love about Python is that a "thing" can be used in > the same ways whether it's from a literal, a variable/name lookup, a > function return value, a class member, an instance member, etc, etc, > etc. (Sometimes this requires strange magic, like member function > calling, but you still have the principle that "a=foo.bar(quux)" and > "_=foo.bar; a=_(quux)" do the same thing.) So anything that makes > str.str mean something weird gets a -1 from me. The proposals > involving ellipsis have at least the virtue that it's clearly a > syntactic element and not an operator, but I suspect the syntax will > be more problematic than useful. > > If it looks like an operator, it should BE an operator. Just to point out that the "." is already overloaded in some cases in Python. Take a look at this literal: 1.2 Surely, that should mean the 2 attribute of the integer 1, correct? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri May 17 19:12:57 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 18 May 2013 03:12:57 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <51953387.9070500@mrabarnett.plus.com> Message-ID: On Sat, May 18, 2013 at 3:07 AM, Chris Kaynor wrote: > On Fri, May 17, 2013 at 4:09 AM, Chris Angelico wrote: >> >> On Fri, May 17, 2013 at 7:55 PM, Joao S. O. Bueno >> wrote: >> > On 16 May 2013 16:29, MRAB wrote: >> >> The suggestion was to use it in place of implicit string concatenation, >> >> which occurs only between string _literals_: >> >> >> >> print ("Hello" . " World") >> >> >> >> and is currently illegal ("SyntaxError: invalid syntax"). >> > >> > What is that? One thing that works in a way for literals and >> > in another way for expressions? >> > Sorry, but there is onlye one word for this: Insanity! >> >> One of the things I love about Python is that a "thing" can be used in >> the same ways whether it's from a literal, a variable/name lookup, a >> function return value, a class member, an instance member, etc, etc, >> etc. (Sometimes this requires strange magic, like member function >> calling, but you still have the principle that "a=foo.bar(quux)" and >> "_=foo.bar; a=_(quux)" do the same thing.) So anything that makes >> str.str mean something weird gets a -1 from me. The proposals >> involving ellipsis have at least the virtue that it's clearly a >> syntactic element and not an operator, but I suspect the syntax will >> be more problematic than useful. >> >> If it looks like an operator, it should BE an operator. > > > Just to point out that the "." is already overloaded in some cases in > Python. Take a look at this literal: 1.2 > Surely, that should mean the 2 attribute of the integer 1, correct? Ahh, true. Good point. I guess literals follow slightly different rules. Still, I don't like the idea of: "hello" . "world" not being an operator. Removing all the whitespace doesn't help, since this notation is specifically about line continuation. ChrisA From jimjjewett at gmail.com Fri May 17 20:23:33 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 17 May 2013 14:23:33 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51961778.6080905@pearwood.info> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> Message-ID: On Fri, May 17, 2013 at 7:41 AM, Steven D'Aprano wrote: > They clearly should be in different threads. Line continuation is orthogonal > to string continuation. You can have string concatenation on a single line: In theory. In practice, the times when I'm having trouble fitting something onto a single line *and* cannot find a good place to break it (using parens), the problem almost always involves a string. And the number of times I needed to concatenate two strings on the same line (but wasn't willing to use a +) has been ... only when when a seemingly arbitrary syntax restriction requires a literal string -- basically, when writing a docstring. > On 17/05/13 19:32, Christian Tismer wrote: >> - continuation of a string on the next line will later enforce the >> backslash. > I don't understand what this sentence means. Today, (if you're not writing a docstring) you can write "abcd" "efgh" and it magically turns into "abcdefgh". He proposes that -- eventually -- you would have to write "abcd" \ "efgh" so that the \ would be an explicit indicator that you were continuing the line, and hadn't just forgotten a comma. > -1 since there are uses for concatenating strings on a single line. I understand "create a string demonstrating all the quoting conventions". I don't understand why an explicit + is so bad in that case. Nor do I understand what would be so horrible about breaking the physical line there. So the only use I know about is docstrings. And maybe that should be fixed there, instead. -jJ From rurpy at yahoo.com Fri May 17 19:14:15 2013 From: rurpy at yahoo.com (rurpy at yahoo.com) Date: Fri, 17 May 2013 10:14:15 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5195EF0A.5090002@pearwood.info> References: <5195253B.9060001@gmail.com> <5195EF0A.5090002@pearwood.info> Message-ID: <01811487-6cc6-4c0d-9ab1-a81c804330bb@googlegroups.com> On 05/17/2013 02:49 AM, Steven D'Aprano wrote: > On 17/05/13 04:28, Ron Adam wrote: > >> I think the line continuation '\' character would also make a good >> explicit string literal concatenation character. It's already >> limited to only work across sequential lines as well. > > Concatenating strings on the same line is a legitimate thing to do, > because you can mix different quoting types. In my opinion, there's > no really clean way to build string literals containing multiple > types of quotation marks, but implicit concatenation works and is > useful. Here's a contrived example: > > > s = "'Aren't you supposed to be " '"working"?' "', he asked with a > wink." And here's a non-contrived one (almost) verbatim from working code: pattern = '[^\uFF1B\u30FB\u3001' r'+:=.,\/\[\]\t\r\n]+' '[\#\uFF03]+' In Python 2 this had been: pattern = ur'[^\uFF1B\u30FB\u3001+:=.,\/\[\]\t\r\n]+[\#\uFF03]+' but was changed to first form above due to Python 3's removal of lexical evaluation of \u literals in raw strings (see http://bugs.python.org/issue14973). Obviously the concatenation could have been done with the + operator but I felt the given form was clearer than trying to visually get whether any particular "+" was inside or outside of a string. There are other more complex regex with more +'s and my preference is to adopt a particular form I can use for most/all such rather than to tweak forms based on a particular string's content. I am assuming this discussion is regarding a possible Python 4 feature -- adjacent string literal concatenation has been documented behavior of Python going back to at least version 1.4.0 (the earliest doc available on python.org): "2.4.1.1 String literal concatenation Multiple adjacent string literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation..." I also have been using adjacent string literal concatenation in the "help" parameters of argparse calls as standard practice for many years, albeit on separate lines. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri May 17 22:06:30 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 17 May 2013 16:06:30 -0400 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: <5195F17A.2000002@pearwood.info> References: <51955D32.6060507@canterbury.ac.nz> <5195F17A.2000002@pearwood.info> Message-ID: On 5/17/2013 4:59 AM, Steven D'Aprano wrote: > On 17/05/13 10:04, Terry Jan Reedy wrote: > >> To me, having the \ below escape the newline that occurs 60 characters >> later is 'counter-intuitive'. >> >> a + \ # a very long comment that seems to go on and on forever > You're misreading it, in my (not-so-)humble opinion :-) Yes, it is a rather imperious opinion ;-) > The backslash should not be interpreted as an escape, since escapes are > only meaningful inside string literals. 'Escape' means 'ignore the normal meaning of the following character'. That is exactly what \ means. 'Excape' had that meaning long before there were python string literals. Ditto for the use of \ as an escape character, as in relational expressions. Relational expressions are typically not quoted, and the fact that they are in Python code, to first turn them into string objects rather than pattern objects, is a nuisance that lead to the r prefix hack. Do you really think Guido just coincidentally choose \ to escape newline, ignorant of its two decade history in unix? Anyway, this is all moot unless the syntax is changed in a way that forces a different interpretation. I don't think that is needed. Terry From nepenthesdev at gmail.com Fri May 17 22:33:29 2013 From: nepenthesdev at gmail.com (Markus) Date: Fri, 17 May 2013 22:33:29 +0200 Subject: [Python-ideas] sqlite3 In-Reply-To: <5194F0E8.6000201@egenix.com> References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> <5194D16F.5060408@egenix.com> <5194F0E8.6000201@egenix.com> Message-ID: Please consider sqlite will come in version 4 some day - http://sqlite.org/src4/doc/trunk/www/index.wiki -- Markus From tjreedy at udel.edu Fri May 17 22:34:54 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 17 May 2013 16:34:54 -0400 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: Message-ID: On 5/17/2013 5:37 AM, Serhiy Storchaka wrote: > I'm only -0.1 on allowing spaces after "\" line continuation. While "\ " > causes SyntaxError at compile time it is not an issue. And trailing > whitespaces should be avoided in any cases, after "\" or not. Python never requires trailing whitespace, so there never a need for trailing white space (except possibly within a multi-line string) and therefore no need (with the exection above) of getting into the habit of adding whitespace. Decent programming editors should have a means to strip trailing whitespace (Idle does)*. Run that (a good habit) and 'xys\ ' is fixed. The Python repository now rejects (new) code with trailing whitespace. Idle's Strip Trailing Whitespace does so on all lines, even if part of a multiline string. That may or may not be what one wants. To avoid the stripping, appending '\n\' to the line -- which also makes the whitespace visible and the intention clear. s = '''abd \n\ efg''' print(s) # produces abd efg (move cursor to detect space after d) -- Terry Jan Reedy From mal at egenix.com Fri May 17 22:44:47 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 17 May 2013 22:44:47 +0200 Subject: [Python-ideas] sqlite3 In-Reply-To: References: <87li7fd954.fsf@uwakimon.sk.tsukuba.ac.jp> <5194D16F.5060408@egenix.com> <5194F0E8.6000201@egenix.com> Message-ID: <519696BF.7040900@egenix.com> On 17.05.2013 22:33, Markus wrote: > Please consider sqlite will come in version 4 some day - > http://sqlite.org/src4/doc/trunk/www/index.wiki Right, and that will need a new Python module (or at least one that also supports SQLite 4): http://sqlite.org/src4/doc/trunk/www/porting.wiki According to the wiki, it is intended to be used as alternative to SQLite 3, not as replacement: http://sqlite.org/src4/doc/trunk/www/design.wiki -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 17 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From bruce at leapyear.org Fri May 17 23:04:08 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 17 May 2013 14:04:08 -0700 Subject: [Python-ideas] Allowing comments after line continuations In-Reply-To: References: <51955D32.6060507@canterbury.ac.nz> <5195F17A.2000002@pearwood.info> Message-ID: On Fri, May 17, 2013 at 1:06 PM, Terry Jan Reedy wrote: > 'Escape' means 'ignore the normal meaning of the following character'. > \ is not only an escape character as you define it. Sometimes it means the exact opposite of that: the following character has special meaning, e.g., \n \t etc. And it is also not the case that \ only applies to the single following character. \123 is a four-character escape sequence, \u1234 is a six-character sequence and \U12345678 is a ten-character sequence! (And Python is not the only language that recognizes long escape sequences.) So in this case, while the current escape sequence is \, the new proposed one is \*[#*] Or writing this in the style used in the python docs: "\" *whitespace**-other-than-newline* * [ "#" *anything-but-newline* * ] * newline* I understand if you disagree with the proposal. But I don't think an argument that it is fundamentally ill-defined and ignorant of history is valid. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From rurpy at yahoo.com Fri May 17 23:41:34 2013 From: rurpy at yahoo.com (rurpy at yahoo.com) Date: Fri, 17 May 2013 14:41:34 -0700 (PDT) Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51963B4F.7090307@gmail.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> Message-ID: <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> On Friday, May 17, 2013 8:14:39 AM UTC-6, Ron Adam wrote: > On 05/17/2013 06:41 AM, Steven D'Aprano wrote: > > They clearly should be in different threads. Line continuation is > > orthogonal to string continuation. You can have string concatenation on > a > > single line: > > > > s = "Label:\t" r"Data containing \ backslashes" > > Can you think of, or find an example of two adjacent strings on the same > line that can't be written as a single string? > > s = "Label:\t Data containing \ backslashes" > > I'm curious about how much of a problem not having implicit string > concatenations really is? > "Can't" is an unrealistically high a bar but I posted a real example at http://mail.python.org/pipermail/python-ideas/2013-May/020847.html that is *better* written IMO as adjacently-concatenated string literals. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Sat May 18 20:16:29 2013 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 18 May 2013 13:16:29 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> Message-ID: <5197C57D.1000807@gmail.com> On 05/17/2013 04:41 PM, rurpy at yahoo.com wrote: > > On Friday, May 17, 2013 8:14:39 AM UTC-6, Ron Adam wrote: > > On 05/17/2013 06:41 AM, Steven D'Aprano wrote: > > They clearly should be in different threads. Line continuation is > > orthogonal to string continuation. You can have string concatenation > on a > > single line: > > > > s = "Label:\t" r"Data containing \ backslashes" > > Can you think of, or find an example of two adjacent strings on the same > line that can't be written as a single string? > > s = "Label:\t Data containing \ backslashes" > > I'm curious about how much of a problem not having implicit string > concatenations really is? > > > "Can't" is an unrealistically high a bar but I posted a real example at > http://mail.python.org/pipermail/python-ideas/2013-May/020847.html > that is *better* written IMO as adjacently-concatenated string literals. If we didn't have implicit string concatenation, I'd probably write it with each part on a separate line to make it easier to read. pattern = '[^\uFF1B\u30FB\u3001' \ + r'+:=.,\/\[\]\t\r\n]+' \ + '[\#\uFF03]+' I think in this case the strings are joined at compile time as Guido suggested in is post. You could also write it as... pattern = ('[^\uFF1B\u30FB\u3001' + r'+:=.,\/\[\]\t\r\n]+' + '[\#\uFF03]+') If implicit string concatenation is removed, it would be nice if there was an explicit replacement for it. There is a strong consensus for doing it, but there isn't strong consensus on how to do it. About line continuations: Line continuations are a related issue to string concatenations because they are used together fairly often. The line continuation behaviour is a bit quarky, but not in any critical way. There has even been a PEP to remove it in python 3, but it was rejected for not having enough support. People do use it, so it would be better if it was improved rather than removed. As noted in other messages, the line continuation is copied from C, which I think originally came from the 'Make' utility. (I'm not positive on that) In C and Make, the \+newline pair is replaced with a space. Python just removes both the \+newline and keeps track of weather or not it's in a string. Look in tokenize.c for this. As for the *not too important* quarkyness: >>> 'abc' \ 'efg' File "", line 1 'abc' \ 'efg' ^ SyntaxError: unexpected character after line continuation character This error implies that the '\' by it self is a line continuation token even though it's not followed by a newline. Other wise you would get the same SyntaxError you get when you use any other symbol in an invalid way. This was probably done either because it was easy to do, and/or because a better error message is more helpful. Trailing white space results in the same error. This happens enough to be annoying. It is confusing to some people why the compiler can recognise the line continuation *character*, but can't figure out that the white space after it is not important. >>> # comment 1\ ... comment 2 File "", line 2 comment 2 ^ SyntaxError: invalid syntax This just shows that comments are parsed before line continuations are considered. Or to put it another way.. the '\' is part of the comment. That isn't the case in C or Make. You can continue a comment on the next line with a line continuation. Nothing wrong with this, but it shows the line continuations in Python aren't exact copies of the line continuation in C. There are perfectly good reasons why the compiler does what it does in each of these cases. I think the little things like this together has contributed to the feeling that line continuations are bad and should be avoided. The discussed (and implied) options: There are a number of options that have been discussed but those haven't really been clearly spelled out so the discussion has been kind of out of focus. This seems like an overly detailed list, but the discussion has touched on pretty much all of these things. I think the goal should be to find the most cohesive combination for Python 4 and/or just go with B alone. A. Do nothing. B. Remove implicit concatenation. (We could stop here, anything after this can be done later.) C. Remove Explicit line continuations. (See options below.) D. Add a new explicit string concatenation token. E. Reuse the \ as an explicit string concatenation. (with C) F. Make an exception for implicit string concatenations only after a line continuation. (with B) G. Make an exception for line continuations if a line ends with a explicit string concatenation. (With C and (D or E)) H. Change line concatenation character from \+newline to just \. I. Allow implicit line continuations if a line ends with a operator that expects to be continued, like a comma inside parentheses already does. (With C) Option H has some interesting possibilities. It pretty much is a complete replacement for the current escaped newline continuation, so how it works, and what constraints it has, would need to be discussed. It's the option that would allow white space and comments after a line continuation character. Option I is interesting because it's already there inside of parentheses, and other containers. It's just haven't seen it described as an implicit line continuation before. It is my feeling that we can't change the escaped newline within strings. That need to be how it is, and it should be documented as a string feature, rather than a general line continuation token. So if line continuations outside of strings is removed, escaped newlines inside of strings will still work. There are so many possibilities here, that the only thing I'm sure of right now is to go ahead and start the process of removing implicit string concatenations (Option B), and then consider everything else as separate issues in that context. Cheers, Ron From steve at pearwood.info Sat May 18 22:35:29 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 19 May 2013 06:35:29 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <51963B4F.7090307@gmail.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> Message-ID: <5197E611.80608@pearwood.info> On 18/05/13 00:14, Ron Adam wrote: > > On 05/17/2013 06:41 AM, Steven D'Aprano wrote: >> On 17/05/13 19:32, Christian Tismer wrote: [...] > Guido's example was a situation where a comma was left out and two strings were joined inside a list without an error message. Actually, no, his error was inside a function call, and he was getting a TypeError of one too few arguments: [quote] I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). [end quote] > If you accidentally put a comma in a multi line expression inside parentheses, it becomes a tuple without an error message. > >>>> ('abc' > ... 'def', > ... 'ghi') > ('abcdef', 'ghi') I think that in a realistic example, this sort of error is less likely than it might appear from such a trivial example. Normally you don't just create a string and do nothing with it. Here's an example from my own code: standardMsg = ( "actual and expected sequences differ in length; expected %d" " items but found %d." % (len(expected), len(actual)) ) msg = self._formatMessage(msg, standardMsg) If I were to accidentally insert an unwanted comma in the middle of the concatenation, I would find out immediately. Python performs very little compile-time checking for you, and that's a virtue. The cost of this is that if you type something you didn't want, Python will do it for you regardless, and you won't find out until you try to use it. The solution is that when typing up repetitive code, you have to be a little more vigilant in Python than you would need to be in some other languages, because Python won't protect you from certain types of typo: list_of_floats = [1.2345, 2.3456, 3,4567, 4.5678] Python will not warn you that you have two ints where you expected one float. I've made this mistake, and then spent inordinate amounts of time not noticing the comma, but I still don't have much sympathy with the view that it is the responsibility of the language to protect me from this sort of typo. [...] > Guido's suggestion is just to live with using a '+'. His point was that any extra overhead wouldn't be that harmful as literal string concatenations tend to be in initiation parts of programs. I think I have found the fatal problem with that suggestion: it rules out using concatenation in docstrings at all. py> def test(): ... """Doc strings """ + "must be literals." ... py> test.__doc__ is None True The equivalent with implicit concatenation works as expected. [...] >>> The reason why Python has grown the recommendation to use parentheses >>> comes more from the absence of a good alternative. >> >> Maybe so, but now that we have multi-line expressions inside brackets, the >> need for an alternative is much reduced. > > If you use braces, you get a one item list as the result. > > Parentheses are used to change an expressions order of evaluation as well. Just for the record, I am from Australia. Like in the UK, when we talk about "brackets", we mean *any* type of bracket, whether round, square or curly. Or as Americans may say, parentheses, brackets, braces. So when I say that we have multi-line expressions inside brackets, I'm referring to the fact that all three of ( [ and { act as explicit line continuations up to their matching closing bracket. -- Steven From steve at pearwood.info Sat May 18 22:58:05 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 19 May 2013 06:58:05 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: <5197C57D.1000807@gmail.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> Message-ID: <5197EB5D.7000305@pearwood.info> On 19/05/13 04:16, Ron Adam wrote: > If implicit string concatenation is removed, it would be nice if there was an explicit replacement for it. There is a strong consensus for doing it, I don't think there is. From what I have seen, there have been nearly as many people objecting to the proposed removal as there have been people supporting it, it is only that some of the people supporting the removal are more vocal, proposing alternative after alternative, none of which are particularly nice. Single dot, ellipsis, yet another string prefix c'', forced backslashes, ampersand... Have I missed any? > but there isn't strong consensus on how to do it. > About line continuations: > > Line continuations are a related issue to string concatenations because they are used together fairly often. They might be related, but they are orthogonal. We could change one, or the other, or both, or neither. There are virtues to changing the behaviour of \ line concatenation independent of any changes made to strings. > The line continuation behaviour is a bit quarky, Do you mean "quirky"? Quarky would mean "like quark(s)", which could refer to something being like a type of German cream cheese, or possibly like fundamental subatomic particles that make up protons and neutrons. > There are a number of options that have been discussed but those haven't really been clearly spelled out so the discussion has been kind of out of focus. This seems like an overly detailed list, but the discussion has touched on pretty much all of these things. I think the goal should be to find the most cohesive combination for Python 4 and/or just go with B alone. > > > A. Do nothing. > > B. Remove implicit concatenation. > > (We could stop here, anything after this can be done later.) We can't just "remove implicit concatenation", because that will break code which is currently working perfectly. And probably it will break more working code than it will fix unnoticed broken code. So removal requires a deprecation schedule: deprecate for at least one release. The conservative approach is: * mark as deprecated in the docs in 3.4; * raise a deprecated warning in 3.5; * remove in 3.6. or even later. Any removal of functionality leads to code churn: people will be forced to change code that works now because it will stop working in the future. That's a serious cost even when there are clear and obvious benefits to the removal. -- Steven From ron3200 at gmail.com Sun May 19 00:20:55 2013 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 18 May 2013 17:20:55 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: <5197EB5D.7000305@pearwood.info> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> Message-ID: <5197FEC7.8080101@gmail.com> On 05/18/2013 03:58 PM, Steven D'Aprano wrote: > On 19/05/13 04:16, Ron Adam wrote: > >> If implicit string concatenation is removed, it would be nice if there >> was an explicit replacement for it. There is a strong consensus for >> doing it, > > I don't think there is. From what I have seen, there have been nearly as > many people objecting to the proposed removal as there have been people > supporting it, ... Correct, there isn't a very strong consensus for the removal. But the discussion has been focused more on a replacement than on a eventual future removal down the road. If it was to be removed (as I said), there is a strong consensus for some sort of an explicit variation to replace it. But there isn't any agreement on how to do that. The discussion is split between not removing it and removing it with some sort of replacement. We need to know how many people are ok with removing it even if a replacement is not found. (It doesn't mean one won't be found.) it is only that some of the people supporting the removal > are more vocal, proposing alternative after alternative, none of which are > particularly nice. Single dot, ellipsis, yet another string prefix c'', > forced backslashes, ampersand... Have I missed any? >> but there isn't strong consensus on how to do it. > >> About line continuations: >> >> Line continuations are a related issue to string concatenations because >> they are used together fairly often. > > They might be related, but they are orthogonal. We could change one, or the > other, or both, or neither. There are virtues to changing the behaviour of > \ line concatenation independent of any changes made to strings. I agree. >> The line continuation behaviour is a bit quarky, > > Do you mean "quirky"? Quarky would mean "like quark(s)", which could refer > to something being like a type of German cream cheese, or possibly like > fundamental subatomic particles that make up protons and neutrons. LOL.. Yes quirky. Definitely not the cheese. ;-) >> There are a number of options that have been discussed but those haven't >> really been clearly spelled out so the discussion has been kind of out of >> focus. This seems like an overly detailed list, but the discussion has >> touched on pretty much all of these things. I think the goal should be >> to find the most cohesive combination for Python 4 and/or just go with B >> alone. >> >> >> A. Do nothing. >> >> B. Remove implicit concatenation. >> >> (We could stop here, anything after this can be done later.) > > > We can't just "remove implicit concatenation", because that will break code > which is currently working perfectly. And probably it will break more > working code than it will fix unnoticed broken code. > > So removal requires a deprecation schedule: deprecate for at least one > release. The conservative approach is: > > * mark as deprecated in the docs in 3.4; > > * raise a deprecated warning in 3.5; > > * remove in 3.6. Correct, and is why I believe we should start the process... with the intention of doing it in python 4 or possibly earlier if there is support for doing it sooner. (I had put that in, but it got edited out.) Any way, this is my vote. > or even later. Any removal of functionality leads to code churn: people > will be forced to change code that works now because it will stop working > in the future. That's a serious cost even when there are clear and obvious > benefits to the removal. I agree with this also. I think starting the process now and depreciating it sooner rather than later would help reduce the code churn down the road. It will also help focus any future discussions of the additional features in the light of implicit concatenation being removed. Cheers, Ron From haoyi.sg at gmail.com Sun May 19 04:13:12 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Sat, 18 May 2013 22:13:12 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: <5195F36E.3010905@pearwood.info> References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> <5194E41F.6000006@stoneleaf.us> <5195F36E.3010905@pearwood.info> Message-ID: Forgive me if this has been mentioned before (i don't think it has) but how about an option somehow to take the list of **kwargs as an association-list? I am approaching this from a point of view of "why am I putting everything into a hashmap just to iterate over it later", as you can see in the way the namedtuple constructor is implemented: http://docs.python.org/2/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields This may be rather out-there, and I'm not sure if it'll speed things up much, but I'm guessing iterating over an assoc list is faster than iterating over anything else. Building an assoc list is also probably faster than building anything else and it's also the most easily convertible (either to OrderedDict or unordered dict) since it preserves all information. -Haoyi On Fri, May 17, 2013 at 5:07 AM, Steven D'Aprano wrote: > On 16/05/13 23:50, Ethan Furman wrote: > >> On 05/15/2013 08:16 PM, Steven D'Aprano wrote: >> >>> >>> I don't believe it can. Hence, when order is important, you cannot use >>> keyword arguments to provide arguments *even if >>> kwargs are ordered*. But if you write your function like this: >>> >>> def create_element(tag, mapping): >>> pass >>> >>> and call it like this: >>> >>> create_element('img', OrderedDict([('alt', 'something'), ('src', >>> 'something.jpg')])) >>> >>> then you can get order for free. Yes, it's a little less convenient to >>> use a list of tuples than nice keyword syntax, >>> but that's a solution that doesn't impose any costs on code that doesn't >>> care about ordering. >>> >> >> Which 'free' are you talking about? Because if the solution requires >> extra typing and extra visual clutter, it's not free. >> > > Free like a puppy :-) > > You make a good point. Perhaps "free" was a bad choice of words. Rather, > let me say that if you need ordered keyword arguments, you can have them > *right now* without waiting for the day when you can drop support for > everything older that Python 3.4 (or whatever version gives you > order-preserving kwargs). > > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Sun May 19 20:58:11 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sun, 19 May 2013 11:58:11 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: <5197EB5D.7000305@pearwood.info> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> Message-ID: > We can't just "remove implicit concatenation", because that will break code > which is currently working perfectly. And probably it will break more > working code than it will fix unnoticed broken code. Really? Isn't the number of programs breaking roughly equal to 2, perhaps less? MarkJ Tacoma, Washington From ned at nedbatchelder.com Sun May 19 21:23:29 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 19 May 2013 15:23:29 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> Message-ID: <519926B1.30700@nedbatchelder.com> On 5/19/2013 2:58 PM, Mark Janssen wrote: >> We can't just "remove implicit concatenation", because that will break code >> which is currently working perfectly. And probably it will break more >> working code than it will fix unnoticed broken code. > Really? Isn't the number of programs breaking roughly equal to 2, perhaps less? Interesting, how did you get that number? --Ned. > MarkJ > Tacoma, Washington > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ncoghlan at gmail.com Mon May 20 00:33:50 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 08:33:50 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: <519926B1.30700@nedbatchelder.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> <519926B1.30700@nedbatchelder.com> Message-ID: On 20 May 2013 05:24, "Ned Batchelder" wrote: > > > On 5/19/2013 2:58 PM, Mark Janssen wrote: >>> >>> We can't just "remove implicit concatenation", because that will break code >>> which is currently working perfectly. And probably it will break more >>> working code than it will fix unnoticed broken code. >> >> Really? Isn't the number of programs breaking roughly equal to 2, perhaps less? > > > Interesting, how did you get that number? If it's based on the contents of these threads, be aware that at least one core developer (me) and probably more have already mostly tuned out on the grounds that the feature is obviously in wide enough use that changing it will break the world without adequate gain. We don't even have to speculate on what others might be doing, we know it would break *our* code. For example, porting Fedora to Python 3 is already going to be a pain. Breaking implicit string concatenation would be yet another road block making that transition more difficult. Cheers, Nick. > > --Ned. > > >> MarkJ >> Tacoma, Washington >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Mon May 20 14:39:05 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 20 May 2013 15:39:05 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> Message-ID: 19.05.13 21:58, Mark Janssen ???????(??): >> We can't just "remove implicit concatenation", because that will break code >> which is currently working perfectly. And probably it will break more >> working code than it will fix unnoticed broken code. > > Really? Isn't the number of programs breaking roughly equal to 2, perhaps less? One is Python interpreter itself. What is other one? From storchaka at gmail.com Mon May 20 14:40:35 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 20 May 2013 15:40:35 +0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> <519926B1.30700@nedbatchelder.com> Message-ID: 20.05.13 01:33, Nick Coghlan ???????(??): > For example, porting Fedora to Python 3 is already going to be a pain. > Breaking implicit string concatenation would be yet another road block > making that transition more difficult. It will be a good cause for people to use Python 3 (but not Python 4). From jsbueno at python.org.br Mon May 20 15:20:53 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 20 May 2013 10:20:53 -0300 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> Message-ID: On 19 May 2013 15:58, Mark Janssen wrote: >> We can't just "remove implicit concatenation", because that will break code >> which is currently working perfectly. And probably it will break more >> working code than it will fix unnoticed broken code. > > Really? Isn't the number of programs breaking roughly equal to 2, perhaps less? Actually, I find this wording somewhat offensive. I have to make use of this feature to code-in long log strings quite often,: as in human readable long strings that can't have an arbitrary amount of whitespace inside (not the case for embedded SQL/HTML snippets), and yet have to be indented along with the code. That is why my only other e-mail on this thread is about adding some syntax for auto- dedenting multiline strings. Don take me wrong, I dislike auto-concatenation just as the next guy - typing a new set of \" \" on each line sometimes makes me wonder if I shoul stop and code a plug-in for that on my editor - but currently it is the only way of making "pretty enterprise code" with long strings - but for even more verbose calls to "dedent" or explicit concatenation ? (which would not save typing the \" \" as well, just would add even more typing) But if you have an ok way of adding a long human-readable string into code with less typing and correct indentation, with the existing syntax, I'd like to know how do you do it. That would be better than saying "only 2 programs use this". js -><- > > MarkJ > Tacoma, Washington > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From ethan at stoneleaf.us Mon May 20 15:26:37 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 06:26:37 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: Message-ID: <519A248D.10305@stoneleaf.us> On 05/10/2013 05:36 PM, Michael Mitchell wrote: > On Fri, May 10, 2013 at 7:08 PM, Alexander Belopolsky wrote: > > Does this earn a point? > > x = (+ 'foo\n' > > + 'bar\n' > + 'baz\n' > ) > > > Plus doesn't make sense as a unary operator on strings. > > x = ('foo\n' + > 'bar\n' + > 'baz\n' + > '') > > This would work. Except your last line is now an empty string, and still with no trailing +. -- ~Ethan~ From rosuav at gmail.com Mon May 20 18:21:26 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 21 May 2013 02:21:26 +1000 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> Message-ID: On Mon, May 20, 2013 at 10:39 PM, Serhiy Storchaka wrote: > 19.05.13 21:58, Mark Janssen ???????(??): > >>> We can't just "remove implicit concatenation", because that will break >>> code >>> which is currently working perfectly. And probably it will break more >>> working code than it will fix unnoticed broken code. >> >> >> Really? Isn't the number of programs breaking roughly equal to 2, perhaps >> less? > > > One is Python interpreter itself. What is other one? And the other, with apologies to WS Gilbert, isn't. But it really doesn't matter. As long as that number is greater than zero, changing this will be a problem. I've not seen a single suggestion that doesn't have downsides as annoying as implicit concat's. In the absence of a *strong* alternative, I would be against any sort of change; why break code if the replacement is hardly better than the current? ChrisA From ron3200 at gmail.com Mon May 20 18:46:28 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 20 May 2013 11:46:28 -0500 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> <519926B1.30700@nedbatchelder.com> Message-ID: <519A5364.9060204@gmail.com> On 05/19/2013 05:33 PM, Nick Coghlan wrote: > If it's based on the contents of these threads, be aware that at least one > core developer (me) and probably more have already mostly tuned out on the > grounds that the feature is obviously in wide enough use that changing it > will break the world without adequate gain. We don't even have to speculate > on what others might be doing, we know it would break *our* code. Ok, so is it your opinion, that in order to remove implicit string joining, that an explicit replacement must be put in at the same time? > For example, porting Fedora to Python 3 is already going to be a pain. > Breaking implicit string concatenation would be yet another road block > making that transition more difficult. This sounds more like a general request to not make any changes, rather than something about the specific item it self. To be clear, this is going to need a long removal schedule. Nothing will probably be actually be removed before 3.7 or later. Maybe two years from now? How about this: First, lets please differentiate string continuation from string concatenation. A string continuation to be a pre-run-time alteration. A string concatenation to be a run time operation. By documenting them that way, it will help make them easier to discuss and teach to new users. Redefine a line continuation character to be strictly a \+\n sequence. That removes the "character after line continuation" errors because a '\' without a newline after it isn't technically a line continuation character. Then use the '\' except when it's at the end of a line to be the explicit string continuation character. This should be easy to do also. We could add this in sooner rather than later. I don't think it would be a difficult patch, and I also don't think it would break anything. Implicit string continuations could be depreciated at the same time with the recommendation to start using the more explicit variation. *But not remove implicit string continuations until Python 4.0.* String continuations are a similar concept to line continuations, so the reuse of '\' for it is an easy concept to learn and remember. It's also easy to explain. This does not change a '\' used inside a string. String escape codes have their own rules. Examples: foo('a' 'b'): # This won't cause an error until Python 4.0 x = 'foo\n' \ 'bar\n' \ 'baz\n' x = ( 'foo\n' # easy to see trailing commas here. \ 'bar\n' \ 'baz\n' ) x = 'foo\n' \ \ 'bar\n' \ \ 'baz\n' If we allow \+newline to work as both a string continuation and line continuation, this could be... x = 'foo\n' \ 'bar\n' \ 'baz\n' This is probably the least disruptive way to do this, and the '\' as a string continuation, is consistent with the \+\n as a line continuation. A final note ... I think we can easily allow comments after line continuations if there is no space between the '\' and the '#'. x = 'foo\n' \# This comment is removed. 'bar\n' \# The new-line at the end is not removed. 'baz\n' If when the tokenizer finds a '\' followed by a '#', then it could remove the comment, backup one, and continue. What would happen is the \+comment+\n would be converted to \+\n. No space can be between the '\' and '#' for this to work. Seems like this should already work, but the current check for an invalid character after a line continuation raises an error before this can happen. Cheers, Ron From dreamingforward at gmail.com Mon May 20 19:12:22 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 20 May 2013 10:12:22 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: <519926B1.30700@nedbatchelder.com> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> <519926B1.30700@nedbatchelder.com> Message-ID: >> Really? Isn't the number of programs breaking roughly equal to 2, perhaps >> less? > > Interesting, how did you get that number? I was making a joke using "unreasonable precision", but I would like to actually see more than that (meaning: I don't think there is) in the standard library. There just isn't much, if at all, of a programmatic reason to use such a construct. It's 1) more typing, 2) a highly improbably sequence that accidently worked by the programmer, 3) it doesn't really satisfy any conceptual separation that I can envision (putting two string literals on the same line? what possible purpose?) And this is the point -- it's more likely a programmer error. Really, I have a hard time believing that the number of programs that would break being larger than a handful. And to fix it is a no-brainer. Mark From ethan at stoneleaf.us Mon May 20 19:07:49 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 10:07:49 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <518E8FD8.8030800@stackless.com> References: <518E7A41.60903@stackless.com> <518E7EB2.30004@egenix.com> <518E8FD8.8030800@stackless.com> Message-ID: <519A5865.5020405@stoneleaf.us> On 05/11/2013 11:37 AM, Christian Tismer wrote: > On 11.05.13 19:24, M.-A. Lemburg wrote: >> On 11.05.2013 19:05, Christian Tismer wrote: >>> I think a simple stripping of white-space in >>> >>> text = s""" >>> leftmost column >>> two-char indent >>> """ >>> >>> would solve 95 % of common indentation and concatenation cases. >>> I don't think provision for merging is needed very often. >>> If text occurs deeply nested in code, then it is also quite likely to >>> be part of an expression, anyway. >>> My major use-case is text constants in a class or function that >>> is multiple lines long and should be statically ready to use without >>> calling a function. >>> >>> (here an 's' as a strip prefix, but I'm not sold on that) >> This is not a good solution for long lines where you don't want to >> have embedded line endings. Taken from existing code: >> >> _litmonth = ('(?P' >> 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|' >> 'm?r|mae|mrz|mai|okt|dez|' >> 'fev|avr|juin|juil|aou|ao?|d?c|' >> 'ene|abr|ago|dic|' >> 'out' >> ')[a-z,\.;]*') >> >> or >> raise errors.DataError( >> 'Inconsistent revenue item currency: ' >> 'transaction=%r; transaction_position=%r' % >> (transaction, transaction_position)) >> > > Your first example is a regex, which could be used as-is. If implicit string concatenation goes away, how can the regex be used as-is? > Your second example is indented five levels deep. That is a coding > style which I would propose to write differently for better readability. > And if you stick with it, why not use the "+"? > > I want to support constant strings, which should not be somewhere > in the middle of code. Your second example is computed, anyway, > not the case that I want to solve. You may not want to solve it, but it needs solving if ISC goes away. -- ~Ethan~ From ethan at stoneleaf.us Mon May 20 19:13:00 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 10:13:00 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: References: <20130516022436.GA86816@cskk.homeip.net> Message-ID: <519A599C.6020603@stoneleaf.us> On 05/15/2013 07:54 PM, Andrew Barnert wrote: > > Implicit concatenation is bad because you often use it accidentally when you intended a comma. I don't think anybody has said they get bit often, just that's it can be painful when they do. I forget the comma once or twice a year -- I'm willing to pay that bit of pain for the convenience. So, yeah, I'm reversing my vote to -1 unless something equally simple and easy on the eyes is developed. -- ~Ethan~ From g.brandl at gmx.net Mon May 20 19:43:39 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 20 May 2013 19:43:39 +0200 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> <519926B1.30700@nedbatchelder.com> Message-ID: Am 20.05.2013 00:33, schrieb Nick Coghlan: > > On 20 May 2013 05:24, "Ned Batchelder" > > wrote: >> >> >> On 5/19/2013 2:58 PM, Mark Janssen wrote: >>>> >>>> We can't just "remove implicit concatenation", because that will break code >>>> which is currently working perfectly. And probably it will break more >>>> working code than it will fix unnoticed broken code. >>> >>> Really? Isn't the number of programs breaking roughly equal to 2, perhaps less? >> >> >> Interesting, how did you get that number? > > If it's based on the contents of these threads, be aware that at least one core > developer (me) and probably more have already mostly tuned out on the grounds > that the feature is obviously in wide enough use that changing it will break the > world without adequate gain. We don't even have to speculate on what others > might be doing, we know it would break *our* code. Yep. I just look at this thread every now and then to marvel at the absurdly complicated ideas people come up with to replace something straightforward :) Georg From ethan at stoneleaf.us Mon May 20 19:22:17 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 10:22:17 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <1368684370.87363.YahooMailNeo@web184705.mail.ne1.yahoo.com> References: <20130516022436.GA86816@cskk.homeip.net> <5194516C.5020709@pearwood.info> <1368684370.87363.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <519A5BC9.3030503@stoneleaf.us> On 05/15/2013 11:06 PM, Andrew Barnert wrote: > From: Steven D'Aprano >> Andrew Barnert wrote: >>> >>> Implicit concatenation is bad because you often use it accidentally when >> you intended a comma. >> >> For some definition of "often". > > Well, yes. But Guido says he makes this mistake often, and others agree with him, and the whole discussion wouldn't have come up if it weren't a problem. So, we're still left with the conclusion: Actually, Guido said: > > This is a fairly common mistake [...] Which I understood to mean, "we all make this mistake," not necessarily that we all make this mistake often. -- ~Ethan~ From jimjjewett at gmail.com Tue May 21 06:07:33 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 21 May 2013 00:07:33 -0400 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <5197E611.80608@pearwood.info> References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <5197E611.80608@pearwood.info> Message-ID: On Sat, May 18, 2013 at 4:35 PM, Steven D'Aprano wrote: > ... Normally you don't just create > a string and do nothing with it. I do. Or, rather, I assign it to a temp name, and then use that temp name in the next line -- temp variables seems less ugly than line continuations. > I think I have found the fatal problem with that suggestion: it rules out > using concatenation in docstrings at all. I think a better solution would be to loosen the requirements for docstrings. Reasonably harmless proposals include (1) Always use the first expression, if it isn't a statement. (2) str( < the above > ) (3) Special treatment for __doc__, such as __doc__ = ... -jJ From jsbueno at python.org.br Tue May 21 14:07:29 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 21 May 2013 09:07:29 -0300 Subject: [Python-ideas] New str whitespace cleaning method? Message-ID: What about an STR method for just formatting whitespace? I propose starting with something that would just replace all occurrences of white-space sequences with a single white-space, and another mode that does the same, but preserves newlines. That way multiline strings could be used in a straight way to enter any desired construct. I know of textwrap.dedent - but it is not quite the same behavior, and this is a case wher having method call postfixed to the string has clear advantages over a function call with the same string - since the function name would have to be placed between the "strign destiantion" - which denotes its purpose, and the string text itself: Ex.: log.warn(dedent("""Hello dear sir, I am sorry to inform you the spanish inquisition has arrived""")) Against: log.warn("""Hello dear sir, I am sorry to inform you the spanish inquisition has arrived""".lint()) That would put everything on the same line as per my proposal - calling "lint(newlines=True)" would preserve the line breaks, and maybe ".lint(strict=True)" would simply strip all whitespace before and after any newline (as opposed to reduce it to a single white space) In time: I have no good strong preference on such method's name, but it rather be a short name. like "lint" and not "space_lint()" Justficative: With the discussions going on about deprecating implicit string concatenation, we we are struggling around good alternatives for entering in code long strings - with less typing, more readability, and more control over the final string. (I just came around this code that has to mix in javascript snippets, and - definitely, implicit concatenation does not make me happy - still, there is no clear way out among the proposals so far: https://github.com/collective/collective.z3cform.datetimewidget/blob/master/src/collective/z3cform/datetimewidget/widget_date.py The poor options of current state is clearly visible there: in up to 4 lines, the author keeps with multi-line implicit concatenation - more lines than that, he just gives up and go for a multiline string - with all the spacing issues on the resulting string. (Imagine if they where Python snippets instead) ) From storchaka at gmail.com Tue May 21 14:52:30 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 21 May 2013 15:52:30 +0300 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: 21.05.13 15:07, Joao S. O. Bueno ???????(??): > What about an STR method for just formatting whitespace? > > I propose starting with something that would just replace all occurrences of > white-space sequences with a single white-space, ' '.join(text.split()) > and another mode that does the same, but preserves newlines. '\n'.join(' '.join(line.split()) for line in text.split('\n')) From ron3200 at gmail.com Tue May 21 16:13:42 2013 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 21 May 2013 09:13:42 -0500 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: <519B8116.5080908@gmail.com> On 05/21/2013 07:07 AM, Joao S. O. Bueno wrote: > What about an STR method for just formatting whitespace? > > I propose starting with something that would just replace all occurrences of > white-space sequences with a single white-space, and another mode that > does the same, but preserves newlines. > > That way multiline strings could be used in a straight way to enter > any desired construct. > > I know of textwrap.dedent - but it is not quite the same behavior, and this is a > case wher having method call postfixed to the string has clear advantages > over a function call with the same string - since the function name > would have to be placed between the "strign destiantion" - which > denotes its purpose, and the string text itself: > Ex.: > > log.warn(dedent("""Hello dear sir, > I am sorry to inform you > the spanish inquisition > has arrived""")) > > Against: > log.warn("""Hello dear sir, > I am sorry to inform you > the spanish inquisition > has arrived""".lint()) If the text can be reformatted then use the fill function from the textwrap module. result = fill(text, width=40) # whitespace cleanup + wrap text. Just make the width longer than the string to not wrap it. I prefer to assign the text to a name first. message = \ """ Hello dear sir, I am sorry to inform you the spanish inquisition has arrived """ log.warn(fill(message, width=40)) -Ron From jsbueno at python.org.br Tue May 21 16:18:45 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 21 May 2013 11:18:45 -0300 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: On 21 May 2013 09:52, Serhiy Storchaka wrote: > 21.05.13 15:07, Joao S. O. Bueno ???????(??): > >> What about an STR method for just formatting whitespace? >> >> I propose starting with something that would just replace all occurrences >> of >> white-space sequences with a single white-space, > > > ' '.join(text.split()) Indeed, this describes the intented effect - but for "day to day" coding, it implies changing the code that goes message("my long string " "bla bla bla " ) for message (" ".join("""my long string bla bla bla""".split())) Which does not exactly fits my concept of a readable and simple way to enter long strings. > > >> and another mode that does the same, but preserves newlines. > > > '\n'.join(' '.join(line.split()) for line in text.split('\n')) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From storchaka at gmail.com Tue May 21 17:19:40 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 21 May 2013 18:19:40 +0300 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: 21.05.13 17:18, Joao S. O. Bueno ???????(??): > On 21 May 2013 09:52, Serhiy Storchaka wrote: >> 21.05.13 15:07, Joao S. O. Bueno ???????(??): >>> What about an STR method for just formatting whitespace? >>> I propose starting with something that would just replace all occurrences of >>> white-space sequences with a single white-space, >> ' '.join(text.split()) > > Indeed, this describes the intented effect - > but for "day to day" coding, it implies changing the code that goes > > message("my long string " > "bla bla bla " > ) > > for > > message (" ".join("""my long string > bla bla bla""".split())) > > Which does not exactly fits my concept of a readable and simple way to > enter long strings. Of course you can define a one-line function. def STR(text): return ' '.join(str.split(text)) From abarnert at yahoo.com Tue May 21 17:52:42 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 21 May 2013 08:52:42 -0700 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: On May 21, 2013, at 8:19, Serhiy Storchaka wrote: > 21.05.13 17:18, Joao S. O. Bueno ???????(??): >> On 21 May 2013 09:52, Serhiy Storchaka wrote: >>> 21.05.13 15:07, Joao S. O. Bueno ???????(??): >>>> What about an STR method for just formatting whitespace? >>>> I propose starting with something that would just replace all occurrences of >>>> white-space sequences with a single white-space, >>> ' '.join(text.split()) >> >> Indeed, this describes the intented effect - >> but for "day to day" coding, it implies changing the code that goes >> >> message("my long string " >> "bla bla bla " >> ) >> >> for >> >> message (" ".join("""my long string >> bla bla bla""".split())) >> >> Which does not exactly fits my concept of a readable and simple way to >> enter long strings. > > Of course you can define a one-line function. > > def STR(text): return ' '.join(str.split(text)) If you read his example, he's clearly looking for a method on the str class, not a free function (presumably to avoid having to add even more stuff to the left of the string that would force you to indent it even further and therefore potentially wrap to even more lines). From abarnert at yahoo.com Tue May 21 17:58:14 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 21 May 2013 08:58:14 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful? In-Reply-To: <519A5BC9.3030503@stoneleaf.us> References: <20130516022436.GA86816@cskk.homeip.net> <5194516C.5020709@pearwood.info> <1368684370.87363.YahooMailNeo@web184705.mail.ne1.yahoo.com> <519A5BC9.3030503@stoneleaf.us> Message-ID: On May 20, 2013, at 10:22, Ethan Furman wrote: > On 05/15/2013 11:06 PM, Andrew Barnert wrote: >> From: Steven D'Aprano >>> Andrew Barnert wrote: >>>> >>>> Implicit concatenation is bad because you often use it accidentally when >>> you intended a comma. >>> >>> For some definition of "often". >> >> Well, yes. But Guido says he makes this mistake often, and others agree with him, and the whole discussion wouldn't have come up if it weren't a problem. So, we're still left with the conclusion: > > Actually, Guido said: >> >> This is a fairly common mistake [...] > > Which I understood to mean, "we all make this mistake," not necessarily that we all make this mistake often. If your point is that Guido didn't think we make the mistake often enough that it's a problem worth solving, that's clearly not true, or he wouldn't have suggested changing the language. If you just want to rewrite the summary as "Implicit concatenation is bad because you use it accidentally when you intended a comma often enough to cause problems" instead of just "often", fine. But how does that change anything meaningful? From abarnert at yahoo.com Tue May 21 18:40:34 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 21 May 2013 09:40:34 -0700 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> <5194E41F.6000006@stoneleaf.us> <5195F36E.3010905@pearwood.info> Message-ID: On May 18, 2013, at 19:13, Haoyi Li wrote: > Forgive me if this has been mentioned before (i don't think it has) but how about an option somehow to take the list of **kwargs as an association-list? The question is, how would you _specify_ that option? The best way I can think of is a function attribute, with a decorator to set the attribute. Similar to my earlier suggestion for a function attribute that takes a constructor callable. Your idea is simpler conceptually, but it's not much simpler to use, and it's actually more complicated in implementation. The existing function calling machinery explicitly uses mapping functionality, at least in CPython and PyPy. Not that it would be _hard_ to rewrite it around a sequence instead, but it would still be harder than not doing so. > I am approaching this from a point of view of "why am I putting everything into a hashmap just to iterate over it later", as you can see in the way the namedtuple constructor is implemented: > > http://docs.python.org/2/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields > > This may be rather out-there, and I'm not sure if it'll speed things up much, but I'm guessing iterating over an assoc list is faster than iterating over anything else. Building an assoc list is also probably faster than building anything else and it's also the most easily convertible (either to OrderedDict or unordered dict) since it preserves all information. But you're forgetting that the all existing kwargs code would get slower if we first built a list of pairs and then constructed a dict from it, as would any new code that wants to do lookup by name. So, you're slowing down the 90% case to speed up the 10% case. Also, the existing functionality is something like this pseudocode: kwargs = dict(starstarargs) for arg, val in zip(namedargs, namedvals): if arg not in f.paramnames: kwargs[arg] = val (I linked to the actual CPython and PyPy code earlier in the thread.) So, if performance actually matters, presumably you're going to hash the names anyway to do that in check, at which point the biggest cost of using a dict is already incurred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Tue May 21 19:37:22 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Tue, 21 May 2013 13:37:22 -0400 Subject: [Python-ideas] OrderedDict literals In-Reply-To: <1360789370.24461.140661191087829.377CE510@webmail.messagingengine.com> References: <510D20B6.5040705@pearwood.info> <1360789370.24461.140661191087829.377CE510@webmail.messagingengine.com> Message-ID: <1369157842.13112.140661233889733.0E019C5C@webmail.messagingengine.com> Since the _real_ issue seems to be wanting an easy way to construct an OrderedDict, I thought the earlier discussion about frozenset literals might be relevant. On Wed, Feb 13, 2013, at 17:02, random832 at fastmail.us wrote: > On Sat, Feb 2, 2013, at 9:20, Steven D'Aprano wrote: > > Unfortunately the proposal to use f{ ... } for frozen sets cannot work > > within the constraints of Python's lexer: > > > > http://mail.python.org/pipermail/python-3000/2008-January/011838.html > > > > Unfortunately we're running out of useful, easy to enter symbols for > > literals. Until such time (Python4000 perhaps, or more likely Python5000) > > as we can use a rich set of Unicode literals, I don't think there is any > > clear way to have a frozenset literal. > > I was going to post about not being sure what the objection is (if it's > multiple tokens, let it be multiple tokens - the contents are multiple > tokens anyway - and saying it would block a future syntax extension > doesn't seem like a reasonable objection to a proposed syntax > extension), but I had a new idea so I'll post that instead: > > { as frozenset { ... } } > > The sequence "{ as" can't occur (to my knowledge) anywhere now. So, the > thing after it is a keyword in that context (and only that context, > otherwise "frozenset" remains an identifier naming an ordinary builtin) > and specifies what kind of literal the following sequence is. You could > also extend it to alternate forms for some other builtin types - for > example { as bytes [1, 2, 3, 4, 5] } instead of b"\x1\x2\x3\x4\x5". > Or... { as set { } } > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Random832 From ethan at stoneleaf.us Wed May 22 00:39:33 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 21 May 2013 15:39:33 -0700 Subject: [Python-ideas] Implicit string literal concatenation considered harmful (options) In-Reply-To: References: <5194EFC7.4060409@mrabarnett.plus.com> <5194F470.2030007@mrabarnett.plus.com> <519507E6.4030701@mrabarnett.plus.com> <51950FF9.1000803@stackless.com> <5195198A.3040000@mrabarnett.plus.com> <5195F939.5030305@stackless.com> <51961778.6080905@pearwood.info> <51963B4F.7090307@gmail.com> <1f84a343-a0fd-49de-80c4-7551c07155d6@googlegroups.com> <5197C57D.1000807@gmail.com> <5197EB5D.7000305@pearwood.info> <519926B1.30700@nedbatchelder.com> Message-ID: <519BF7A5.9010201@stoneleaf.us> On 05/20/2013 10:12 AM, Mark Janssen wrote: >>> Really? Isn't the number of programs breaking roughly equal to 2, perhaps >>> less? >> >> Interesting, how did you get that number? > > I was making a joke using "unreasonable precision", but I would like > to actually see more than that (meaning: I don't think there is) in > the standard library. There just isn't much, if at all, of a > programmatic reason to use such a construct. It's 1) more typing, 2) > a highly improbably sequence that accidently worked by the programmer, > 3) it doesn't really satisfy any conceptual separation that I can > envision (putting two string literals on the same line? what possible > purpose?) > > And this is the point -- it's more likely a programmer error. Really, > I have a hard time believing that the number of programs that would > break being larger than a handful. And to fix it is a no-brainer. On the same line is probably rare, I agree. On different lines it is very common. Much more common than the number of errors generated by the forgotten comma. -- ~Ethan~ From ron3200 at gmail.com Wed May 22 06:13:04 2013 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 21 May 2013 23:13:04 -0500 Subject: [Python-ideas] Line continuations with comments Message-ID: There were a few people who liked the idea of having comments after a line continuation. I was able to make a small patch that removed the some of the restrictions on the '\' for testing some ideas which does the following. * Allow a line to continue on the same line. * Skips comments before checking for the new line after a back slash. Here are some examples... These are technically the same. >>> 'aaa' \ ... 'bbb' \ ... 'ccc' 'aaabbbccc' >>> 'aaa' \ 'bbb' \ 'ccc' 'aaabbbccc' Yes there is't much need for this, but I wanted to see if it would work and if the test suit passes. It does. ;-) You can put a comment after a line continuation. >>> 'aaa' \# one ... 'bbb' \# two ... 'ccc' # three 'aaabbbccc' Works with expressions too. >>> result = \ ... + 111 \# A ... + 222 \# B ... + 333 \# C ... + 444 # D >>> result 1110 But if it has a space between the \ and the #, the line is continued on the same line instead of the following line. >>> 'aaa' \ #comment 'aaa' The reason \# works, but not \ #, is when the comment comes directly after the back slash, it's removed and leaves a (backslash + new-line) pair. Removing the white space before the new line check caused some errors in the test suite. I haven't figured out why yet. So this doesn't do that for now. Currently you get this if you try any of these examples. >>> 'abc' \#comment File "", line 1 'abc' \#comment ^ SyntaxError: unexpected character after line continuation character Only one of pythons tests fail, and I don't think it's related. test test_urllib2_localnet failed See the diff below if you want to play with it. It's not big. Cheers, Ron diff -r 155e6fb309f5 Parser/tokenizer.c --- a/Parser/tokenizer.c Tue May 21 21:02:04 2013 +0200 +++ b/Parser/tokenizer.c Tue May 21 22:10:31 2013 -0500 @@ -1391,18 +1391,31 @@ again: tok->start = NULL; + + c = tok_nextc(tok); + + /* Check if continuing line */ + if (tok->cont_line == 1 && c == '\n') { + tok->cont_line = 0; + c = tok_nextc(tok); + } + /* Skip spaces */ - do { + while (c == ' ' || c == '\t' || c == '\014') { c = tok_nextc(tok); - } while (c == ' ' || c == '\t' || c == '\014'); + tok->cont_line = 0; + } /* Set start of current token */ tok->start = tok->cur - 1; /* Skip comment */ - if (c == '#') + if (c == '#') { while (c != EOF && c != '\n') c = tok_nextc(tok); + tok_backup(tok, c); + goto again; + } /* Check for EOF and errors now */ if (c == EOF) { @@ -1641,12 +1654,6 @@ /* Line continuation */ if (c == '\\') { - c = tok_nextc(tok); - if (c != '\n') { - tok->done = E_LINECONT; - tok->cur = tok->inp; - return ERRORTOKEN; - } tok->cont_line = 1; goto again; /* Read next line */ } From mikegraham at gmail.com Wed May 22 15:41:51 2013 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 22 May 2013 09:41:51 -0400 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: Backslash line continuations are mostly to be avoided and making a change like this would seem to a. make them slightly less obvious when they are used, b. increase their use in cases where there aren't even long lines of code involved, and c. seem to encourage their use in general. It seems to me that using parentheses is an already-existing, somewhat-better way to do what you're doing in your examples. Mike On Wed, May 22, 2013 at 12:13 AM, Ron Adam wrote: > > There were a few people who liked the idea of having comments after a line > continuation. > > I was able to make a small patch that removed the some of the restrictions > on the '\' for testing some ideas which does the following. > > * Allow a line to continue on the same line. > > * Skips comments before checking for the new line > after a back slash. > > > Here are some examples... > > > These are technically the same. > > >>> 'aaa' \ > ... 'bbb' \ > ... 'ccc' > 'aaabbbccc' > > >>> 'aaa' \ 'bbb' \ 'ccc' > 'aaabbbccc' > > Yes there is't much need for this, but I wanted to see if it would work > and if the test suit passes. It does. ;-) > > > You can put a comment after a line continuation. > > >>> 'aaa' \# one > ... 'bbb' \# two > ... 'ccc' # three > 'aaabbbccc' > > > Works with expressions too. > > >>> result = \ > ... + 111 \# A > ... + 222 \# B > ... + 333 \# C > ... + 444 # D > >>> result > 1110 > > > But if it has a space between the \ and the #, the line is continued on > the same line instead of the following line. > > >>> 'aaa' \ #comment > 'aaa' > > The reason \# works, but not \ #, is when the comment comes directly after > the back slash, it's removed and leaves a (backslash + new-line) pair. > > Removing the white space before the new line check caused some errors in > the test suite. I haven't figured out why yet. So this doesn't do that > for now. > > > > Currently you get this if you try any of these examples. > > >>> 'abc' \#comment > File "", line 1 > 'abc' \#comment > ^ > SyntaxError: unexpected character after line continuation character > > > > > Only one of pythons tests fail, and I don't think it's related. > > test test_urllib2_localnet failed > > > > See the diff below if you want to play with it. It's not big. > > Cheers, > Ron > > > > > diff -r 155e6fb309f5 Parser/tokenizer.c > --- a/Parser/tokenizer.c Tue May 21 21:02:04 2013 +0200 > +++ b/Parser/tokenizer.c Tue May 21 22:10:31 2013 -0500 > @@ -1391,18 +1391,31 @@ > > again: > tok->start = NULL; > + > + c = tok_nextc(tok); > + > + /* Check if continuing line */ > + if (tok->cont_line == 1 && c == '\n') { > + tok->cont_line = 0; > + c = tok_nextc(tok); > + } > + > /* Skip spaces */ > - do { > + while (c == ' ' || c == '\t' || c == '\014') { > c = tok_nextc(tok); > - } while (c == ' ' || c == '\t' || c == '\014'); > + tok->cont_line = 0; > + } > > /* Set start of current token */ > tok->start = tok->cur - 1; > > /* Skip comment */ > - if (c == '#') > + if (c == '#') { > while (c != EOF && c != '\n') > c = tok_nextc(tok); > + tok_backup(tok, c); > + goto again; > + } > > /* Check for EOF and errors now */ > if (c == EOF) { > @@ -1641,12 +1654,6 @@ > > /* Line continuation */ > if (c == '\\') { > - c = tok_nextc(tok); > - if (c != '\n') { > - tok->done = E_LINECONT; > - tok->cur = tok->inp; > - return ERRORTOKEN; > - } > tok->cont_line = 1; > goto again; /* Read next line */ > } > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Wed May 22 16:11:41 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 22 May 2013 10:11:41 -0400 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: On Wed, May 22, 2013 at 9:41 AM, Mike Graham wrote: > Backslash line continuations are mostly to be avoided and making a change > like this would seem to [snip: mostly make them more usable] The way I see it, either one believes that backslashes belong in Python -- and therefore they should be made as useful as possible -- or that they do not -- and therefore they should be crippled. But if they don't belong in Python, they shouldn't be crippled, rather, they shouldn't even exist. A compromise should at least be internally consistent. -- Devin From mikegraham at gmail.com Wed May 22 16:31:15 2013 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 22 May 2013 10:31:15 -0400 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: On Wed, May 22, 2013 at 10:11 AM, Devin Jeanpierre wrote: > The way I see it, either one believes that backslashes belong in > Python -- and therefore they should be made as useful as possible -- > or that they do not -- and therefore they should be crippled. But if > they don't belong in Python, they shouldn't be crippled, rather, they > shouldn't even exist. > > A compromise should at least be internally consistent. > With changes a while back making backslash continuations never strictly necessary, it seems like officially declaring backslash deprecated might be reasonable. They are discouraged by PEP8 and other style guides and at this point they violate the "There should be one-- and preferably only one --obvious way to do it." principle. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Wed May 22 16:54:02 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 22 May 2013 10:54:02 -0400 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: <1369234442.31182.140661234304333.4A2983F4@webmail.messagingengine.com> On Wed, May 22, 2013, at 10:31, Mike Graham wrote: > With changes a while back making backslash continuations never strictly > necessary, Someone pointed out an example a while back of them being necessary (except for being vacuously unnecessary because you can just make an arbitrarily long physical line): multiple values in a with statement. From bruce at leapyear.org Wed May 22 18:02:16 2013 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 22 May 2013 09:02:16 -0700 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: On Wed, May 22, 2013 at 6:41 AM, Mike Graham wrote: > Backslash line continuations are mostly to be avoided and making a change > like this would seem to > > a. make them slightly less obvious when they are used, > b. increase their use in cases where there aren't even long lines of code > involved, and > c. seem to encourage their use in general. > > It seems to me that using parentheses is an already-existing, > somewhat-better way to do what you're doing in your examples. > > Your last statement seems to be missing the point of the larger discussion. Yes, parenthesis can be used in most cases where someone might use \ continuation. There seems to be strong sentiment to *not* remove \ continuation. Given that, is allowing comments after a continuation a reasonable change? I think so. Notwithstanding that, these discussions are moving away from Guido's original comment about being bitten by implicit continuation of strings and not moving towards consensus. Let me throw in a few facts: 1) There are bugs caused by unintended implicit string concatenation. 2) Using + as it exists now is not a drop-in replacement for implicit string concatenation as it is a run-time operation and has a different precedence than the implicit concatenation. 3) There are programs that use implicit string concatenation that will need to be fixed if the feature is removed. 4) There are programs that use \ continuation that will need to be fixed if the feature is removed. 5) Explicit is better than implicit. Personally, I would endorse deprecating and eventually removing implicit string concatenation and adding a syntax (not an operator) for explicit run-time string concatenation. The use of \ continuation as that syntax seems to me like a reasonable choice if we assume that this feature isn't going away. In particular, it works today so it's easy to start using it and linters can look for it. However, it's pointless to bikeshed the choice of syntax if there's no consensus that there should be an explicit syntax in the first place. On Tue, May 21, 2013 at 9:13 PM, Ron Adam wrote: > I was able to make a small patch that removed the some of the restrictions > on the '\' for testing some ideas which does the following. > > * Allow a line to continue on the same line. > > Aside from serving as a marker for explicit string concatenation, this means that I can freely sprinkle in backslashes anywhere I want, like this: foo \ = \ bar + \ \ 3 That seems like a bad idea. > > > The reason \# works, but not \ #, is when the comment comes directly after > the back slash, it's removed and leaves a (backslash + new-line) pair. > Whether we require there be no space between \ and # or, conversely, require there be at least one whitespace character should not be based on the relative ease of patching the current code. Personally, I would prefer the latter as I believe requiring a space after \ will increase readability. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed May 22 18:05:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 22 May 2013 09:05:10 -0700 Subject: [Python-ideas] Line continuations with comments In-Reply-To: <1369234442.31182.140661234304333.4A2983F4@webmail.messagingengine.com> References: <1369234442.31182.140661234304333.4A2983F4@webmail.messagingengine.com> Message-ID: <462CBED1-4900-4A4C-9E88-6368FDFB102F@yahoo.com> On May 22, 2013, at 7:54, random832 at fastmail.us wrote: > On Wed, May 22, 2013, at 10:31, Mike Graham wrote: >> With changes a while back making backslash continuations never strictly >> necessary, > > Someone pointed out an example a while back of them being necessary > (except for being vacuously unnecessary because you can just make an > arbitrarily long physical line): multiple values in a with statement. Well, you can always break within each expression, adding extra parens within the expression if necessary. But often, this looks even worse than backslashes. with closing(NSWhyDoesAppleUseSuchLongNames( NSLongNamedConstant)) as thing1, closing( NSAnotherLongFunction( NSAnotherLongConstant)) as thing2 I may have got this wrong because I'm typing on a phone, but you get the idea. This is legal, and it follows PEP8, and it doesn't require a backslash. But it's horrible. Another alternative is to bind all those silly PyObjC names to shorter names--but then, when you're reading the code, you can't immediately tell what it does. I don't _always_ have to use backslashes in code using ridiculous names like this to make it readable, but sometimes they are the best solution. So, even though I don't know that they're _necessary_, I still don't want backslashes deprecated. But I also don't want them expanded. From rosuav at gmail.com Wed May 22 18:08:58 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 23 May 2013 02:08:58 +1000 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: On Thu, May 23, 2013 at 12:31 AM, Mike Graham wrote: > On Wed, May 22, 2013 at 10:11 AM, Devin Jeanpierre > wrote: >> >> The way I see it, either one believes that backslashes belong in >> Python -- and therefore they should be made as useful as possible -- >> or that they do not -- and therefore they should be crippled. But if >> they don't belong in Python, they shouldn't be crippled, rather, they >> shouldn't even exist. >> >> A compromise should at least be internally consistent. > > > With changes a while back making backslash continuations never strictly > necessary, it seems like officially declaring backslash deprecated might be > reasonable. They are discouraged by PEP8 and other style guides and at this > point they violate the "There should be one-- and preferably only one > --obvious way to do it." principle. Maybe the backslash should be considered on par with goto - some use it occasionally and reckon it's vital, others never use it and consider it superfluous. It's not (usually) a problem to have it in the language, it's not deprecated, but style guides advise against its use. ChrisA From ron3200 at gmail.com Thu May 23 01:15:59 2013 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 22 May 2013 18:15:59 -0500 Subject: [Python-ideas] Line continuations with comments In-Reply-To: References: Message-ID: <519D51AF.1080505@gmail.com> On 05/22/2013 11:02 AM, Bruce Leban wrote: > > On Wed, May 22, 2013 at 6:41 AM, Mike Graham > > wrote: > > Backslash line continuations are mostly to be avoided and making a > change like this would seem to > > a. make them slightly less obvious when they are used, > b. increase their use in cases where there aren't even long lines of > code involved, and > c. seem to encourage their use in general. > > It seems to me that using parentheses is an already-existing, > somewhat-better way to do what you're doing in your examples. > > Your last statement seems to be missing the point of the larger discussion. > Yes, parenthesis can be used in most cases where someone might use \ > continuation. There seems to be strong sentiment to *not* remove \ > continuation. Given that, is allowing comments after a continuation a > reasonable change? I think so. > > Notwithstanding that, these discussions are moving away from Guido's > original comment about being bitten by implicit continuation of strings and > not moving towards consensus. Let me throw in a few facts: > > 1) There are bugs caused by unintended implicit string concatenation. There was one found in pythons own library recently where a missing comma caused an unintentional implicit concatenation. It's fixed now, but it's not clear how long it's been there. > 2) Using + as it exists now is not a drop-in replacement for implicit > string concatenation as it is a run-time operation and has a different > precedence than the implicit concatenation. > 3) There are programs that use implicit string concatenation that will need > to be fixed if the feature is removed. > 4) There are programs that use \ continuation that will need to be fixed if > the feature is removed. > 5) Explicit is better than implicit. Agree on all counts. > Personally, I would endorse deprecating and eventually removing implicit > string concatenation and adding a syntax (not an operator) for explicit > run-time string concatenation. The use of \ continuation as that syntax > seems to me like a reasonable choice if we assume that this feature isn't > going away. In particular, it works today so it's easy to start using it > and linters can look for it. I agree here too. I'm still looking for the location in pythons source code where adjacent strings are joined. The docs say this ... ----------- Note that this feature is defined at the syntactical level, but implemented at compile time. The ?+? operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting styles for each component (even mixing raw strings and triple quoted strings). ----------- I haven't found either the syntactic definition, or the compile time implementation yet. >However, it's pointless to bikeshed the choice > of syntax if there's no consensus that there should be an explicit syntax > in the first place. It seems the consensus may be dependent on what the syntax might be. > On Tue, May 21, 2013 at 9:13 PM, Ron Adam > > wrote: > > I was able to make a small patch that removed the some of the > restrictions on the '\' for testing some ideas which does the following. > > * Allow a line to continue on the same line. > > > Aside from serving as a marker for explicit string concatenation, this > means that I can freely sprinkle in backslashes anywhere I want, like this: > > foo \ = \ bar + \ \ 3 > > That seems like a bad idea. Yes, But just don't do that. ;-) And don't do this either. (works now) foo \ = \ bar + \ \ 3 Or this. (also works now) foo = ( (bar + ( ( 3)) ) ) I'm not sure why some people dislike the back slash so much as a continuation tool. Python is the language that avoids using {braces around blocks}, so you'd think it would be the other way around. I really don't think the '\' will be over used. New programmers do try a lot of strange things at first for making their own programs easier to read, (I did), but usually they come around to what the community practices and recommends. > > > The reason \# works, but not \ #, is when the comment comes directly > after the back slash, it's removed and leaves a (backslash + new-line) > pair. > > > Whether we require there be no space between \ and # or, conversely, > require there be at least one whitespace character should not be based on > the relative ease of patching the current code. Personally, I would prefer > the latter as I believe requiring a space after \ will increase readability. My preference is to allow any number of spaces before and after the '\'. and that any comments after the slash not change what it means. A built in single space rule would probably frustrate people who don't want to do it that way. Like how you get a character after a continuation error, even if it's only a single space. Yeah, it's how it works, but it's still annoying to get that error in that situation. A comment of course, would still uses up the rest of the line. So a '\' after '#' is just part of the comment. Cheers, Ron From haoyi.sg at gmail.com Thu May 23 18:36:47 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 23 May 2013 12:36:47 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> <5194E41F.6000006@stoneleaf.us> <5195F36E.3010905@pearwood.info> Message-ID: >The question is, how would you _specify_ that option? This seems like the perfect use case for function annotations, or a decorator. I imagine both cases would look rather pretty def func(**kwargs: ordered): ... @ordered_kwargs def func(**kwargs): ... I'm not sure if this is a bad idea for other reasons (e.g. decorators/annotations being reserved for library code rather than core language features) but it does look right intuitively: you are annotating the function or the kwarg to change it's behavior. Here's another thought: macros would be able to pretty trivially give you a syntax like: >>> odict(a = 1, b = 2) OrderedDict([('a', 1), ('b', 2)]) >>> odict(b = 2, a = 1) OrderedDict([('b', 2), ('a', 1)]) or >>> o%dict(a = 1, b = 2) OrderedDict([('a', 1), ('b', 2)]) >>> o%dict(b = 1, a = 1) OrderedDict([('b', 2), ('a', 1)]) for ordered dict literals. It wouldn't work for generally adding orderliness to other functions, since the macro won't know which bindings are named arguments and which bindings are **kwargs, but it also looks really pretty. Probably not something you'd want to put in the std lib, but it's fun if you want to try out the syntax in your own projects. -Haoyi On Tue, May 21, 2013 at 12:40 PM, Andrew Barnert wrote: > On May 18, 2013, at 19:13, Haoyi Li wrote: > > Forgive me if this has been mentioned before (i don't think it has) but > how about an option somehow to take the list of **kwargs as an > association-list? > > > The question is, how would you _specify_ that option? > > The best way I can think of is a function attribute, with a decorator to > set the attribute. Similar to my earlier suggestion for a function > attribute that takes a constructor callable. > > Your idea is simpler conceptually, but it's not much simpler to use, and > it's actually more complicated in implementation. > > The existing function calling machinery explicitly uses mapping > functionality, at least in CPython and PyPy. Not that it would be _hard_ to > rewrite it around a sequence instead, but it would still be harder than not > doing so. > > I am approaching this from a point of view of "why am I putting everything > into a hashmap just to iterate over it later", as you can see in the way > the namedtuple constructor is implemented: > > > http://docs.python.org/2/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields > > This may be rather out-there, and I'm not sure if it'll speed things up > much, but I'm guessing iterating over an assoc list is faster than > iterating over anything else. Building an assoc list is also probably > faster than building anything else and it's also the most easily > convertible (either to OrderedDict or unordered dict) since it preserves > all information. > > > But you're forgetting that the all existing kwargs code would get slower > if we first built a list of pairs and then constructed a dict from it, as > would any new code that wants to do lookup by name. So, you're slowing down > the 90% case to speed up the 10% case. > > Also, the existing functionality is something like this pseudocode: > > kwargs = dict(starstarargs) > for arg, val in zip(namedargs, namedvals): > if arg not in f.paramnames: > kwargs[arg] = val > > (I linked to the actual CPython and PyPy code earlier in the thread.) > > So, if performance actually matters, presumably you're going to hash the > names anyway to do that in check, at which point the biggest cost of using > a dict is already incurred. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu May 23 23:06:09 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Thu, 23 May 2013 17:06:09 -0400 Subject: [Python-ideas] Let's be more orderly! In-Reply-To: References: <5343C608-96B4-4716-9D06-38D9AFA327B2@yahoo.com> <1368640915.57259.YahooMailNeo@web184705.mail.ne1.yahoo.com> <1368653256.24377.YahooMailNeo@web184705.mail.ne1.yahoo.com> <51944F71.9080507@pearwood.info> <5194E41F.6000006@stoneleaf.us> <5195F36E.3010905@pearwood.info> Message-ID: On 5/23/2013 12:36 PM, Haoyi Li wrote: > >The question is, how would you _specify_ that option? > > This seems like the perfect use case for function annotations, or a > decorator. I imagine both cases would look rather pretty > > def func(**kwargs: ordered): > ... Guido has more or less rejected annotations because checking for an anotation would slow down every function call for the benefit of very few. > @ordered_kwargs > def func(**kwargs): > ... Returning a function with a custom ordered kwargs .__call__ method would not affect normal functions. func.__class__ cannot be changed from Python code (non-heap type) but I don't know about from C. In any case, the attributes (components) of func could be fed to an okw_function (ordered keyword function) class to produce an new object. The decorator approach immediately fails for a system without the decorator. Since it should only be used for funcs that require the ordering, I think that would be appropriate. tjr From anntzer.lee at gmail.com Fri May 24 04:10:50 2013 From: anntzer.lee at gmail.com (Antony Lee) Date: Thu, 23 May 2013 19:10:50 -0700 Subject: [Python-ideas] A posteriori implementation of an abstract method Message-ID: Currently, the implementation of abstract methods is not possible outside of the class statement: from abc import ABCMeta, abstractmethod class ABC(metaclass=ABCMeta): @abstractmethod def method(self): pass class C(ABC): pass C.method = lambda self: "actual implementation" C() results in a TypeError (complaining about abstract methods), as __abstractmethods__ is just computed once, at class definition. Of course this example is a bit contrived, but perhaps a more legitimate use case would involve a class decorator or another way to define the implementation of the abstract methods out of the class, such as @implementation(C, ABC) def method(): return "actual implementation" I believe this behavior can be "fixed" (well, I don't know yet if this should actually be considered an error, and haven't actually tried to "fix" it) by defining ABCMeta.__setattr__ properly (to update __abstractmethods__ when necessary). What do you think? Best, Antony -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Fri May 24 18:02:47 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 24 May 2013 12:02:47 -0400 Subject: [Python-ideas] A posteriori implementation of an abstract method In-Reply-To: References: Message-ID: On Thu, May 23, 2013 at 10:10 PM, Antony Lee wrote: > Currently, the implementation of abstract methods is not possible outside of > the class statement: Good! Readability is very important; separating related logic is a bad idea. There may be times when patching something is the least of evils, but it certainly shouldn't be encouraged. > from abc import ABCMeta, abstractmethod > class ABC(metaclass=ABCMeta): > @abstractmethod > def method(self): pass > class C(ABC): pass At this point, either C itself is also intended as abstract-only, or there is already an error. > C.method = lambda self: "actual implementation" > C() Much better to just create a new subclass of C. If that isn't isn't possible, then I suppose you do have to update __abstractmethods__. I usually write my "abstract" methods to either just pass (implement it if you want something to happen; safe to ignore it otherwise) or to return NotImplemented. The authors of ABC have gone out of their way to say "You MUST implement this." So requiring an extra "yes, I really meant that" step for code that violates the contract is not unreasonable. > I believe this behavior can be "fixed" (well, I don't know yet if this > should actually be considered an error, and haven't actually tried to "fix" > it) by defining ABCMeta.__setattr__ properly (to update __abstractmethods__ > when necessary). What do you think? It would work, but I don't think it would be a good idea. (1) What if the new implementation is still only a partial implementation, to used by super? Then the class *should* stay abstract. (2) Even if the new implementation of that method resolved the only enforced barrier, it still isn't clear that the class should change type. There are reasons to say "this class will never be instantiated". Removing that invariant is important enough that it should be an explicit choice, instead of an implicit side effect. -jJ From wolfgang.maier at biologie.uni-freiburg.de Fri May 24 19:38:25 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Fri, 24 May 2013 17:38:25 +0000 (UTC) Subject: [Python-ideas] New str whitespace cleaning method? References: Message-ID: Joao S. O. Bueno writes: > > What about an STR method for just formatting whitespace? > > I propose starting with something that would just replace all occurrences of > white-space sequences with a single white-space, and another mode that > does the same, but preserves newlines. > > That way multiline strings could be used in a straight way to enter > any desired construct. > > I know of textwrap.dedent - but it is not quite the same behavior, and this is a > case wher having method call postfixed to the string has clear advantages > over a function call with the same string - since the function name > would have to be placed between the "strign destiantion" - which > denotes its purpose, and the string text itself: > > Ex.: > > log.warn(dedent("""Hello dear sir, > I am sorry to inform you > the spanish inquisition > has arrived""")) > > Against: > log.warn("""Hello dear sir, > I am sorry to inform you > the spanish inquisition > has arrived""".lint()) > Haven't really thought that through, but couldn't you write a decorator that does the dedenting, and that you then use on your warn()? This way, you wouldn't have to care about the formatting at all in your actual code. Best, Wolfgang From wolfgang.maier at biologie.uni-freiburg.de Fri May 24 19:52:39 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Fri, 24 May 2013 17:52:39 +0000 (UTC) Subject: [Python-ideas] New str whitespace cleaning method? References: Message-ID: Wolfgang Maier writes: > > Joao S. O. Bueno ...> writes: > > > > > What about an STR method for just formatting whitespace? > > > > I propose starting with something that would just replace all occurrences of > > white-space sequences with a single white-space, and another mode that > > does the same, but preserves newlines. > > > > That way multiline strings could be used in a straight way to enter > > any desired construct. > > > > I know of textwrap.dedent - but it is not quite the same behavior, and > this is a > > case wher having method call postfixed to the string has clear advantages > > over a function call with the same string - since the function name > > would have to be placed between the "strign destiantion" - which > > denotes its purpose, and the string text itself: > > > > Ex.: > > > > log.warn(dedent("""Hello dear sir, > > I am sorry to inform you > > the spanish inquisition > > has arrived""")) > > > > Against: > > log.warn("""Hello dear sir, > > I am sorry to inform you > > the spanish inquisition > > has arrived""".lint()) > > > > Haven't really thought that through, but couldn't you write a decorator that > does the dedenting, and that you then use on your warn()? > This way, you wouldn't have to care about the formatting at all in your > actual code. > Best, > Wolfgang > ok, here's example code for it: from textwrap import dedent class dedentify(object): def __init__ (self, f): self.f = f def __call__ (self, message): self.f(dedent(message)) @dedentify def warn (message): print (message) warn("""\ Hello dear sir, I am sorry to inform you the spanish inquisition has arrived""") now prints: Hello dear sir, I am sorry to inform you the spanish inquisition has arrived I hope this is what you had in mind ? Wolfgang From jsbueno at python.org.br Fri May 24 20:03:23 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 24 May 2013 15:03:23 -0300 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: Come on - this is Python, and we are talking about entering a string in code - One doe snto want to do a "from myutils.stringutils import dedenter" just to be able to use long strings in code. - in other languages it might be normal to go to arbtirary leenghts of boiler plate code to do trivial stuff. But ..seeing the attention here, I guess I am indeed one of the 2 programmers in the World who needs long strings with no white-space from indentation in code and consider it should be part of the syntax, not a possibility on the stdlib, so..whatever. :-/ On 24 May 2013 14:52, Wolfgang Maier wrote: > Wolfgang Maier writes: > >> >> Joao S. O. Bueno ...> writes: >> >> > >> > What about an STR method for just formatting whitespace? >> > >> > I propose starting with something that would just replace all occurrences of >> > white-space sequences with a single white-space, and another mode that >> > does the same, but preserves newlines. >> > >> > That way multiline strings could be used in a straight way to enter >> > any desired construct. >> > >> > I know of textwrap.dedent - but it is not quite the same behavior, and >> this is a >> > case wher having method call postfixed to the string has clear advantages >> > over a function call with the same string - since the function name >> > would have to be placed between the "strign destiantion" - which >> > denotes its purpose, and the string text itself: >> > >> > Ex.: >> > >> > log.warn(dedent("""Hello dear sir, >> > I am sorry to inform you >> > the spanish inquisition >> > has arrived""")) >> > >> > Against: >> > log.warn("""Hello dear sir, >> > I am sorry to inform you >> > the spanish inquisition >> > has arrived""".lint()) >> > >> >> Haven't really thought that through, but couldn't you write a decorator that >> does the dedenting, and that you then use on your warn()? >> This way, you wouldn't have to care about the formatting at all in your >> actual code. >> Best, >> Wolfgang >> > > ok, here's example code for it: > > from textwrap import dedent > > class dedentify(object): > def __init__ (self, f): > self.f = f > > def __call__ (self, message): > self.f(dedent(message)) > > @dedentify > def warn (message): > print (message) > > warn("""\ > Hello dear sir, > I am sorry to inform you > the spanish inquisition > has arrived""") > > now prints: > Hello dear sir, > I am sorry to inform you > the spanish inquisition > has arrived > > I hope this is what you had in mind ? > Wolfgang > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From abarnert at yahoo.com Fri May 24 21:07:27 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 24 May 2013 12:07:27 -0700 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: <7BA8407D-E851-478D-9549-C06DF904281A@yahoo.com> How is one import statement "arbitrary lengths of boilerplate"? How is putting the batteries into the stdlib instead of the language not pythonic? Sent from a random iPhone On May 24, 2013, at 11:03, "Joao S. O. Bueno" wrote: > Come on - this is Python, and we are talking about entering a string in code - > One doe snto want to do a "from myutils.stringutils import dedenter" > just to be able > to use long strings in code. - in other languages it might be normal to > go to arbtirary leenghts of boiler plate code to do trivial stuff. > > But ..seeing the attention here, I guess I am indeed one of the 2 programmers > in the World who needs long strings with no white-space from indentation > in code and consider it should be part of the syntax, not a > possibility on the stdlib, > so..whatever. :-/ > > > > > On 24 May 2013 14:52, Wolfgang Maier > wrote: >> Wolfgang Maier writes: >> >>> >>> Joao S. O. Bueno ...> writes: >>> >>>> >>>> What about an STR method for just formatting whitespace? >>>> >>>> I propose starting with something that would just replace all occurrences of >>>> white-space sequences with a single white-space, and another mode that >>>> does the same, but preserves newlines. >>>> >>>> That way multiline strings could be used in a straight way to enter >>>> any desired construct. >>>> >>>> I know of textwrap.dedent - but it is not quite the same behavior, and >>> this is a >>>> case wher having method call postfixed to the string has clear advantages >>>> over a function call with the same string - since the function name >>>> would have to be placed between the "strign destiantion" - which >>>> denotes its purpose, and the string text itself: >>>> >>>> Ex.: >>>> >>>> log.warn(dedent("""Hello dear sir, >>>> I am sorry to inform you >>>> the spanish inquisition >>>> has arrived""")) >>>> >>>> Against: >>>> log.warn("""Hello dear sir, >>>> I am sorry to inform you >>>> the spanish inquisition >>>> has arrived""".lint()) >>> >>> Haven't really thought that through, but couldn't you write a decorator that >>> does the dedenting, and that you then use on your warn()? >>> This way, you wouldn't have to care about the formatting at all in your >>> actual code. >>> Best, >>> Wolfgang >> >> ok, here's example code for it: >> >> from textwrap import dedent >> >> class dedentify(object): >> def __init__ (self, f): >> self.f = f >> >> def __call__ (self, message): >> self.f(dedent(message)) >> >> @dedentify >> def warn (message): >> print (message) >> >> warn("""\ >> Hello dear sir, >> I am sorry to inform you >> the spanish inquisition >> has arrived""") >> >> now prints: >> Hello dear sir, >> I am sorry to inform you >> the spanish inquisition >> has arrived >> >> I hope this is what you had in mind ? >> Wolfgang >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From wolfgang.maier at biologie.uni-freiburg.de Fri May 24 21:47:06 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Fri, 24 May 2013 19:47:06 +0000 (UTC) Subject: [Python-ideas] New str whitespace cleaning method? References: Message-ID: Joao S. O. Bueno writes: > > Come on - this is Python, and we are talking about entering a string in code - > One doe snto want to do a "from myutils.stringutils import dedenter" > just to be able > to use long strings in code. - in other languages it might be normal to > go to arbtirary leenghts of boiler plate code to do trivial stuff. > > But ..seeing the attention here, I guess I am indeed one of the 2 programmers > in the World who needs long strings with no white-space from indentation > in code and consider it should be part of the syntax, not a > possibility on the stdlib, > so..whatever. :-/ > I guess the problem is not that nobody has a need for it, but rather that there are many different types of behaviour of that formatting method that one could imagine. You gave a few examples yourself and, personally, I'd prefer that method to remove a fixed specified number of spaces from the beginning of each line, so that you could do things like: if format_output: print ("""\ 1) Indent *only* the first line of a paragraph, 2) Have pretty-formatted numbered lists independent of the code indentation level.""".ltrim(16)) I think, it's easy enough to come up with solutions to everybodies preferences, but that doesn't mean that they have to be built into python. Best, Wolfgang From wolfgang.maier at biologie.uni-freiburg.de Fri May 24 22:06:07 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Fri, 24 May 2013 20:06:07 +0000 (UTC) Subject: [Python-ideas] New str whitespace cleaning method? References: Message-ID: One clean solution might be (now don't shout at me. I know it would be a profound change with lots of compatibility problems, and its just a wild thought that I'm not advocating for): Triple-quoted strings could be made a subclass of str, and their formatting methods (at least strip, lstrip, ljust, rjust) could overwrite those of str to operate on a per-line basis, e.g., strip() and lstrip() could remove whitespace from the end and the beginning of each line. From jimjjewett at gmail.com Fri May 24 23:46:52 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 24 May 2013 17:46:52 -0400 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: On Fri, May 24, 2013 at 4:06 PM, Wolfgang Maier wrote: > Triple-quoted strings could be made a subclass of str, Not really, but the new-prefix solution (d""" ... """) is a moral equivalent. > and their formatting > methods (at least strip, lstrip, ljust, rjust) could overwrite those of str > to operate on a per-line basis, e.g., strip() and lstrip() could remove > whitespace from the end and the beginning of each line. I think most use cases treat either all lines or all lines but the first as a block. If the 4th line is indented relative to the 3rd, that is probably intentional. Note that doctests solve a related problem by adding a prefix (">>> " or "... ") that effectively increases the indent further. -jJ From ron3200 at gmail.com Sat May 25 02:04:30 2013 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 24 May 2013 19:04:30 -0500 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: <51A0000E.5050604@gmail.com> On 05/24/2013 04:46 PM, Jim Jewett wrote: > On Fri, May 24, 2013 at 4:06 PM, Wolfgang Maier > wrote: >> >Triple-quoted strings could be made a subclass of str, > Not really, but the new-prefix solution (d""" ... """) is a moral equivalent. I don't care for dedent, (even with the 'd' prefix syntax.) It only covers about 20% of the use cases that I care about. Which is why I generally don't bother using it. A much more useful alternative is to have a margin() method on strings that will take a width argument and works independent of the indent level the expression. It's not a dedent, or indent, or maybe it's both. ;-) message = """\ A simple paragraph of several lines. """.margin(2) A simple paragraph of several lines. message = """\ A simple paragraph of several lines. """.margin(16) A simple paragraph of several lines. A margin method covers 90 percent of times I think about using dedent, and is geared more towards what I really need in those cases. A margin of 0, gives us the dedent use. message = """\ A simple paragraph of several lines. """.margin(0) A simple paragraph of several lines. And calling a margin() method multiple times gives predictable results. You could use textwraps TextWrapper class with it's initial_indent() and subsequent_indent(), along with dedent() to do the same things as above. But it's a lot of work for simple cases like these. Cheers, Ron From greg.ewing at canterbury.ac.nz Sat May 25 03:49:57 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 25 May 2013 13:49:57 +1200 Subject: [Python-ideas] New str whitespace cleaning method? In-Reply-To: References: Message-ID: <51A018C5.9060008@canterbury.ac.nz> Wolfgang Maier wrote: > personally, I'd > prefer that method to remove a fixed specified number of spaces from the > beginning of each line, > > if format_output: > print ("""\ > 1) > Indent *only* the first line > of a paragraph, > 2) > Have pretty-formatted numbered > lists > independent of the code indentation level.""".ltrim(16)) This makes it abundantly unclear how much is being stripped without a close study of the code and then carefully counting whitespace characters, being sure to count tabs as one character. Also, if you change the indentation level of the code, you have to change the embedded numeric literal to match or it breaks. I'd prefer some kind of dedented-string-literal syntax that explicitly marks how much to trim from each line, e.g. print(d"""This text has no | whitespace at the beginning of any |line except the second one, |which begins with 3 spaces.""") This would be clear to read and easy to edit. -- Greg From devyncjohnson at gmail.com Sat May 25 15:49:03 2013 From: devyncjohnson at gmail.com (Devyn Collier Johnson) Date: Sat, 25 May 2013 09:49:03 -0400 Subject: [Python-ideas] Python3 Multitasking Message-ID: <51A0C14F.7070201@Gmail.com> It may benefit many programmers if Python3 code could be multithreaded as easy as BASH code. For example, in BASH, if I wish to multithread a command, I put an ampersand at the end of the line. Then, the BASH interpreter will execute that line while continuing the execution of the script instead of waiting to finish that command. For Python3, I have two ideas for how the Python3 multitasking syntax should look. Option 1: SOME_COMMAND & Option 2: multitask(SOME_COMMAND) Thank you, Devyn Collier Johnson DevynCJohnson at Gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat May 25 16:17:11 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sat, 25 May 2013 10:17:11 -0400 Subject: [Python-ideas] Python3 Multitasking In-Reply-To: <51A0C14F.7070201@Gmail.com> References: <51A0C14F.7070201@Gmail.com> Message-ID: On 5/25/2013 9:49 AM, Devyn Collier Johnson wrote: > It may benefit many programmers if Python3 code could be multithreaded > as easy as BASH code. For example, in BASH, if I wish to multithread a > command, I put an ampersand at the end of the line. Last I knew, that makes a separate process in sh and csh. Has that changed, or is bash different? > Then, the BASH > interpreter will execute that line while continuing the execution of the > script instead of waiting to finish that command. For Python3, I have > two ideas for how the Python3 multitasking syntax should look. > > Option 1: SOME_COMMAND & > Option 2: multitask(SOME_COMMAND) Check out the threading, multiprocessing, and subprocess modules. From abarnert at yahoo.com Sun May 26 00:10:35 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 25 May 2013 15:10:35 -0700 (PDT) Subject: [Python-ideas] Python3 Multitasking In-Reply-To: <51A0C14F.7070201@Gmail.com> References: <51A0C14F.7070201@Gmail.com> Message-ID: <1369519835.67140.YahooMailNeo@web184703.mail.ne1.yahoo.com> > From: Devyn Collier Johnson >Sent: Saturday, May 25, 2013 6:49 AM >It may benefit many programmers if Python3 code could be multithreaded as easy as BASH code. For example, in BASH, if I wish to multithread a command, I put an ampersand at the end of the line. Then, the BASH interpreter will execute that line while continuing the execution of the script instead of waiting to finish that command. For Python3, I have two ideas for how the Python3 multitasking syntax should look. > >Option 1: SOME_COMMAND & & already has a meaning, as an operator. This would be confusing to both human readers and the parser. >Option 2: multitask(SOME_COMMAND) This one is fine?and you can do it trivially today: ? ? import multiprocessing ? ? jobs = [] ? ? def multitask(command): ? ? ? ? jobs.append(multiprocessing.Process(command)) And you're done. If you want to limit the number of running processes, or use a pool instead of a process per task, or use threads instead of processes, or name the jobs and store them in a dict, or whatever,?you can change the multitask function. From ram.rachum at gmail.com Sun May 26 14:00:13 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Sun, 26 May 2013 05:00:13 -0700 (PDT) Subject: [Python-ideas] Idea: Compressing the stack on the fly Message-ID: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Hi everybody, Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to how programming languages are implemented, so this idea will most likely be either (a) completely impossible or (b) trivial knowledge. I was thinking about the implementation of the factorial in Python. I was comparing in my mind 2 different solutions: The recursive one, and the one that uses a loop. Here are example implementations for them: def factorial_recursive(n): if n == 1: return 1 return n * factorial_recursive(n - 1) def factorial_loop(n): result = 1 for i in range(1, n + 1): result *= i return result I know that the recursive one is problematic, because it's putting a lot of items on the stack. In fact it's using the stack as if it was a loop variable. The stack wasn't meant to be used like that. Then the question came to me, why? Maybe the stack could be built to handle this kind of (ab)use? I read about tail-call optimization on Wikipedia. If I understand correctly, the gist of it is that the interpreter tries to recognize, on a frame-by-frame basis, which frames could be completely eliminated, and then it eliminates those. Then I read Guido's blog post explaining why he doesn't want it in Python. In that post he outlined 4 different reasons why TCO shouldn't be implemented in Python. But then I thought, maybe you could do something smarter than eliminating individual stack frames. Maybe we could create something that is to the current implementation of the stack what `xrange` is to the old-style `range`. A smart object that allows access to any of a long list of items in it, without actually having to store those items. This would solve the first argument that Guido raises in his post, which I found to be the most substantial one. What I'm saying is: Imagine the stack of the interpreter when it runs the factorial example above for n=1000. It has around 1000 items in it and it's just about to explode. But then, if you'd look at the contents of that stack, you'd see it's embarrassingly regular, a compression algorithm's wet dream. It's just the same code location over and over again, with a different value for `n`. So what I'm suggesting is an algorithm to compress that stack on the fly. An algorithm that would detect regularities in the stack and instead of saving each individual frame, save just the pattern. Then, there wouldn't be any problem with showing informative stack trace: Despite not storing every individual frame, each individual frame could still be *accessed*, similarly to how `xrange` allow access to each individual member without having to store each of them. Then, the stack could store a lot more items, and tasks that currently require recursion (like pickling using the standard library) will be able to handle much deeper recursions. What do you think? Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun May 26 22:08:00 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sun, 26 May 2013 16:08:00 -0400 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Message-ID: On 5/26/2013 8:00 AM, Ram Rachum wrote: > def factorial_recursive(n): > if n == 1: > return 1 > return n * factorial_recursive(n - 1) This fails for 0 (0! == 1 also), but this is easily fixed. If also fails for negative and non-integral values, but ignore that for the moment. def factorial_tail(n, result=1): if n > 1: return factorial_tail(n-1, result * n) else: return result def factorial_while(n, result=1) # written to maximize sameness with tail version while n > 1: n = n=1 result = result * n) else: return result If one can rewrite the 'body' recursion (my term) as tail recursion, the translation to while loop is trivial, and for linear recursion on counts, the for loop version is simple and ofter even clearer as it explicitly identifies all the values used in the coomputation. > def factorial_loop(n): > result = 1 > for i in range(1, n + 1): > result *= i > return result Now consider a production ready function that will always terminate: def factorial(n): if n < 0 or n != int(n): raise ValueError('factorial input must be a count') To do same with recursion, you have to write main function as a nested function to avoid useless doing check more than once. Conclusion: for linear processing of collections, use a while or for loop. > While I do not see much in the idea. * Converting recursion to iteration compresses stack to 1 frame. * Slows down all computation for rare benefit. * Does not really scale very well. In most languages, factorial overflows long before there is a stack problem. To be realistic, consider linearly processing a billion character string with recursion versus iteration. * Recursion does not work with iterators as well as iteration. Iterators are one of Python's crown jewels. with open('terabyte.txt') as f: for c in chars(f): process(c) is the right idiom for independently processing the characters of a terabyte file. -- Terry Jan Reedy From wuwei23 at gmail.com Mon May 27 03:30:01 2013 From: wuwei23 at gmail.com (alex23) Date: Sun, 26 May 2013 18:30:01 -0700 (PDT) Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Message-ID: <9f9bd252-88a6-402a-9b58-8be92f954a5a@be10g2000pbd.googlegroups.com> On May 27, 6:23?am, Ram Rachum wrote: > So in those cases where you have to use recursion, it would be nice if the > stack could be compressed so the program could put 1000x many frames on the > stack and will be less likely to crash. http://docs.python.org/2/library/sys.html#sys.setrecursionlimit From haoyi.sg at gmail.com Mon May 27 06:18:37 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 27 May 2013 00:18:37 -0400 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: <9f9bd252-88a6-402a-9b58-8be92f954a5a@be10g2000pbd.googlegroups.com> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <9f9bd252-88a6-402a-9b58-8be92f954a5a@be10g2000pbd.googlegroups.com> Message-ID: I don't think setrecursionlimit is the answer here. It's like answering "hey maybe we can save some cash here" with "why not just make more money?". The fact that python's conventions encourages using iterators and explicit state machines instead of tail-recursion and tail-calls is worth something, but simply saying "it slows down all computation for rare benefit" and "does not scale well" is rather hand wavy. We already are not going to process billion character strings with iteration in python (python loops are sloooow). There are some (not-factorial) algorithms which are more naturally expressed recursively than iteratively. Saying recursion is slower and more memory intensive than iteration (true), and thus we shouldn't think about making recursion less slow or less memory intensive doesn't really make much sense to me. This is an extremely novel idea, which begs the question: has it been done before? In any other language/runtime? And how did it turn out? A cursory googling doesn't get me anything, and my gut says this is more PhD thesis than quick hack. -Haoyi On Sun, May 26, 2013 at 9:30 PM, alex23 wrote: > On May 27, 6:23 am, Ram Rachum wrote: > > So in those cases where you have to use recursion, it would be nice if > the > > stack could be compressed so the program could put 1000x many frames on > the > > stack and will be less likely to crash. > > http://docs.python.org/2/library/sys.html#sys.setrecursionlimit > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon May 27 07:01:03 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 27 May 2013 14:01:03 +0900 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <9f9bd252-88a6-402a-9b58-8be92f954a5a@be10g2000pbd.googlegroups.com> Message-ID: <87wqqlrp80.fsf@uwakimon.sk.tsukuba.ac.jp> Haoyi Li writes: > We already are not going to process billion character strings with > iteration in python (python loops are sloooow). Not all questions need to be answered before you ask. Mailman archives do regularly process multigigabyte mbox files when rebuilding an archive for some reason. This may take quite a few minutes even on a fast box, but it's still a reasonable cost once every decade or so (many archives never need rebuilding). > Saying recursion is slower and more memory intensive than iteration > (true), and thus we shouldn't think about making recursion less > slow or less memory intensive doesn't really make much sense to me. I think the point is more that tail recursion (where various forms of stack compression make a lot of sense) is equivalent to iteration, which Python does quite well in many ways. The augmented TOOWTDI principle (with the clause "and preferably, only one") suggests that work on optimizing tail recursion is both a YAGNI and an attractive nuisance, given that (for Python programmers) the iterative algorithms are equally "natural" and invariably faster and more memory-efficient. IOW, I doubt that Guido would explicitly veto contribution of a more efficient tail recursion as long as programs using it are equally debuggable, and the implementation not so complex as to invite lots of maintenance in the future. I suppose he just wants to encourage people to spend their available effort on more Pythonic lines of development. From guido at python.org Mon May 27 07:14:33 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 26 May 2013 22:14:33 -0700 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: <87wqqlrp80.fsf@uwakimon.sk.tsukuba.ac.jp> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <9f9bd252-88a6-402a-9b58-8be92f954a5a@be10g2000pbd.googlegroups.com> <87wqqlrp80.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, May 26, 2013 at 10:01 PM, Stephen J. Turnbull wrote: > IOW, I doubt that Guido would explicitly veto contribution of a more > efficient tail recursion as long as programs using it are equally > debuggable, and the implementation not so complex as to invite lots of > maintenance in the future. I suppose he just wants to encourage > people to spend their available effort on more Pythonic lines of > development. Actually (as you can read in the blog: http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html) I would seriously frown upon that if it was a patch to CPython only. Tail recursion elimination needs to be addressed at the language level, not as an implementation trick. As to the OP's idea of applying compression to the stack, well, it definitely sounds novel and probably untried, and I recommend trying it in another language than Python first to see if it can be done efficiently at all. One issue is that there's actually quite a bit more on the stack than just the value of local variables, and all that needs to be taken into account. The first place where I would look for ideas is the work done on backwards execution -- I've never used or understood such systems, but I presume they must have some way of handling the stack, too. -- --Guido van Rossum (python.org/~guido) From abarnert at yahoo.com Mon May 27 07:23:01 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 26 May 2013 22:23:01 -0700 (PDT) Subject: [Python-ideas] Fw: Idea: Compressing the stack on the fly In-Reply-To: <1369593309.80727.YahooMailNeo@web184703.mail.ne1.yahoo.com> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <1369593309.80727.YahooMailNeo@web184703.mail.ne1.yahoo.com> Message-ID: <1369632181.77862.YahooMailNeo@web184702.mail.ne1.yahoo.com> Sorry, my reply got bounced because it went to the Google Groups address instead of the regular address. Let me try again. For those who don't want to read the whole thing, because others have since made many of the same points: I don't think this is likely to be a good optimization for CPython, much less something to require of all implementations? but it could be an interesting experiment for PyPy, especially when combined with the existing Stackless and JIT features. From: Ram Rachum Sent: Sunday, May 26, 2013 5:00 AM > def factorial_recursive(n): >> ? ? if n == 1: >> ? ? ? ? return 1 >> ? ? return n * factorial_recursive(n - 1) This is not a tail-recursive implementation. It doesn't just return the value of a recursive call, it performs further computation on that value before returning it. You need to do something like this: def factorial_recursive(n, accum=1): ? ? if n == 1: ? ? ? ? return accum ? ? return factorial_recursive(n - 1, n * accum) (You may notice that the transformation from naive-recursive to tail-recursive looks a lot like the transformation to iterative.) > But then I thought, maybe you could do something smarter than eliminating > individual stack frames. Maybe we could create something that is to the current > implementation of the stack what `xrange` is to the old-style `range`. A smart > object that allows access to any of a long list of items in it, without actually > having to store those items.?This would solve the first argument that Guido > raises in his post, which I found to be the most substantial one. One key difference here: xrange has a constant size. No matter how big the range is, you only need to store 3 values. Your suggestion of storing each of the separate n values means your stack is still linear. You might get a 5x, 10x, or even 50x, improvement out of it?but it would still grow linearly. And there aren't that many programs for which 1000 is not enough but 10000 would be. > So what I'm suggesting is an algorithm to compress that stack on the > fly. An algorithm that would detect regularities in the stack and instead of > saving each individual frame, save just the pattern. Then, there wouldn't be > any problem with showing informative stack trace: Despite not storing every > individual frame, each individual frame could still be accessed, similarly to > how `xrange` allow access to each individual member without having to store each > of them. If you look at the way function calls are implemented in CPython, this would be very difficult. However, if you look at Stackless CPython, or, especially, PyPy, that's another story. It's worth noting that storing the stack on the heap already gives you the ability to easily raise the recursion depth 50x, and nobody cares much about that?but people care a lot about the other benefits of the stackless implementation. So, your idea might be interesting even if your original motivation turns out not to be. For example, a 10x smaller stack might mean 10x as much cache locality, or your compression might provide enough feedback for the PyPy JIT to take advantage of. Also, you might want to look at how generator functions are implemented. Your original code can't be TCO'd because it has a non-empty continuation after the recursive call. But if you think about it, a set of local variables and a continuation point is basically what a generator state stores. If you take this far enough, a JIT might be able to optimize out not only tail recursive calls, but some non-tail-recursive implementations like yours. From rosuav at gmail.com Mon May 27 09:32:27 2013 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 27 May 2013 17:32:27 +1000 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <9f9bd252-88a6-402a-9b58-8be92f954a5a@be10g2000pbd.googlegroups.com> Message-ID: On Mon, May 27, 2013 at 2:18 PM, Haoyi Li wrote: > Saying recursion is slower and more memory intensive than iteration (true), > and thus we shouldn't think about making recursion less slow or less memory > intensive doesn't really make much sense to me. Nothing more than a gut feeling, but this sounds like trying to optimize a bubble sort. Sure, you could write a tighter implementation that halves the running time, but it's still an inefficient base method to use, so switching methods will give more benefit. Actually, the biggest problem I foresee out of this is that it'll make microbenchmarking a lot harder, which would make debates with jmf 75% less productive... Still, it does sound like a fun piece of technology to play with. ChrisA From storchaka at gmail.com Mon May 27 11:29:14 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 27 May 2013 12:29:14 +0300 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Message-ID: 26.05.13 15:00, Ram Rachum ???????(??): > I was thinking about the implementation of the factorial in Python. I > was comparing in my mind 2 different solutions: The recursive one, and > the one that uses a loop. And math.factorial() does not use any of them. From lkb.teichmann at gmail.com Mon May 27 11:55:33 2013 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Mon, 27 May 2013 09:55:33 +0000 (UTC) Subject: [Python-ideas] Advanced Line Continuation Message-ID: Hello List, often I am writing mathematical code, with rather long formulae, spanning several lines of code. So I am in need of line continuation, and I know that this is currently best done by putting the line in parentheses, as in a = (something_long * something_longer + something_else) those parentheses are mathematically unnecessary. Wouldn't it be nice if we could just write a = something_long * something_longer + something_else by simply adding a new line continuation rule: if a line ends in an operator, the next line is a continuation. Currently, lines ending in a operator are always a syntax error, so this rule would not break old code. This rule is rather intuitive, as putting operators at the end of the last line if continued is already encouraged by PEP8. One exception certainly is the , (comma) operator, as it actually can be at the and of a line, so this new rule would not apply to this operator, but as , is actually rather different from other operators, I would not consider that an astonishing exception. The latter actually is a pity: with the operator-at-end-continues-line rule we would also get rid of the problem with long with statements, as in with something_long as something_longer, something_cool as something_cooler: which cannot be nicely split. So we could add the rule "if a line starts with 'with', even comma continues a line", but this rule might then sound a bit too arbitrary. Greetings Martin From goktug.kayaalp at gmail.com Mon May 27 12:41:50 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Mon, 27 May 2013 13:41:50 +0300 Subject: [Python-ideas] Custom string prefixes Message-ID: <87obbw4scx.fsf@gmail.com> Hi, I wanted to share this idea of (not possibly only) mine with Python Core developers. I think it would add some significant flexibility to Python to let users define custom string prefixes. What I mean by a string prefix is, a letter prefixing the string literal, modifying the behavior of it, e.g. >>> r"^\?\? (?P.*)$" is a so-called /raw string/ which does not interpret character escapes. I would suggest that providing ability to define custom string prefixes would benefit the programmer, especially enrich Python based DSLs. Let me provide some examples: - I know that there is a library called `decimal`, which provides facilities for finer floating point arithmetic. A `Decimal` class is used to express these numbers and operations, resulting in >>> decimal.Decimal ("1.6e-9") * decimal.Decimal ("1.0e9") which is a little bit messy. This can easily be cured by >>> from decimal import Decimal as D >>> D ("1.6e-9") * D ("1.0e9") but I'd enounce that the following is more concise and readable: >>> D"1.6e-9" * D"1.0e9" with removed parens. - I use a little program I wrote instead of make and rake for my daily computer usage called poke[*], where I do a lot of string interpolation. from poke.utils import shell, build as _build OUTDIR, PROGRAM = "out", "lsvirtues" def build (): t0 = _build.CompilationTask ( ["laziness.c", "hubris.c", "impatience.c"], "{outdir}/{progn}".format(outdir=OUTDIR, progn=PROGRAM), "gcc -Wall -o {outfile} {infiles}") t0.commit () def clean (): shell.sh ("rm -fr {0} *.o .pokedb".format (OUTDIR)) Even though `str.format (*, **)` is cool, I think using an 'interpolated string' prefix can help clean up stuff a little bit: # ... def build (): t0 = _build.CompilationTask ([...], I"{OUTDIR}/{progn}", ...) def clean (): shell.sh (I"rm -fr {OUTDIR} *.o .pokedb") I thought of subclassing the `str` and overloading some unary operators, e.g. "~", but, I believe this is not very well suited because - still requires some mess: ~s("hello {name}") - or, a monkey-patched `str`: ~"we should{should_we} do that" - and, a tilde is not as meaningful as a letter. One can find many other uses too, which saves some redundant method calls and stuff for a particular task. And such an addition to the language would be compatible too, in the sense that it will be possible to run older scripts on a runtime with this implemented. I'm looking forward to your criticisms and advices. I've searched this online and asked in the chatroom (#python) and I'm nearly sure that I'm not asking for a feature that is already present. Being a beginner, I can say that I'm kind of nervous to post here, where really experienced people discuss the features of an internationally-adopted language. [*] https://github.com/gkya/poke Greetings, G?ktu?. -- G?ktu? Kayaalp From jsbueno at python.org.br Mon May 27 13:01:30 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 27 May 2013 08:01:30 -0300 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Message-ID: And, besides everybody else's comments, I'd like to point out that tail recursion elimination _is_ feasible in Python as it is today. One can create a decorator meant only for suitable tail-recursion functions, that makes the bookkeeping of whenever the funciton is called - and in this book-keeping make the call to the decorated function inside a try-except block. If it is called reursively, it them raises an apropriate exception, that will make the execution continue on the frame-stack level of the first stack. I have a blog post with such a toy - nevertheless, it is just a toy. (If ther ewas soem small problem that could be elegantly approached in this fashion but not interactively, it could be used in production though) http://metapython.blogspot.com.br/2010/11/tail-recursion-elimination-in-python.html js -><- On 26 May 2013 09:00, Ram Rachum wrote: > Hi everybody, > > Here's an idea I had a while ago. Now, I'm an ignoramus when it comes to how > programming languages are implemented, so this idea will most likely be > either (a) completely impossible or (b) trivial knowledge. > > I was thinking about the implementation of the factorial in Python. I was > comparing in my mind 2 different solutions: The recursive one, and the one > that uses a loop. Here are example implementations for them: > > def factorial_recursive(n): > if n == 1: > return 1 > return n * factorial_recursive(n - 1) > > def factorial_loop(n): > result = 1 > for i in range(1, n + 1): > result *= i > return result > > > I know that the recursive one is problematic, because it's putting a lot of > items on the stack. In fact it's using the stack as if it was a loop > variable. The stack wasn't meant to be used like that. > > Then the question came to me, why? Maybe the stack could be built to handle > this kind of (ab)use? > > I read about tail-call optimization on Wikipedia. If I understand correctly, > the gist of it is that the interpreter tries to recognize, on a > frame-by-frame basis, which frames could be completely eliminated, and then > it eliminates those. Then I read Guido's blog post explaining why he doesn't > want it in Python. In that post he outlined 4 different reasons why TCO > shouldn't be implemented in Python. > > But then I thought, maybe you could do something smarter than eliminating > individual stack frames. Maybe we could create something that is to the > current implementation of the stack what `xrange` is to the old-style > `range`. A smart object that allows access to any of a long list of items in > it, without actually having to store those items. This would solve the first > argument that Guido raises in his post, which I found to be the most > substantial one. > > What I'm saying is: Imagine the stack of the interpreter when it runs the > factorial example above for n=1000. It has around 1000 items in it and it's > just about to explode. But then, if you'd look at the contents of that > stack, you'd see it's embarrassingly regular, a compression algorithm's wet > dream. It's just the same code location over and over again, with a > different value for `n`. > > So what I'm suggesting is an algorithm to compress that stack on the fly. An > algorithm that would detect regularities in the stack and instead of saving > each individual frame, save just the pattern. Then, there wouldn't be any > problem with showing informative stack trace: Despite not storing every > individual frame, each individual frame could still be accessed, similarly > to how `xrange` allow access to each individual member without having to > store each of them. > > Then, the stack could store a lot more items, and tasks that currently > require recursion (like pickling using the standard library) will be able to > handle much deeper recursions. > > What do you think? > > > Ram. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ubershmekel at gmail.com Mon May 27 14:28:44 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 27 May 2013 15:28:44 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87obbw4scx.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> Message-ID: On Mon, May 27, 2013 at 1:41 PM, G?ktu? Kayaalp wrote: > I think it would add some significant flexibility to Python to let users > define custom string prefixes. What I mean by a string prefix is, > a letter prefixing the string literal, modifying the behavior of it, > [...] > The feature isn't present. But it isn't really clear what you want either. The current string prefixes are not dynamically analyzed. Do you want static hooks for the lexer? A string method to register new prefixes? Do you want to specify new string prefixes in the existing framework? I think string prefixes aren't something that we want more of, this is already complicated enough: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" "r" | "u" | "R" | "U" We should be more creative on how to get rid of them. Yuval Greenfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Mon May 27 14:47:45 2013 From: masklinn at masklinn.net (Masklinn) Date: Mon, 27 May 2013 14:47:45 +0200 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> Message-ID: <9A8CDC23-E3BF-4377-935A-FFE14712E036@masklinn.net> On 2013-05-27, at 14:28 , Yuval Greenfield wrote: > On Mon, May 27, 2013 at 1:41 PM, G?ktu? Kayaalp wrote: > >> I think it would add some significant flexibility to Python to let users >> define custom string prefixes. What I mean by a string prefix is, >> a letter prefixing the string literal, modifying the behavior of it, >> [...] >> > > The feature isn't present. But it isn't really clear what you want either. Regardless of the implementation details (which can be bikeshed separately afterwards if the idea is considered worthy), there seem to be 2 different suggestions in my reading: 1. Pluggable string prefixes allowing custom parsing of string literals and their resulting in arbitrary types, e.g. a `d` prefix on a string literal would yield a decimal.Decimal (one can also imagine an `h`-string for markup-safe strings literals[1] for instance) 2. Some sort of codegen/environment hook allowing (amongst other things?) the implementation of shell/ruby/perl string interpolation (Scala uses prefixes for interpolator specifications[2]) [1] https://pypi.python.org/pypi/MarkupSafe [2] http://docs.scala-lang.org/overviews/core/string-interpolation.html From goktug.kayaalp at gmail.com Mon May 27 15:09:35 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Mon, 27 May 2013 16:09:35 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: (Yuval Greenfield's message of "Mon, 27 May 2013 15:28:44 +0300") References: <87obbw4scx.fsf@gmail.com> Message-ID: <87k3mk4lio.fsf@gmail.com> > ... A string method to register new prefixes? ... This is what I want. I believe having this will be beneficial. > I think string prefixes aren't something that we want more of, this is already > complicated enough: I concur that all those prefixes are confusing. But they might be used for creating handy shorthand notations, eliminating a significant amount of syntax and thus making it easier for the user to recognize the important pieces of a program source when they are 'registerable' by the user. I can see that the amount of work to implement this is not trivial, but, personally I think that the benefits it will bring are not trivial too. Yuval Greenfield writes: > On Mon, May 27, 2013 at 1:41 PM, G?ktu? Kayaalp > wrote: > > I think it would add some significant flexibility to Python to let users > define custom string prefixes. What I mean by a string prefix is, > a letter prefixing the string literal, modifying the behavior of it, > [...] > > > The feature isn't present. But it isn't really clear what you want either. The > current string prefixes are not dynamically analyzed. Do you want static hooks > for the lexer? A string method to register new prefixes? Do you want to > specify new string prefixes in the existing framework? > > I think string prefixes aren't something that we want more of, this is already > complicated enough: > "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" > "r" | "u" | "R" | "U" > > We should be more creative on how to get rid of them. > > Yuval Greenfield > -- G?ktu? Kayaalp From ubershmekel at gmail.com Mon May 27 15:13:13 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 27 May 2013 16:13:13 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <9A8CDC23-E3BF-4377-935A-FFE14712E036@masklinn.net> References: <87obbw4scx.fsf@gmail.com> <9A8CDC23-E3BF-4377-935A-FFE14712E036@masklinn.net> Message-ID: On Mon, May 27, 2013 at 3:47 PM, Masklinn wrote: > Regardless of the implementation details (which can be bikeshed > separately afterwards if the idea is considered worthy), there seem to > be 2 different suggestions in my reading: > > 1. Pluggable string prefixes allowing custom parsing of string literals > and their resulting in arbitrary types, e.g. a `d` prefix on a string > literal would yield a decimal.Decimal (one can also imagine an > `h`-string for markup-safe strings literals[1] for instance) > > For this specific use-case I'd suggest d = decimal.Decimal d('1.0') > 2. Some sort of codegen/environment hook allowing (amongst other > things?) the implementation of shell/ruby/perl string interpolation > (Scala uses prefixes for interpolator specifications[2]) > > A friend of mine likes to do this: import sys '{sys}'.format(**locals()) Yuval Greenfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon May 27 15:13:43 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 27 May 2013 23:13:43 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87obbw4scx.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> Message-ID: <51A35C07.4080207@pearwood.info> On 27/05/13 20:41, G?ktu? Kayaalp wrote: > I'm looking forward to your criticisms and advices. I've searched this > online and asked in the chatroom (#python) and I'm nearly sure that I'm > not asking for a feature that is already present. Being a beginner, I > can say that I'm kind of nervous to post here, where really experienced > people discuss the features of an internationally-adopted language. Welcome, and I admire your bravery! So please don't take it personally when I say, your idea does not sound very good to me. In fact, it sounds terrible. You call this proposal "custom string prefixes", but what you describe is actually a second way to call a function, only not any function, but just functions that take a single string argument. So more like a function call that looks like a string. Let me start with your example: > >>> from decimal import Decimal as D > >>> D ("1.6e-9") * D ("1.0e9") > > but I'd enounce that the following is more concise and readable: > > >>> D"1.6e-9" * D"1.0e9" > > with removed parens. Just to save a couple of parentheses, you add a lot of complication to the language, make it harder for people to learn, and for no real benefit except to save a few keystrokes. Consider: String prefixes are currently part of Python's syntax, and can operate at compile-time. With your proposal, they become run-time operations, like any function call. So this is redundant: we already have a perfectly good way of calling functions. Not just redundant, but also very limited, because most functions take more than one argument, or non-string arguments. Do you have any idea how you would implement this change? Do you at least have an idea for the API? What commands would the user give to define a new "string prefix"? How would the user query the "string prefixes" already defined? What happens when they combine multiple prefixes? Python is famous for being "executable pseudo-code". To a very large degree, code written in Python should be readable by people who are not Python programmers. What do you think s"ham" will mean to the reader? I think that it is better to encourage people to write meaningful names: make_sandwich("ham") than trying to save every last keystroke possible. Code is written once, but read over and over again. That half a second you save by typing s"ham" will cost other people dozens of seconds, maybe minutes, each time them read your code. -- Steven From rosuav at gmail.com Mon May 27 15:18:06 2013 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 27 May 2013 23:18:06 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87obbw4scx.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> Message-ID: On Mon, May 27, 2013 at 8:41 PM, G?ktu? Kayaalp wrote: > - I know that there is a library called `decimal`, which provides > facilities for finer floating point arithmetic. A `Decimal` > class is used to express these numbers and operations, resulting in > > >>> decimal.Decimal ("1.6e-9") * decimal.Decimal ("1.0e9") > > which is a little bit messy. This can easily be cured by > > >>> from decimal import Decimal as D > >>> D ("1.6e-9") * D ("1.0e9") > > but I'd enounce that the following is more concise and readable: > > >>> D"1.6e-9" * D"1.0e9" > > with removed parens. Your wording is a little confusing, as you're no longer talking about a string literal; but ISTM you're talking about something very similar to the C++11 standard's new feature of user-defined literals: http://en.wikipedia.org/wiki/C%2B%2B11#User-defined_literals This may be a little too complex for what you're proposing, but it is along the same lines. I suspect a generic system for allowing Decimal and other literals would be welcomed here. > Even though `str.format (*, **)` is cool, I think using an > 'interpolated string' prefix can help clean up stuff a little bit: > > # ... > def build (): > t0 = _build.CompilationTask ([...], I"{OUTDIR}/{progn}", ...) > > def clean (): > shell.sh (I"rm -fr {OUTDIR} *.o .pokedb") Please no. It's far too easy to make extremely messy code this way. If you want it, spell it like this: shell.sh ("rm -fr {OUTDIR} *.o .pokedb".format(**globals())) (or locals() perhaps) so it's sufficiently obvious that you're just casually handing all your names to the format function. It's like avoiding Python 2's input() in favour of explicitly spelling it out as eval(raw_input()) - put the eval call right there where it can be seen. The system of interpolations as found in other languages (I'm most familiar with the PHP one as I have to work with it on a daily basis) is inevitably a target for more and more complexity and then edge cases; being explicit is the Python way, so unless there's a really good reason to make all your global names easily available, I would be strongly inclined to not. > I'm looking forward to your criticisms and advices. I've searched this > online and asked in the chatroom (#python) and I'm nearly sure that I'm > not asking for a feature that is already present. Being a beginner, I > can say that I'm kind of nervous to post here, where really experienced > people discuss the features of an internationally-adopted language. I'd recommend python-list at python.org or comp.lang.python rather than #python; you get much better responses when there's no requirement for people to be online simultaneously. But in this case you're right, there's no feature quite as you describe. In fact, I'd recommend you join python-list regardless, if only because we have awesome fun there :) You sound like you'd be the perfect sort to join in. ChrisA From rosuav at gmail.com Mon May 27 15:30:37 2013 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 27 May 2013 23:30:37 +1000 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Message-ID: On Mon, May 27, 2013 at 9:01 PM, Joao S. O. Bueno wrote: > I have a blog post with such a toy - nevertheless, it is just a toy. > (If ther ewas soem small problem that could be elegantly approached > in this fashion but not interactively, it could be used in production > though) What can tail recursion do that can't be done by reassigning to the function parameters and 'goto' back to the top? Or, in the absence of an actual goto, a construct like this: def tail_recursive_function(some_arg): while True: # ... code if call_self: # return tail_recursive_function(some_other_arg) some_arg = some_other_arg continue # ... more code # falling off the end: break which basically amounts to a goto but using extra keywords to avoid the one that people hate. Is there any fundamental difference? I've never understood there to be any, but I'm only one, and possibly I'm wrong. ChrisA From ned at nedbatchelder.com Mon May 27 15:52:29 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 27 May 2013 09:52:29 -0400 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> Message-ID: <51A3651D.40403@nedbatchelder.com> On 5/27/2013 9:30 AM, Chris Angelico wrote: > On Mon, May 27, 2013 at 9:01 PM, Joao S. O. Bueno wrote: >> I have a blog post with such a toy - nevertheless, it is just a toy. >> (If ther ewas soem small problem that could be elegantly approached >> in this fashion but not interactively, it could be used in production >> though) > What can tail recursion do that can't be done by reassigning to the > function parameters and 'goto' back to the top? Or, in the absence of > an actual goto, a construct like this: > > def tail_recursive_function(some_arg): > while True: > # ... code > if call_self: > # return tail_recursive_function(some_other_arg) > some_arg = some_other_arg > continue > # ... more code > # falling off the end: > break > > which basically amounts to a goto but using extra keywords to avoid > the one that people hate. Is there any fundamental difference? I've > never understood there to be any, but I'm only one, and possibly I'm > wrong. That style can't handle mutually recursive procedures, or the extreme case: a state machine implemented with N functions, each of which calls the next state function at the end. Tail-call elimination isn't simply about noticing recursive calls. It's about noticing that a function ends with a function call, and not burning another stack frame in the process. --Ned. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From masklinn at masklinn.net Mon May 27 15:53:34 2013 From: masklinn at masklinn.net (Masklinn) Date: Mon, 27 May 2013 15:53:34 +0200 Subject: [Python-ideas] Make gettext support direct PO files input Message-ID: <76F0346A-251D-4C93-BBC8-F74625CD5666@masklinn.net> * The standard Python distribution provides an msgfmt tool able to compile PO files to MO files. * The gettext module only looks for MO files. This leads to 2 questions: * Why isn't the msgfmt behavior available programmatically from the stdlib even though all the pieces seem present? (as far as I can see, everything is implemented in msgfmt directly) * Why can't gettext use PO files directly, but requires separately generating MO files? From goktug.kayaalp at gmail.com Mon May 27 15:58:37 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Mon, 27 May 2013 16:58:37 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A35C07.4080207@pearwood.info> (Steven D'Aprano's message of "Mon, 27 May 2013 23:13:43 +1000") References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> Message-ID: <87d2sc4j8y.fsf@gmail.com> Thanks for your input, Steven, it's very blissful to be able to discuss with people here. > Do you have any idea how you would implement this change? Do you at least have > an idea for the API? I neither have an idea on how to implement this nor a good API design idea. It has not been more than a couple weeks since when this came to my mind and I've spent some time on trying to figure out how one would benefit this. But, I can say that this idea is not original, as Scala does have a concept that is probably similar (I believe Scala let's one define a 'string interpolator', but I don't know very much detail). > ... So this is redundant: we already have a perfectly good way of > calling functions. Not just redundant, but also very limited, because most > functions take more than one argument, or non-string arguments. > ... > Python is famous for being "executable pseudo-code". To a very large degree, > code written in Python should be readable by people who are not Python > programmers. What do you think > ... > That half a second you save by typing s"ham" will cost other people > dozens of seconds, maybe minutes, each time them read your code. I'll say that these are not valid statements (in my perspective) for the following reasons: - We have decorators, which are only syntactic sugar for calling higher order functions, and, only higher order functions that take at least one argument and return a callable. - It does not only save half a second, but, it enables eliminate redundancy in code where string operations are very frequent. Some examples are scripts build tools[*], command line programs that process stdin and write results to stdout, logging utilities, etc. [*] I use an utility that I've wrote, and I can say such a feature would reduce the density of the scripts I wrote. There is also Scons (and possibly others), but I don't know much about it. Steven D'Aprano writes: > On 27/05/13 20:41, G?ktu? Kayaalp wrote: > >> I'm looking forward to your criticisms and advices. I've searched this >> online and asked in the chatroom (#python) and I'm nearly sure that I'm >> not asking for a feature that is already present. Being a beginner, I >> can say that I'm kind of nervous to post here, where really experienced >> people discuss the features of an internationally-adopted language. > > Welcome, and I admire your bravery! So please don't take it personally when I > say, your idea does not sound very good to me. In fact, it sounds > terrible. You call this proposal "custom string prefixes", but what you > describe is actually a second way to call a function, only not any function, > but just functions that take a single string argument. So more like a function > call that looks like a string. > > Let me start with your example: > > >> >>> from decimal import Decimal as D >> >>> D ("1.6e-9") * D ("1.0e9") >> >> but I'd enounce that the following is more concise and readable: >> >> >>> D"1.6e-9" * D"1.0e9" >> >> with removed parens. > > Just to save a couple of parentheses, you add a lot of complication to the > language, make it harder for people to learn, and for no real benefit except > to save a few keystrokes. Consider: > > String prefixes are currently part of Python's syntax, and can operate at > compile-time. With your proposal, they become run-time operations, like any > function call. So this is redundant: we already have a perfectly good way of > calling functions. Not just redundant, but also very limited, because most > functions take more than one argument, or non-string arguments. > > Do you have any idea how you would implement this change? Do you at least have > an idea for the API? What commands would the user give to define a new "string > prefix"? How would the user query the "string prefixes" already defined? What > happens when they combine multiple prefixes? > > Python is famous for being "executable pseudo-code". To a very large degree, > code written in Python should be readable by people who are not Python > programmers. What do you think > > s"ham" > > will mean to the reader? I think that it is better to encourage people to write meaningful names: > > make_sandwich("ham") > > than trying to save every last keystroke possible. Code is written once, but > read over and over again. That half a second you save by typing s"ham" will > cost other people dozens of seconds, maybe minutes, each time them read your > code. -- G?ktu? Kayaalp From rosuav at gmail.com Mon May 27 16:05:10 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 28 May 2013 00:05:10 +1000 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: <51A3651D.40403@nedbatchelder.com> References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <51A3651D.40403@nedbatchelder.com> Message-ID: On Mon, May 27, 2013 at 11:52 PM, Ned Batchelder wrote: > > On 5/27/2013 9:30 AM, Chris Angelico wrote: >> What can tail recursion do that can't be done by reassigning to the >> function parameters and 'goto' back to the top? > > That style can't handle mutually recursive procedures, or the extreme case: > a state machine implemented with N functions, each of which calls the next > state function at the end. Tail-call elimination isn't simply about > noticing recursive calls. It's about noticing that a function ends with a > function call, and not burning another stack frame in the process. Ahh, gotcha. Of course. Mutual recursion would be a bit more of a problem to the compressor, too, though; it'd have to recognize a pattern that spans multiple stack frames. ChrisA From goktug.kayaalp at gmail.com Mon May 27 16:30:15 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Mon, 27 May 2013 17:30:15 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: (Chris Angelico's message of "Mon, 27 May 2013 23:18:06 +1000") References: <87obbw4scx.fsf@gmail.com> Message-ID: <878v304hs8.fsf@gmail.com> > ... but ISTM you're talking about something very similar > to the C++11 standard's new feature of user-defined literals ... I did not know this until now, but it looks like a fine idea. I wonder how people would react to the idea of having this in Python. I can also add that this is far better than what I propose. > In fact, I'd recommend you join python-list regardless, if only > because we have awesome fun there :) You sound like you'd be the > perfect sort to join in. I've just started to get into the community, and even though I haven't posted anything to python-list, I'm trying to read every message. Python community is really awesome! Thanks for your input! G?ktu? Chris Angelico writes: > On Mon, May 27, 2013 at 8:41 PM, G?ktu? Kayaalp > wrote: >> - I know that there is a library called `decimal`, which provides >> facilities for finer floating point arithmetic. A `Decimal` >> class is used to express these numbers and operations, resulting in >> >> >>> decimal.Decimal ("1.6e-9") * decimal.Decimal ("1.0e9") >> >> which is a little bit messy. This can easily be cured by >> >> >>> from decimal import Decimal as D >> >>> D ("1.6e-9") * D ("1.0e9") >> >> but I'd enounce that the following is more concise and readable: >> >> >>> D"1.6e-9" * D"1.0e9" >> >> with removed parens. > > Your wording is a little confusing, as you're no longer talking about > a string literal; but ISTM you're talking about something very similar > to the C++11 standard's new feature of user-defined literals: > > http://en.wikipedia.org/wiki/C%2B%2B11#User-defined_literals > > This may be a little too complex for what you're proposing, but it is > along the same lines. I suspect a generic system for allowing Decimal > and other literals would be welcomed here. > >> Even though `str.format (*, **)` is cool, I think using an >> 'interpolated string' prefix can help clean up stuff a little bit: >> >> # ... >> def build (): >> t0 = _build.CompilationTask ([...], I"{OUTDIR}/{progn}", ...) >> >> def clean (): >> shell.sh (I"rm -fr {OUTDIR} *.o .pokedb") > > Please no. It's far too easy to make extremely messy code this way. If > you want it, spell it like this: > > shell.sh ("rm -fr {OUTDIR} *.o .pokedb".format(**globals())) > > (or locals() perhaps) so it's sufficiently obvious that you're just > casually handing all your names to the format function. It's like > avoiding Python 2's input() in favour of explicitly spelling it out as > eval(raw_input()) - put the eval call right there where it can be > seen. The system of interpolations as found in other languages (I'm > most familiar with the PHP one as I have to work with it on a daily > basis) is inevitably a target for more and more complexity and then > edge cases; being explicit is the Python way, so unless there's a > really good reason to make all your global names easily available, I > would be strongly inclined to not. > >> I'm looking forward to your criticisms and advices. I've searched this >> online and asked in the chatroom (#python) and I'm nearly sure that I'm >> not asking for a feature that is already present. Being a beginner, I >> can say that I'm kind of nervous to post here, where really experienced >> people discuss the features of an internationally-adopted language. > > I'd recommend python-list at python.org or comp.lang.python rather than > #python; you get much better responses when there's no requirement for > people to be online simultaneously. But in this case you're right, > there's no feature quite as you describe. > > In fact, I'd recommend you join python-list regardless, if only > because we have awesome fun there :) You sound like you'd be the > perfect sort to join in. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- G?ktu? Kayaalp From haoyi.sg at gmail.com Mon May 27 17:51:40 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 27 May 2013 11:51:40 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <878v304hs8.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> Message-ID: With macros , you can string interpolation kinda like that: >>> a, b = 1, 2>>> s%"%{a} apple and %{b} bananas"'1 apple and 2 bananas' But enough with the self promotion =D. > What do you think > > s"ham" > > will mean to the reader? I think that it is better to encourage people to write meaningful names: > > make_sandwich("ham") I disagree that custom string prefixes make code harder to read. Saying s"ham" costs minutes of confusion over s("ham") is based purely on unfamiliarity, which can be removed over time. Why u"text" rather than make_unicode_string("text") if the latter is more meaningful? This may be entirely mistaken, but - If the parsing of string literals could be shifted from the lexer/parser into functions being called at import time to preprocess the interned literals - If hooking into this preprocessing could be made easy (via a simple desugaring, import-hook style, or otherwise). Naively making b"12" call the function "b" is probably a bad idea, since people using variables called "b" all the time. - If existing prefixes would continue with identical semantics and minimal performance changes (e.g. from a change from lex-time handling to user-land preprocessing) Then this could be pretty nice. > I think string prefixes aren't something that we want more of, this is already complicated enough: > "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" > "r" | "u" | "R" | "U" > > We should be more creative on how to get rid of them. If-if-if all that works out, you would be able to *completely remove *the* ( *"b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" | "r" | "u" | "R" | "U") from the grammar specification! Not add more to it, remove it! Shifting the specification of all the different string prefixes into a user-land library. I'd say that's a pretty creative way of getting rid of that nasty blob of grammar =D. Now, that's a lot of "if"s, and I have no idea if any of them at all are true, but if-if-if they are all true, we could both simplify the lexer/parser, open up the prefixes for extensibility while maintaining the exact semantics for existing code. -Haoyi On Mon, May 27, 2013 at 10:30 AM, G?ktu? Kayaalp wrote: > > ... but ISTM you're talking about something very similar > > to the C++11 standard's new feature of user-defined literals ... > > I did not know this until now, but it looks like a fine idea. I wonder > how people would react to the idea of having this in Python. I can also > add that this is far better than what I propose. > > > In fact, I'd recommend you join python-list regardless, if only > > because we have awesome fun there :) You sound like you'd be the > > perfect sort to join in. > > I've just started to get into the community, and even though I haven't > posted anything to python-list, I'm trying to read every message. Python > community is really awesome! > > Thanks for your input! > > G?ktu? > > Chris Angelico writes: > > > On Mon, May 27, 2013 at 8:41 PM, G?ktu? Kayaalp > > wrote: > >> - I know that there is a library called `decimal`, which provides > >> facilities for finer floating point arithmetic. A `Decimal` > >> class is used to express these numbers and operations, resulting in > >> > >> >>> decimal.Decimal ("1.6e-9") * decimal.Decimal ("1.0e9") > >> > >> which is a little bit messy. This can easily be cured by > >> > >> >>> from decimal import Decimal as D > >> >>> D ("1.6e-9") * D ("1.0e9") > >> > >> but I'd enounce that the following is more concise and readable: > >> > >> >>> D"1.6e-9" * D"1.0e9" > >> > >> with removed parens. > > > > Your wording is a little confusing, as you're no longer talking about > > a string literal; but ISTM you're talking about something very similar > > to the C++11 standard's new feature of user-defined literals: > > > > http://en.wikipedia.org/wiki/C%2B%2B11#User-defined_literals > > > > This may be a little too complex for what you're proposing, but it is > > along the same lines. I suspect a generic system for allowing Decimal > > and other literals would be welcomed here. > > > >> Even though `str.format (*, **)` is cool, I think using an > >> 'interpolated string' prefix can help clean up stuff a little bit: > >> > >> # ... > >> def build (): > >> t0 = _build.CompilationTask ([...], I"{OUTDIR}/{progn}", ...) > >> > >> def clean (): > >> shell.sh (I"rm -fr {OUTDIR} *.o .pokedb") > > > > Please no. It's far too easy to make extremely messy code this way. If > > you want it, spell it like this: > > > > shell.sh ("rm -fr {OUTDIR} *.o .pokedb".format(**globals())) > > > > (or locals() perhaps) so it's sufficiently obvious that you're just > > casually handing all your names to the format function. It's like > > avoiding Python 2's input() in favour of explicitly spelling it out as > > eval(raw_input()) - put the eval call right there where it can be > > seen. The system of interpolations as found in other languages (I'm > > most familiar with the PHP one as I have to work with it on a daily > > basis) is inevitably a target for more and more complexity and then > > edge cases; being explicit is the Python way, so unless there's a > > really good reason to make all your global names easily available, I > > would be strongly inclined to not. > > > >> I'm looking forward to your criticisms and advices. I've searched this > >> online and asked in the chatroom (#python) and I'm nearly sure that I'm > >> not asking for a feature that is already present. Being a beginner, I > >> can say that I'm kind of nervous to post here, where really experienced > >> people discuss the features of an internationally-adopted language. > > > > I'd recommend python-list at python.org or comp.lang.python rather than > > #python; you get much better responses when there's no requirement for > > people to be online simultaneously. But in this case you're right, > > there's no feature quite as you describe. > > > > In fact, I'd recommend you join python-list regardless, if only > > because we have awesome fun there :) You sound like you'd be the > > perfect sort to join in. > > > > ChrisA > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > -- > G?ktu? Kayaalp > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goktug.kayaalp at gmail.com Mon May 27 18:26:37 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Mon, 27 May 2013 19:26:37 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: (Haoyi Li's message of "Mon, 27 May 2013 11:51:40 -0400") References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> Message-ID: <871u8s4cea.fsf@gmail.com> > I disagree that custom string prefixes make code harder to read. Saying s"ham" > costs minutes of confusion over s("ham") is based purely on unfamiliarity, > which can be removed over time. Why u"text" rather than make_unicode_string > ("text") if the latter is more meaningful?? I concur. And, while we can't know what s("ham") does to the string "ham" (print it or get the sum of ord(x) for x in "ham" or add to a database) without context, we can understand that some parsing of the string is involved when we see s"ham". > * If hooking into this preprocessing could be made easy (via a simple > desugaring, import-hook style, or otherwise). Naively making b"12" call the > function "b" is probably a bad idea, since people using variables called "b" > all the time. I thought of something like >>> string.register_prefix ("x", callable) where callable expects at least one string argument and returns a string. I don't know how apt would this be practically. > If-if-if all that works out, you would be able to completely remove the ("b" | > "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" |?"r" | "u" | "R" > | "U") from the grammar specification! Not add more to it, remove it! Shifting > the specification of all the different string prefixes into a user-land > library. I'd say that's a pretty creative way of getting rid of that nasty > blob of grammar =D. Frankly, I never thought of this (simplifying the grammar) :) Greetings G?ktu?. Haoyi Li writes: > With macros, you can string interpolation kinda like that: > > >>>> a, b = 1, 2 >>>> s%"%{a} apple and %{b} bananas" > '1 apple and 2 bananas' > But enough with the self promotion =D.? > >>?What do you think >>? >> s"ham" >>? >> will mean to the reader? I think that it is better to encourage people to > write meaningful names: >>? >> make_sandwich("ham") > > I disagree that custom string prefixes make code harder to read. Saying s"ham" > costs minutes of confusion over s("ham") is based purely on unfamiliarity, > which can be removed over time. Why u"text" rather than make_unicode_string > ("text") if the latter is more meaningful?? > > This may be entirely mistaken, but? > > * If the parsing of string literals could be shifted from the lexer/parser > into functions being called at import time to preprocess the interned > literals > > * If hooking into this preprocessing could be made easy (via a simple > desugaring, import-hook style, or otherwise). Naively making b"12" call the > function "b" is probably a bad idea, since people using variables called "b" > all the time. > > * If existing prefixes would continue with identical semantics and minimal > performance changes (e.g. from a change from lex-time handling to user-land > preprocessing) > > Then this could be pretty nice. > >>?I think string prefixes aren't something that we want more of, this is > already complicated enough: >> "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" >> "r" | "u" | "R" | "U" >> >> We should be more creative on how to get rid of them. > > If-if-if all that works out, you would be able to completely remove the ("b" | > "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" |?"r" | "u" | "R" > | "U") from the grammar specification! Not add more to it, remove it! Shifting > the specification of all the different string prefixes into a user-land > library. I'd say that's a pretty creative way of getting rid of that nasty > blob of grammar =D. > > Now, that's a lot of "if"s, and I have no idea if any of them at all are true, > but if-if-if they are all true, we could both simplify the lexer/parser, open > up the prefixes for ?extensibility while maintaining the exact semantics for > existing code. > > -Haoyi > > > > > On Mon, May 27, 2013 at 10:30 AM, G?ktu? Kayaalp > wrote: > > > ... but ISTM you're talking about something very similar > > to the C++11 standard's new feature of user-defined literals ... > > I did not know this until now, but it looks like a fine idea. I wonder > how people would react to the idea of having this in Python. I can also > add that this is far better than what I propose. > > > > In fact, I'd recommend you join python-list regardless, if only > > because we have awesome fun there :) You sound like you'd be the > > perfect sort to join in. > > > I've just started to get into the community, and even though I haven't > posted anything to python-list, I'm trying to read every message. Python > community is really awesome! > > Thanks for your input! > > ? ? ? ? G?ktu? > > > > Chris Angelico writes: > > > On Mon, May 27, 2013 at 8:41 PM, G?ktu? Kayaalp > > wrote: > >> ? - I know that there is a library called `decimal`, which provides > >> ? ? facilities for finer floating point arithmetic. A `Decimal` > >> ? ? class is used to express these numbers and operations, resulting in > >> > >> ? ? ? ? >>> decimal.Decimal ("1.6e-9") * decimal.Decimal ("1.0e9") > >> > >> ? ? which is a little bit messy. This can easily be cured by > >> > >> ? ? ? ? >>> from decimal import Decimal as D > >> ? ? ? ? >>> D ("1.6e-9") * D ("1.0e9") > >> > >> ? ? but I'd enounce that the following is more concise and readable: > >> > >> ? ? ? ? >>> D"1.6e-9" * D"1.0e9" > >> > >> ? ? with removed parens. > > > > Your wording is a little confusing, as you're no longer talking about > > a string literal; but ISTM you're talking about something very similar > > to the C++11 standard's new feature of user-defined literals: > > > > http://en.wikipedia.org/wiki/C%2B%2B11#User-defined_literals > > > > This may be a little too complex for what you're proposing, but it is > > along the same lines. I suspect a generic system for allowing Decimal > > and other literals would be welcomed here. > > > >> ? ? Even though `str.format (*, **)` is cool, I think using an > >> ? ? 'interpolated string' prefix can help clean up stuff a little bit: > >> > >> ? ? ? ?# ... > >> ? ? ? ?def build (): > >> ? ? ? ? ?t0 = _build.CompilationTask ([...], I"{OUTDIR}/{progn}", ...) > >> > >> ? ? ? ?def clean (): > >> ? ? ? ? ?shell.sh (I"rm -fr {OUTDIR} *.o .pokedb") > > > > Please no. It's far too easy to make extremely messy code this way. If > > you want it, spell it like this: > > > > shell.sh ("rm -fr {OUTDIR} *.o .pokedb".format(**globals())) > > > > (or locals() perhaps) so it's sufficiently obvious that you're just > > casually handing all your names to the format function. It's like > > avoiding Python 2's input() in favour of explicitly spelling it out as > > eval(raw_input()) - put the eval call right there where it can be > > seen. The system of interpolations as found in other languages (I'm > > most familiar with the PHP one as I have to work with it on a daily > > basis) is inevitably a target for more and more complexity and then > > edge cases; being explicit is the Python way, so unless there's a > > really good reason to make all your global names easily available, I > > would be strongly inclined to not. > > > >> I'm looking forward to your criticisms and advices. I've searched this > >> online and asked in the chatroom (#python) and I'm nearly sure that I'm > >> not asking for a feature that is already present. Being a beginner, I > >> can say that I'm kind of nervous to post here, where really experienced > >> people discuss the features of an internationally-adopted language. > > > > I'd recommend python-list at python.org or comp.lang.python rather than > > #python; you get much better responses when there's no requirement for > > people to be online simultaneously. But in this case you're right, > > there's no feature quite as you describe. > > > > In fact, I'd recommend you join python-list regardless, if only > > because we have awesome fun there :) You sound like you'd be the > > perfect sort to join in. > > > > ChrisA > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > -- > G?ktu? Kayaalp > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- G?ktu? Kayaalp From ned at nedbatchelder.com Mon May 27 18:59:40 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Mon, 27 May 2013 12:59:40 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> Message-ID: <51A390FC.3070306@nedbatchelder.com> On 5/27/2013 11:51 AM, Haoyi Li wrote: > If-if-if all that works out, you would be able to /completely remove > /the/(/"b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | > "RB" | "r" | "u" | "R" | "U") from the grammar specification! Not add > more to it, remove it! Shifting the specification of all the different > string prefixes into a user-land library. I'd say that's a pretty > creative way of getting rid of that nasty blob of grammar =D. > In order to achieve this ideal, and assuming you'd be keeping backward compatibility (!), you'd have to explain how to support both of these strings: "Hello\n" r"Hello\n" Implicit in your idea is that the plain literal creates a string of some kind, and but the r-prefixed string would apply some user-land function to the string. But there is no function you can apply to string literals to make them be raw. The r prefix suppresses interpretation that happens in un-prefixed strings. By the time a user-land function got hold of the string, the interpretation has already been done, information has already been lost. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 27 19:14:46 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 27 May 2013 10:14:46 -0700 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A390FC.3070306@nedbatchelder.com> References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A390FC.3070306@nedbatchelder.com> Message-ID: On Mon, May 27, 2013 at 9:59 AM, Ned Batchelder wrote: > Implicit in your idea is that the plain literal creates a string of some > kind, and but the r-prefixed string would apply some user-land function to > the string. But there is no function you can apply to string literals to > make them be raw. The r prefix suppresses interpretation that happens in > un-prefixed strings. By the time a user-land function got hold of the > string, the interpretation has already been done, information has already > been lost. I'm not sure what to think of the whole proposal (except that it sounds Perl-ish :-), but this particular issue is easily dealt with: let the representation sent into the function always be the raw form. Then the 'r' prefix could be a no-op, while the default prefix would interpret escapes. -- --Guido van Rossum (python.org/~guido) From goktug.kayaalp at gmail.com Mon May 27 19:18:15 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Mon, 27 May 2013 20:18:15 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A390FC.3070306@nedbatchelder.com> (Ned Batchelder's message of "Mon, 27 May 2013 12:59:40 -0400") References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A390FC.3070306@nedbatchelder.com> Message-ID: <87wqqk2vfs.fsf@gmail.com> > In order to achieve this ideal, and assuming you'd be keeping backward > compatibility (!), you'd have to explain how to support both of these strings: > > ??? "Hello\n" > ??? r"Hello\n" A possible solution is: - In parse time, any string literal is a /raw string/, regardless of what prefix it has or if it even has a prefix. - The /raw string/ is then passed to user-land in this raw state, and then, if no prefix is applied, it is parsed as a standard string, otherwise the requested prefix is applied. - In case of a user-land raw string (e.g. r"yo"), the prefix function can be the identity function (e.g. f(x) = x). This is possibly not the most ideal solution, but it is a solution. Greetings, G?ktu?. Ned Batchelder writes: > On 5/27/2013 11:51 AM, Haoyi Li wrote: > > If-if-if all that works out, you would be able to completely remove the > ("b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" |?"r" | > "u" | "R" | "U") from the grammar specification! Not add more to it, > remove it! Shifting the specification of all the different string prefixes > into a user-land library. I'd say that's a pretty creative way of getting > rid of that nasty blob of grammar =D. > > > > In order to achieve this ideal, and assuming you'd be keeping backward > compatibility (!), you'd have to explain how to support both of these strings: > > ??? "Hello\n" > ??? r"Hello\n" > > Implicit in your idea is that the plain literal creates a string of some kind, > and but the r-prefixed string would apply some user-land function to the > string.? But there is no function you can apply to string literals to make > them be raw.? The r prefix suppresses interpretation that happens in > un-prefixed strings.? By the time a user-land function got hold of the string, > the interpretation has already been done, information has already been lost. > > --Ned. -- G?ktu? Kayaalp From solipsis at pitrou.net Mon May 27 20:11:09 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 27 May 2013 20:11:09 +0200 Subject: [Python-ideas] Make gettext support direct PO files input References: <76F0346A-251D-4C93-BBC8-F74625CD5666@masklinn.net> Message-ID: <20130527201109.13d5cc78@fsol> On Mon, 27 May 2013 15:53:34 +0200 Masklinn wrote: > * The standard Python distribution provides an msgfmt tool able to > compile PO files to MO files. > * The gettext module only looks for MO files. > > This leads to 2 questions: > * Why isn't the msgfmt behavior available programmatically from the stdlib > even though all the pieces seem present? (as far as I can see, > everything is implemented in msgfmt directly) Because noone did the job yet, I guess. Any stdlib addition to make gettext more usable would probably be welcome :-) > * Why can't gettext use PO files directly, but requires separately > generating MO files? I don't know, perhaps it's supposed to make loading times smaller. gettext is an old tool, some of its design decisions may not make sense nowadays (perhaps they didn't make sense at the time, either). Regards Antoine. From jeanpierreda at gmail.com Mon May 27 20:35:17 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Mon, 27 May 2013 14:35:17 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87wqqk2vfs.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A390FC.3070306@nedbatchelder.com> <87wqqk2vfs.fsf@gmail.com> Message-ID: On Mon, May 27, 2013 at 1:18 PM, G?ktu? Kayaalp wrote: > A possible solution is: > > - In parse time, any string literal is a /raw string/, regardless of > what prefix it has or if it even has a prefix. > > - The /raw string/ is then passed to user-land in this raw state, > and then, if no prefix is applied, it is parsed as a standard > string, otherwise the requested prefix is applied. > > - In case of a user-land raw string (e.g. r"yo"), the prefix > function can be the identity function (e.g. f(x) = x). > > This is possibly not the most ideal solution, but it is a solution. This is beginning to sound like E-style quasiliterals. http://www.erights.org/elang/grammar/quasi-overview.html -- Devin From masklinn at masklinn.net Mon May 27 21:01:08 2013 From: masklinn at masklinn.net (Masklinn) Date: Mon, 27 May 2013 21:01:08 +0200 Subject: [Python-ideas] Make gettext support direct PO files input In-Reply-To: <20130527201109.13d5cc78@fsol> References: <76F0346A-251D-4C93-BBC8-F74625CD5666@masklinn.net> <20130527201109.13d5cc78@fsol> Message-ID: On 2013-05-27, at 20:11 , Antoine Pitrou wrote: > On Mon, 27 May 2013 15:53:34 +0200 > Masklinn wrote: >> * The standard Python distribution provides an msgfmt tool able to >> compile PO files to MO files. >> * The gettext module only looks for MO files. >> >> This leads to 2 questions: >> * Why isn't the msgfmt behavior available programmatically from the stdlib >> even though all the pieces seem present? (as far as I can see, >> everything is implemented in msgfmt directly) > > Because noone did the job yet, I guess. Any stdlib addition to make > gettext more usable would probably be welcome :-) > >> * Why can't gettext use PO files directly, but requires separately >> generating MO files? > > I don't know, perhaps it's supposed to make loading times smaller. > gettext is an old tool, some of its design decisions may not make sense > nowadays (perhaps they didn't make sense at the time, either). So assuming somebody'd be interested in trying to fix these, what would be the best route, opening an issue on the tracker and providing some sort of proof of concept? From solipsis at pitrou.net Mon May 27 21:31:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 27 May 2013 21:31:52 +0200 Subject: [Python-ideas] Make gettext support direct PO files input References: <76F0346A-251D-4C93-BBC8-F74625CD5666@masklinn.net> <20130527201109.13d5cc78@fsol> Message-ID: <20130527213152.264d91fa@fsol> On Mon, 27 May 2013 21:01:08 +0200 Masklinn wrote: > On 2013-05-27, at 20:11 , Antoine Pitrou wrote: > > On Mon, 27 May 2013 15:53:34 +0200 > > Masklinn wrote: > >> * The standard Python distribution provides an msgfmt tool able to > >> compile PO files to MO files. > >> * The gettext module only looks for MO files. > >> > >> This leads to 2 questions: > >> * Why isn't the msgfmt behavior available programmatically from the stdlib > >> even though all the pieces seem present? (as far as I can see, > >> everything is implemented in msgfmt directly) > > > > Because noone did the job yet, I guess. Any stdlib addition to make > > gettext more usable would probably be welcome :-) > > > >> * Why can't gettext use PO files directly, but requires separately > >> generating MO files? > > > > I don't know, perhaps it's supposed to make loading times smaller. > > gettext is an old tool, some of its design decisions may not make sense > > nowadays (perhaps they didn't make sense at the time, either). > > So assuming somebody'd be interested in trying to fix these, what would > be the best route, opening an issue on the tracker and providing some > sort of proof of concept? I think that would be a good start. Note that I'm not a gettext expert, and I could be totally talking nonsense here. Regards Antoine. From haoyi.sg at gmail.com Mon May 27 21:41:29 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Mon, 27 May 2013 15:41:29 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A390FC.3070306@nedbatchelder.com> <87wqqk2vfs.fsf@gmail.com> Message-ID: The scala people intend to use their version of this for things like xml literals: snippet = xml"hello world!" Which are parsed and interned at compile time. Generally, it would be useful for embedding snippets of non-python code in strings within a python program, and have them resolved at import-time and interned to prevent run-time errors or performance overheads. I'm thinking of regexes and libraries like Parsimonious, which take literal strings to do stuff with, would benefit from this. No more "should I or should I not re.compile() my regexes", no more worrying about whether the ad-hoc global compiled-regex cache is going to start evicting my regexes. For the vast majority of cases, regexes are static literals and can be compiled at import time without any fuss or global side effects (e.g. evicting others from the cache). On Mon, May 27, 2013 at 2:35 PM, Devin Jeanpierre wrote: > On Mon, May 27, 2013 at 1:18 PM, G?ktu? Kayaalp > wrote: > > A possible solution is: > > > > - In parse time, any string literal is a /raw string/, regardless of > > what prefix it has or if it even has a prefix. > > > > - The /raw string/ is then passed to user-land in this raw state, > > and then, if no prefix is applied, it is parsed as a standard > > string, otherwise the requested prefix is applied. > > > > - In case of a user-land raw string (e.g. r"yo"), the prefix > > function can be the identity function (e.g. f(x) = x). > > > > This is possibly not the most ideal solution, but it is a solution. > > This is beginning to sound like E-style quasiliterals. > > http://www.erights.org/elang/grammar/quasi-overview.html > > -- Devin > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue May 28 03:10:32 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 28 May 2013 11:10:32 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A390FC.3070306@nedbatchelder.com> <87wqqk2vfs.fsf@gmail.com> Message-ID: <51A40408.8070709@pearwood.info> On 28/05/13 05:41, Haoyi Li wrote: > The scala people intend to use their version of this for things like xml > literals: > > snippet = xml"hello world!" > > Which are parsed and interned at compile time. Generally, it would be > useful for embedding snippets of non-python code in strings within a python > program, and have them resolved at import-time and interned to prevent > run-time errors or performance overheads. "Prevent run-time errors" is not generally something that Python cares about, or could do even if it tried. In any case, that's not going to be practical for Python, since the xml function won't exist at compile-time. It won't exist until import-time, which is at run-time. But I wonder, why would you want your XML to be interned? I don't think I'd want large amounts of verbose XML in memory long after I've finished with it. -- Steven From steve at pearwood.info Tue May 28 04:09:42 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 28 May 2013 12:09:42 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <871u8s4cea.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <871u8s4cea.fsf@gmail.com> Message-ID: <51A411E6.4040002@pearwood.info> On 28/05/13 02:26, G?ktu? Kayaalp wrote: >> I disagree that custom string prefixes make code harder to read. Saying s"ham" >> costs minutes of confusion over s("ham") is based purely on unfamiliarity, >> which can be removed over time. Why u"text" rather than make_unicode_string >> ("text") if the latter is more meaningful? > > I concur. And, while we can't know what s("ham") does to the string > "ham" (print it or get the sum of ord(x) for x in "ham" or add to a > database) without context, we can understand that some parsing of the > string is involved when we see s"ham". Not at all. Since your proposal tells us that the user can define s"" as an arbitrary function, it can do anything. We already have function call syntax. I don't believe we need two ways to call a function, especially when the second way is so limited (single letter name, one argument, must be a string). >> * If hooking into this preprocessing could be made easy (via a simple >> desugaring, import-hook style, or otherwise). Naively making b"12" call the >> function "b" is probably a bad idea, since people using variables called "b" >> all the time. > > I thought of something like > > >>> string.register_prefix ("x", callable) > > where callable expects at least one string argument and returns a > string. I don't know how apt would this be practically. This fails your earlier example: D"1.25" -> Decimal("1.25") >> If-if-if all that works out, you would be able to completely remove the ("b" | >> "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" | "r" | "u" | "R" >> | "U") from the grammar specification! Not add more to it, remove it! Shifting >> the specification of all the different string prefixes into a user-land >> library. I'd say that's a pretty creative way of getting rid of that nasty >> blob of grammar =D. > > Frankly, I never thought of this (simplifying the grammar) :) And how does that help the reader of the code? Whether all those string prefixes are defined as part of the language grammar, or in a standard library, makes absolutely no difference. They still need to be inserted into builtins so that code that uses them will continue to work. But now, instead of just a dozen fixed prefixes, you have an unlimited number of potential looks-like-strings-but-actually-are-anything prefixes. -- Steven From steve at pearwood.info Tue May 28 04:11:14 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 28 May 2013 12:11:14 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> Message-ID: <51A41242.3020107@pearwood.info> On 28/05/13 01:51, Haoyi Li wrote: > I disagree that custom string prefixes make code harder to read. Saying > s"ham" costs minutes of confusion over s("ham") is based purely on > unfamiliarity, which can be removed over time. Why u"text" rather than > make_unicode_string("text") if the latter is more meaningful? For backward compatibility, of course. u"text" only because Python 1.x didn't have Unicode strings. Of course people have to learn language features, and when reading an unfamiliar language you may be unfamiliar with the semantics of a specific function call. But it's much easier to guess that: Decimal("1.25") calls a function called "Decimal" with the string "1.25" as the single argument, since this is notation shared by nearly all mainstream programming languages. Then you have a good place to start: search for Decimal, and see where it is defined. On the other hand, one needs to be intimately familiar with the code-base to know that: D"1.25" means exactly the same thing. It's radically different from most programming languages, and it doesn't look like a function call even though it is a function call. Why have two ways to call the function Decimal just to save a pair of parentheses? Python includes Unicode literals so we can write strings in the natural way. In Python 3, where the u prefix is no longer required, I can say: equation = "2??" and it works perfectly naturally, without needing to mess around with bytes and encodings. This is important for strings, since they are a fundamental data type, but not every data type needs to have compiler support, and not every function is so important that saving two parentheses is worth the reduction in readability. -- Steven From stephen at xemacs.org Tue May 28 04:22:12 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 28 May 2013 11:22:12 +0900 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A40408.8070709@pearwood.info> References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A390FC.3070306@nedbatchelder.com> <87wqqk2vfs.fsf@gmail.com> <51A40408.8070709@pearwood.info> Message-ID: <87sj17sv1n.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > But I wonder, why would you want your XML to be interned? Server templates? From ncoghlan at gmail.com Tue May 28 04:31:19 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 May 2013 12:31:19 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87obbw4scx.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> Message-ID: On Mon, May 27, 2013 at 8:41 PM, G?ktu? Kayaalp wrote: > Hi, > > I wanted to share this idea of (not possibly only) mine with Python > Core developers. Hi G?ktu?, You may want to review the recent threads on MacroPy (such as [1]), as well as various other proposals for compile time string manipulation and AST based metaprogramming in the list archives. Cheers, Nick. [1] http://mail.python.org/pipermail/python-ideas/2013-May/020499.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue May 28 05:06:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 May 2013 13:06:38 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> Message-ID: On Tue, May 28, 2013 at 1:51 AM, Haoyi Li wrote: > If-if-if all that works out, you would be able to completely remove the ("b" > | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" | "r" | "u" | > "R" | "U") from the grammar specification! Not add more to it, remove it! > Shifting the specification of all the different string prefixes into a > user-land library. I'd say that's a pretty creative way of getting rid of > that nasty blob of grammar =D. > > Now, that's a lot of "if"s, and I have no idea if any of them at all are > true, but if-if-if they are all true, we could both simplify the > lexer/parser, open up the prefixes for extensibility while maintaining the > exact semantics for existing code. Oops, should have read more of the thread before replying :) But, yeah, it would be nice if we could get to a mechanism that replaces the current horror show that is string prefix handling (see http://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals) with the functional equivalent of the following Python code: def str_raw(source_bytes, source_encoding): return source_bytes.decode(source_encoding) def str_with_escapes(source_bytes, source_encoding): # Handle escapes to create "s" return s def bytes_raw(source_bytes, source_encoding): return source_bytes def bytes_with_escapes(source_bytes, source_encoding): # Handle escapes to create "b" return b cache_token = "Marker for pyc validity checking" for prefix in (None, "u", "U"): ast.register_str_prefix(prefix, str_with_escapes, cache_token) for prefix in ("r", "R"): ast.register_str_prefix(prefix, str_raw, cache_token) for prefix in (None, "b", "B"): ast.register_str_prefix(prefix, bytes_with_escapes, cache_token) for prefix in ("br", "Br", "bR", "BR", "rb", "rB", "Rb", "RB"): ast.register_str_prefix(prefix, bytes_raw, cache_token) The module caching code would likely need to grow another header dict that stores a mapping of prefix implementation names to their cache tokens. If the cache file references an unregistered prefix then the import would fail, while if it references one with a mismatched cache token, then the cached file would need to be regenerated. We could either just live with the fact that running the same file with different registrations may regenerate the file in __pycache__, or else come up with a nonconflicting naming scheme (I suspect the latter would be too messy and too rarely needed to be worth the hassle). (Obviously, the four core handlers wouldn't work quite this way - they would always be present, and their cache invalidation would be handled with the existing global bytecode cookie. However, it's a useful demonstration of the value of the generalisation, and the issues any such generalisation will need to handle). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ron3200 at gmail.com Tue May 28 08:32:17 2013 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 28 May 2013 01:32:17 -0500 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> Message-ID: <51A44F71.60709@gmail.com> On 05/27/2013 10:06 PM, Nick Coghlan wrote: > On Tue, May 28, 2013 at 1:51 AM, Haoyi Li wrote: >> >If-if-if all that works out, you would be able to completely remove the ("b" >> >| "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" | "r" | "u" | >> >"R" | "U") from the grammar specification! Not add more to it, remove it! >> >Shifting the specification of all the different string prefixes into a >> >user-land library. I'd say that's a pretty creative way of getting rid of >> >that nasty blob of grammar =D. >> > >> >Now, that's a lot of "if"s, and I have no idea if any of them at all are >> >true, but if-if-if they are all true, we could both simplify the >> >lexer/parser, open up the prefixes for extensibility while maintaining the >> >exact semantics for existing code. > Oops, should have read more of the thread before replying:) > > But, yeah, it would be nice if we could get to a mechanism that > replaces the current horror show that is string prefix handling (see > http://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals) > with the functional equivalent > of the following Python code: Yes, the grammar reference is a bit hard to grasp. But there is really only four cases there. It just seems like more when you consider the different variation of upper and lower case. > def str_raw(source_bytes, source_encoding): > return source_bytes.decode(source_encoding) > > def str_with_escapes(source_bytes, source_encoding): > # Handle escapes to create "s" > return s > > def bytes_raw(source_bytes, source_encoding): > return source_bytes > > def bytes_with_escapes(source_bytes, source_encoding): > # Handle escapes to create "b" > return b > > cache_token = "Marker for pyc validity checking" > > for prefix in (None, "u", "U"): > ast.register_str_prefix(prefix, str_with_escapes, cache_token) > > for prefix in ("r", "R"): > ast.register_str_prefix(prefix, str_raw, cache_token) > > for prefix in (None, "b", "B"): > ast.register_str_prefix(prefix, bytes_with_escapes, cache_token) > > for prefix in ("br", "Br", "bR", "BR", "rb", "rB", "Rb", "RB"): > ast.register_str_prefix(prefix, bytes_raw, cache_token) While playing around in tokenizer.c, it took me a bit to figure out how this worked... if (is_potential_identifier_start(c)) { /* Process b"", r"", u"", br"" and rb"" */ int saw_b = 0, saw_r = 0, saw_u = 0; while (1) { if (!(saw_b || saw_u) && (c == 'b' || c == 'B')) saw_b = 1; /* Since this is a backwards compatibility support literal we don't want to support it in arbitrary order like byte literals. */ else if (!(saw_b || saw_u || saw_r) && (c == 'u' || c == 'U')) saw_u = 1; /* ur"" and ru"" are not supported */ else if (!(saw_r || saw_u) && (c == 'r' || c == 'R')) saw_r = 1; else break; c = tok_nextc(tok); if (c == '"' || c == '\'') goto letter_quote; } It continues with the identifier section if it doesn't jump to the string section. I came up with a working alternative that I think is much easier to understand... /* Check for standard string */ if (c == '"' || c == '\'') goto letter_quote; /* Check for string literals b"", r"" or u"". */ c2 = c; c = tok_nextc(tok); if ((c == '"' || c == '\'') && ((c2 == 'b') || (c2 == 'B') || (c2 == 'r') || (c2 == 'R') || (c2 == 'u') || (c2 == 'U'))) goto letter_quote; /* Check for string literals rb"" and br"". */ c3 = c; c = tok_nextc(tok); if ((c == '"' || c == '\'') && (c2 != c3) && ((c2 == 'r') || (c2 == 'R') || (c2 == 'b') || (c2 == 'B')) && ((c3 == 'b') || (c3 == 'B') || (c3 == 'r') || (c3 == 'R')) ) goto letter_quote; tok_backup(tok, c); tok_backup(tok, c3); c = c2; goto not_a_string; letter_quote: /* String */ { ... reads string to find its end. The jump to "not_a_string" just skips over the rest of the string section. Because it's not a loop, it takes a few more lines. This puts all the string code together in one place, and the identifier parts don't have any string testing lines in it. Maybe it's not quite as efficient, but I think it's much easier to understand. (And yes, I could've used if-else's and avoided the goto's, but I like the fall through pattern in this case without the deep indention.) I'm not sure how practical removing or moving string prefixes would be. Having only a few literals, is probably the best practical compromise between the two ideals. Moving them to the run time parser would make some things slower. Being able to add or register more prefix's would probably hurt Pythons readability when you want to review someone else's programs. I think it would only improve readability for programs we write ourselves, because then we know much more easily what we defined those prefixes to mean. That wouldn't be the case when we read someone else's code. There are some options for cleaning up parts of the interpreter. We could move all (or as much as is doable) of the compile time stuff to later in the chain, which probably means moving it to a place that it all can be done from ast.c. That would make the tokenizer simpler and cleaner. Alternatively, we could go the other way and move as much is doable to a preprocessor step, which would happen just before a program is tokenized. But I'm not sure either one of these options has much real benefit. I'm more interested in the mini core language that was suggested a while back to help solve some of the boot strapping issues. Is anyone working on that? Cheers, Ron From abarnert at yahoo.com Tue May 28 09:30:51 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 28 May 2013 00:30:51 -0700 Subject: [Python-ideas] Idea: Compressing the stack on the fly In-Reply-To: References: <3daafc97-115a-4525-907c-fdbf356fd749@googlegroups.com> <51A3651D.40403@nedbatchelder.com> Message-ID: <951E241A-3294-4361-B66E-2D8B480BFE48@yahoo.com> On May 27, 2013, at 7:05, Chris Angelico wrote: > On Mon, May 27, 2013 at 11:52 PM, Ned Batchelder wrote: >> >> On 5/27/2013 9:30 AM, Chris Angelico wrote: >>> What can tail recursion do that can't be done by reassigning to the >>> function parameters and 'goto' back to the top? >> >> That style can't handle mutually recursive procedures, or the extreme case: >> a state machine implemented with N functions, each of which calls the next >> state function at the end. Tail-call elimination isn't simply about >> noticing recursive calls. It's about noticing that a function ends with a >> function call, and not burning another stack frame in the process. > > Ahh, gotcha. Of course. Mutual recursion would be a bit more of a > problem to the compressor, too, though; it'd have to recognize a > pattern that spans multiple stack frames. That's exactly why I suggested that it would be more interesting in a PyPy-style tracing JIT than in a static compiler. At runtime, you can detect that your traced fast path has a FUNC opcode that calls a function whose body is already part of the trace, and turn that into code that stashes just enough information to generate stack frames if needed, then do a goto. In simple cases, that stashing could amount to no more than incrementing a counter; in complex cases, I think it might be closely related to stashing enough info to create generator states if you fall off the fast path. That should be much simpler than trying to detect patterns that will lead to tail calls. And it means you only make the effort when it makes a difference. But, most importantly, it means you can benefit even in non-tail-call cases, like the naive recursion in the original post. Even when you can't eliminate recursion you will still "compress" the state. And it might also let you optimize chains of generator yields or coroutine sends with not much extra effort (which might even be a basis for optimizing out explicit coroutine trampolines, which would be very cool). This is pretty different from traditional TCO, but given that TCO isn't relevant to the OP's proposal or to his example, I don't think that's a problem. From jamylak at gmail.com Wed May 29 07:27:27 2013 From: jamylak at gmail.com (James K) Date: Wed, 29 May 2013 15:27:27 +1000 Subject: [Python-ideas] collections.Counter multiplication Message-ID: Can we add a multiplication feature to collections.Counter, I don't see why not. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed May 29 13:34:29 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 29 May 2013 21:34:29 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: Message-ID: <51A5E7C5.5080407@pearwood.info> On 29/05/13 15:27, James K wrote: > Can we add a multiplication feature to collections.Counter, I don't see why > not. What would this feature do, and under what circumstances would you use it? -- Steven From storchaka at gmail.com Wed May 29 15:25:56 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 29 May 2013 16:25:56 +0300 Subject: [Python-ideas] Gzip and zip extra field Message-ID: Gzip files can contains an extra field [1] and some applications use this for extending gzip format. The current GzipFile implementation ignores this field on input and doesn't allow to create a new file with an extra field. ZIP file entries also can contains an extra field [2]. Currently it just saved as bytes in the `extra` attribute of ZipInfo. I propose to save an extra field for gzip file and provide structural access to subfields. f = gzip.GzipFile('somefile.gz', 'rb') f.extra_bytes # A raw extra field as bytes # iterating over all subfields for xid, data in f.extra_map.items(): ... # get Apollo file type information f.extra_map[b'AP'] # (or f.extra_map['AP']?) # creating gzip file with extra field f = gzip.GzipFile('somefile.gz', 'wb', extra=extrabytes) f = gzip.GzipFile('somefile.gz', 'wb', extra=[(b'AP', apollodata)]) f = gzip.GzipFile('somefile.gz', 'wb', extra={b'AP': apollodata}) # change Apollo file type information f.extra_map[b'AP'] = ... Issue #17681 [3] has preliminary patches. There is some open doubt about interface. Is not it over-engineered? Currently GzipFile supports seamless reading a sequence of separately compressed gzip files. Every such chunk can have own extra field (this is used in dictzip for example). It would be desirable to be able to read only until the end of current chunk in order not to miss an extra field. [1] http://www.gzip.org/format.txt [2] http://www.pkware.com/documents/casestudies/APPNOTE.TXT [3] http://bugs.python.org/issue17681 From ned at nedbatchelder.com Wed May 29 16:29:29 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 29 May 2013 10:29:29 -0400 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: Message-ID: <51A610C9.4040405@nedbatchelder.com> On 5/29/2013 1:27 AM, James K wrote: > Can we add a multiplication feature to collections.Counter, I don't > see why not. James, welcome to the list. To get an idea accepted, you have to do a few things: 1) Explain the idea fully. I don't understand what "a multiplication feature" would do. 2) Explain why the idea is useful to enough people that it should be added to the standard library. These two criteria are not easy to meet. Sometimes an idea seems popular, but it turns out that different people want it to behave differently, or differently at different times (see the discussion about an itertools.chunked feature). Sometimes an idea is straightforward enough to describe, but is useful to too few people to justify adding it to the standard library. Discussing these things doesn't often result in a change to Python, but does often lead to useful discussion. --Ned. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Wed May 29 21:23:56 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 29 May 2013 15:23:56 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87obbw4scx.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> Message-ID: On Mon, May 27, 2013 at 6:41 AM, G?ktu? Kayaalp wrote: > I think it would add some significant flexibility to Python to let users > define custom string prefixes. What I mean by a string prefix is, > a letter prefixing the string literal, modifying the behavior of it, --snip-- Rather than Decimal, IMO a more compelling use case is SQL queries. At the moment, string literals make unsafe string formatting an attractive nuisance: cur.execute("..." % (...)) versus cur.execute("...", (...)) Something that custom string prefixes do, that cannot be done in Python, is make this confusion impossible. You could make the only way to create passable SQL expressions via the string sql:"...", which produces an SQL object. At no point in time does the programmer deal with strings that can be manipulated in unsafe ways to result in SQL injection vulnerabilities. Of course, then there is the issue of "what if you want to produce an SQL expression from a string"? Then you can make that difficult, rather than attractive, perhaps requiring the following code: with sql.unsafe.disable_all_security_protections: expr = sql.unsafe.compile_string(my_string) cur.execute(expr, (...)) As it stands today, it's very common for people to produce insecure code completely by accident. I see it on a regular basis in #python. There is no way to resolve this without something similar to E's quasiliterals, or these prefixed strings. -- Devin From jamylak at gmail.com Wed May 29 22:17:34 2013 From: jamylak at gmail.com (James K) Date: Thu, 30 May 2013 06:17:34 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A610C9.4040405@nedbatchelder.com> References: <51A610C9.4040405@nedbatchelder.com> Message-ID: It should work like this >>> from collections import Counter >>> Counter({'a': 1, 'b': 2}) * 2 # scalar Counter({'b': 4, 'a': 2}) >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies matching keys Counter({'b': 4}) This is intuitive behavior and therefore should be added. I am unsure about division as dividing by a non-existing key would be a division by 0, although division by a scalar is straightforward. On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: > On 5/29/2013 1:27 AM, James K wrote: > > Can we add a multiplication feature to collections.Counter, I don't see > why not. > > > James, welcome to the list. To get an idea accepted, you have to do a few > things: > > 1) Explain the idea fully. I don't understand what "a multiplication > feature" would do. > 2) Explain why the idea is useful to enough people that it should be added > to the standard library. > > These two criteria are not easy to meet. Sometimes an idea seems popular, > but it turns out that different people want it to behave differently, or > differently at different times (see the discussion about an > itertools.chunked feature). Sometimes an idea is straightforward enough to > describe, but is useful to too few people to justify adding it to the > standard library. > > Discussing these things doesn't often result in a change to Python, but > does often lead to useful discussion. > > --Ned. > > > > _______________________________________________ > Python-ideas mailing listPython-ideas at python.orghttp://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Wed May 29 22:21:56 2013 From: mertz at gnosis.cx (David Mertz) Date: Wed, 29 May 2013 13:21:56 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> Message-ID: That "intuitive" behavior certainly wouldn't have been my first--or second guess--on seeing the syntax. On Wed, May 29, 2013 at 1:17 PM, James K wrote: > It should work like this > > >>> from collections import Counter > >>> Counter({'a': 1, 'b': 2}) * 2 # scalar > Counter({'b': 4, 'a': 2}) > >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies > matching keys > Counter({'b': 4}) > > > This is intuitive behavior and therefore should be added. I am unsure > about division as dividing by a non-existing key would be a division by 0, > although division by a scalar is straightforward. > > > On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: > >> On 5/29/2013 1:27 AM, James K wrote: >> >> Can we add a multiplication feature to collections.Counter, I don't see >> why not. >> >> >> James, welcome to the list. To get an idea accepted, you have to do a >> few things: >> >> 1) Explain the idea fully. I don't understand what "a multiplication >> feature" would do. >> 2) Explain why the idea is useful to enough people that it should be >> added to the standard library. >> >> These two criteria are not easy to meet. Sometimes an idea seems >> popular, but it turns out that different people want it to behave >> differently, or differently at different times (see the discussion about an >> itertools.chunked feature). Sometimes an idea is straightforward enough to >> describe, but is useful to too few people to justify adding it to the >> standard library. >> >> Discussing these things doesn't often result in a change to Python, but >> does often lead to useful discussion. >> >> --Ned. >> >> >> >> _______________________________________________ >> Python-ideas mailing listPython-ideas at python.orghttp://mail.python.org/mailman/listinfo/python-ideas >> >> >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamylak at gmail.com Wed May 29 22:42:17 2013 From: jamylak at gmail.com (James K) Date: Thu, 30 May 2013 06:42:17 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> Message-ID: Oops I forgot Counter doesn't support operations with scalars, I'm still proposing the second example On Thu, May 30, 2013 at 6:21 AM, David Mertz wrote: > That "intuitive" behavior certainly wouldn't have been my first--or second > guess--on seeing the syntax. > > > On Wed, May 29, 2013 at 1:17 PM, James K wrote: > >> It should work like this >> >> >>> from collections import Counter >> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >> Counter({'b': 4, 'a': 2}) >> >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # >> multiplies matching keys >> Counter({'b': 4}) >> >> >> This is intuitive behavior and therefore should be added. I am unsure >> about division as dividing by a non-existing key would be a division by 0, >> although division by a scalar is straightforward. >> >> >> On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: >> >>> On 5/29/2013 1:27 AM, James K wrote: >>> >>> Can we add a multiplication feature to collections.Counter, I don't see >>> why not. >>> >>> >>> James, welcome to the list. To get an idea accepted, you have to do a >>> few things: >>> >>> 1) Explain the idea fully. I don't understand what "a multiplication >>> feature" would do. >>> 2) Explain why the idea is useful to enough people that it should be >>> added to the standard library. >>> >>> These two criteria are not easy to meet. Sometimes an idea seems >>> popular, but it turns out that different people want it to behave >>> differently, or differently at different times (see the discussion about an >>> itertools.chunked feature). Sometimes an idea is straightforward enough to >>> describe, but is useful to too few people to justify adding it to the >>> standard library. >>> >>> Discussing these things doesn't often result in a change to Python, but >>> does often lead to useful discussion. >>> >>> --Ned. >>> >>> >>> >>> _______________________________________________ >>> Python-ideas mailing listPython-ideas at python.orghttp://mail.python.org/mailman/listinfo/python-ideas >>> >>> >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> >> > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Wed May 29 23:08:05 2013 From: mertz at gnosis.cx (David Mertz) Date: Wed, 29 May 2013 14:08:05 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> Message-ID: <48C729BE-258B-4D9F-AE1E-0B900176CEDA@gnosis.cx> I don't see the use case here. I guess the described behavior is mildly interesting, but it doesn't feel widely useful. For your own need, why not just write: class MyCounter(Counter): def __mul__(self, x): new = Counter() if isinstance(x, int): for k in self.keys(): new[k] = x * self[k] return new elif isinstance(x, Counter): for k in self.keys(): if x[k]: new[k] = x[k] * self[k] return new else: raise ValueError("Can only multiply MyCounter by int or Counter") On May 29, 2013, at 1:42 PM, James K wrote: > Oops I forgot Counter doesn't support operations with scalars, I'm still proposing the second example > > > On Thu, May 30, 2013 at 6:21 AM, David Mertz wrote: > That "intuitive" behavior certainly wouldn't have been my first--or second guess--on seeing the syntax. > > > On Wed, May 29, 2013 at 1:17 PM, James K wrote: > It should work like this > > >>> from collections import Counter > >>> Counter({'a': 1, 'b': 2}) * 2 # scalar > Counter({'b': 4, 'a': 2}) > >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies matching keys > Counter({'b': 4}) > > > This is intuitive behavior and therefore should be added. I am unsure about division as dividing by a non-existing key would be a division by 0, although division by a scalar is straightforward. > > > On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: > On 5/29/2013 1:27 AM, James K wrote: >> Can we add a multiplication feature to collections.Counter, I don't see why not. > > James, welcome to the list. To get an idea accepted, you have to do a few things: > > 1) Explain the idea fully. I don't understand what "a multiplication feature" would do. > 2) Explain why the idea is useful to enough people that it should be added to the standard library. > > These two criteria are not easy to meet. Sometimes an idea seems popular, but it turns out that different people want it to behave differently, or differently at different times (see the discussion about an itertools.chunked feature). Sometimes an idea is straightforward enough to describe, but is useful to too few people to justify adding it to the standard library. > > Discussing these things doesn't often result in a change to Python, but does often lead to useful discussion. > > --Ned. > >> >> >> _______________________________________________ >> Python-ideas mailing list >> >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > -- mertz@ | The specter of free information is haunting the `Net! All the gnosis | powers of IP- and crypto-tyranny have entered into an unholy .cx | alliance...ideas have nothing to lose but their chains. Unite | against "intellectual property" and anti-privacy regimes! From haoyi.sg at gmail.com Wed May 29 23:25:46 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 29 May 2013 17:25:46 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> Message-ID: I disagree; I don't think adding an additional language feature to make it harder to do a single thing (i.e. using % on string literals) isn't really worth it on its own. I think the main benefit of doing the process-at-import-time-and-intern thing using custom prefixes is that in a lot of cases, the magic strings you find in python programs aren't really magic data, but instead they're code. Things like: - XML templates - SQL queries - Regexes - Parser Grammers (e.g. Parsimonious) may be in a funny syntax (which is why it needs to be put into a string), but fundamentally they are code just like the rest of the python programs, so it makes sense that they'd be compiled and interned at import-time like the rest of the python code. Granted, you can always do: my_sql_query = sql(...) in the top level namespace to basically do the same thing manually, but that's basically manually spaghettifying your program by shifting pieces of code from where they're used (e.g. inside a loop, inside a function, inside an object) to somewhere far away (the global namespace). Not for any abstraction (e.g. wanting to use the query in more than once place), not for neatness, but purely in exchange for the added performance (it can be quite expensive re-parsing a big xml template each time). -Haoyi On Wed, May 29, 2013 at 3:23 PM, Devin Jeanpierre wrote: > On Mon, May 27, 2013 at 6:41 AM, G?ktu? Kayaalp > wrote: > > I think it would add some significant flexibility to Python to let users > > define custom string prefixes. What I mean by a string prefix is, > > a letter prefixing the string literal, modifying the behavior of it, > --snip-- > > Rather than Decimal, IMO a more compelling use case is SQL queries. At > the moment, string literals make unsafe string formatting an > attractive nuisance: > > cur.execute("..." % (...)) > > versus > > cur.execute("...", (...)) > > Something that custom string prefixes do, that cannot be done in > Python, is make this confusion impossible. You could make the only way > to create passable SQL expressions via the string sql:"...", which > produces an SQL object. At no point in time does the programmer deal > with strings that can be manipulated in unsafe ways to result in SQL > injection vulnerabilities. > > Of course, then there is the issue of "what if you want to produce an > SQL expression from a string"? Then you can make that difficult, > rather than attractive, perhaps requiring the following code: > > with sql.unsafe.disable_all_security_protections: > expr = sql.unsafe.compile_string(my_string) > cur.execute(expr, (...)) > > As it stands today, it's very common for people to produce insecure > code completely by accident. I see it on a regular basis in #python. > There is no way to resolve this without something similar to E's > quasiliterals, or these prefixed strings. > > -- Devin > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu May 30 00:47:40 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 29 May 2013 23:47:40 +0100 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> Message-ID: <51A6858C.2020302@mrabarnett.plus.com> On 29/05/2013 21:17, James K wrote: > It should work like this > > >>> from collections import Counter > >>> Counter({'a': 1, 'b': 2}) * 2 # scalar > Counter({'b': 4, 'a': 2}) > >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # > multiplies matching keys > Counter({'b': 4}) > > > This is intuitive behavior and therefore should be added. I am unsure > about division as dividing by a non-existing key would be a division by > 0, although division by a scalar is straightforward. > Multiplying by scalars I understand, but by another Counter? That just feels wrong to me. For example: >>> c = Counter("apple": 3, "orange": 5) >>> # Double everything. >>> c * 2 Counter("apple": 6, "orange": 10) Fine, OK. But what does _this_ mean? >>> d = Counter("orange": 4, "pear": 2) >>> c * d ??? From haoyi.sg at gmail.com Thu May 30 01:06:05 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 29 May 2013 19:06:05 -0400 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? Message-ID: I just finished writing a rather knotty piece of code to try and extract the original source code that went into creating an AST. The whole thing is a nasty hack, and I was wondering if there was a better way of doing things. In particular, some obvious techniques aren't sufficient: - lineno/col_offset just tells you where the AST starts, not where it ends. The next AST's lineno/col_offset tells you where the next AST starts, which is not where the previous one ends: it could include a whole bunch of trash (whitespace/closing parentheses/comments/etc.) that I don't want. - unparsing the AST (via unparser.py or similar) is not sufficient, because the original parsing has already thrown away a bunch of information from the original source, e.g. redundant parentheses, comments. You can get something which runs identically, but you can't get the exact original source. I want the original source code for debugging/tracing purposes: I want my debugging asserts/tracing macros to show me the original source code of the condition which failed, and not source code + extra junk or source code + reshuffled parentheses (as would be the case with the two techniques used above). However, other possible uses come to mind: - It would make tools like 2to3.py much simpler, since you could work purely at an AST level and just say "give me original source here!" for the parts which don't need to be changed. Currently it has its own lexer/parser, which is necessary (under the status quo) for reasons given above, but seems like a great waste when there's already a perfectly good lexer/parser in the ast module. - Automatically extracting the source code from unit tests to insert into documentation, which would be much easier if I could work purely at an AST level. So what's there to do? I've described why the two techniques above are each insufficient, but together, you can: - Bound the extent of an AST in the source code using the AST's subtree's minimal and maximal lineno/col_offset, along with it's successor's minimal lineno/col_offset - scrub that extent with ast.parse, trying to parse each and every possible string and (if it parses) unparsing it to check semantic equality with the original AST This is terribly hacky, the asymptotic performance is not good, and you could say many other nasty words about it. And all because I need to retrieve some information (source_length of the AST) that the parser probably already had, but conveniently threw away before giving me the AST. Would it make sense to have the parser preserve the source_length in the ast.AST objects, along with the lineno and col_offset? This would take a miniscule amount of additional storage, is information that i'm sure it already has, and would greatly benefit the use cases I described above. -Haoyi -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu May 30 01:47:28 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 May 2013 09:47:28 +1000 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: References: Message-ID: <51A69390.8070905@pearwood.info> On 30/05/13 09:06, Haoyi Li wrote: > I want the original source code for debugging/tracing purposes Do you have an idea of the memory overhead of keeping the source code around? Is it worth having the ast module honour the __debug__ flag (-O and -OO switches), and *not* preserve source when optimizations are in effect? That may mean that some ast operations cannot run under -O or -OO, in the same way that code that cares about __doc__ strings cannot meaningfully run under -OO. That's okay by me. -- Steven From mmr15 at case.edu Thu May 30 01:43:44 2013 From: mmr15 at case.edu (Matthew Ruffalo) Date: Wed, 29 May 2013 19:43:44 -0400 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A6858C.2020302@mrabarnett.plus.com> References: <51A610C9.4040405@nedbatchelder.com> <51A6858C.2020302@mrabarnett.plus.com> Message-ID: <51A692B0.8070202@case.edu> On 05/29/2013 06:47 PM, MRAB wrote: > On 29/05/2013 21:17, James K wrote: >> It should work like this >> >> >>> from collections import Counter >> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >> Counter({'b': 4, 'a': 2}) >> >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # >> multiplies matching keys >> Counter({'b': 4}) >> >> >> This is intuitive behavior and therefore should be added. I am unsure >> about division as dividing by a non-existing key would be a division by >> 0, although division by a scalar is straightforward. >> > Multiplying by scalars I understand, but by another Counter? That just > feels wrong to me. > > For example: > > >>> c = Counter("apple": 3, "orange": 5) > >>> # Double everything. > >>> c * 2 > Counter("apple": 6, "orange": 10) > > Fine, OK. > > But what does _this_ mean? > > >>> d = Counter("orange": 4, "pear": 2) > >>> c * d > ??? > James K is proposing pairwise multiplication of matching elements, with the normal behavior of a missing element having a value of 0. There's another perfectly reasonable interpretation of the * operator, however: the Cartesian product of two multisets. """ >>> from collections import Counter >>> from itertools import product >>> c = Counter(apple=3, orange=10) >>> d = Counter(orange=4, pear=2) >>> cd = Counter(product(c.elements(), d.elements())) # c * d >>> cd Counter({('orange', 'orange'): 40, ('orange', 'pear'): 20, ('apple', 'orange'): 12, ('apple', 'pear'): 6}) """ It would be nice to define * as the Cartesian product of two sets, also: """ >>> s1 = {'a', 'b', 'c'} >>> s2 = {'c', 'd'} >>> set(product(s1, s2)) # s1 * s2 {('a', 'd'), ('c', 'c'), ('c', 'd'), ('a', 'c'), ('b', 'd'), ('b', 'c')} """ The fact that there are two distinct possibilities for Counter.__mul__ seems problematic; these objects have set and arithmetic operations and * is meaningful in either context. Implementing set.__mul__ as a Cartesian product doesn't seem to have any obvious drawbacks, though. MMR... From haoyi.sg at gmail.com Thu May 30 02:04:44 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 29 May 2013 20:04:44 -0400 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: <51A69390.8070905@pearwood.info> References: <51A69390.8070905@pearwood.info> Message-ID: I don't need to keep the source code, I just need a single integer for each node. I would then be able to reconstruct the source snippet. On Wed, May 29, 2013 at 7:47 PM, Steven D'Aprano wrote: > On 30/05/13 09:06, Haoyi Li wrote: > > I want the original source code for debugging/tracing purposes >> > > Do you have an idea of the memory overhead of keeping the source code > around? > > Is it worth having the ast module honour the __debug__ flag (-O and -OO > switches), and *not* preserve source when optimizations are in effect? That > may mean that some ast operations cannot run under -O or -OO, in the same > way that code that cares about __doc__ strings cannot meaningfully run > under -OO. That's okay by me. > > > > -- > Steven > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu May 30 02:06:37 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 May 2013 10:06:37 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> Message-ID: <51A6980D.90909@pearwood.info> On 30/05/13 06:17, James K wrote: > It should work like this > > >>> from collections import Counter > >>> Counter({'a': 1, 'b': 2}) * 2 # scalar > Counter({'b': 4, 'a': 2}) Under what circumstances would you do this? What is it about Counters that they should support multiplication when no other mapping type does? For what it is worth, the above is trivially doable using dict comprehensions: py> from collections import Counter py> count = Counter({'a': 2, 'b': 3}) py> Counter({k:v*2 for k,v in count.items()}) Counter({'b': 6, 'a': 4}) > >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies > matching keys > Counter({'b': 4}) > > > This is intuitive behavior and therefore should be added. Not to me it isn't. I cannot guess what purpose this would hold, or why you would want to do this. > I am unsure about > division as dividing by a non-existing key would be a division by 0, > although division by a scalar is straightforward. Oh, so now you're proposing division as well? I presume you would want to support both / and // division, since they're both equally intuitive: Counter({'a': 10})/3 => Counter({'a': 3.3333333333333333}) Counter({'a': 10})//Counter({'a': 3, 'b': 4}) => Counter({'a': 3, 'b': 0}) What about other operations, like ** & | ^ >> << ? Is it your proposal that Counters should support every operation that ints support? -- Steven From steve at pearwood.info Thu May 30 02:09:13 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 May 2013 10:09:13 +1000 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: References: <51A69390.8070905@pearwood.info> Message-ID: <51A698A9.3020900@pearwood.info> On 30/05/13 10:04, Haoyi Li wrote: > I don't need to keep the source code, I just need a single integer for each > node. I would then be able to reconstruct the source snippet. And so you did say. Sorry for the noise. -- Steven From steve at pearwood.info Thu May 30 02:37:12 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 May 2013 10:37:12 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> Message-ID: <51A69F38.7070104@pearwood.info> On 30/05/13 05:23, Devin Jeanpierre wrote: > On Mon, May 27, 2013 at 6:41 AM, G?ktu? Kayaalp > wrote: >> I think it would add some significant flexibility to Python to let users >> define custom string prefixes. What I mean by a string prefix is, >> a letter prefixing the string literal, modifying the behavior of it, > --snip-- > > Rather than Decimal, IMO a more compelling use case is SQL queries. At > the moment, string literals make unsafe string formatting an > attractive nuisance: > > cur.execute("..." % (...)) > > versus > > cur.execute("...", (...)) > > Something that custom string prefixes do, that cannot be done in > Python, is make this confusion impossible. You could make the only way > to create passable SQL expressions via the string sql:"...", which > produces an SQL object. At no point in time does the programmer deal > with strings that can be manipulated in unsafe ways to result in SQL > injection vulnerabilities. I think that's wrong. *This* proposal, for custom user-land prefixes, will not help in this case. Your suggestion will only work if Python has a new built-in type, the "SQL Query", which does not support *any* form of string input, *and* the cur.execute method is changed to no longer accept strings (backwards compatibility be damned). The loss of backwards compatibility makes this a Python 4000 idea. But putting that aside, it has to be a built-in type only accessible as a literal, because if it is a function that takes a string argument, say, sql(), then you'll have exactly the same issue. Some people will write this: cur.execute(sql("..." % (...))) instead of one of these: cur.execute(sql("..."), (...)) cur.execute(sql:"...", (...)) So that effectively rules out any user-land solution. Given that Python is a language which allows the programmer to shoot themselves in the foot if they so choose, I'm not really so sure that even in Python 4000 we should be going to extraordinary efforts to prevent *this specific* toe from being shot off. -- Steven From mmr15 at case.edu Thu May 30 02:52:16 2013 From: mmr15 at case.edu (Matthew Ruffalo) Date: Wed, 29 May 2013 20:52:16 -0400 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A6980D.90909@pearwood.info> References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> Message-ID: <51A6A2C0.5030004@case.edu> On 05/29/2013 08:06 PM, Steven D'Aprano wrote: > On 30/05/13 06:17, James K wrote: >> It should work like this >> >> >>> from collections import Counter >> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >> Counter({'b': 4, 'a': 2}) > > Under what circumstances would you do this? > > What is it about Counters that they should support multiplication when > no other mapping type does? Counters are different from other mapping types because they provide a natural Python stdlib implementation of multisets -- the docs explicitly state that "The Counter class is similar to bags or multisets in other languages.". The class already has behavior that is different from other mapping types: Counter.__init__ can also take an iterable of hashable objects instead of another mapping, and Counter.update adds counts instead of replacing them. > > For what it is worth, the above is trivially doable using dict > comprehensions: > > py> from collections import Counter > py> count = Counter({'a': 2, 'b': 3}) > py> Counter({k:v*2 for k,v in count.items()}) > Counter({'b': 6, 'a': 4}) > > >> >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # >> multiplies >> matching keys >> Counter({'b': 4}) >> >> >> This is intuitive behavior and therefore should be added. > > Not to me it isn't. I cannot guess what purpose this would hold, or > why you would want to do this. > > >> I am unsure about >> division as dividing by a non-existing key would be a division by 0, >> although division by a scalar is straightforward. > > Oh, so now you're proposing division as well? I presume you would want > to support both / and // division, since they're both equally intuitive: > > Counter({'a': 10})/3 > => Counter({'a': 3.3333333333333333}) > > Counter({'a': 10})//Counter({'a': 3, 'b': 4}) > => Counter({'a': 3, 'b': 0}) > > What about other operations, like ** & | ^ >> << ? Is it your proposal > that Counters should support every operation that ints support? > Counters already support & and | for multiset intersection and union. From http://docs.python.org/3/library/collections.html : """ >>>c = Counter(a=3, b=1) >>>d = Counter(a=1, b=2) >>>c + d # add two counters together: c[x] + d[x] Counter({'a': 4, 'b': 3}) >>>c - d # subtract (keeping only positive counts) Counter({'a': 2}) >>>c & d # intersection: min(c[x], d[x]) Counter({'a': 1, 'b': 1}) >>>c | d # union: max(c[x], d[x]) Counter({'a': 3, 'b': 2}) """ The rationale for supporting multiplication by a scalar makes some sense when using Counters as multisets; multiplying by a another Counter is questionable. MMR... From steve at pearwood.info Thu May 30 03:29:26 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 May 2013 11:29:26 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A6A2C0.5030004@case.edu> References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> <51A6A2C0.5030004@case.edu> Message-ID: <51A6AB76.5090706@pearwood.info> On 30/05/13 10:52, Matthew Ruffalo wrote: > On 05/29/2013 08:06 PM, Steven D'Aprano wrote: >> On 30/05/13 06:17, James K wrote: >>> It should work like this >>> >>> >>> from collections import Counter >>> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >>> Counter({'b': 4, 'a': 2}) >> >> Under what circumstances would you do this? >> >> What is it about Counters that they should support multiplication when no other mapping type does? > > Counters are different from other mapping types because they provide a natural Python stdlib implementation of multisets -- the docs explicitly state that "The Counter class is similar to bags or multisets in other languages.". The class already has behavior that is different from other mapping types: Counter.__init__ can also take an iterable of hashable objects instead of another mapping, and Counter.update adds counts instead of replacing them. None of this answers my question. Under what circumstances would you multiply a counter by a scalar (let alone by another counter)? The fact that counters differ in some ways from other mappings doesn't justify every arbitrary change proposed. -- Steven From goktug.kayaalp at gmail.com Thu May 30 03:48:40 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Thu, 30 May 2013 04:48:40 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: (Devin Jeanpierre's message of "Wed, 29 May 2013 15:23:56 -0400") References: <87obbw4scx.fsf@gmail.com> Message-ID: <87mwrdb5l3.fsf@gmail.com> > Something that custom string prefixes do, that cannot be done in > Python, is make this confusion impossible. You could make the only way > to create passable SQL expressions via the string sql:"...", which > produces an SQL object. At no point in time does the programmer deal > with strings that can be manipulated in unsafe ways to result in SQL > injection vulnerabilities. IMO, a better decision would be to use an ORM for this. Abstracting away the SQL language with something like SQLAlchemy would result in code written in a single language, which in turn would possibly decrease the odds of making a mistake. Greetings, G?ktu?. Devin Jeanpierre writes: > On Mon, May 27, 2013 at 6:41 AM, G?ktu? Kayaalp > wrote: >> I think it would add some significant flexibility to Python to let users >> define custom string prefixes. What I mean by a string prefix is, >> a letter prefixing the string literal, modifying the behavior of it, > --snip-- > > Rather than Decimal, IMO a more compelling use case is SQL queries. At > the moment, string literals make unsafe string formatting an > attractive nuisance: > > cur.execute("..." % (...)) > > versus > > cur.execute("...", (...)) > > Something that custom string prefixes do, that cannot be done in > Python, is make this confusion impossible. You could make the only way > to create passable SQL expressions via the string sql:"...", which > produces an SQL object. At no point in time does the programmer deal > with strings that can be manipulated in unsafe ways to result in SQL > injection vulnerabilities. > > Of course, then there is the issue of "what if you want to produce an > SQL expression from a string"? Then you can make that difficult, > rather than attractive, perhaps requiring the following code: > > with sql.unsafe.disable_all_security_protections: > expr = sql.unsafe.compile_string(my_string) > cur.execute(expr, (...)) > > As it stands today, it's very common for people to produce insecure > code completely by accident. I see it on a regular basis in #python. > There is no way to resolve this without something similar to E's > quasiliterals, or these prefixed strings. > > -- Devin -- G?ktu? Kayaalp From abarnert at yahoo.com Thu May 30 03:47:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 29 May 2013 18:47:10 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> Message-ID: <2AF81C27-1353-4597-9343-69A3EB198861@yahoo.com> Given how simple this would be to implement, I think the obvious way forward is to write an implementation, put it on PyPI, and give it time to see if it gets any traction. If enough people use it, it can be easily added to the stdlib later. (IIRC, there's a project named something like more-collections that has OrderedSet, OrderedDefaultDict, etc., so you could submit this new class as a patch to that project instead of creating a new one, but that doesn't make too much difference.) If you don't know how to write the implementation yourself, just ask and someone will write it for you. Sent from a random iPhone On May 29, 2013, at 13:17, James K wrote: > It should work like this > > >>> from collections import Counter > >>> Counter({'a': 1, 'b': 2}) * 2 # scalar > Counter({'b': 4, 'a': 2}) > >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies matching keys > Counter({'b': 4}) > > > This is intuitive behavior and therefore should be added. I am unsure about division as dividing by a non-existing key would be a division by 0, although division by a scalar is straightforward. > > On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: >> On 5/29/2013 1:27 AM, James K wrote: >>> Can we add a multiplication feature to collections.Counter, I don't see why not. >> >> James, welcome to the list. To get an idea accepted, you have to do a few things: >> >> 1) Explain the idea fully. I don't understand what "a multiplication feature" would do. >> 2) Explain why the idea is useful to enough people that it should be added to the standard library. >> >> These two criteria are not easy to meet. Sometimes an idea seems popular, but it turns out that different people want it to behave differently, or differently at different times (see the discussion about an itertools.chunked feature). Sometimes an idea is straightforward enough to describe, but is useful to too few people to justify adding it to the standard library. >> >> Discussing these things doesn't often result in a change to Python, but does often lead to useful discussion. >> >> --Ned. >> >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Thu May 30 03:49:23 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 29 May 2013 21:49:23 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A69F38.7070104@pearwood.info> References: <87obbw4scx.fsf@gmail.com> <51A69F38.7070104@pearwood.info> Message-ID: On Wed, May 29, 2013 at 8:37 PM, Steven D'Aprano wrote: > The loss of backwards compatibility makes this a Python 4000 idea. Definitely this was against a hypothetical SQL API, showcasing its benefit. One could replace it with, for example, a templating library. Then the risk that is mitigated is accidental XSS vulnerabilities, rather than accidental SQL injection vulnerabilities. Templating libraries are far more common, and non-standard, so they can make this change more easily than the DB-API can. > But putting that aside, it has to be a built-in type only accessible as a > literal, because if it is a function that takes a string argument, say, > sql(), then you'll have exactly the same issue. What's the problem with that? If the registry of prefix transformers is kept private by Python, then it's not easy at all to retrieve a function from that registry (although you can do it via ctypes). You can delete any references to the function that are not in the registry, and this is effectively hidden without crazy ctypes magic. And then, because this is Python, you can also add an unsafe API for users that need the flexibility and are aware of the security implications. The point isn't to make insecure code impossible -- that itself is impossible. But it'd be nice to make it obvious, when it's being done, that it is in fact insecure. > Given that Python is a language which allows the programmer to shoot > themselves in the foot if they so choose, I'm not really so sure that even > in Python 4000 we should be going to extraordinary efforts to prevent *this > specific* toe from being shot off. There are many instances where Python does try to protect people from shooting themselves in the foot (for example, immutable tuples*). But anyway, I think it's a little callous to dismiss at all the idea of protecting people from security-related problems. It's not programmers' feet I'm worried about. When programmers make security-related mistakes, the people that suffer are their users. It's not as if it's only a little suffering, either. It can be a lot of suffering. And I see this mistake very, very often. If adding this feature would enable better APIs in the future, that prevented pain in the future... maybe that prevented pain would outweigh the effort spent adding and maintaining yet another feature to Python, and any pain of additional work ("with unsafe"). If the only reason not to do it is a philosophical objection to helping people help themselves, then I think that philosophy should be ignored. -- Devin .. [*] The wound being prevented is that tuples could be hashable and mutable, but if they were placed inside a dict or set, and mutated in such a way that their hash changed, their bucket would not change. This makes them pseudo-invisible. A similar (but different) thing happens if you insert NaN, but Python does not protect you there. Of course, nothing is impossible when you use ctypes... From abarnert at yahoo.com Thu May 30 03:55:50 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 29 May 2013 18:55:50 -0700 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A44F71.60709@gmail.com> References: <87obbw4scx.fsf@gmail.com> <878v304hs8.fsf@gmail.com> <51A44F71.60709@gmail.com> Message-ID: On May 27, 2013, at 23:32, Ron Adam wrote: > Moving them to the run time parser would make some things slower. Being able to add or register more prefix's would probably hurt Pythons readability when you want to review someone else's programs. I think it would only improve readability for programs we write ourselves, because then we know much more easily what we defined those prefixes to mean. That wouldn't be the case when we read someone else's code. If there's a use case with wide applicability within a particular domain, I could see a prefix being nonstandard, but still a de facto standard within that domain, and therefore increasing readability. As a rough parallel, consider that numpy arrays don't quite implement the normal Sequence protocol, and have all kinds of behaviors that are nonsense for normal sequences, and yet indexing with tuples and ellipses doesn't cause any readability problems for numeric programmers--and in fact python 3.0 even added a small change to make it easier for them. The question is whether there's any comparable use case here. Is there some prefix that would be widely useful in numerics or django or XML processing or whatever? Or would prefixes only really be useful for local ad hoc domains? From rosuav at gmail.com Thu May 30 05:30:18 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 30 May 2013 13:30:18 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A69F38.7070104@pearwood.info> References: <87obbw4scx.fsf@gmail.com> <51A69F38.7070104@pearwood.info> Message-ID: On Thu, May 30, 2013 at 10:37 AM, Steven D'Aprano wrote: > But putting that aside, it has to be a built-in type only accessible as a > literal, because if it is a function that takes a string argument, say, > sql(), then you'll have exactly the same issue. Some people will write this: > > cur.execute(sql("..." % (...))) > > instead of one of these: > > cur.execute(sql("..."), (...)) > cur.execute(sql:"...", (...)) > > > So that effectively rules out any user-land solution. Actually, there is a user-land solution! It just isn't 100% perfect. My old-favorite, the linter... All you need is an intelligent code-parsing tool that gives you a warning if you call sql() with anything other than a literal - or, for that matter, skip the sql() check and give a warning if .execute()'s first parameter is not a literal. Put that into your makefile or repository pre-commit hook and you should have no trouble keeping yourself safe. Disadvantage: Doesn't actually protect you, just helps you keep yourself safe. Advantages: Works on existing releases of Python; can be customized to your own personal requirements. ChrisA From mertz at gnosis.cx Thu May 30 05:31:58 2013 From: mertz at gnosis.cx (David Mertz) Date: Wed, 29 May 2013 20:31:58 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <2AF81C27-1353-4597-9343-69A3EB198861@yahoo.com> References: <51A610C9.4040405@nedbatchelder.com> <2AF81C27-1353-4597-9343-69A3EB198861@yahoo.com> Message-ID: Well... I actually *did* implement it a few notes up-thread, it's not merely simple, but actual. If you really want, you can privately substitute my: class MyCounter(Counter): ... with: class Counter(Counter): ... Of course, I don't *recommend* doing that. :-) On Wed, May 29, 2013 at 6:47 PM, Andrew Barnert wrote: > Given how simple this would be to implement, I think the obvious way > forward is to write an implementation, put it on PyPI, and give it time to > see if it gets any traction. If enough people use it, it can be easily > added to the stdlib later. > > (IIRC, there's a project named something like more-collections that has > OrderedSet, OrderedDefaultDict, etc., so you could submit this new class as > a patch to that project instead of creating a new one, but that doesn't > make too much difference.) > > If you don't know how to write the implementation yourself, just ask and > someone will write it for you. > > Sent from a random iPhone > > On May 29, 2013, at 13:17, James K wrote: > > It should work like this > > >>> from collections import Counter > >>> Counter({'a': 1, 'b': 2}) * 2 # scalar > Counter({'b': 4, 'a': 2}) > >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies > matching keys > Counter({'b': 4}) > > > This is intuitive behavior and therefore should be added. I am unsure > about division as dividing by a non-existing key would be a division by 0, > although division by a scalar is straightforward. > > On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: > >> On 5/29/2013 1:27 AM, James K wrote: >> >> Can we add a multiplication feature to collections.Counter, I don't see >> why not. >> >> >> James, welcome to the list. To get an idea accepted, you have to do a >> few things: >> >> 1) Explain the idea fully. I don't understand what "a multiplication >> feature" would do. >> 2) Explain why the idea is useful to enough people that it should be >> added to the standard library. >> >> These two criteria are not easy to meet. Sometimes an idea seems >> popular, but it turns out that different people want it to behave >> differently, or differently at different times (see the discussion about an >> itertools.chunked feature). Sometimes an idea is straightforward enough to >> describe, but is useful to too few people to justify adding it to the >> standard library. >> >> Discussing these things doesn't often result in a change to Python, but >> does often lead to useful discussion. >> >> --Ned. >> >> >> >> _______________________________________________ >> Python-ideas mailing listPython-ideas at python.orghttp://mail.python.org/mailman/listinfo/python-ideas >> >> >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu May 30 07:21:57 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 29 May 2013 23:21:57 -0600 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <51A35C07.4080207@pearwood.info> References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> Message-ID: On Mon, May 27, 2013 at 7:13 AM, Steven D'Aprano wrote: > Welcome, and I admire your bravery! +1 > So please don't take it personally when > I say, your idea does not sound very good to me. In fact, it sounds > terrible. You call this proposal "custom string prefixes", but what you > describe is actually a second way to call a function, only not any function, > but just functions that take a single string argument. So more like a > function call that looks like a string. > > Let me start with your example: > > > >> >>> from decimal import Decimal as D >> >>> D ("1.6e-9") * D ("1.0e9") >> >> but I'd enounce that the following is more concise and readable: >> >> >>> D"1.6e-9" * D"1.0e9" >> >> with removed parens. > > > Just to save a couple of parentheses, you add a lot of complication to the > language, make it harder for people to learn, and for no real benefit except > to save a few keystrokes. Consider: > > String prefixes are currently part of Python's syntax, and can operate at > compile-time. With your proposal, they become run-time operations, like any > function call. So this is redundant: we already have a perfectly good way of > calling functions. Not just redundant, but also very limited, because most > functions take more than one argument, or non-string arguments. > > Do you have any idea how you would implement this change? Do you at least > have an idea for the API? What commands would the user give to define a new > "string prefix"? How would the user query the "string prefixes" already > defined? What happens when they combine multiple prefixes? > > Python is famous for being "executable pseudo-code". To a very large degree, > code written in Python should be readable by people who are not Python > programmers. What do you think > > s"ham" > > will mean to the reader? I think that it is better to encourage people to > write meaningful names: > > make_sandwich("ham") > > than trying to save every last keystroke possible. Code is written once, but > read over and over again. That half a second you save by typing s"ham" will > cost other people dozens of seconds, maybe minutes, each time them read your > code. Spot on. The language needs to fit in our brains and you could argue that we've already pushed past that threshold. One concern I'd have is in how you would look up what some arbitrary prefix is supposed to do. Furthermore, it sounds like the intent is to apply the prefix at run-time. That means that the function that gets applied depends on what's in some registry at a given moment. So source with these custom string prefixes will be even more ambiguous when read. I agree that one should just use the function rather than the prefix. I'd expect the prefix handler to not be stateful, in which case it truly would be a functional operation, always giving the same output for a given literal string input. In that case, why not just call the function at the module scope, bind the result to a name there, and use that name wherever you like. It's effectively a constant, right? Now, give a twist to this idea of some stable `literal -> contstant` operation. Consider a new syntax or built-in function for indicating that, at compile-time, an expression should be considered equivalent to the literal to which it evaluates. Then the compiler would be free to substitute the literal for the expression and store the literal in the bytecode/constants/pyc file. From then on that expression would not be evaluated at run-time anymore. Of course, the expression would have to be entirely literal-based and evaluate to a literal. The catch is that the compiler would have to evaluate the expression, which would probably add disproportionate complexity to the compiler. The other catch is that, due to how dynamic Python is, any given expression could evaluate differently at run-time (e.g. someone monkey-patches str.__add__ or some other function you are calling on a string literal). Ultimately this probably isn't worth it. Just evaluate the expression at module level so it's only evaluated once at import-time and move on. If it's really expensive, pre-calculate the equivalent literal and stick that in your source (along with a comment on how the literal was generated). Well, that was diverting. -eric From abarnert at yahoo.com Thu May 30 07:24:52 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 29 May 2013 22:24:52 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> <2AF81C27-1353-4597-9343-69A3EB198861@yahoo.com> Message-ID: <23079C8D-0B0A-437A-A021-BB3E3445CCF8@yahoo.com> On May 29, 2013, at 20:31, David Mertz wrote: > Well... I actually *did* implement it a few notes up-thread, it's not merely simple, but actual. OK, so the OP can just copy and paste. I think I'd use numbers.Integral and collections.abc.Mapping rather than int and Counter. And maybe take any Container as a counter with the value 1 for each key, given that Counter is intended to work as a multiset. Also, I think the one-liner dict comprehension implementation is simpler than the explicit loop around setitem. And I think if you have mult you'll probably want imult as well. But bikeshedding aside, your implementation should be more than enough for the OP. > If you really want, you can privately substitute my: > > class MyCounter(Counter): ... > > with: > > class Counter(Counter): ... > > Of course, I don't *recommend* doing that. :-) Well, given that it will be in a separate module if the OP puts it on PyPI, I don't see any downside to giving it the same name. > > > On Wed, May 29, 2013 at 6:47 PM, Andrew Barnert wrote: >> Given how simple this would be to implement, I think the obvious way forward is to write an implementation, put it on PyPI, and give it time to see if it gets any traction. If enough people use it, it can be easily added to the stdlib later. >> >> (IIRC, there's a project named something like more-collections that has OrderedSet, OrderedDefaultDict, etc., so you could submit this new class as a patch to that project instead of creating a new one, but that doesn't make too much difference.) >> >> If you don't know how to write the implementation yourself, just ask and someone will write it for you. >> >> Sent from a random iPhone >> >> On May 29, 2013, at 13:17, James K wrote: >> >>> It should work like this >>> >>> >>> from collections import Counter >>> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >>> Counter({'b': 4, 'a': 2}) >>> >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) # multiplies matching keys >>> Counter({'b': 4}) >>> >>> >>> This is intuitive behavior and therefore should be added. I am unsure about division as dividing by a non-existing key would be a division by 0, although division by a scalar is straightforward. >>> >>> On Thu, May 30, 2013 at 12:29 AM, Ned Batchelder wrote: >>>> On 5/29/2013 1:27 AM, James K wrote: >>>>> Can we add a multiplication feature to collections.Counter, I don't see why not. >>>> >>>> James, welcome to the list. To get an idea accepted, you have to do a few things: >>>> >>>> 1) Explain the idea fully. I don't understand what "a multiplication feature" would do. >>>> 2) Explain why the idea is useful to enough people that it should be added to the standard library. >>>> >>>> These two criteria are not easy to meet. Sometimes an idea seems popular, but it turns out that different people want it to behave differently, or differently at different times (see the discussion about an itertools.chunked feature). Sometimes an idea is straightforward enough to describe, but is useful to too few people to justify adding it to the standard library. >>>> >>>> Discussing these things doesn't often result in a change to Python, but does often lead to useful discussion. >>>> >>>> --Ned. >>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> http://mail.python.org/mailman/listinfo/python-ideas >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu May 30 07:41:39 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 29 May 2013 22:41:39 -0700 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> Message-ID: <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> On May 29, 2013, at 22:21, Eric Snow wrote: > Now, give a twist to this idea of some stable `literal -> contstant` > operation. Consider a new syntax or built-in function for indicating > that, at compile-time, an expression should be considered equivalent > to the literal to which it evaluates. Then the compiler would be free > to substitute the literal for the expression and store the literal in > the bytecode/constants/pyc file. From then on that expression would > not be evaluated at run-time anymore. Of course, the expression would > have to be entirely literal-based and evaluate to a literal. So you're suggesting that instead of C++11-style string prefixes, we should have C++11-style constexpr. I realize your ultimate conclusion was that we probably don't need _either_ feature. But still, it's amazing how C++11-ish this discussion is getting. Which may be a good hint that (as you suggest) this feature isn't a good fit for Python. Unless someone has a way of doing it through compile-time templates, of course. :) > The catch is that the compiler would have to evaluate the expression, > which would probably add disproportionate complexity to the compiler. I don't think it's that bad; the compiler just has to make a call to a PyEval* function. Also, given that your proposal is that it be explicitly an optional optimization that the compiler is free to ignore means there's an even more trivial implementation... From mertz at gnosis.cx Thu May 30 07:48:05 2013 From: mertz at gnosis.cx (David Mertz) Date: Wed, 29 May 2013 22:48:05 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <23079C8D-0B0A-437A-A021-BB3E3445CCF8@yahoo.com> References: <51A610C9.4040405@nedbatchelder.com> <2AF81C27-1353-4597-9343-69A3EB198861@yahoo.com> <23079C8D-0B0A-437A-A021-BB3E3445CCF8@yahoo.com> Message-ID: On Wed, May 29, 2013 at 10:24 PM, Andrew Barnert wrote: > On May 29, 2013, at 20:31, David Mertz wrote: > > Well... I actually *did* implement it a few notes up-thread, it's not > merely simple, but actual. > > I think I'd use numbers.Integral and collections.abc.Mapping rather than > int and Counter. And maybe take any Container as a counter with the value 1 > for each key, given that Counter is intended to work as a multiset. > It all still seems somewhat silly to me since the use-case eludes me. But... You are right about numbers.Integral, that is a better isinstance() test. However, I don't think so with collections.abc.Mapping--in that case I deliberately chose collections.Counter because the semantics seemed undefined in other cases. That is, I have no idea what meaning the OP would assign to: >>> Counter({'a':1,'b':2}) * OrderedDict((('a','x'),('b','y'))) Also, I think the one-liner dict comprehension implementation is simpler > than the explicit loop around setitem. > Yeah, that's probably simpler. Maybe less explicit, but the code is so short in any case. > And I think if you have mult you'll probably want imult as well. > Maybe. It wasn't in the original "spec" so who knows. But that's straightforward also, of course. > But bikeshedding aside, your implementation should be more than enough for > the OP. > That's my sense. Well, given that it will be in a separate module if the OP puts it on PyPI, > I don't see any downside to giving it the same name. > Fair enough. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu May 30 08:31:21 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 29 May 2013 23:31:21 -0700 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> <2AF81C27-1353-4597-9343-69A3EB198861@yahoo.com> <23079C8D-0B0A-437A-A021-BB3E3445CCF8@yahoo.com> Message-ID: <83E97B86-22C0-4562-9A00-BD6D0FAB1A71@yahoo.com> On May 29, 2013, at 22:48, David Mertz wrote: > However, I don't think so with collections.abc.Mapping--in that case I deliberately chose collections.Counter because the semantics seemed undefined in other cases. > > That is, I have no idea what meaning the OP would assign to: > > >>> Counter({'a':1,'b':2}) * OrderedDict((('a','x'),('b','y'))) My immediate thought was TypeError. And then I went away for a few minutes, reread it, and my immediate thought was {'a': 'x', 'b': 'yy'} or the Ordered equivalent thereof. Which I think proves your point. Honestly, I understand this half of the proposal far less than the scalar version. Any numpy or pandas user would find that obvious, and I can't imagine anything else it could mean. The Counter*Counter, I have no intuition for at all. From haoyi.sg at gmail.com Thu May 30 12:42:34 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 30 May 2013 06:42:34 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: > The language needs to fit in our brains and you could argue that we've already pushed past that threshold I would argue that "prefix just calls the registered function at import time" fits with my brain better then "20 special cases in the lexer which result in magic things happening", but maybe that's just me. > One concern I'd have is in how you would look up what some arbitrary prefix is supposed to do In theory PyCharm/Vim/Emacs would be able to just jump to the definition of the prefix, instantly giving you the docstring along with the source code of what it does, so I don't think this is a real problem. Tool support will catch up, as always, and you could always grep "@str_prefix\ndef blah" or something similar. > Furthermore, it sounds like the intent is to apply the prefix at run-time. Is there any other time in Python? There's only import-time and execution-time to choose from. > That means that the function that gets applied depends on what's in some registry at a given moment. So source with these custom string prefixes will be even more ambiguous when read. I don't think the second line follows from the first. You could say the same of PEP302 import hooks, which overload a fundamental operation (import xxx) to do whatever the hell you want, depending on what's in your sys.meta_path at any given moment. Heck, you could say the same of any function call my_func(...), that it depends on what my_func has been defined as at a given moment! Remember functions can be (and often are!) rebound multiple times, but people seem to get by just fine. > In that case, why not just call the function at the module scope, bind the result to a name there, and use that name wherever you like. It's effectively a constant, right? The main reason I'd give is that moving a whole bunch of things into module scope spaghettifies your program. If I'm making a bunch of SQL queries in a bunch of functions, I expect the SQL to be in the functions where I'm using them (nobody else cares about them, right?) rather than at module scope. Moving stuff to module scope for performance adds another layer of unnecessary indirection when i look at my_function and wonder what my_function_sql_query and my_function_xml_template is meant to do, for every single function. This quickly degenerates into a "huge mess of things in global scope which i don't know do what" problem. Unless you're super disciplined with your naming conventions to allow you to see, instantly, what function uses what global. But that's the whole point of having a function-level scope in the first place! Exactly how annoying this is depends on how much "code in strings" you want to use. It may sound crazy, but I think Dropbox's Pyxl, Facebook's XHP and React show that it's pretty nice having external code (in this case HTML templates) localized to exactly the point of use, right in the source code. Rather than being forced to chuck it off into a whole separate top-level-declaration or (as it's usually done now) in a separate file. I'd argue that at least part of the reason external code templates tend to be large and clunky, rather than small and focused like Pyxl/XHP/React snippets tend to be, is exactly because of this problem: too many small, focused snippets means too many top level files or declarations results in too much indirection and spaghetti, so naturally people make their templates large to minimize the number of things lying around in your global scope. > But still, it's amazing how C++11-ish this discussion is getting. Which may be a good hint that (as you suggest) this feature isn't a good fit for Python. I don't agree with this; while C++ is huge and terrible, C++11 actually has some pretty good stuff (e.g. real lambdas, with real closures and real lexical scoping! u"unicode" and r"raw" strings!). Dismissing something as bad just because it's something C++11 has is a terrible idea and immediately shuts out a whole range of interesting possibilities. -Haoyi On Thu, May 30, 2013 at 1:41 AM, Andrew Barnert wrote: > On May 29, 2013, at 22:21, Eric Snow wrote: > > > Now, give a twist to this idea of some stable `literal -> contstant` > > operation. Consider a new syntax or built-in function for indicating > > that, at compile-time, an expression should be considered equivalent > > to the literal to which it evaluates. Then the compiler would be free > > to substitute the literal for the expression and store the literal in > > the bytecode/constants/pyc file. From then on that expression would > > not be evaluated at run-time anymore. Of course, the expression would > > have to be entirely literal-based and evaluate to a literal. > > So you're suggesting that instead of C++11-style string prefixes, we > should have C++11-style constexpr. > > I realize your ultimate conclusion was that we probably don't need > _either_ feature. But still, it's amazing how C++11-ish this discussion is > getting. Which may be a good hint that (as you suggest) this feature isn't > a good fit for Python. > > Unless someone has a way of doing it through compile-time templates, of > course. :) > > > The catch is that the compiler would have to evaluate the expression, > > which would probably add disproportionate complexity to the compiler. > > I don't think it's that bad; the compiler just has to make a call to a > PyEval* function. > > Also, given that your proposal is that it be explicitly an optional > optimization that the compiler is free to ignore means there's an even more > trivial implementation... > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu May 30 13:10:33 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 30 May 2013 12:10:33 +0100 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A6AB76.5090706@pearwood.info> References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> <51A6A2C0.5030004@case.edu> <51A6AB76.5090706@pearwood.info> Message-ID: <51A733A9.8010607@mrabarnett.plus.com> On 30/05/2013 02:29, Steven D'Aprano wrote: > On 30/05/13 10:52, Matthew Ruffalo wrote: >> On 05/29/2013 08:06 PM, Steven D'Aprano wrote: >>> On 30/05/13 06:17, James K wrote: >>>> It should work like this >>>> >>>> >>> from collections import Counter >>>> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >>>> Counter({'b': 4, 'a': 2}) >>> >>> Under what circumstances would you do this? >>> >>> What is it about Counters that they should support multiplication when no other mapping type does? >> >> Counters are different from other mapping types because they provide a natural Python stdlib implementation of multisets -- the docs explicitly state that "The Counter class is similar to bags or multisets in other languages.". The class already has behavior that is different from other mapping types: Counter.__init__ can also take an iterable of hashable objects instead of another mapping, and Counter.update adds counts instead of replacing them. > > > None of this answers my question. Under what circumstances would you multiply a counter by a scalar (let alone by another counter)? The fact that counters differ in some ways from other mappings doesn't justify every arbitrary change proposed. > Well, you can add Counters together: >>> from collections import Counter >>> Counter({'a': 1, 'b': 2}) + Counter({'a': 1, 'b': 2}) Counter({'b': 4, 'a': 2}) so multiplying by a non-negative scalar would have the same behaviour. However, subtracting Counters won't return a negative value: >>> Counter({'a': 1, 'b': 2}) - Counter({'a': 1, 'b': 3}) Counter() and NOT: Counter({'b': -1}) From steve at pearwood.info Thu May 30 16:50:01 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 31 May 2013 00:50:01 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A733A9.8010607@mrabarnett.plus.com> References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> <51A6A2C0.5030004@case.edu> <51A6AB76.5090706@pearwood.info> <51A733A9.8010607@mrabarnett.plus.com> Message-ID: <51A76719.3@pearwood.info> On 30/05/13 21:10, MRAB wrote: > On 30/05/2013 02:29, Steven D'Aprano wrote: >> On 30/05/13 10:52, Matthew Ruffalo wrote: >>> On 05/29/2013 08:06 PM, Steven D'Aprano wrote: >>>> On 30/05/13 06:17, James K wrote: >>>>> It should work like this >>>>> >>>>> >>> from collections import Counter >>>>> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >>>>> Counter({'b': 4, 'a': 2}) >>>> >>>> Under what circumstances would you do this? >>>> >>>> What is it about Counters that they should support multiplication when no other mapping type does? >>> >>> Counters are different from other mapping types because they provide a natural Python stdlib implementation of multisets -- the docs explicitly state that "The Counter class is similar to bags or multisets in other languages.". The class already has behavior that is different from other mapping types: Counter.__init__ can also take an iterable of hashable objects instead of another mapping, and Counter.update adds counts instead of replacing them. >> >> >> None of this answers my question. Under what circumstances would you multiply a counter by a scalar (let alone by another counter)? The fact that counters differ in some ways from other mappings doesn't justify every arbitrary change proposed. >> > Well, you can add Counters together: > >>>> from collections import Counter >>>> Counter({'a': 1, 'b': 2}) + Counter({'a': 1, 'b': 2}) > Counter({'b': 4, 'a': 2}) > > so multiplying by a non-negative scalar would have the same behaviour. With respect MRAB, you are also not answering my question, which is about *why* somebody might wish to multiply a Counter, not what it would do. I already know that multiplication by an integer is equivalent to repeated addition. What I'm asking for is a use-case, for why one might find this functionality useful, not on what the functionality is. For example, the use-case for Counter addition might be: "I have used a Counter to count items from one data set, and another Counter to count items from a different data set. Now I want to count items from both data sets together. Rather than concatenate the two data sets (which may not even be possible, if they were from iterators) and re-count, I can simply add the two Counters." That's a great use-case, and it justifies supporting Counter + Counter. The use-case for Counter multiplication escapes me, unless it is this: "I have a Counter. I want to multiply all the counts by a constant, just because I can." To the Original Poster, James K: If the only justification for this proposal is that you think it is logical that Counters should support this behaviour, then I am against it. Adding this behaviour requires more code, more tests, more documentation, more things for users to learn. But if you have a good use-case showing some circumstances where a programmer would find it helpful to have this functionality, then please tell us. -- Steven From python at mrabarnett.plus.com Thu May 30 17:18:25 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 30 May 2013 16:18:25 +0100 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A76719.3@pearwood.info> References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> <51A6A2C0.5030004@case.edu> <51A6AB76.5090706@pearwood.info> <51A733A9.8010607@mrabarnett.plus.com> <51A76719.3@pearwood.info> Message-ID: <51A76DC1.4050006@mrabarnett.plus.com> On 30/05/2013 15:50, Steven D'Aprano wrote: > On 30/05/13 21:10, MRAB wrote: >> On 30/05/2013 02:29, Steven D'Aprano wrote: >>> On 30/05/13 10:52, Matthew Ruffalo wrote: >>>> On 05/29/2013 08:06 PM, Steven D'Aprano wrote: >>>>> On 30/05/13 06:17, James K wrote: >>>>>> It should work like this >>>>>> >>>>>> >>> from collections import Counter >>>>>> >>> Counter({'a': 1, 'b': 2}) * 2 # scalar >>>>>> Counter({'b': 4, 'a': 2}) >>>>> >>>>> Under what circumstances would you do this? >>>>> >>>>> What is it about Counters that they should support multiplication when no other mapping type does? >>>> >>>> Counters are different from other mapping types because they provide a natural Python stdlib implementation of multisets -- the docs explicitly state that "The Counter class is similar to bags or multisets in other languages.". The class already has behavior that is different from other mapping types: Counter.__init__ can also take an iterable of hashable objects instead of another mapping, and Counter.update adds counts instead of replacing them. >>> >>> >>> None of this answers my question. Under what circumstances would you multiply a counter by a scalar (let alone by another counter)? The fact that counters differ in some ways from other mappings doesn't justify every arbitrary change proposed. >>> >> Well, you can add Counters together: >> >>>>> from collections import Counter >>>>> Counter({'a': 1, 'b': 2}) + Counter({'a': 1, 'b': 2}) >> Counter({'b': 4, 'a': 2}) >> >> so multiplying by a non-negative scalar would have the same behaviour. > > With respect MRAB, you are also not answering my question, which is about *why* somebody might wish to multiply a Counter, not what it would do. I already know that multiplication by an integer is equivalent to repeated addition. What I'm asking for is a use-case, for why one might find this functionality useful, not on what the functionality is. > > For example, the use-case for Counter addition might be: > > "I have used a Counter to count items from one data set, and another Counter to count items from a different data set. Now I want to count items from both data sets together. Rather than concatenate the two data sets (which may not even be possible, if they were from iterators) and re-count, I can simply add the two Counters." > > That's a great use-case, and it justifies supporting Counter + Counter. > > The use-case for Counter multiplication escapes me, unless it is this: > > "I have a Counter. I want to multiply all the counts by a constant, just because I can." > "I have a batch of identical collections of items. How many do I have of each item?" > > To the Original Poster, James K: > > If the only justification for this proposal is that you think it is logical that Counters should support this behaviour, then I am against it. Adding this behaviour requires more code, more tests, more documentation, more things for users to learn. But if you have a good use-case showing some circumstances where a programmer would find it helpful to have this functionality, then please tell us. > From rosuav at gmail.com Thu May 30 17:20:27 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 31 May 2013 01:20:27 +1000 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: <51A76DC1.4050006@mrabarnett.plus.com> References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> <51A6A2C0.5030004@case.edu> <51A6AB76.5090706@pearwood.info> <51A733A9.8010607@mrabarnett.plus.com> <51A76719.3@pearwood.info> <51A76DC1.4050006@mrabarnett.plus.com> Message-ID: On Fri, May 31, 2013 at 1:18 AM, MRAB wrote: > "I have a batch of identical collections of items. How many do I have > of each item?" Perfect use for dictionary comprehension. I don't think this sort of thing is common enough for builtin functionality... ChrisA From python at mrabarnett.plus.com Thu May 30 17:52:56 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 30 May 2013 16:52:56 +0100 Subject: [Python-ideas] collections.Counter multiplication In-Reply-To: References: <51A610C9.4040405@nedbatchelder.com> <51A6980D.90909@pearwood.info> <51A6A2C0.5030004@case.edu> <51A6AB76.5090706@pearwood.info> <51A733A9.8010607@mrabarnett.plus.com> <51A76719.3@pearwood.info> <51A76DC1.4050006@mrabarnett.plus.com> Message-ID: <51A775D8.8030107@mrabarnett.plus.com> On 30/05/2013 16:20, Chris Angelico wrote: > On Fri, May 31, 2013 at 1:18 AM, MRAB wrote: >> "I have a batch of identical collections of items. How many do I have >> of each item?" > > Perfect use for dictionary comprehension. I don't think this sort of > thing is common enough for builtin functionality... > ...and I haven't needed such functionality thus far, so I'll leave it at that. :-) From abarnert at yahoo.com Thu May 30 18:37:09 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 30 May 2013 09:37:09 -0700 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: <0E9EFAB7-3A92-4E6D-9403-5B2EE2F6F987@yahoo.com> On May 30, 2013, at 3:42, Haoyi Li wrote: > > But still, it's amazing how C++11-ish this discussion is getting. Which may be a good hint that (as you suggest) this feature isn't a good fit for Python. > > I don't agree with this; while C++ is huge and terrible, C++11 actually has some pretty good stuff (e.g. real lambdas, with real closures and real lexical scoping! u"unicode" and r"raw" strings!). Dismissing something as bad just because it's something C++11 has is a terrible idea and immediately shuts out a whole range of interesting possibilities. C++11 is a set of very clever, and often very nice, improvements, but nearly all of the improvements are solutions to problems unique to C++90. And if you step back, the problem most people are trying to solve here is that they don't want to do string processing inline within a function because they're worried about performance. (Of course the OP wasn't even worried about that; he just wanted to save a few keystrokes making decimals. But most of the follow ups haven't been about that.) I'd guess that 90% of the time those performance worries are misguided, and the remaining 10% can already be solved by moving the strings to module level, caching the results, etc. So, we're looking for syntactic sugar for a way to accomplish one of those solutions more nicely. That's a very different problem than making strings (and other types) accessible to a compile-time metalanguage that's radically different from the runtime language, which is what C++ was solving. (PS, you misattributed most of your quotes to me instead of Eric Snow. But I'm sure he'll still see then.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu May 30 22:00:35 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 30 May 2013 14:00:35 -0600 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: On Thu, May 30, 2013 at 4:42 AM, Haoyi Li wrote: >> The language needs to fit in our brains and you could argue > that we've already pushed past that threshold > > I would argue that "prefix just calls the registered function at import > time" fits with my brain better then "20 special cases in the lexer which > result in magic things happening", but maybe that's just me. The existing prefixes are not going away, but they are easy to look up in the docs. So that brain-space is not going way. Additionally, under this proposal those prefixes could change meaning at run-time. Furthermore, you could encounter prefixes that are not in the docs and you would have to know how to figure out what they are supposed to do. Right now there is zero ambiguity about what the prefixes do. So this proposal would simply be adding one more thing that has to fit inside your brain without adding any power to the language. On top of that I'm not seeing why this would be very widely used meaning people would probably not recognize what's going on at first. This is rough on the non-experts, and adds more complexity to the language where we already have a perfectly good way of doing the same thing (explicit function calls). > >> One concern I'd have > is in how you would look up what some arbitrary prefix is supposed to > do > > In theory PyCharm/Vim/Emacs would be able to just jump to the definition of > the prefix, instantly giving you the docstring along with the source code of > what it does, so I don't think this is a real problem. Tool support will > catch up, as always, and you could always grep "@str_prefix\ndef blah" or > something similar. So you would have to use some editor with up-to-date support and do it at run-time. No web viewers. No distro-installed editors. This has a particularly smell to it. > >> Furthermore, it sounds like the intent is to apply the prefix at > run-time. > > Is there any other time in Python? There's only import-time and > execution-time to choose from. 1. compile-time (during import with no up-to-date .pyc file) 2. run-time 2a. import time (execution of the module) 2b. call-time (execution of a function body) .pyc files allow us to skip this part after the first time. Currently string prefixes are handled at compile time. This proposal moves it to run-time. Furthermore, only the raw string literals would still actually be literals and able to be interned/compiled away/stored in a functions constants. But now there would be extra byte codes used in the compiled code to process the raw literals. All for the sake of hypothetical use cases that can already be handled by explicit function calls. > >> That means that the function that gets applied depends on > what's in some registry at a given moment. So source with these > custom string prefixes will be even more ambiguous when read. > > I don't think the second line follows from the first. You could say the same > of PEP302 import hooks, which overload a fundamental operation (import xxx) > to do whatever the hell you want, depending on what's in your sys.meta_path > at any given moment. Heck, you could say the same of any function call > my_func(...), that it depends on what my_func has been defined as at a given > moment! Remember functions can be (and often are!) rebound multiple times, > but people seem to get by just fine. Consider that the import system is a giant black box for most people and even gives the experts fits. Comparing this proposal to it really says something. Furthermore, existing dynamic-language-induced challenges do not justify adding more. > >> In that case, why not just call the > function at the module scope, bind the result to a name there, and use > that name wherever you like. It's effectively a constant, right? > > The main reason I'd give is that moving a whole bunch of things into module > scope spaghettifies your program. If I'm making a bunch of SQL queries in a > bunch of functions, I expect the SQL to be in the functions where I'm using > them (nobody else cares about them, right?) rather than at module scope. > Moving stuff to module scope for performance adds another layer of > unnecessary indirection when i look at my_function and wonder what > my_function_sql_query and my_function_xml_template is meant to do, for every > single function. Then put them in a class with the related methods. However, my whole moving-things-around recommendation is orthogonal to this proposal. Handling a prefix during a function call would be equivalent (or even less performant) to explicitly calling the handler. Bottom line: I'm still not seeing the utility of this feature over plain function calls. -eric p.s. Do you think this feature would ever be used in the standard library if it made it into the language? From haoyi.sg at gmail.com Thu May 30 22:22:13 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 30 May 2013 16:22:13 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: > So you would have to use some editor with up-to-date support and do it at run-time. No web viewers. No distro-installed editors. This has a particularly smell to it. Grep is hardly an editor with up to date support! It'd just be like looking up a function you don't know. Either it's in the docs, or if not, you start grepping. Literally, it will be exactly like looking up a function! It's not like grepping for "@str_prefix\ndef regex" is any harder than grepping for "def regex"! > Then put them in a class with the related methods. However, my whole moving-things-around recommendation is orthogonal to this proposal. Handling a prefix during a function call would be equivalent (or even less performant) to explicitly calling the handler. The idea was to handle the prefix at either import or compile time (i'm not actually sure of the distinction, although i'm sure there is one), so it would be fully inlined by the time the code starts executed (over and over). I don't think "big class full of related static methods only called once" is a very good way of organizing code, and the reason people do (and they do do it!) it for use cases like this is because they have to, not because they like to. > p.s. Do you think this feature would ever be used in the standard library if it made it into the language? Maybe? I could imagine the regex module using it right away for a very nice syntax for: regex"".match(...) While simultaneously getting rid of the behaviourally unspecified global-compiled-regex-cache (what does "a few regular expressions at a time" mean anyway?) in favor of per-regex-literal interning of compiled regexes, which is what the global-compiled-regex-cache is trying to approximate anyway. The sqlite3 (or sqlite4 or sqlite10) module could eventually use this to pre-parse SQL literal queries, as well as the xml libraries for pre-parsing XML blobstrings. If any of the parser-combinator libraries like Parsimonious end up in the std lib, they could definitely use it too. Anyway, I don't expect everyone to get up and agree with me all of a sudden, or to have it used in the std lib tomorrow, so it's OK if you disagree with my judgement or your experience tells you it's a bad idea. I just thought it's worth a substantive debate because I think it's a neat idea that's started appearing in other places, and the OP brought it up, so there's some indication it's not just me being crazy. I've made all the points I want to make, so I should probably stop talking now. -Haoyi On Thu, May 30, 2013 at 4:00 PM, Eric Snow wrote: > On Thu, May 30, 2013 at 4:42 AM, Haoyi Li wrote: > >> The language needs to fit in our brains and you could argue > > that we've already pushed past that threshold > > > > I would argue that "prefix just calls the registered function at import > > time" fits with my brain better then "20 special cases in the lexer which > > result in magic things happening", but maybe that's just me. > > The existing prefixes are not going away, but they are easy to look up > in the docs. So that brain-space is not going way. Additionally, > under this proposal those prefixes could change meaning at run-time. > Furthermore, you could encounter prefixes that are not in the docs and > you would have to know how to figure out what they are supposed to do. > Right now there is zero ambiguity about what the prefixes do. > > So this proposal would simply be adding one more thing that has to fit > inside your brain without adding any power to the language. On top of > that I'm not seeing why this would be very widely used meaning people > would probably not recognize what's going on at first. This is rough > on the non-experts, and adds more complexity to the language where we > already have a perfectly good way of doing the same thing (explicit > function calls). > > > > >> One concern I'd have > > is in how you would look up what some arbitrary prefix is supposed to > > do > > > > In theory PyCharm/Vim/Emacs would be able to just jump to the definition > of > > the prefix, instantly giving you the docstring along with the source > code of > > what it does, so I don't think this is a real problem. Tool support will > > catch up, as always, and you could always grep "@str_prefix\ndef blah" or > > something similar. > > So you would have to use some editor with up-to-date support and do it > at run-time. No web viewers. No distro-installed editors. This has > a particularly smell to it. > > > > >> Furthermore, it sounds like the intent is to apply the prefix at > > run-time. > > > > Is there any other time in Python? There's only import-time and > > execution-time to choose from. > > 1. compile-time (during import with no up-to-date .pyc file) > 2. run-time > 2a. import time (execution of the module) > 2b. call-time (execution of a function body) > > .pyc files allow us to skip this part after the first time. Currently > string prefixes are handled at compile time. This proposal moves it > to run-time. Furthermore, only the raw string literals would still > actually be literals and able to be interned/compiled away/stored in a > functions constants. But now there would be extra byte codes used in > the compiled code to process the raw literals. All for the sake of > hypothetical use cases that can already be handled by explicit > function calls. > > > > >> That means that the function that gets applied depends on > > what's in some registry at a given moment. So source with these > > custom string prefixes will be even more ambiguous when read. > > > > I don't think the second line follows from the first. You could say the > same > > of PEP302 import hooks, which overload a fundamental operation (import > xxx) > > to do whatever the hell you want, depending on what's in your > sys.meta_path > > at any given moment. Heck, you could say the same of any function call > > my_func(...), that it depends on what my_func has been defined as at a > given > > moment! Remember functions can be (and often are!) rebound multiple > times, > > but people seem to get by just fine. > > Consider that the import system is a giant black box for most people > and even gives the experts fits. Comparing this proposal to it really > says something. Furthermore, existing dynamic-language-induced > challenges do not justify adding more. > > > > >> In that case, why not just call the > > function at the module scope, bind the result to a name there, and use > > that name wherever you like. It's effectively a constant, right? > > > > The main reason I'd give is that moving a whole bunch of things into > module > > scope spaghettifies your program. If I'm making a bunch of SQL queries > in a > > bunch of functions, I expect the SQL to be in the functions where I'm > using > > them (nobody else cares about them, right?) rather than at module scope. > > Moving stuff to module scope for performance adds another layer of > > unnecessary indirection when i look at my_function and wonder what > > my_function_sql_query and my_function_xml_template is meant to do, for > every > > single function. > > Then put them in a class with the related methods. However, my whole > moving-things-around recommendation is orthogonal to this proposal. > Handling a prefix during a function call would be equivalent (or even > less performant) to explicitly calling the handler. > > Bottom line: I'm still not seeing the utility of this feature over > plain function calls. > > -eric > > p.s. Do you think this feature would ever be used in the standard > library if it made it into the language? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goktug.kayaalp at gmail.com Thu May 30 22:15:39 2013 From: goktug.kayaalp at gmail.com (=?utf-8?B?R8O2a3R1xJ8=?= Kayaalp) Date: Thu, 30 May 2013 23:15:39 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <0E9EFAB7-3A92-4E6D-9403-5B2EE2F6F987@yahoo.com> (Andrew Barnert's message of "Thu, 30 May 2013 09:37:09 -0700") References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> <0E9EFAB7-3A92-4E6D-9403-5B2EE2F6F987@yahoo.com> Message-ID: <87bo7sw7f8.fsf@gmail.com> > Of course the OP wasn't even worried about that; he just wanted to > save a few keystrokes making decimals. I haven't used the decimal library for anything other than the example I've given in the first post of this thread. What I wanted was a more convenient way to express transformations on string literals that are usually expressed via functions and classes and what I proposed was a method to achieve this utilizing syntactic sugar and runtime evaluation of string literals. Reducing my *worries* to an *effort of mutating the language to save some keystrokes* is rude. G?ktu?. Andrew Barnert writes: > On May 30, 2013, at 3:42, Haoyi Li wrote: > > > >?But still, it's amazing how C++11-ish this discussion is getting. Which > may be a good hint that (as you suggest) this feature isn't a good fit for > Python. > > > I don't agree with this; while C++ is huge and terrible, C++11 actually > has some pretty good stuff (e.g. real lambdas, with real closures and real > lexical scoping!?u"unicode" and r"raw" strings!). Dismissing something as > bad just because it's something C++11 has is a terrible idea and > immediately shuts out a whole range of interesting possibilities. > > > C++11 is a set of very clever, and often very nice, improvements, but nearly > all of the improvements are solutions to problems unique to C++90. > > And if you step back, the problem most people are trying to solve here is that > they don't want to do string processing inline within a function because they're > worried about performance.?(Of course the OP wasn't even worried about that; > he just wanted to save a few keystrokes making decimals. But most of the > follow ups haven't been about that.) I'd?guess that 90% of the time those > performance worries are misguided, and the remaining 10% can already be solved > by moving the strings to module level, caching the results, etc. So, we're > looking for syntactic sugar for a way to accomplish one of those solutions > more nicely. That's a very different problem than making strings (and other > types) accessible to a compile-time metalanguage that's radically different > from the runtime language, which is what C++ was solving. > > (PS, you misattributed most of your quotes to me instead of Eric Snow. But I'm > sure he'll still see then.) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- G?ktu? Kayaalp From mertz at gnosis.cx Thu May 30 22:51:28 2013 From: mertz at gnosis.cx (David Mertz) Date: Thu, 30 May 2013 13:51:28 -0700 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87bo7sw7f8.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> <0E9EFAB7-3A92-4E6D-9403-5B2EE2F6F987@yahoo.com> <87bo7sw7f8.fsf@gmail.com> Message-ID: I am strongly opposed to any such syntax sugar. In all the posts in this thread, I have not seen anything that would do anything other than save 4 keystrokes. The thing that expresses transformations on string literals are known as *functions* already. And while functions need a set of parentheses and perhaps an 'r' to make sure raw strings are passed in, this is exactly the same thing as the "registry of evaluated string types". I.e. if you want: mydecimal = D"123.45" Just settle for the perfectly good (and far less confusing and error-prone): mydecimal = D(r"123.45") And likewise, this is fine: sql = S(r"SELECT * FROM foo WHERE field='value'") funny_string_type = f(r"Gets interpreted [in some] ") On Thu, May 30, 2013 at 1:15 PM, G?ktu? Kayaalp wrote: > > Of course the OP wasn't even worried about that; he just wanted to > > save a few keystrokes making decimals. > > I haven't used the decimal library for anything other than the example > I've given in the first post of this thread. What I wanted was a more > convenient way to express transformations on string literals that are > usually expressed via functions and classes and what I proposed was a > method to achieve this utilizing syntactic sugar and runtime evaluation > of string literals. Reducing my *worries* to an *effort of mutating the > language to save some keystrokes* is rude. > > G?ktu?. > Andrew Barnert writes: > > > On May 30, 2013, at 3:42, Haoyi Li wrote: > > > > > > > But still, it's amazing how C++11-ish this discussion is getting. > Which > > may be a good hint that (as you suggest) this feature isn't a good > fit for > > Python. > > > > > > I don't agree with this; while C++ is huge and terrible, C++11 > actually > > has some pretty good stuff (e.g. real lambdas, with real closures > and real > > lexical scoping! u"unicode" and r"raw" strings!). Dismissing > something as > > bad just because it's something C++11 has is a terrible idea and > > immediately shuts out a whole range of interesting possibilities. > > > > > > C++11 is a set of very clever, and often very nice, improvements, but > nearly > > all of the improvements are solutions to problems unique to C++90. > > > > And if you step back, the problem most people are trying to solve here > is that > > they don't want to do string processing inline within a function because > they're > > worried about performance. (Of course the OP wasn't even worried about > that; > > he just wanted to save a few keystrokes making decimals. But most of the > > follow ups haven't been about that.) I'd guess that 90% of the time those > > performance worries are misguided, and the remaining 10% can already be > solved > > by moving the strings to module level, caching the results, etc. So, > we're > > looking for syntactic sugar for a way to accomplish one of those > solutions > > more nicely. That's a very different problem than making strings (and > other > > types) accessible to a compile-time metalanguage that's radically > different > > from the runtime language, which is what C++ was solving. > > > > (PS, you misattributed most of your quotes to me instead of Eric Snow. > But I'm > > sure he'll still see then.) > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > -- > G?ktu? Kayaalp > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu May 30 23:03:47 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 30 May 2013 14:03:47 -0700 Subject: [Python-ideas] Custom string prefixes In-Reply-To: <87bo7sw7f8.fsf@gmail.com> References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> <0E9EFAB7-3A92-4E6D-9403-5B2EE2F6F987@yahoo.com> <87bo7sw7f8.fsf@gmail.com> Message-ID: On May 30, 2013, at 13:15, goktug.kayaalp at gmail.com (G?ktu? Kayaalp) wrote: >> Of course the OP wasn't even worried about that; he just wanted to >> save a few keystrokes making decimals. > > I haven't used the decimal library for anything other than the example > I've given in the first post of this thread. What I wanted was a more > convenient way to express transformations on string literals that are > usually expressed via functions and classes and what I proposed was a > method to achieve this utilizing syntactic sugar and runtime evaluation > of string literals. Reducing my *worries* to an *effort of mutating the > language to save some keystrokes* is rude. But that's exactly what it is. Whether its just for Decimal, or for a wide variety of different constructors and function calls, it saves literally two keystrokes for each use (or maybe 3, if your functions need raw strings). I don't see why it's rude to point that out, or how it's not obviously true. That doesn't mean it's necessarily a bad idea. As I said earlier, if there are application domains (numerics, XML processing, django, whatever) where a particular prefix would be of widespread use in making everyone's code more readable within that community, even though those prefixes wouldn't be used outside the community, your proposal might be a good idea. But it's still true that all it does is save a few keystrokes. Meanwhile, the further suggestion by Haoyi and others to move things to compile time makes it a very different idea (especially since you explicitly asked for runtime evaluation), with different pros and cons. And that's what I was pointing out. From ericsnowcurrently at gmail.com Thu May 30 23:14:29 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 30 May 2013 15:14:29 -0600 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: On Thu, May 30, 2013 at 2:22 PM, Haoyi Li wrote: > The idea was to handle the prefix at either import or compile time (i'm not > actually sure of the distinction, although i'm sure there is one), so it > would be fully inlined by the time the code starts executed (over and over). Then that is basically the same idea as the one I thought I was extrapolating last night. And I already said there why I think it's unnecessary. -eric From ncoghlan at gmail.com Fri May 31 00:25:25 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 May 2013 08:25:25 +1000 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: On 31 May 2013 07:15, "Eric Snow" wrote: > > On Thu, May 30, 2013 at 2:22 PM, Haoyi Li wrote: > > The idea was to handle the prefix at either import or compile time (i'm not > > actually sure of the distinction, although i'm sure there is one), so it > > would be fully inlined by the time the code starts executed (over and over). > > Then that is basically the same idea as the one I thought I was > extrapolating last night. And I already said there why I think it's > unnecessary. Folks, there are several prior discussions on this list regarding AST based metaprogramming. There *are* valid use cases for letting third party libraries hook into the compilation system to transform a raw text string into a different kind of object, with the three biggest examples being nice subprocess invocations, inline SQL and implicit string interpolation that only permit literals, thus avoiding most naive string injection vulnerabilities. Security is the main gain here, since many security vulnerabilities arise from developers passing untrusted input to unsafe functions. By providing a syntax that accepts only raw string literals, we could open up a new avenue for more secure API design, as literals are just as trusted as any other piece of source code. This is *not* an easy problem to solve, but framing an initial exploration as finding a way to replace the existing string prefix processing is a good way to ground a proposed solution in practical reality. Cheers, Nick. > > -eric > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri May 31 00:29:43 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 30 May 2013 18:29:43 -0400 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: > Then that is basically the same idea as the one I thought I was extrapolating last night. And I already said there why I think it's unnecessary. We've both answered each others points, and neither of us is convinced. Let's just say we disagree and leave it at that =); we don't *need* to convince each other right here right now. On Thu, May 30, 2013 at 6:25 PM, Nick Coghlan wrote: > > On 31 May 2013 07:15, "Eric Snow" wrote: > > > > On Thu, May 30, 2013 at 2:22 PM, Haoyi Li wrote: > > > The idea was to handle the prefix at either import or compile time > (i'm not > > > actually sure of the distinction, although i'm sure there is one), so > it > > > would be fully inlined by the time the code starts executed (over and > over). > > > > Then that is basically the same idea as the one I thought I was > > extrapolating last night. And I already said there why I think it's > > unnecessary. > > Folks, there are several prior discussions on this list regarding AST > based metaprogramming. There *are* valid use cases for letting third party > libraries hook into the compilation system to transform a raw text string > into a different kind of object, with the three biggest examples being nice > subprocess invocations, inline SQL and implicit string interpolation that > only permit literals, thus avoiding most naive string injection > vulnerabilities. > > Security is the main gain here, since many security vulnerabilities arise > from developers passing untrusted input to unsafe functions. By providing a > syntax that accepts only raw string literals, we could open up a new avenue > for more secure API design, as literals are just as trusted as any other > piece of source code. > > This is *not* an easy problem to solve, but framing an initial exploration > as finding a way to replace the existing string prefix processing is a good > way to ground a proposed solution in practical reality. > > Cheers, > Nick. > > > > > -eric > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri May 31 06:27:26 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Fri, 31 May 2013 00:27:26 -0400 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: <51A698A9.3020900@pearwood.info> References: <51A69390.8070905@pearwood.info> <51A698A9.3020900@pearwood.info> Message-ID: Anyone else have any thoughts about this? This seems like it would be a pretty straightforward thing to do, and I would be happy to go through the code and submit a patch. The only question is whether we want to do it in the first place; are there any reasons it can't/shouldn't be done that I'm not aware of? On Wed, May 29, 2013 at 8:09 PM, Steven D'Aprano wrote: > On 30/05/13 10:04, Haoyi Li wrote: > >> I don't need to keep the source code, I just need a single integer for >> each >> node. I would then be able to reconstruct the source snippet. >> > > And so you did say. Sorry for the noise. > > > -- > Steven > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 31 07:04:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 May 2013 15:04:13 +1000 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: References: <51A69390.8070905@pearwood.info> <51A698A9.3020900@pearwood.info> Message-ID: On 31 May 2013 14:28, "Haoyi Li" wrote: > > Anyone else have any thoughts about this? This seems like it would be a pretty straightforward thing to do, and I would be happy to go through the code and submit a patch. The only question is whether we want to do it in the first place; are there any reasons it can't/shouldn't be done that I'm not aware of? Seems reasonable to me, but would need to see a patch to give a definite yes or no. Cheers, Nick. > > > On Wed, May 29, 2013 at 8:09 PM, Steven D'Aprano wrote: >> >> On 30/05/13 10:04, Haoyi Li wrote: >>> >>> I don't need to keep the source code, I just need a single integer for each >>> node. I would then be able to reconstruct the source snippet. >> >> >> And so you did say. Sorry for the noise. >> >> >> -- >> Steven >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri May 31 09:02:17 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 31 May 2013 16:02:17 +0900 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: <87wqqfbpja.fsf@uwakimon.sk.tsukuba.ac.jp> >>>>> Haoyi Li writes: > Maybe? I could imagine the regex module using it right away for a > very nice syntax for: > regex"".match(...) I don't consider that particularly nice compared to the function call it greatly resembles. And there's no way I would copy-paste a to multiple places, so I'd need an appropriately- scoped variable for it anyway -- which might as well be initialized to the compiled regexp. That leaves possibility of regexp compilation at source compile time as the unique benefit. I don't think that's worth new syntax. I think those observations about *this* particular use case probably generalize to similar use cases. That leaves the "let's make it as easy as possible to use safe literals in places where variable interpolation is dangerous" rationale, which I think is a lot more plausible. My intuition says that is better addressed in templating languages for user interfaces, though. At least, that's where I regularly stick my nose (and occasionally my neck) into hanging nooses. > While simultaneously getting rid of the behaviourally unspecified > global-compiled-regex-cache (what does "a few regular expressions > at a > time" > mean anyway?) It means it's an optimization that throwaway scripts can take advantage of, leaving the choice to "intern" the regex to the user. > in favor of per-regex-literal interning of compiled regexes, which > is what the global-compiled-regex-cache is trying to approximate I'm not the author, so I'm not authoritative on Python, but in XEmacs we do the same thing because it dramatically speeds up loops where the regexp is inlined. (Emacs Lisp doesn't provide a compiled regexp type. Of course they'd be even faster if we could save the compiled regexp in a variable, but very strict compatibility with Emacs is required.) Somebody once tried attaching the compiled regexp to the strings as properties, but that meant interning the strings so string content storage alone increased by about 20% in XEmacs itself, and the compiled regexps added another 25% of string storage. So XEmacs itself grew by about 5%, and uncollectable strings plus compiled regexp baggage accumulated rapidly.[1] That was unacceptable given a lack of visible performance improvement over the cache. The global regexp cache is a reasonable compromise, that's all: an inexpensive, easily tuned optimization which provides substantial saving in a common case. Footnotes: [1] Granted, this was before we implemented weakrefs, which probably would mitigate this problem. From ubershmekel at gmail.com Fri May 31 09:32:23 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Fri, 31 May 2013 10:32:23 +0300 Subject: [Python-ideas] Custom string prefixes In-Reply-To: References: <87obbw4scx.fsf@gmail.com> <51A35C07.4080207@pearwood.info> <0808A36E-A5D1-4DA4-8B14-D8CB0C3D448C@yahoo.com> Message-ID: On Fri, May 31, 2013 at 1:25 AM, Nick Coghlan wrote: > [...] implicit string interpolation that only permit literals, thus > avoiding most naive string injection vulnerabilities. > > Security is the main gain here, since many security vulnerabilities arise > from developers passing untrusted input to unsafe functions. By providing a > syntax that accepts only raw string literals, we could open up a new avenue > for more secure API design, as literals are just as trusted as any other > piece of source code. > > [...] > > > Do you mean compile time string interpolation? Because if it's anything dynamic then it's still unsafe to interpolate raw string literals. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri May 31 11:59:44 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Fri, 31 May 2013 05:59:44 -0400 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: References: <51A69390.8070905@pearwood.info> <51A698A9.3020900@pearwood.info> Message-ID: Ok, I'll give it a shot; I'm not familiar with the python codebase or build process, but i'll puzzle it out. Where's the place to go for help related to this sort of thing? python-dev? On Fri, May 31, 2013 at 1:04 AM, Nick Coghlan wrote: > > On 31 May 2013 14:28, "Haoyi Li" wrote: > > > > Anyone else have any thoughts about this? This seems like it would be a > pretty straightforward thing to do, and I would be happy to go through the > code and submit a patch. The only question is whether we want to do it in > the first place; are there any reasons it can't/shouldn't be done that I'm > not aware of? > > Seems reasonable to me, but would need to see a patch to give a definite > yes or no. > > Cheers, > Nick. > > > > > > > On Wed, May 29, 2013 at 8:09 PM, Steven D'Aprano > wrote: > >> > >> On 30/05/13 10:04, Haoyi Li wrote: > >>> > >>> I don't need to keep the source code, I just need a single integer for > each > >>> node. I would then be able to reconstruct the source snippet. > >> > >> > >> And so you did say. Sorry for the noise. > >> > >> > >> -- > >> Steven > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 31 15:47:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 May 2013 23:47:15 +1000 Subject: [Python-ideas] Could the ast module's ASTs preserve source_length in addition to lineno and col_offset? In-Reply-To: References: <51A69390.8070905@pearwood.info> <51A698A9.3020900@pearwood.info> Message-ID: On 31 May 2013 20:00, "Haoyi Li" wrote: > > Ok, I'll give it a shot; I'm not familiar with the python codebase or build process, but i'll puzzle it out. Where's the place to go for help related to this sort of thing? python-dev? Check the developer guide at docs. python.org/devguide, and if you have any follow-up questions, sign up to the core-mentorship at python.org list. Cheers, Nick. > > > On Fri, May 31, 2013 at 1:04 AM, Nick Coghlan wrote: >> >> >> On 31 May 2013 14:28, "Haoyi Li" wrote: >> > >> > Anyone else have any thoughts about this? This seems like it would be a pretty straightforward thing to do, and I would be happy to go through the code and submit a patch. The only question is whether we want to do it in the first place; are there any reasons it can't/shouldn't be done that I'm not aware of? >> >> Seems reasonable to me, but would need to see a patch to give a definite yes or no. >> >> Cheers, >> Nick. >> >> > >> > >> > On Wed, May 29, 2013 at 8:09 PM, Steven D'Aprano wrote: >> >> >> >> On 30/05/13 10:04, Haoyi Li wrote: >> >>> >> >>> I don't need to keep the source code, I just need a single integer for each >> >>> node. I would then be able to reconstruct the source snippet. >> >> >> >> >> >> And so you did say. Sorry for the noise. >> >> >> >> >> >> -- >> >> Steven >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> http://mail.python.org/mailman/listinfo/python-ideas >> > >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > http://mail.python.org/mailman/listinfo/python-ideas >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Fri May 31 18:35:43 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 18:35:43 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery Message-ID: Hi, reading PEP 426, I made a connection to a (IMHO) longstanding issue: YAML not being in the stdlib. I?m no big fan of JSON, because it?s so strict and comparatively verbose compared with YAML. I just think YAML is more pythonic, and a better choice for any kind of human-written data format. So i devised 3 ideas: 1. *YAML in the stdlib* The stdlib shouldn?t get more C code; that?s what I?ve gathered. So let?s put a pure-python implementation of YAML into the stdlib. Let?s also strictly define the API and make it secure-by-naming?. What i mean is let?s use the safe load function that doesn?t instantiate user-defined classes (in PyYAML called ?safe_load?) as default load function ?load?, and call the unsafe one by a longer, explicit name (e.g. ?unsafe_load? or ?extended_load? or something) Let?s base the parser on generators, since generators are cool, easy to debug, and allow us to emit and test the token stream (other than e.g. the HTML parser we have) 2. *Implementation discovery* People want fast parsing. That?s incompatible with a pure python implementation. So let?s define (or use, if there is one I?m not aware of) a discovery mechanism that allows implementations of certain APIs to register themselves as such. Let ?import yaml? use this mechanism to import a compatible 3rd party implementation in preference to the stdlib one Let?s define a property of the implementation that tells the user which implementation he?s using, and a way to select a specific implementation (Although that?s probably easily done by just not doing ?import yaml?, but ?import std_yaml? or ?import pyyaml2?) 3. Allow YAML to be used besides JSON as metadata like in PEP 426. (so including either pymeta.yaml or pymeta.json makes a valid package) I don?t propose that we exclusively use YAML, but only because I think that PEP 426 shouldn?t be hindered from being implemented ASAP by waiting for a new std-library to be ready. What do you think? Is there a reason for not including a YAML lib that i didn?t cover? Is there a reason JSON is used other than YAML not being in the stdlib? -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Fri May 31 19:02:53 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 31 May 2013 10:02:53 -0700 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: On May 31, 2013 9:46 AM, "Philipp A." wrote: > > I?m no big fan of JSON, because it?s so strict and comparatively verbose compared with YAML. I just think YAML is more pythonic, and a better choice for any kind of human-written data format. > Considering json values are Python literals and yaml isn't I'd say you have the first part backwards. And as far as human-written data goes strictness helps prevent errors. But it doesn't have to be a competition. If there's value in having a standard yaml parser or value in accepting yaml in specific cases that value should stand by itself. --- Bruce (from my phone) -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Fri May 31 19:18:17 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 19:18:17 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: json is both subset of python literals (pyon, if you want) and yaml, but stricter than both: it doesn?t allow typewriter apostrophes (') for strings like python, and doesn?t allow unicode, raw strings or byte literals. (yaml afaik knows all of those, albeit with different syntax than python. but it has them, allowing to easily represent more of python?s capabilities) yaml reports errors and treats ambiguities as errors. with ?strictness?, i mean the ways something can be written: both python and yaml allow for synonymous ways to write the same object, whereas JSON only accepts variable whitespace: hardly a source of error! and of course there?s value in a stdlib yaml parser: as i said it?s much more human friendly, and even some python projects already use it for configuration because of this (i say ?even?, because of course JSON being in the stdlib is a string aargument to use it in dependecy-free projects). also YAML is standardized, so having a parser in the stdlib doesn?t mean it?s a bad thing when development stalls, because a parsing library doesn?t need to evolve (was it kenneth reitz who said the stdlib is where projects go to die?) 2013/5/31 Bruce Leban > On May 31, 2013 9:46 AM, "Philipp A." wrote: > > > > I?m no big fan of JSON, because it?s so strict and comparatively verbose > compared with YAML. I just think YAML is more pythonic, and a better choice > for any kind of human-written data format. > > > > Considering json values are Python literals and yaml isn't I'd say you > have the first part backwards. And as far as human-written data goes > strictness helps prevent errors. > > But it doesn't have to be a competition. If there's value in having a > standard yaml parser or value in accepting yaml in specific cases that > value should stand by itself. > > --- Bruce > (from my phone) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Fri May 31 19:49:10 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Fri, 31 May 2013 17:49:10 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?PEP_426=2C=09YAML_in_the_stdlib_and_impl?= =?utf-8?q?ementation_discovery?= References: Message-ID: Philipp A. writes: > Hi, reading PEP 426, I made a connection to a (IMHO) longstanding issue: > YAML not being in the stdlib. There have been security issues with YAML (which bit the Rails community not so long ago) because it allows the construction of arbitrary objects. So it may be that YAML is not the best format for scenarios where tools read YAML from untrusted sources. The PEP defines the metadata format as a Python dictionary - the serialising of metadata to a specific file format seems a secondary consideration. It's quite possible that some of the packaging tools that use the new metadata will support different serialisation mechanisms, perhaps including YAML, but ISTM that having YAML in the stdlib is orthogonal to the PEP. Do you have a specific YAML implementation in mind? I thought that the front-runner was PyYAML, but in my initial experiments with PyYAML and packaging metadata, I found bugs in the implementation (which I have reported on the PyYAML tracker) which made me switch to JSON. Regards, Vinay Sajip From solipsis at pitrou.net Fri May 31 20:00:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 31 May 2013 20:00:07 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery References: Message-ID: <20130531200007.6328cab2@fsol> Hello, On Fri, 31 May 2013 18:35:43 +0200 "Philipp A." wrote: > > What do you think? > > Is there a reason for not including a YAML lib that i didn?t cover? As for many topics, the #1 reason is that nobody proposed such an inclusion. By *proposing*, we do not mean just emitting the idea as you did (which is of course fine), but drafting a concrete proposal (under the form of a Python Enhancement Proposal), and promising - or finding someone else willing to promise - to handle maintenance and bugfixing of the new stdlib module within our development community (not eternally of course, but a couple of years would be nice so as to iron out most issues). Regards Antoine. From flying-sheep at web.de Fri May 31 20:13:18 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 20:13:18 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: 2013/5/31 Vinay Sajip > There have been security issues with YAML (which bit the Rails community > not > so long ago) because it allows the construction of arbitrary objects. So it > may be that YAML is not the best format for scenarios where tools read YAML > from untrusted sources. > please read my post again: i specifically mention that issue and a possible solution. i?m just a little annoyed that you skipped that paragraph and attack a strawman now. but not too annoyed :) The PEP defines the metadata format as a Python dictionary - the serialising > of metadata to a specific file format seems a secondary consideration. It's > quite possible that some of the packaging tools that use the new metadata > will support different serialisation mechanisms, perhaps including YAML, > but > ISTM that having YAML in the stdlib is orthogonal to the PEP. > but in the future, package metadata won?t be specified in the setup.py anymore, so we need a metadata file (like setup.cfg would have been for distutils2). and we write those per hand. the involved metadata corresponds exactly to the one mentioned here, so what do you think that the format of that metadata file will be? Do you have a specific YAML implementation in mind? I thought that the > front-runner was PyYAML, but in my initial experiments with PyYAML and > packaging metadata, I found bugs in the implementation (which I have > reported on the PyYAML tracker) which made me switch to JSON. > i didn?t think of any, but i don?t think any available one would meet the proposed goals of a secure API (like i said in the paragraph you skipped) and a generator-based implementation/API. Regards, > Vinay Sajip > regards, phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Fri May 31 20:18:16 2013 From: masklinn at masklinn.net (Masklinn) Date: Fri, 31 May 2013 20:18:16 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: <9DB60555-0185-4EB7-BD1D-101EAC32A534@masklinn.net> On 2013-05-31, at 19:18 , Philipp A. wrote: > json is both subset of python literals (pyon, if you want) and yaml, but > stricter than both: it doesn?t allow typewriter apostrophes (') for strings > like python, and doesn?t allow unicode, raw strings or byte literals. All json strings are unicode literals: JSON strings can embed any unicode character literally aside from the double quote and backslash, and they support \u escapes (identical to Python's). The only major difference is that JSON does not support \U escapes (to escapes can only be used for BMP characters). > (yaml > afaik knows all of those, albeit with different syntax than python. but it > has them, allowing to easily represent more of python?s capabilities) YAML has no rawstrings that I know of. It also has no byte literals, there is a working draft for a binary tag encoded in base64[0]. Its failsafe schema only contains strings (unicode), same as JSON. > yaml reports errors and treats ambiguities as errors. That is not correct, the spec notes: > A YAML processor may recover from syntax errors, possibly by ignoring > certain parts of the input, but it must provide a mechanism for > reporting such errors. YAML implementations are absolutely free to resolve ambiguities on their own and not report any error by default, and the spec's "loading failure points" graph clearly indicates parsing a YAML document may yield a partial representation. > and of course there?s value in a stdlib yaml parser: as i said it?s much > more human friendly, and even some python projects already use it for > configuration because of this (i say ?even?, because of course JSON being > in the stdlib is a string aargument to use it in dependecy-free projects). > also YAML is standardized, so having a parser in the stdlib doesn?t mean > it?s a bad thing when development stalls, because a parsing library doesn?t > need to evolve (was it kenneth reitz who said the stdlib is where projects > go to die?) [0] http://yaml.org/type/binary.html From brett at python.org Fri May 31 20:35:48 2013 From: brett at python.org (Brett Cannon) Date: Fri, 31 May 2013 14:35:48 -0400 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: On Fri, May 31, 2013 at 12:35 PM, Philipp A. wrote: > Hi, reading PEP 426, I made a connection to a (IMHO) longstanding issue: > YAML not being in the stdlib. > > I?m no big fan of JSON, because it?s so strict and comparatively verbose > compared with YAML. I just think YAML is more pythonic, and a better choice > for any kind of human-written data format. > > So i devised 3 ideas: > > YAML in the stdlib > The stdlib shouldn?t get more C code; that?s what I?ve gathered. > So let?s put a pure-python implementation of YAML into the stdlib. > Let?s also strictly define the API and make it secure-by-naming?. What i > mean is let?s use the safe load function that doesn?t instantiate > user-defined classes (in PyYAML called ?safe_load?) as default load function > ?load?, and call the unsafe one by a longer, explicit name (e.g. > ?unsafe_load? or ?extended_load? or something) > Let?s base the parser on generators, since generators are cool, easy to > debug, and allow us to emit and test the token stream (other than e.g. the > HTML parser we have) So yaml is not going to end up in the stdlib. The format is not used widely enough to warrant being added nor have to maintain a parser for such a complicated format. > Implementation discovery > People want fast parsing. That?s incompatible with a pure python > implementation. > So let?s define (or use, if there is one I?m not aware of) a discovery > mechanism that allows implementations of certain APIs to register themselves > as such. > Let ?import yaml? use this mechanism to import a compatible 3rd party > implementation in preference to the stdlib one > Let?s define a property of the implementation that tells the user which > implementation he?s using, and a way to select a specific implementation > (Although that?s probably easily done by just not doing ?import yaml?, but > ?import std_yaml? or ?import pyyaml2?) The standard practice to to place any accelerated code in something like _yaml and then in yaml.py do a ``from _yaml import *``. > Allow YAML to be used besides JSON as metadata like in PEP 426. (so > including either pymeta.yaml or pymeta.json makes a valid package) > I don?t propose that we exclusively use YAML, but only because I think that > PEP 426 shouldn?t be hindered from being implemented ASAP by waiting for a > new std-library to be ready. But that then creates a possible position where just to read metadata you must have a 3rd-party library installed, and I view that as non-starter. > > What do you think? While I appreciate what you are suggesting, I don't see it happening. > > Is there a reason for not including a YAML lib that i didn?t cover? Yes, see above. > > Is there a reason JSON is used other than YAML not being in the stdlib? It's simpler, it's Python syntax, it's faster to parse. If you don't like json and would rather specify metadata using YAML, I would write a tool that read YAML and then emitted the metadata.json file. That way you get to write your metadata in the format you want but without requiring YAML support in the stdlib. But making YAML a first-class citizen in all of this won't happen as long as YAML is not in the stdlib and that is not a viable option. From guido at python.org Fri May 31 20:43:36 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 31 May 2013 11:43:36 -0700 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: On Fri, May 31, 2013 at 11:35 AM, Brett Cannon wrote: > So yaml is not going to end up in the stdlib. The format is not used > widely enough to warrant being added nor have to maintain a parser for > such a complicated format. Hm. What makes you think it's not being used widely enough? I suppose JSON is more popular, but it's not like YAML is dying. AFAIK there's a 3rd party yaml parser that we could incorporate with its authors' permission -- this would save people from a pretty common dependency (just like we did for JSON). >> Is there a reason JSON is used other than YAML not being in the stdlib? > > It's simpler, it's Python syntax, it's faster to parse. I would warn strongly against the "JSON is Python syntax" meme. While you can usually read JSON with Python's eval(), *writing* it with repr() is a disaster because of JSON's requirement to use double string quotes. And as we know, eval() is unsafe, so the conclusion is that one should always use the json module, and never rely on the fact that it looks like Python (except that it makes the format easy to understand to humans familiar with Python). (I have no opinion on the use of YAML for metadata.) -- --Guido van Rossum (python.org/~guido) From abarnert at yahoo.com Fri May 31 20:43:54 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 31 May 2013 11:43:54 -0700 (PDT) Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> From: Philipp A. Sent:?Friday, May 31, 2013 10:18 AM >?also YAML is standardized, so having a parser in the stdlib doesn?t? > mean it?s a bad thing when development stalls, because a parsing? > library doesn?t need to evolve (was it kenneth reitz who said the? > stdlib is where projects go to die?) But YAML is up to 1.2, and I believe most libraries (including PyYAML) only handle 1.1 so far. There are also known bugs in the 1.1 specification (e.g., "." is a valid float literal, but doesn't specify 0.0 or any other valid value), that each library has to work around.?There are features of the standard, such as YAML<->XML bindings, that are still in early stages of design. Maybe a YAML-1.1-as-interpreted-by-the-majority-of-the-quasi-reference-implementations library doesn't need to evolve, but a?YAML library does. From: Philipp A. Sent: Friday, May 31, 2013 9:35 AM >????1. YAML in the stdlib >The stdlib shouldn?t get more C code; that?s what I?ve gathered. >So let?s put a pure-python implementation of YAML into the stdlib. Are you suggesting importing PyYAML (in modified form, and without the libyaml-binding "fast" implementation) into the stdlib, or building a new one? If the former, have you talked to?Kirill Simonov? If the latter, are you proposing to build it, or just suggesting that it would be nice if somebody did? > Let?s base the? > parser on generators, since generators are cool, easy to debug, and? > allow us to emit and test the token stream (other than e.g. the HTML > parser we have) Do you mean adding a load_iter() akin to load_all() except that it yields one document at a time, or a SAX-like API instead of a simple load()?? Or do you just mean we should write a new implementation from scratch that's, e.g., built around a character stream generator feeding a token generator feeding an event generator feeding a document generator? Since YAML is parseable by simple recursive-descent, that isn't impossible, but?most implementations make extensive use of peeking, which means you'd need to wrap each generator or add a lookahead stash at each level, which might destroy most of the benefits you're looking for. Also, writing a new implementation from scratch isn't exactly trivial. Look at?https://bitbucket.org/xi/pyyaml/src/804623320ab2/lib3/yaml?at=default compared to?http://hg.python.org/cpython/file/16fea8b0f8c4/Lib/json. >????2. Implementation discovery >People want fast parsing. That?s incompatible with a pure python implementation. >So?let?s define (or use, if there is one I?m not aware of) a discovery >mechanism that allows implementations of certain APIs to register >themselves as such. This is an interesting idea, but I think it might be better served by first applying it to something that's already finished, instead of to vaporware.? For example, the third-party lxml library provides an implementation of the ElementTree API. For some use cases, it's better than the stdlib one. So, a?lot of programs start off with this: ? ? try: ? ? ? ? from lxml import etree as ET ? ? except ImportError: ? ? ? ? from xml.etree import ElementTree as ET Your registration mechanism would mean they don't have to do this; they just import from the stdlib, and if lxml is present and registered, it would be loaded instead. >Let ?import yaml? use this mechanism to import a compatible 3rd party implementation? >in preference to the stdlib one >Let?s?define a property of the implementation that tells the user which? >implementation he?s using, and a way to select a specific implementation >(Although that?s probably easily done by just not doing ?import yaml?,? >but ?import std_yaml? or ?import pyyaml2?) There are a few examples of something similar, both in and out of the stdlib. For example: The dbm module basically works like this: you can import dbm.ndbm, or you can just import dbm to get the best available implementation. That isn't done by hooking the import, but rather by providing a handful of wrapper functions that forward to the chosen implementation. Is that reasonable for YAML, or are there too many top-level functions or too much module-level global state or something? BeautifulSoup uses a different mechanism: you import bs4, but when you construct a BeautifulSoup object, you can optionally specify a parser by name, or leave it unspecified to have it pick the best. I don't think that applies here, as you'll probably be mostly calling top-level functions, not constructing and using parser objects. There's also been some discussion around how tulip/PEP3156 could allow a "default event loop", which could be either tulip's or provided by some external library like Twisted. What all of these are missing is a way for an unknown third-party implementation to plug themselves in as the new best. Of course you can always monkeypatch it at runtime (dbm._names.insert(0, __name__)), but you want to do it at _install_ time, which is a different story. One further issue is that sometimes the system administrator (or end user) might want to affect the default choice for programs running on his machine. For example, lxml is built around libxml2. Mac OS X 10.6, some linux distros, etc. come with old or buggy versions of libxml2. You might want to install lxml anyway and make it the default for BeautifulSoup, but not for ElementTree, or vice-versa. Finally, what happens if you install two modules on your system which both register as implementations of the same module? >????3. Allow YAML to be used besides JSON as metadata like in PEP 426. (so including either? > pymeta.yaml or pymeta.json makes a valid package) >I don?t propose >that we exclusively use YAML, but only because I think that PEP 426 >shouldn?t be hindered from being implemented ASAP by waiting for a new >std-library to be ready. Note that JSON is a strict subset of YAML 1.2, and not too far from a subset of 1.1.?So, you could propose exclusive YAML, and make sticking within the JSON schema and syntax required for packages compatible with Python 3.3 and earlier, but optional for 3.4+ packages. >What do you think? >Is there a reason for not including a YAML lib that i didn?t cover?Is there a reason JSON is used other than YAML not being in the stdlib? >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >http://mail.python.org/mailman/listinfo/python-ideas > > > From flying-sheep at web.de Fri May 31 20:57:38 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 20:57:38 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: 2013/5/31 Brett Cannon > So yaml is not going to end up in the stdlib. The format is not used > widely enough to warrant being added nor have to maintain a parser for > such a complicated format. > [citation needed] it?s omnipresent in the ruby community *because* it?s nicer than JSON and XML, and *because* the ruby stdlib has a parser (my interpretation, of course, but not a unlikely one, imho). and again, to intercept the ?unsafe? argument: naming the unsafe load function ?load? creates human error. naming the safe one ?load? prevents it. i?m sure of that, too: nobody may honestly say he didn?t know that ?unsafe_load? is unsafe. > (Although that?s probably easily done by just not doing ?import yaml?, > but > > ?import std_yaml? or ?import pyyaml2?) > > The standard practice to to place any accelerated code in something > like _yaml and then in yaml.py do a ``from _yaml import *``. > that?s what i said. just that _name implies internal, implementation -specific, rapidly changing code, which doesn?t fit my vision of a strict API that ?_yaml? and compatible implementations should implement. but maybe an infobox in the stdlib yaml documentation telling the user about it is sufficient. But that then creates a possible position where just to read metadata > you must have a 3rd-party library installed, and I view that as > non-starter. > that?s exactly why i presented those 3 ideas as one: they work together best (although the implementation discovery isn?t mandatory) It's simpler, it's Python syntax, it's faster to parse. > wrong, wrong and irrelevant. it?s only ?simpler? for certail definitions of ?simple?. those definitions target compilers, not humans. python targets humans, not compilers. (that?s e.g. why it favors readability over speed) also JSON is NOT python syntax, not even a subset: it has true, false and null instead of True, False and None, and also there?s a little hack involving escaped newlines which breaks code based on this assumption in awesome ways ;) But making YAML a first-class citizen in all of this won't happen > as long as YAML is not in the stdlib and that is not a viable option. > says you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Fri May 31 21:01:01 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 31 May 2013 12:01:01 -0700 (PDT) Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: <9DB60555-0185-4EB7-BD1D-101EAC32A534@masklinn.net> References: <9DB60555-0185-4EB7-BD1D-101EAC32A534@masklinn.net> Message-ID: <1370026861.9343.YahooMailNeo@web184701.mail.ne1.yahoo.com> >From: Masklinn >Sent: Friday, May 31, 2013 11:18 AM >On 2013-05-31, at 19:18 , Philipp A. wrote: >> (yaml >> afaik knows all of those, albeit with different syntax than python. but it >> has them, allowing to easily represent more of python?s capabilities) > >YAML has no rawstrings that I know of.? Single-quoted strings are basically raw strings. They're different from raw strings in the same way all YAML strings are different from Python strings?newlines and doubling the quotes to escape them?but they ignore escape sequences, which is the fundamental property of raw strings. See?http://www.yaml.org/spec/1.2/spec.html#id2760844 for an example, and section 7.3.2 for specifics. >It also has no byte literals, there >is a working draft for a binary tag encoded in base64[0].? Section 10.4 explicitly says that "It is strongly recommended that [interoperable] schemas?make as much use as possible of the the YAML tag repository at http://yaml.org/type/. This repository provides recommended global tags for increasing the portability of YAML documents between different applications."? Even though most of the tags in the repository are officially working drafts, they're effectively standards. >Its failsafe >schema only contains strings (unicode), same as JSON. But the spec doesn't recommend using the failsafe schema for most purposes. Section 10.3 says "[The core schema] is the recommended default?schema?that YAML?processor?should use unless instructed otherwise. It is also strongly recommended that other?schemas?should be based on it." Section 10.4 then implies that what most applications really should be using is something the spec doesn't quite name or define?the?core schema plus all the?tags from the repository. Note that all of this means you can't just say "use YAML" to specify anything; you have to say "use this YAML schema". So, if we were to follow the OP's suggestion of using YAML for metadata, it would have to be more specific than that. From masklinn at masklinn.net Fri May 31 21:14:46 2013 From: masklinn at masklinn.net (Masklinn) Date: Fri, 31 May 2013 21:14:46 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> References: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: <8ED3592D-5613-4412-B792-4CF9D170A722@masklinn.net> On 2013-05-31, at 20:43 , Andrew Barnert wrote: > > For example, the third-party lxml library provides an implementation of the ElementTree API. For some use cases, it's better than the stdlib one. So, a lot of programs start off with this: > > try: > from lxml import etree as ET > except ImportError: > from xml.etree import ElementTree as ET > > Your registration mechanism would mean they don't have to do this; they just import from the stdlib, and if lxml is present and registered, it would be loaded instead. That seems rife with potential issues and unintended side-effects e.g. while lxml does for the most part provide ET's API, it also extends it and I do not know if it can run ET's testsuite. It also doesn't handle ET's implementation details for obvious reasons[0]. So while a developer who will try lxml and fallback on ET will keep in mind to stay compatible and not use ET implementation details, one who expects ET and gets lxml on client machine will likely be slightly disappointed in the system. [0] and one of those implementation details can turn out deadly in unskilled hands: by default ElementTree will *not* remember namespace aliases and will rename most things from ns0 onwards: >>> ET.tostring(ET.fromstring('')) '' whereas lxml *will* save parsed namespace aliases in its internal namespace map(s?) and reuse them: >>> lxml.tostring(lxml.fromstring('')) '' if a developer expects a re-prefixed output and get lxml's, things are going to blow up. Yeah it's bad to do that, but I've seen enough supposedly-XML-based software which cared less for the namespace itself than for the alias to know that it's way too common. From flying-sheep at web.de Fri May 31 21:23:59 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 21:23:59 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> References: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: 2013/5/31 Andrew Barnert > But YAML is up to 1.2, and I believe most libraries (including PyYAML) > only handle 1.1 so far. There are also known bugs in the 1.1 specification > (e.g., "." is a valid float literal, but doesn't specify 0.0 or any other > valid value), that each library has to work around. There are features of > the standard, such as YAML<->XML bindings, that are still in early stages > of design. Maybe a > YAML-1.1-as-interpreted-by-the-majority-of-the-quasi-reference-implementations > library doesn't need to evolve, but a YAML library does. > afaik YAML 1.2 exists to clarify those mentioned bugs, since they have all been found and people needed a bugfree standard. also could you mention a bug with a non-obvious solution? i don?t think any yaml implementation is going to interpret ?.? as something other than 0.0 Are you suggesting importing PyYAML (in modified form, and without the > libyaml-binding "fast" implementation) into the stdlib, or building a new > one? If the former, have you talked to Kirill Simonov? If the latter, are > you proposing to build it, or just suggesting that it would be nice if > somebody did? > i don?t know: would it aid my argument if i had asked him or written my own? (i?ve done nothing of the two because unless Guido says ?I can see YAML in the stdlib? it would be pointless imho) Do you mean adding a load_iter() akin to load_all() except that it yields > one document at a time, or a SAX-like API instead of a simple load()? > no i meant that the lexer should be a generator (e.g. ?[int(token) for token in YAMLLexer(open('myfile.yml')).lex()]? and/or an API accepting incomplete yaml chunks and emitting tokens, like ?for token in lexer.feed(stream.read())?) but what you said is also necessary for the format: lexing from a long stream of documents coming in through the network doesn?t make sense in another way) Your registration mechanism would mean they don't have to do this; they > just import from the stdlib, and if lxml is present and registered, it > would be loaded instead. > exactly There are a few examples of something similar, both in and out of the > stdlib. For example: > > The dbm module basically works like this: you can import dbm.ndbm, or you > can just import dbm to get the best available implementation. That isn't > done by hooking the import, but rather by providing a handful of wrapper > functions that forward to the chosen implementation. Is that reasonable for > YAML, or are there too many top-level functions or too much module-level > global state or something? > i think so: as i said, we?d need to define an API. since it?s ?just? a serialization language, i think we could go with not much more than - load(fileobj_or_filename, safe=True) #maybe better than a unsafe_blah for each loadlike function - load_iter(fileobj_or_filename, safe=True) - loads(fileobj_or_filename, safe=True) - loads_iter(fileobj_or_filename, safe=True) - dump() - dumps - YAMLLexer #with some methods and defined constructors - YAMLParser #accepting tokens from the lexer - YAMLTokens #one of the new, shiny enums BeautifulSoup > > [?] > > tulip > also nice ideas > What all of these are missing is a way for an unknown third-party > implementation to plug themselves in as the new best. Of course you can > always monkeypatch it at runtime (dbm._names.insert(0, __name__)), but you > want to do it at _install_ time, which is a different story. > > One further issue is that sometimes the system administrator (or end user) > might want to affect the default choice for programs running on his > machine. For example, lxml is built around libxml2. Mac OS X 10.6, some > linux distros, etc. come with old or buggy versions of libxml2. You might > want to install lxml anyway and make it the default for BeautifulSoup, but > not for ElementTree, or vice-versa. > > Finally, what happens if you install two modules on your system which both > register as implementations of the same module? > i think we can?t allows them to modify some syste-global list, since everything would install itself as #1, so it would be pointless. i don?t know how to select one, but we should expose a systemwide way to configure the used one (like .pythonrc?), as well as a way to directly use one from python (as said above). then it wouldn?t matter much, since the admin is required to only install on, or configure the system to use the preferred one. the important things are imho to make the system discoverable and transparent, exposing the found implementations and the used one as well as we can. Note that JSON is a strict subset of YAML 1.2, and not too far from a > subset of 1.1. So, you could propose exclusive YAML, and make sticking > within the JSON schema and syntax required for packages compatible with > Python 3.3 and earlier, but optional for 3.4+ packages. > yeah. pretty nice. but i don?t think a stdlib yaml can land before 3.5. -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Fri May 31 21:27:53 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 21:27:53 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: <1370026861.9343.YahooMailNeo@web184701.mail.ne1.yahoo.com> References: <9DB60555-0185-4EB7-BD1D-101EAC32A534@masklinn.net> <1370026861.9343.YahooMailNeo@web184701.mail.ne1.yahoo.com> Message-ID: 2013/5/31 Guido van Rossum > On Fri, May 31, 2013 at 11:35 AM, Brett Cannon wrote: > > So yaml is not going to end up in the stdlib. The format is not used > > widely enough to warrant being added nor have to maintain a parser for > > such a complicated format. > > Hm. What makes you think it's not being used widely enough? I suppose > JSON is more popular, but it's not like YAML is dying. AFAIK there's a > 3rd party yaml parser that we could incorporate with its authors' > permission -- this would save people from a pretty common dependency > (just like we did for JSON). > i think ruby created its own reality here: yaml wasn?t popular because it wasn?t in the stdlib, and became popular as soon as it was. its advantages as arguably most human-writable serialization format helped here. at least that?s my interpretation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Fri May 31 21:39:54 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 31 May 2013 21:39:54 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: <8ED3592D-5613-4412-B792-4CF9D170A722@masklinn.net> References: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> <8ED3592D-5613-4412-B792-4CF9D170A722@masklinn.net> Message-ID: 2013/5/31 Masklinn > On 2013-05-31, at 20:43 , Andrew Barnert wrote: > > try: > > from lxml import etree as ET > > except ImportError: > > from xml.etree import ElementTree as ET > > > > Your registration mechanism would mean they don't have to do this; they > just import from the stdlib, and if lxml is present and registered, it > would be loaded instead. > > That seems rife with potential issues and unintended side-effects e.g. > while lxml does for the most part provide ET's API, it also extends it > and I do not know if it can run ET's testsuite. It also doesn't handle > ET's implementation details for obvious reasons. > and that?s where my idea?s ?strict API? comes into play: compatible implementations would *have to* pass a test suite and implement a certain API and comply with the standard. unsure if and how to test the latter (surely running a testsuite when something wants to register isn?t practical ? or is it?) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri May 31 21:51:32 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 31 May 2013 12:51:32 -0700 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: <51A8FF44.4010905@stoneleaf.us> On 05/31/2013 11:57 AM, Philipp A. wrote: > 2013/5/31 Brett Cannon wrote: > > The standard practice to to place any accelerated code in something > like _yaml and then in yaml.py do a ``from _yaml import *``. > > that?s what i said. just that _name implies internal, implementation > -specific, rapidly changing code, [...] _name implies implementation detail and private. Very little in the stdlib is rapidly changing. -- ~Ethan~ From brett at python.org Fri May 31 21:54:30 2013 From: brett at python.org (Brett Cannon) Date: Fri, 31 May 2013 15:54:30 -0400 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: On Fri, May 31, 2013 at 2:57 PM, Philipp A. wrote: > 2013/5/31 Brett Cannon >> >> So yaml is not going to end up in the stdlib. The format is not used >> widely enough to warrant being added nor have to maintain a parser for >> such a complicated format. > > > [citation needed] OK, I claim it isn't as widely used as I think would warrant inclusion, you disagree. Asking for a citation could be thrown in either direction and on any point in this discussion and it comes off as aggressive. > > it?s omnipresent in the ruby community *because* it?s nicer than JSON and > XML, and *because* the ruby stdlib has a parser (my interpretation, of > course, but not a unlikely one, imho). That's fine, but that's not a reason to add it to Python's stdlib. Adding anything to the stdlib takes careful thought because the burden of maintenance is high for any code that lands there. From my POV a YAML module just isn't there. > and again, to intercept the ?unsafe? > argument: naming the unsafe load function ?load? creates human error. naming > the safe one ?load? prevents it. i?m sure of that, too: nobody may honestly > say he didn?t know that ?unsafe_load? is unsafe. > >> > (Although that?s probably easily done by just not doing ?import yaml?, >> > but >> > ?import std_yaml? or ?import pyyaml2?) >> >> The standard practice to to place any accelerated code in something >> like _yaml and then in yaml.py do a ``from _yaml import *``. > > > that?s what i said. just that _name implies internal, implementation > -specific, rapidly changing code, which doesn?t fit my vision of a strict > API that ?_yaml? and compatible implementations should implement. but maybe > an infobox in the stdlib yaml documentation telling the user about it is > sufficient. > >> But that then creates a possible position where just to read metadata >> you must have a 3rd-party library installed, and I view that as >> non-starter. > > > that?s exactly why i presented those 3 ideas as one: they work together best > (although the implementation discovery isn?t mandatory) > >> It's simpler, it's Python syntax, it's faster to parse. > > > wrong, wrong and irrelevant. It might be irrelevant to you, but it isn't irrelevant to everyone. Remember, this is for the stdlib which means its use needs to beyond just what packaging wants. > > it?s only ?simpler? for certail definitions of ?simple?. those definitions > target compilers, not humans. python targets humans, not compilers. (that?s > e.g. why it favors readability over speed) > also JSON is NOT python syntax, not even a subset: it has true, false and > null instead of True, False and None, For the purposes of what is being discussed here it is close enough (the PEP mentions the use of none once). > and also there?s a little hack > involving escaped newlines which breaks code based on this assumption in > awesome ways ;) > >> But making YAML a first-class citizen in all of this won't happen >> as long as YAML is not in the stdlib and that is not a viable option. > > > says you. Yes, says me. It's my opinion and I am allowed to express it here. You are beginning to take this personally and become a bit hostile. Please take a moment to step back and realize this is just a discussion and just because I disagree with it doesn't mean I think that's bad or I think negatively of you, I just disagree with you. From brett at python.org Fri May 31 22:11:10 2013 From: brett at python.org (Brett Cannon) Date: Fri, 31 May 2013 16:11:10 -0400 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: On Fri, May 31, 2013 at 2:43 PM, Guido van Rossum wrote: > On Fri, May 31, 2013 at 11:35 AM, Brett Cannon wrote: >> So yaml is not going to end up in the stdlib. The format is not used >> widely enough to warrant being added nor have to maintain a parser for >> such a complicated format. > > Hm. What makes you think it's not being used widely enough? In my purview it isn't. I mean I know App Engine uses it but I just don't come across it often enough to think that we should incorporate it. Heck, I think I may have suggested it years back when I came across a need, but at this moment in time I just don't have a feeling it's wide enough to want to maintain the code. > I suppose > JSON is more popular, but it's not like YAML is dying. AFAIK there's a > 3rd party yaml parser that we could incorporate with its authors' > permission -- this would save people from a pretty common dependency > (just like we did for JSON). Sure, but the pure Python version is naively 5,500 lines, just 500 lines less than decimal.py. The json package is 1,100. Sure it's difficult to get right since the format is hard to parse which is an argument for it's addition, but this would not be a small import of code. And the popularity:code size/complexity just isn't an easy sell to me. Obviously it's just a gut feeling but I just am not feeling it as worth the hassle. But we really won't know w/o asking python-dev about this. > >>> Is there a reason JSON is used other than YAML not being in the stdlib? >> >> It's simpler, it's Python syntax, it's faster to parse. > > I would warn strongly against the "JSON is Python syntax" meme. While > you can usually read JSON with Python's eval(), *writing* it with > repr() is a disaster because of JSON's requirement to use double > string quotes. And as we know, eval() is unsafe, so the conclusion is > that one should always use the json module, and never rely on the fact > that it looks like Python (except that it makes the format easy to > understand to humans familiar with Python). I'm talking purely from the perspective of writing it by hand which is what sparked this conversation. There is no new format to really learn like with YAML: write a Python dict using double-quotes for strings and lowercase your singletons from Python and you basically there. -Brett > > (I have no opinion on the use of YAML for metadata.) > > -- > --Guido van Rossum (python.org/~guido) From dholth at gmail.com Fri May 31 22:22:21 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 31 May 2013 16:22:21 -0400 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: <9DB60555-0185-4EB7-BD1D-101EAC32A534@masklinn.net> <1370026861.9343.YahooMailNeo@web184701.mail.ne1.yahoo.com> Message-ID: On Fri, May 31, 2013 at 3:27 PM, Philipp A. wrote: > 2013/5/31 Guido van Rossum >> >> On Fri, May 31, 2013 at 11:35 AM, Brett Cannon wrote: >> > So yaml is not going to end up in the stdlib. The format is not used >> > widely enough to warrant being added nor have to maintain a parser for >> > such a complicated format. >> >> Hm. What makes you think it's not being used widely enough? I suppose >> JSON is more popular, but it's not like YAML is dying. AFAIK there's a >> 3rd party yaml parser that we could incorporate with its authors' >> permission -- this would save people from a pretty common dependency >> (just like we did for JSON). > > > i think ruby created its own reality here: yaml wasn?t popular because it > wasn?t in the stdlib, and became popular as soon as it was. its advantages > as arguably most human-writable serialization format helped here. at least > that?s my interpretation. I kindof like the idea of a stdlib YAML. IIUC the format competes with XML and pickle in interesting ways, avoiding in-band signaling like "keys starting with a $" or whatever that sometimes happens with JSON. It doesn't make sense as the serialization format for packaging metadata. *.dist-info/pymeta.json, to be parsed at install time to resolve dependency graphs, has no human readability/writability requirement, but must be fast, simple to implement, and available on Python 2.6+. At least for the 100k+ existing pypy-hosted sdists the metadata input format is a Python program called "setup.py". YAML could be a good alternative. Perhaps you could try doing a YAML version of Bento's configuration language? Right now the thinking in packaging is that there will not be *a* standard setup.py replacement. Instead, we will define hooks so that your build tool can generate the static metadata and build your package, and the install tools will be able to interoperate from there. From mal at egenix.com Fri May 31 22:30:49 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 31 May 2013 22:30:49 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: Message-ID: <51A90879.2010507@egenix.com> On 31.05.2013 20:13, Philipp A. wrote: > 2013/5/31 Vinay Sajip >> The PEP defines the metadata format as a Python dictionary - the serialising >> of metadata to a specific file format seems a secondary consideration. It's >> quite possible that some of the packaging tools that use the new metadata >> will support different serialisation mechanisms, perhaps including YAML, >> but >> ISTM that having YAML in the stdlib is orthogonal to the PEP. >> > > but in the future, package metadata won?t be specified in the setup.py > anymore, so we need a metadata file (like setup.cfg would have been for > distutils2). and we write those per hand. the involved metadata corresponds > exactly to the one mentioned here, so what do you think that the format of > that metadata file will be? Just as data point: PEP 426 explicitly says "It is expected that these metadata files will be generated by build tools based on other input formats (such as setup.py) rather than being edited by hand." Not sure where you got the idea from that anyone would write the JSON files by hand. The data will be extracted from the things you specify in setup.py at sdist or wheel build time and put into the JSON files. So that particular use case is not very likely to happen. That's not to say there aren't any use cases, it's just not going to be this one :-). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 31 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-07-01: EuroPython 2013, Florence, Italy ... 31 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From masklinn at masklinn.net Fri May 31 22:45:22 2013 From: masklinn at masklinn.net (Masklinn) Date: Fri, 31 May 2013 22:45:22 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery In-Reply-To: References: <1370025834.12159.YahooMailNeo@web184706.mail.ne1.yahoo.com> <8ED3592D-5613-4412-B792-4CF9D170A722@masklinn.net> Message-ID: On 2013-05-31, at 21:39 , Philipp A. wrote: > 2013/5/31 Masklinn > >> On 2013-05-31, at 20:43 , Andrew Barnert wrote: >>> try: >>> from lxml import etree as ET >>> except ImportError: >>> from xml.etree import ElementTree as ET >>> >>> Your registration mechanism would mean they don't have to do this; they >> just import from the stdlib, and if lxml is present and registered, it >> would be loaded instead. >> >> That seems rife with potential issues and unintended side-effects e.g. >> while lxml does for the most part provide ET's API, it also extends it >> and I do not know if it can run ET's testsuite. It also doesn't handle >> ET's implementation details for obvious reasons. >> > > and that?s where my idea?s ?strict API? comes into play: compatible > implementations would *have to* pass a test suite and implement a certain > API and comply with the standard. But that's not sufficient is the issue here, as I tried to point out when somebody uses ElementTree they may be using more than just the API, they may well be relying on implementation details (e.g. namespace behavior or the _namespace_map). It might be bad, but it still happens. A lot. Hell, I've used _namespace_map in the past because I had to (wanted to transform a maven script and some script down the line, maybe maven itself, wanted exactly the `mvn` namespace alias). This will usually be safe, especially with old packages with low to no evolution. But if you start swapping things with "API-compatible" libraries all bets are off. > unsure if and how to test the latter (surely running a testsuite when > something wants to register isn?t practical ? or is it?) From solipsis at pitrou.net Fri May 31 23:05:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 31 May 2013 23:05:21 +0200 Subject: [Python-ideas] PEP 426, YAML in the stdlib and implementation discovery References: Message-ID: <20130531230521.3888f052@fsol> On Fri, 31 May 2013 16:11:10 -0400 Brett Cannon wrote: > On Fri, May 31, 2013 at 2:43 PM, Guido van Rossum wrote: > > On Fri, May 31, 2013 at 11:35 AM, Brett Cannon wrote: > >> So yaml is not going to end up in the stdlib. The format is not used > >> widely enough to warrant being added nor have to maintain a parser for > >> such a complicated format. > > > > Hm. What makes you think it's not being used widely enough? > > In my purview it isn't. I mean I know App Engine uses it but I just > don't come across it often enough to think that we should incorporate > it. YAML is used by both Salt and Ansible, two configuration management engines (*) written in Python with growing popularity. http://docs.saltstack.com/topics/tutorials/starting_states.html#default-data-yaml http://ansible.cc/docs/playbooks.html#playbook-language-example (*) in other words, Chef / Puppet contenders https://www.ohloh.net/p/compare?project_0=Chef&project_1=salt&project_2=Ansible > > I suppose > > JSON is more popular, but it's not like YAML is dying. AFAIK there's a > > 3rd party yaml parser that we could incorporate with its authors' > > permission -- this would save people from a pretty common dependency > > (just like we did for JSON). > > Sure, but the pure Python version is naively 5,500 lines, just 500 > lines less than decimal.py. The json package is 1,100. Sure it's > difficult to get right since the format is hard to parse which is an > argument for it's addition, but this would not be a small import of > code. I agree that YAML being on the complex side is a bit of a warning sign for stdlib inclusion. > I'm talking purely from the perspective of writing it by hand which is > what sparked this conversation. There is no new format to really learn > like with YAML: write a Python dict using double-quotes for strings > and lowercase your singletons from Python and you basically there. But writing JSON by hand isn't really pleasant or convenient. It's ok for small lumps of data. Salt and Ansible don't (seem to) use YAML for anything complicated, just the much friendlier user experience. Regards Antoine. From vinay_sajip at yahoo.co.uk Fri May 31 23:40:33 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Fri, 31 May 2013 21:40:33 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?PEP_426=2C=09YAML_in_the_stdlib_and_impl?= =?utf-8?q?ementation_discovery?= References: Message-ID: Philipp A. writes: > please read my post again: i specifically mention that issue and a > possible solution. i?m just a little annoyed that you skipped that > paragraph and attack a strawman now. but not too annoyed :) I did read it, perhaps I should have been more clear. I didn't say the security issue was a show-stopper, just tagged it as a possible problem area. There are already yaml libraries out in the wild whose load() is the unsafe version, and a user may not necessarily be able to control (or even know) which yaml library is installed (e.g. distro package managers are conservative about adopting recent versions of libs). > i didn?t think of any, but i don?t think any available one would meet the > proposed goals of a secure API (like i said in the paragraph you skipped) It's chicken and egg. IMO it doesn't make sense to even think about YAML in the stdlib until there is a version outside the stdlib which has a reasonable level of adoption and battle-tested status. This is how JSON support came into the stdlib, for example. At the moment PyYAML seems to be the most mature, but from what I can see on its Trac, the most recent version (3.10 AFAIK) is still not ready. Regards, Vinay Sajip